Hypothetical Indexes in PostgreSQL

PostgreSQL LogoAt recent conferences, we have received a curious question from users who have used Oracle in the past and are now using PostgreSQL: “Do we have hypothetical indexes in PostgreSQL ?“. The answer to that question is YES. The true meaning of hypothetical is imaginary but not real. We can try creating indexes that are imaginary for the PostgreSQL optimizer which don’t exist in reality. Let’s look at some detailed examples.

How are hypothetical indexes helpful?

The execution time of an SQL in most of the relational databases depends on the cost of the execution. An optimizer chooses the plan that has the lowest cost, and thus considers further phases such as execute and fetch using that plan. One of the easiest ways of optimizing a well-written SQL is through appropriate indexes that are suitable for that query. An index may reduce the number of pages fetched from disk and may live in the cache due to its size (<<< table size). So, indexing is always a low-hanging fruit for admins and developers wishing to tune an SQL.

But often developers wish to see if an index can be really helpful. The only option available is to create the index and check to see if the queries are able to use it with a reduced cost. Creating the index may be fine if it is in a test environment, or a table that is not huge in size, but if for some reason you wish to see if an index can be helpful for an SQL without actually creating it, then you may test and try this extension to create hypothetical indexes.

PostgreSQL Extension for hypothetical indexes

In order to achieve the functionality of creating imaginary indexes in PostgreSQL, we need to use an extension named hypopg. This extension is made available in the PGDG repository from PostgreSQL 9.2 and works until the latest active release PostgreSQL 11.3. In order to create this extension, you may just use yum or apt depending on your linux distro or compile it from source.

Installing hypopg

In RedHat/CentOS

For Debian/Ubuntu, you must make sure you have the development package named: postgresql-server-dev-X where X is the major version.

Creating the Extension

In order to show you the list of functions created by hypopg, I have created this extension in a different schema. Here is the list:

hypopg Functions

hypopg_create_index: This function is used to create a hypothetical index. We do not have to specify the name of the index because it is ignored by this function anyway.

hypopg_drop_index: This function is used to drop a hypothetical index upon testing. As soon as you disconnect your session, the index automatically gets dropped as it is only created in the private memory of that session.

hypopg_get_indexdef: The definition of the hypothetical index we have created using this function.

hypopg_list_indexes: This function helps us list all the hypothetical indexes we have created in our session.

hypopg_relation_size: This function helps us estimate the index size, so we know the approximate amount of space this index could occupy.

hypopg_reset: This function drops all the hypothetical indexes created by us in a single command.

How does a hypothetical index work in reality?

When you use the hypopg_create_index function to create a hypothetical index, it creates an index in our connection’s private memory. None of the catalog tables nor the original tables are actually touched. The only way to see if we can benefit from that index is by running an EXPLAIN <QUERY>. However, if you wish to run an EXPLAIN ANALYZE that runs that SQL and provides the run time stats, it would not be able to use that hypothetical index. This is because a hypothetical index does not exist in reality.

Testing hypothetical indexes

While testing hypothetical indexes, I want to check if there are any cases where it may not work. Because our main reason for using this extension is to test whether the optimizer could identify the index we wish to create, I have thought of testing almost all types of the most-used indexes in PostgreSQL such as BTREE, HASH, GIN, GiST, BRIN, and BLOOM indexes. Only BTREE indexes are currently mentioned as applicable for hypothetical indexes.

For this reason, I have created the following table which may not make much sense for a real-time use case but helps us test all the types of indexes we could create.