EXPLAIN FORMAT=JSON: cost_info knows why optimizer prefers one index to another

February 22, 2016
Author
Sveta Smirnova
Share this Post:

EXPLAIN FORMAT=JSONTime for another entry in the EXPLAIN FORMAT=JSON is cool! series of blog posts. This time we’ll discuss how using EXPLAIN FORMAT=JSON allows you to see that cost_info knows why the optimizer prefers one index to another.

Tables often have more than one index. Any of these indexes can be used to resolve query. The optimizer has to make a choice in this case. One of the metrics that can be used to help make the choice is the potential cost of the query evaluation.

For example, let’s take the table titles from the standard employees database:

It has two indexes: emp_no and PRIMARY, each of which could be used to resolve query:

At first glance, emp_no doesn’t really fit for this query. PRIMARY does fit, because it contains both the title and from_date fields. Unfortunately, it cannot be used to resolve the query, because we don’t limit the search by emp_no and title . It can, however, be used to select rows from the index. When we use EXPLAIN , though, it shows us that the optimizer has chosen index emp_no (every secondary index in InnoDB contains a link to the clustered index anyway):

PRIMARY KEY exists in the field possible_keys, but was not chosen. EXPLAIN FORMAT=JSON can show us why.

First let’s run it on the original query:

The important part here is:

Which shows that the overall query_cost is 89796.80. We don’t really know what the units are for this cost, or how it is actually measured. It isn’t important; the only thing that is important for now is that smaller is better. (Think of it like shopping for a product: it doesn’t matter which you buy it from, just that you buy it at the lowest price.)

Another important member of the index is cost_info, which belongs to the table itself:

Here we get even more details, such as cost of read operation and evaluation. prefix_cost is not useful for this example, because it contains the cost of joining to the next table in JOIN. Since we don’t join the table titles with any other value of prefix_cost, is equivalent to the cost of the full query. data_read_per_join contains the amount of data that should be read for each JOIN operation. In our case it is once again the same as how much data we should read to fully evaluate the query.

Now let’s force index PRIMARY and examine the EXPLAIN FORMAT=JSON output:

Notice the numbers are different this time. The total query cost is 531269.80, which is about 6 times greater than 89796.80:

read_cost is 442725.00, which is 353 times greater than 1252.00. However, the eval_cost and data_read_per_join are the same as the query that uses index emp_no :

These numbers clearly explain why the optimizer prefers the index emp_no to PRIMARY KEY.

In our example above this behavior is correct. In a real life scenario, if the optimizer’s choice is wrong. these numbers can show either that there is a bug in the optimizer or that the table’s statistics are outdated and need to be updated.

Conclusion: EXPLAIN FORMAT=JSON can be used together with FORCE INDEX to find out why the optimizer prefers one index to another.

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved