Using Hints to Analyze Queries

Hints to Analyze QueriesIn this blog post, we’ll look at using hints to analyze queries.

There are a lot of things that you can do wrong when writing a query, which means that there a lot of things that you can do to make it better. From my personal experience there are two things you should review first:

  1. The table join order
  2. Which index is being used

Why only those two? Because many other alternatives that are more expensive, and at the end query optimization is a cost-effectiveness analysis. This is why we must start with the simplest fixes. We can control this with the hints “straight_join” and “force index”. These allow us to execute the query with the plan that we would like to test.

Join Order

In a query where we use multiple tables or subqueries, we have some particular fields that we are going to use to join the tables. Those fields could be the Primary Key of the table, the first part of a secondary index, neither or both. But before we analyze possible scenarios, table structure or indexes, we need to establish what is the best order for that query to join the tables.

When we talked about join order and the several tables to join, one possible scenario is that a table is using a primary key to join a table, and another field to join to other tables. For instance:

We get this explain:

It is filtering by value1 on table_a, which joins with table_b with the primary key, and table_c uses the value of id_c which it gets from table_b.

But we can change the table order and use straight_join:

The query is semantically the same, but now we get this explain:

In this case, we are performing a full table scan over table_c, which then joins with table_b using index over id_c to finally join table_a using the primary key.

Sometimes the optimizer chooses the incorrect join order because of bad statistics. I found myself reviewing the first query with the second explain plan, where the only thing that I did to find the query problem was to add “STRAIGHT_JOIN” to the query.

Taking into account that the optimizer could fail on this task, we found a practical way to force it to do what we want (change the join order).

It is also useful to find out when an index is missing. For example:

The explain plan shows:

It is starting with costs_spac_types and then using the clustered index for the next two tables. The explain doesn’t look bad!

However, it was taking longer than this:

0.17 sec versus 0.09 sec. This is the explain plan:

Reviewing the table structure:

I saw that the unique index was over cost_id and then spac_type_id. After adding this index:

Now, the explain plan without STRIGHT_JOIN is:

Which is much better, as it is scanning fewer rows and the query time is just 0.01 seconds.


The optimizer has the choice of using a clustered index, a secondary index, a partial secondary index or no index at all, which means that it uses the clustered index.

Sometimes the optimizer ignores the use of an index because it thinks reading the rows directly is faster than an index lookup:

In both cases, we are reading directly from the clustered index.

Then, we have secondary indexes that are partially used or/and that are partially useful for the query. This means that we are going to scan the index and then we are going to lookup in the clustered index. YES! TWO STRUCTURES WILL BE USED! We usually don’t realize any of this, but this is like an extra join between the secondary index and the clustered index.

Finally, the covering index, which is simple to identify as “Using index” in the extra column:

Index Analysis

As I told you before, this is a cost-effectiveness analysis from the point of view of query performance. Most of the time it is faster to use covering indexes than secondary indexes, and finally the clustered index. However, usually covering indexes are more expensive for writes, as you need more fields to cover the query needs. So we are going to use a secondary index that also uses the clustered index. If the amount of rows is not large and it is selecting most of the rows, however, it could be even faster to perform a full table scan. Another thing to take into account is that the amount of indexes affects the write rate.

Let’s do an analysis. This is a common query:

It is using all the fields of each table.

This is more restrictive:

But it is performing a full table scan over t2, and then is using t2.value1 to lookup on t1 using the clustered index.

Let’s add an index on table_index_analisis_2 over value1:

The explain shows that it is not being used, not even when we force it:

This is because the optimizer considers performing a full table scan better than using a part of the index.

Now we are going to add an index over value1 and value2:

We can see that now it is using the index, and in the extra column says “Using index” — which means that it is not using the clustered index.

Finally, we are going to add an index over table_index_analisis_1, in the best way that it is going to be used for this query:

However, it is not selected by the optimizer. That is why we need to force it:

Now, we are just using the secondary index in both cases.


There are many more hints to analyze queries we could review, like handler used, table design, etc. However, in my opinion, it is useful to focus on these at the beginning of the analysis.

I will also like to point out that using hints is not a long-term solution! Hints should be used just in the analysis phase.

Share this post