MySQL 101: How to Find and Tune a Slow SQL Query

tune a slow sql queryOne of the most common support tickets we get at Percona is the infamous “database is running slower” ticket.  While this can be caused by a multitude of factors, it is more often than not caused by a bad query.  While everyone always hopes to recover through some quick config tuning, the real fix is to identify and fix the problem query.  Sure, we can generally alleviate some pain by throwing more resources at the server.  But this is almost always a short term bandaid and not the proper fix.

With Percona Monitoring and Management

So how do we find the queries causing problems and fix them?  If you have Percona Monitoring and Management (PMM) installed, the identification process is swift.  With the Query Analytics enabled (QAN) in PMM, you can simply look at the table to identify the top query:

Percona Monitoring and Management queries

When you click on the query in the table, you should see some statistics about that query and also (in most cases), an example:

Percona Monitoring and Management queries

View a Demo of Percona Monitoring and Management

Without Percona Monitoring and Management

Now, let’s assume that you don’t have PMM installed yet (I’m sure that is being worked on as you read this).  To find the problem queries, you’ll need to do some manual collection and processing that PMM does for you.  The following is the best process for collecting and aggregating the top queries:

  1. Set long_query_time = 0 (in some cases, you may need to rate limit to not flood the log)
  2. Enable the slow log and collect for some time (slow_query_log = 1)
  3. Stop collection and process the log with pt-query-digest
  4. Begin reviewing the top queries in times of resource usage

Note – you can also use the performance schema to identify queries, but setting that up is outside the scope of this post.  Here is a good reference on how to use P_S to find suboptimal queries.

When looking for bad queries, one of the top indicators is a large discrepancy between rows_examined and rows_sent.  In cases of suboptimal queries, the rows examined will be very large compared with a small number of rows sent.

Once you have identified your query, it is time to start the optimization process.  The odds are that the queries at the top of your list (either in PMM or the digest report) lack indices.  Indexes allow the optimizer to target the rows you need rather than scanning everything and discarding non-matching values.   Let’s take the following sample query as an example:

This looks like a straightforward query that should be pretty simple.  However, it is showing up as a resource hog and is bogging down the server.  Here is how it showed up in the pt-query-digest output:

We can see right away the high number of rows examined vs. the rows sent, as highlighted above.  So now that we’ve identified the problem query let’s start optimizing it.  Step 1 in optimizing the query would be to run an EXPLAIN plan:

The EXPLAIN output is the first clue that this query is not properly indexed.  The type: ALL indicates that the entire table is being scanned to find a single record.  In many cases, this will lead to I/O pressure on the system if your dataset exceeds memory.  The Using filesort indicates that once it goes through the entire table to find your rows, it has to then sort them (a common symptom of CPU spikes).

Limiting Rows Examined

One thing that is critical to understand is that query tuning is an iterative process.  You won’t always get it right the first time and data access patterns may change over time.  In terms of optimization, the first thing we want to do is get this query using an index and not using a full scan.  For this, we want to look at the WHERE clause: where username = “admin1”.

With this column theoretically being selective, an index on username would be a good start.  Let’s add the index and re-run the query:

Optimizing Sorts

So we are halfway there!  The type: ref indicates we are now using an index, and you can see the rows dropped from 635k down to 1.  This example isn’t the best as this finds one row, but the next thing we want to address is the filesort.  For this, we’ll need to change our username index to be a composite index (multiple columns).  The rule of thumb for a composite index is to work your way from the most selective to the least selective columns, and then if you need sorting, keep that as the last field.  Given that premise, let’s modify the index we just added to include the last_login field:

And there we have it!  Even if this query scanned more than one row, it would read them in sorted order, so the extra CPU needed for the sorting is eliminated.  To show this, let’s do this same index on a non-unique column (I left email as non-unique for this demo):

In summary, the general process to tune a SQL query follows this process:

  1. Identify the query (either manually or with a tool like PMM)
  2. Check the EXPLAIN plan of the query
  3. Review the table definition
  4. Create indexes
    1. Start with columns in the WHERE clause
    2. For composite indexes, start with the most selective column and work to the least selective column
    3. Ensure sorted columns are at the end of the composite index
  5. Review the updated explain plan and revise as needed
  6. Continue to review the server to identify changes in access patterns that require new indexing

While query optimization can seem daunting, using a process can make it much easier to achieve.  Naturally, optimizing complex queries isn’t trivial like the above example, but is definitely possible when broken down.  And remember that Percona engineers are always available to help you when you get stuck!  Happy optimizing!

View a Demo of Percona Monitoring and Management

Our solution brief “Get Up and Running with Percona Server for MySQL” outlines setting up a MySQL® database on-premises using Percona Server for MySQL. It includes failover and basic business continuity components.

Download PDF

Share this post

Leave a Reply