Identifying High Load Spots in MySQL Using Slow Query Log and pt-query-digest

Identifying High Load Spots in MySQL Using Slow Query Log and pt-query-digest

PREVIOUS POST
NEXT POST

pt-query-digest MySQL slow queriespt-query-digest is one of the most commonly used tool when it comes to query auditing in MySQL®. By default, pt-query-digest reports the top ten queries consuming the most amount of time inside MySQL. A query that takes more time than the set threshold for completion is considered slow but it’s not always true that tuning such queries makes them faster. Sometimes, when resources on server are busy, it will impact every other operation on the server, and so will impact queries too. In such cases, you will see the proportion of slow queries goes up. That can also include queries that work fine in general.

This article explains a small trick to identify such spots using pt-query-digest and the slow query log. pt-query-digest is a component of Percona Toolkit, open source software that is free to download and use.

Some sample data

Let’s have a look at sample data in Percona Server 5.7. Slow query log is configured to capture queries longer than ten seconds with no limit on rate of logging, which is generally considered to throttle the IO that comes while writing slow queries to the log file.

When I run pt-query-digest, I see in the summary report that 80% of the queries have come from just three query patterns.

Query #1 is generated by the qan-agent from PMM and runs approximately once a minute. These results will be handed over to PMM Server. Similarly queries #2 & #3 are pretty simple. I mean, they scan just one row and will return either zero or one rows. They also use indexing, which makes me think that this is not because of something just with in MySQL. I wanted to know if I could find any common aspect of all these occurrences.

Let’s take a closer look at the queries recorded in slow query log.

Now you can see two things.

  • All of them have same Unix timestamp
  • All of them were spending more than 70% of their execution time waiting for some lock.

Analyzing the data from pt-query-digest

Now I want to check if I can group the count of queries based on their time of execution. If there are multiple queries at a given time captured into the slow query log, time will be printed for the first query but not all. Fortunately, in this case I can rely on the Unix timestamp to compute the counts. The timestamp is gets captured for every query. Luckily, without a long struggle, a combination of grep and awk utilities have displayed what I wanted to display.

You can use the command below to check the regular date time format of a given timestamp. So, Oct 3, 05:32 is when there was something wrong on the server:

Query tuning can be carried out alongside this, but identifying such spots helps avoiding spending time on query tuning where badly written queries are not the problem. Having said that, from this point, further troubleshooting may take different sub paths such as checking log files at that particular time, looking at CPU reports, reviewing past pt-stalk reports if set up to run in the background, and dmesg etc. This approach is useful for identifying at what time (or time range) MySQL was more stressed just using slow query log when no robust monitoring tools, like Percona Monitoring and Management (PMM), are deployed.

Using PMM to monitor queries

If you have PMM, you can review Query Analytics to see the topmost slow queries, along with details like execution counts, load etc. Below is a sample screen copy for your reference:

Slow query log from PMM dashboard

NOTE: If you use Percona Server for MySQL, slow query log can report time in micro seconds. It also supports extended logging of  other statistics about query execution. These provide extra power to see the insights of query processing. You can see more information about these options here.

PREVIOUS POST
NEXT POST

Share this post

Leave a Reply