What do we optimize with mk-query-digest ?Peter Zaitsev
When we’re looking at mk-query-digest report we typically look at the Queries causing the most impact (sum of the query execution times) as well as queries having some longest samples. Why are we looking at these ?
Queries with highest Impact are important because looking at these queries and optimizing them typically helps to improve system capacity, which in the end often improves response time of the system for variety of queries. Also queries causing highest impact often correspond to either frequent or slowest transactions in the system which makes these queries important to look at.
Slowest Queries correspond to bad response times. If there is 30 sec query this means somebody had to wait for 30 seconds which is typically not acceptable for interactive queries. It is best to look at 95 percentile “time” rather than slowest time for the queries as there are often exceptionally high time which you can see which are unrelated to query itself – general system overload, row level locks, table locks all could cause query response time to be unreasonable. If Slowest query is not interactive query, for example part of a batch job it is often fine, assuming it is not causing system overload, interactive queries, which have real users waiting for them are often high priority to be optimized. It helps if batch jobs use different MySQL user compared to the web site itself as it makes it easier to remove such queries from analyzes.
What is Important to mention neither of these two query types really guarantees improvement for application performance (and even more so transaction we’re trying to optimize), it typically does so as transactions you’re trying to optimize typically will manifest itself in one of these query sets.
If you are looking for improving response time of particular transaction/user interaction you really have to look at it, rather than aggregating data for the all server – understanding what queries are being ran (together with other interactions – memcache, remote sites etc) is a key to understanding what causes performance issues.
If you have slow queries from interactive application it is true somebody is suffering (though it is unclear how critical this suffering is) – it is however perfectly possible to get bad performance with MySQL responsible for large portion of response time without any slow queries. We’ve seen pages which take 10.000, 20.000 and even more queries to generate. In such cases even with queries taking 1 ms in average you’re looking at 10,20 or more seconds to respond.