Downloads

Blog

Optimizing slow web pages with mk-query-digest

April 7, 2011

Author

Aurimas Mikalauskas

Insight for DBAs

MySQL

Share this Post:

I don’t use many tools in my consulting practice but for the ones I do, I try to know them as best as I can. I’ve been using mk-query-digest for almost as long as it exists but it continues to surprise me in ways I couldn’t imagine it would. This time I’d like to share a quick tip on how mk-query-digest allows you to slice your data in a completely different way than it otherwise would by default.

Disclaimer: this only works when persistent connections or connection pools aren’t used and is only accurate when single mysql connection is used during execution of a request.

If you are seeking to reduce the load on the database server and [as a result] increase response time for some random user requests, you are usually interested in queries that are consuming most MySQL time and that’s how mk-query-digest groups and orders data by default. Fixing top 10 queries on the list indeed will most likely reduce the load and improve response time for some requests. What if some pages are still slow to load because of the time spent in database and you either can’t or don’t want to profile or debug the application to figure out what’s happening under the hood?

That sounds like something I was working on today – I had a slow query log (captured with long_query_time=0 and all the eXtra benefits from Percona slow query log patch), I knew some particular pages were taking minutes to load and that’s exactly what the customer asked me to focus on. So instead of using mk-query-digest to list me top slowest queries, I asked it to list me top slowest sessions:

mk-query-digest --group-by=Thread_id --order-by=Query_time:sum in &gt; out

1	mk-query-digest --group-by=Thread_id --order-by=Query_time:sum in > out

Spot on, the session I needed to focus on was right at the top. And what do you know, 519 queries were run during that session which took 148s seconds overall:

# ########################################################################
# Report grouped by Thread_id
# ########################################################################

# Item 1: 3.41 QPS, 0.97x concurrency, ID 0xABCE5AD2A2DD1BA1 at byte 288124661
# This item is included in the report because it matches --limit.
# Scores: Apdex = 0.97 [1.0], V/M = 19.02
# Query_time sparkline: | ^______|
# Time range: 2011-04-05 16:12:13 to 16:14:45
# Attribute    pct   total     min     max     avg     95%  stddev  median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count          0     519
# Exec time      2    148s    11us     33s   285ms    53ms      2s    26us
# Lock time      0     5ms       0   334us     9us    66us    32us       0
# Rows sent      0      41       0       1    0.08    0.99    0.27       0
# Rows examine   1   4.97M       0 445.49k   9.80k   5.73k  49.33k       0
# Rows affecte   0       2       0       1    0.00       0    0.06       0
# Rows read      1   2.01M       0 250.47k   3.96k    1.96  27.94k    0.99
# Bytes sent     0 241.20k      11   8.01k  475.89  918.49  689.98  258.32
# Merge passes   0       0       0       0       0       0       0       0
# Tmp tables     0      15       0       1    0.03       0    0.17       0
# Tmp disk tbl   0       3       0       1    0.01       0    0.08       0
# Tmp tbl size   0   4.78k       0   4.78k    9.43       0  211.60       0
# Query size     0 100.95k      19   2.71k  199.17  363.48  206.60  151.03
# InnoDB:
# IO r bytes     0       0       0       0       0       0       0       0
# IO r ops       0       0       0       0       0       0       0       0
# IO r wait      0       0       0       0       0       0       0       0
# pages distin   1  67.99k       0  10.64k   1.26k   3.88k   2.47k   31.70
# queue wait     0       0       0       0       0       0       0       0
# rec lock wai   0       0       0       0       0       0       0       0
# Boolean:
# Filesort       0% yes,  99% no
# Full scan      7% yes,  92% no
# QC Hit        78% yes,  21% no
# Tmp table      2% yes,  97% no
# Tmp table on   0% yes,  99% no
# String:
# Databases    prod_db
# Hosts        localhost
# InnoDB trxID 1153145C (2/0%), 11531626 (2/0%)... 43 more
# Last errno   0
# Users        prod
# Query_time distribution
#   1us
#  10us  ################################################################
# 100us  #########
#   1ms  #
#  10ms  #
# 100ms  #
#    1s  ###
#  10s+  #
160847

# ########################################################################

# Report grouped by Thread_id

# ########################################################################

# Item 1: 3.41 QPS, 0.97x concurrency, ID 0xABCE5AD2A2DD1BA1 at byte 288124661

# This item is included in the report because it matches --limit.

# Scores: Apdex = 0.97 [1.0], V/M = 19.02

# Query_time sparkline: | ^______|

# Time range: 2011-04-05 16:12:13 to 16:14:45

# Attribute pct total min max avg 95% stddev median

# ============ === ======= ======= ======= ======= ======= ======= =======

# Count 0 519

# Exec time 2 148s 11us 33s 285ms 53ms 2s 26us

# Lock time 0 5ms 0 334us 9us 66us 32us 0

# Rows sent 0 41 0 1 0.08 0.99 0.27 0

# Rows examine 1 4.97M 0 445.49k 9.80k 5.73k 49.33k 0

# Rows affecte 0 2 0 1 0.00 0 0.06 0

# Rows read 1 2.01M 0 250.47k 3.96k 1.96 27.94k 0.99

# Bytes sent 0 241.20k 11 8.01k 475.89 918.49 689.98 258.32

# Merge passes 0 0 0 0 0 0 0 0

# Tmp tables 0 15 0 1 0.03 0 0.17 0

# Tmp disk tbl 0 3 0 1 0.01 0 0.08 0

# Tmp tbl size 0 4.78k 0 4.78k 9.43 0 211.60 0

# Query size 0 100.95k 19 2.71k 199.17 363.48 206.60 151.03

# InnoDB:

# IO r bytes 0 0 0 0 0 0 0 0

# IO r ops 0 0 0 0 0 0 0 0

# IO r wait 0 0 0 0 0 0 0 0

# pages distin 1 67.99k 0 10.64k 1.26k 3.88k 2.47k 31.70

# queue wait 0 0 0 0 0 0 0 0

# rec lock wai 0 0 0 0 0 0 0 0

# Boolean:

# Filesort 0% yes, 99% no

# Full scan 7% yes, 92% no

# QC Hit 78% yes, 21% no

# Tmp table 2% yes, 97% no

# Tmp table on 0% yes, 99% no

# String:

# Databases prod_db

# Hosts localhost

# InnoDB trxID 1153145C (2/0%), 11531626 (2/0%)... 43 more

# Last errno 0

# Users prod

# Query_time distribution

# 1us

# 10us ################################################################

# 100us #########

# 1ms #

# 10ms #

# 100ms #

# 1s ###

# 10s+ #

160847

The stats here are aggregated per all queries which is great, but I still need to figure out what queries were run. I could use mk-log-player and split all sessions that way, unfortunately mk-log-player will not have all the other useful information, not even query timing. Instead, I’ve used mk-query-digest:

mk-query-digest --filter='$event-&gt;{Thread_id} == 160847' in &gt; out

1	mk-query-digest --filter='$event->{Thread_id} == 160847' in > out

Now I know exactly what needs to be fixed first to make the greatest impact to this page response time. I can also convert that into a slow query log that lists all the queries that were executed during this session in the order they were executed:

mk-query-digest --filter='$event-&gt;{Thread_id} == 160847' --no-report --print in &gt; out

1	mk-query-digest --filter='$event->{Thread_id} == 160847' --no-report --print in > out

Pretty cool, isn’t it? Sure, it would be even better if mk-query-digest would do a nested group-by and order-by within a group so I would avoid the extra step, but then even better than that would be if it would optimize the queries all together! Unfortunately mk-query-digest won’t do that for you, but then there’s mk-query-advisor 😉