Often I run into problems when trying to use mk-query-digest with tcpdump on “very” busy hosts. You might be thinking, “very busy is a relative and unquantifiable term,” and you’d be right, so I’ll phrase this differently. Let me give a little background to the problem first. Mk-query-digest tries to handle dropped or missing packets gracefully, but even so they can skew results dramatically. Imagine a situation where a single connection sends two queries and gets two responses, with a response time of R for each request, and a wait time of W between the requests. If the first response and second request are dropped by the kernel, the result – from mk-query-digest’s perspective – is that the database took 2R + W time to respond to the first request.
Back to the question of, “what is a very busy host?” In my experience, if you are getting even 5% of tcpdump packets dropped by the kernel, the results can be skewed enough to cause confusion about which queries are really slow. Recently, I got more than 60% dropped packets on a server with roughly 50MB/s of traffic on port 3306, system load of about 10, and 8 CPU cores. The resulting mk-query-digest output was obviously bogus when compared to the host’s slow-query-log (for example, none of the top 5 slow queries reported by mkqd appeared in the actual slow log file). After a little brain-storming, we came up with a few solutions:
#1 has an obvious flaw — if your long-query-time is 1 second, and mkqd believes that a query which actually took 10ms instead took 0.9s, the results are still useless. That is to say, this doesn’t actually solve the real problem of dropped packets, it just applies a mask to the output. #2 seems like the simplest good solution, but when I tested this, I still got very high percentage of dropped packets (around 30% when filtering only 4 out of hundreds of active clients). While this is lower than without the filter, it is still unusable. #3 actually worked very well and resulted in about 0.2% packet loss on this host, which is acceptable — the variances are statistically smoothed out and don’t noticeably affect the results. Here is the tcpdump command used.
tcpdump -i eth0 -s 65535 -x -n -q -tttt 'port 3306 and tcp & 7 == 2 and tcp & 7 == 2'
I also had to forward the tcpdump outputÂ to another host for processing because the database host couldn’t handle the additional IO or CPU pressure of either writing it to a file or piping it to mk-query-digest. Here is a draft of a script to automate this; use at your own risk and update to suit your needs.
# Set some defaults
d=$(date +%F-%T | tr :- _)
REMOTEHOST='some.other.hostname' # CHANGEME
LIMIT=50Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â # limit # of queries in report
SLEEPTIME=1200Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â # duration to capture tcpdump data in seconds (1200 = 20 min)
# check lock file
if [ -e "$TMP/$LOCKFILE" ]; then
echo "$self: lock file $LOCKFILE already exists, aborting"
# set trap to be sure tcpdump doesn't run for ever
# and clean up the temp file too
trapÂ 'rm -f $LOCKFILE; kill $PID; ssh $REMOTEHOST -- "rm -f $TMP/$TMPfile"; exit' INT TERM EXIT
# run the tcpdump & write to remote file and sleep for a bit
tcpdump -i eth0 -s 65535 -x -n -q -tttt 'port 3306 and tcp & 7 == 2 and tcp & 7 == 2' 2>/dev/null \
| ssh $REMOTEHOST -- "cat - > $TMP/$TMPfile" &
# set trap to be sure both remote files are removed
trap 'ssh $REMOTEHOST -- "rm -f $TMP/$TMPfile $TMP/$RESfile"; rm -f $LOCKFILE $RESfile; exit' INT TERM EXIT
# digest the result, copy to localhost, then email it
ssh $REMOTEHOST -- "mk-query-digest --type tcpdump --limit $LIMIT < $TMP/$TMPfile 2>&1 > $TMP/$RESfile"
scp -q $REMOTEHOST:$TMP/$RESfile $RESfile
# email $RESfile using your preferred transport
# clean up remote and local files.
ssh $REMOTEHOST -- "rm -f $TMP/$TMPfile $TMP/$RESfile"
rm -f $RESfile $LOCKFILE
trap - INT TERM EXIT