How to use tcpdump on "Very" Busy Hosts

April 18, 2011

Author

Devananda van der Veen

Insight for DBAs

MySQL

Share this Post:

Tcpdump Packet Loss on Busy Servers

Often I run into problems when trying to use mk-query-digest with tcpdump on “very” busy hosts. You might be thinking, “very busy is a relative and unquantifiable term,” and you’d be right, so I’ll phrase this differently. Let me give a little background to the problem first. Mk-query-digest tries to handle dropped or missing packets gracefully, but even so they can skew results dramatically. Imagine a situation where a single connection sends two queries and gets two responses, with a response time of R for each request, and a wait time of W between the requests. If the first response and second request are dropped by the kernel, the result – from mk-query-digest’s perspective – is that the database took 2R + W time to respond to the first request.

Possible Solutions to Using tcpdump on Busy Servers

Back to the question of, “what is a very busy host?” In my experience, if you are getting even 5% of tcpdump packets dropped by the kernel, the results can be skewed enough to cause confusion about which queries are really slow. Recently, I got more than 60% dropped packets on a server with roughly 50MB/s of traffic on port 3306, system load of about 10, and 8 CPU cores. The resulting mk-query-digest output was obviously bogus when compared to the host’s slow-query-log (for example, none of the top 5 slow queries reported by mkqd appeared in the actual slow log file). After a little brain-storming, we came up with a few solutions:

use “mk-query-digest –filter ‘$event->{time} && $event->{time} > 1′” to exclude all queries which it believes took longer than the servers long-query-time of 1 second

tcpdump traffic only from a small and representative subset of clients

tcpdump a modulo of incoming ports (including port 3306, because we must capture the responses from mysqld)

#1 has an obvious flaw — if your long-query-time is 1 second, and mkqd believes that a query which actually took 10ms instead took 0.9s, the results are still useless. That is to say, this doesn’t actually solve the real problem of dropped packets, it just applies a mask to the output. #2 seems like the simplest good solution, but when I tested this, I still got very high percentage of dropped packets (around 30% when filtering only 4 out of hundreds of active clients). While this is lower than without the filter, it is still unusable. #3 actually worked very well and resulted in about 0.2% packet loss on this host, which is acceptable — the variances are statistically smoothed out and don’t noticeably affect the results. Here is the tcpdump command used.

tcpdump -i eth0 -s 65535 -x -n -q -tttt 'port 3306 and tcp[1] &amp; 7 == 2 and tcp[3] &amp; 7 == 2'

1	tcpdump -i eth0 -s 65535 -x -n -q -tttt 'port 3306 and tcp[1] & 7 == 2 and tcp[3] & 7 == 2'

I also had to forward the tcpdump outputÂ to another host for processing because the database host couldn’t handle the additional IO or CPU pressure of either writing it to a file or piping it to mk-query-digest. Here is a draft of a script to automate this; use at your own risk and update to suit your needs.

#!/bin/bash<br># Set some defaults<br>d=$(date +%F-%T | tr :- _)<br>self=`basename $0`<br>REMOTEHOST='some.other.hostname'           # CHANGEME<br>LIMIT=50Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  # limit # of queries in report<br>SLEEPTIME=1200Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  # duration to capture tcpdump data in seconds (1200 = 20 min)<br>TMP='/tmp'<br>LOCKFILE="$self.lockfile"<br>TMPfile="$self.temp"<br>RESfile="$self.result"<br><br># check lock file<br>if [ -e "$TMP/$LOCKFILE" ]; then<br> echo "$self: lock file $LOCKFILE already exists, aborting"<br> exit 1<br>fi<br><br># set trap to be sure tcpdump doesn't run for ever<br># and clean up the temp file too<br>trapÂ  'rm -f $LOCKFILE; kill $PID; ssh $REMOTEHOST -- "rm -f $TMP/$TMPfile"; exit' INT TERM EXIT<br>touch $TMP/$LOCKFILE<br><br># run the tcpdump &amp; write to remote file and sleep for a bit<br>tcpdump -i eth0 -s 65535 -x -n -q -tttt 'port 3306 and tcp[1] &amp; 7 == 2 and tcp[3] &amp; 7 == 2' 2&gt;/dev/null <br> | ssh $REMOTEHOST -- "cat - &gt; $TMP/$TMPfile" &amp;<br><br>PID=$!<br>sleep $SLEEPTIME<br>kill $PID<br><br># set trap to be sure both remote files are removed<br>trap 'ssh $REMOTEHOST -- "rm -f $TMP/$TMPfile $TMP/$RESfile"; rm -f $LOCKFILE $RESfile; exit' INT TERM EXIT<br><br># digest the result, copy to localhost, then email it<br>ssh $REMOTEHOST -- "mk-query-digest --type tcpdump --limit $LIMIT &lt; $TMP/$TMPfile 2&gt;&amp;1 &gt; $TMP/$RESfile"<br>scp -q $REMOTEHOST:$TMP/$RESfile $RESfile<br><br># email $RESfile using your preferred transport<br><br># clean up remote and local files.<br>ssh $REMOTEHOST -- "rm -f $TMP/$TMPfile $TMP/$RESfile"<br>rm -f $RESfile $LOCKFILE<br><br>trap - INT TERM EXIT<br>exit 0

1

#!/bin/bash # Set some defaults d=$(date +%F-%T | tr :- _) self=`basename $0` REMOTEHOST='some.other.hostname' # CHANGEME LIMIT=50Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â # limit # of queries in report SLEEPTIME=1200Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â # duration to capture tcpdump data in seconds (1200 = 20 min) TMP='/tmp' LOCKFILE="$self.lockfile" TMPfile="$self.temp" RESfile="$self.result" # check lock file if [ -e "$TMP/$LOCKFILE" ]; then echo "$self: lock file $LOCKFILE already exists, aborting" exit 1 fi # set trap to be sure tcpdump doesn't run for ever # and clean up the temp file too trapÂ 'rm -f $LOCKFILE; kill $PID; ssh $REMOTEHOST -- "rm -f $TMP/$TMPfile"; exit' INT TERM EXIT touch $TMP/$LOCKFILE # run the tcpdump & write to remote file and sleep for a bit tcpdump -i eth0 -s 65535 -x -n -q -tttt 'port 3306 and tcp[1] & 7 == 2 and tcp[3] & 7 == 2' 2>/dev/null | ssh $REMOTEHOST -- "cat - > $TMP/$TMPfile" & PID=$! sleep $SLEEPTIME kill $PID # set trap to be sure both remote files are removed trap 'ssh $REMOTEHOST -- "rm -f $TMP/$TMPfile $TMP/$RESfile"; rm -f $LOCKFILE $RESfile; exit' INT TERM EXIT # digest the result, copy to localhost, then email it ssh $REMOTEHOST -- "mk-query-digest --type tcpdump --limit $LIMIT < $TMP/$TMPfile 2>&1 > $TMP/$RESfile" scp -q $REMOTEHOST:$TMP/$RESfile $RESfile # email $RESfile using your preferred transport # clean up remote and local files. ssh $REMOTEHOST -- "rm -f $TMP/$TMPfile $TMP/$RESfile" rm -f $RESfile $LOCKFILE trap - INT TERM EXIT exit 0