Have you ever wondered how optimized your Memcached installation is? There is a common misconception that one doesn’t have to think too deeply about Memcached performance, but that is not true. If your setup is inefficient, you could:
Percona does a lot of consulting around Memcached, so we try to take a quantitative, scientific approach to measuring memcached performance, just like everything else we do.
memcached is basically a key-value in-memory database, so it works well to analyze its traffic with Maatkit’s mk-query-digest tool. You can now use mk-query-digest to get “query event” information and report on memcached effectiveness. As far as we know, this is the first and only tool of its kind.
The output is very similar to that which would be displayed for MySQL queries. Commands are fingerprinted so that similar things can be aggregated together. The output begins with an overall report:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 |
# Overall: 2.69k total, 113 unique, 28.36 QPS, 0.04x concurrency _________ # total min max avg 95% stddev median # Exec time 4s 11us 2s 1ms 403us 43ms 28us # Hosts 2694 # Time range 2009-06-11 21:54:47.089744 to 2009-06-11 21:56:22.136503 # bytes 7.34M 0 265.49k 2.79k 4.71k 16.50k 51.63 # key print 2694 # 4% (119) Memc_delete # 6% (163) Memc_error # 77% (2k) Memc_get # 0% (4) Memc_incr # 15% (415) Memc_miss # 17% (483) Memc_set |
We can see that 77% of commands (cmds) were get, 17% set, 4% delete and 4 incr (increment) cmds. At present, Memc_error only refers to gets (or, retrieval cmds in general) that were interrupted (some things are TODO). Memc_miss is for any cmd that resulted in a NOT_FOUND result. So in this case, 15% of all cmds tried to do something with a nonexistent key.
The next bit aggregates similar queries into classes and prints out the most expensive ones. Here’s the top one:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
# Query 1: 7.06 QPS, 0.04x concurrency, ID 0x7AFBE9433477C7BC at byte 5983429 # This item is included in the report because it matches --limit. # pct total min max avg 95% stddev median # Count 24 657 # Exec time 87 3s 19us 2s 5ms 1ms 87ms 31us # Hosts 1 127.0.0.1 # Time range 2009-06-11 21:54:48.050834 to 2009-06-11 21:56:21.669302 # bytes 38 2.79M 0 4.75k 4.35k 4.71k 1.31k 4.71k # key print 1 TEST_KEY # 17% (114) Memc_error # 100% (657) Memc_get # Query_time distribution # 1us # 10us ################################################################ # 100us ############# # 1ms #### # 10ms # 100ms # 1s # # 10s+ get TEST_KEYG |
Interestingly, apparently 17% of all “get TEST_KEY” were interrupted (Memc_error), probably by the TCP connection being closed client-side.
You’ll notice 100% Memc_get. This is a result of the default –report and –group-by, which is “fingerprint” just like for slow query logs. So the fingerprint for this event is “get TEST_KEY” and since the cmd is “get”, 100% Memc_get. There’s another way to aggregate — by the fingerprinted key, omitting the verb (get/set/etc). To do this, you’d specify key_print as the group-by value. More details are below.
This feature is still in beta. Some things are TODO; the report format could be cleaned up a little, etc. But we’re releasing early and often, and we’re calling on the community to help us out with testing! Here’s how:
In code:
|
1 2 3 |
wget http://maatkit.googlecode.com/svn/trunk/mk-query-digest/mk-query-digest sudo tcpdump -s 65535 -x -n -q -tttt -i eth0 port 11211 > memc_tcpdump.txt mk-query-digest --type memcached memc_tcpdump.txt |
Depending on what you’re interested in, you may want to try these other reports:
These instructions could change in the future — this blog post is not authoritative documentation. The most current documentation is always embedded in the tool itself, and is readable with perldoc! Let us know what you think. Please use the Maatkit mailing list and and post bug reports on Google Code.
Resources
RELATED POSTS