What’s new in TokuMX 1.4, Bonus Part 6: Faster count queries

We just released version 1.4.0 of TokuMX, our high-performance distribution of MongoDB. There are a lot of improvements in this version (release notes), the most of any release yet. In this series of blog posts, we describe the most interesting changes and how they’ll affect users.

TokuMX 1.4 just came out, and it has many goodies. In this post, I want to talk about an improvement we made so early in the development cycle that we practically forgot about it: improving count queries.

The Count

MongoDB has optimized count queries to be very fast. As a result, MongoDB developers have become accustomed to count queries being fast and designed applications with this in mind. Prior to 1.4, TokuMX count queries were not optimized, and were as fast as other non-counting queries, without returning the results over the network. In 1.4, we addressed this.

What did we do?

Typically, a TokuMX in-memory query requires a bunch of computation. The document is retrieved from the storage layer, copied to an in-memory buffer, then processed, and then finally returned. But for count queries, much of this work is unnecessary. If we know we are counting a range in an index, and not actually reporting any documents, we can skip copying each document to an in-memory buffer and processing, and simply increment a count. In fact, we can integrate this counting into the storage layer. This makes counting of ranges very fast.

When does this optimization take effect?

When we are doing a count of a range that does not require a matcher, we use this optimization. If a matcher is required to filter documents in the range, the optimization does not take effect. So, for example, if collection foo has an index on {a: 1}, and we run db.foo.count({a: {$gt: 10, $lte: 20}}), this case will be eligible for this optimization. However, if we run db.foo.count({a: {$gt: 10, $lte: 20}, b: 100}), then the optimization is not used because we need to filter out documents where b != 100, and this uses a matcher.

How does performance compare with MongoDB?

For counts of the entire collection, MongoDB is significantly better, because MongoDB can take advantage of their database level reader/writer lock to maintain an accurate collection count in a single number at all times. If TokuMX were to use such an optimization, concurrency gains made from document-level locking would disappear. I wrote about this in detail a few months ago. The optimization described here applies to full collection counts, but it doesn’t—and will never—come close to what MongoDB can do.

As for other count queries, let’s run a simple in-memory experiment:

I loaded 10 million documents with an index on {a: 1} into TokuMX 1.3.3, TokuMX 1.4.0, MongoDB 2.4.9, and MongoDB 2.6.0-rc0. The documents are of this form:

I gathered the size on disk as reported by the filesystem and by MongoDB/TokuMX and the RSS after running the touch command to bring in data and indexes, and then ran full collection counts (db.bar.count()) and 10-, 100-, 1000-, and 10000-key range counts on a, and then on a with a matcher on x. Note that because the a values are random values between 0 and 1 million and there are 10 million documents, there are duplicate a values and so the larger range counts return more documents than the size of their range.

Tests were run on my desktop, which has a 4-core Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz and 16GB of RAM. Full collection count timings are averages of 100 runs on TokuMX and 10000 runs on MongoDB, and range count timings are averages of 10000 runs. The timing code is available on github. Sizes are in MB, times are in microseconds.

filesystem size on disk6,3978,650235236
db.stats() size on disk3,9665,602234235
resident set size3,0085,7403,3523,341
full collection count()1291241,999,950550,320
10-key range count, with matcher201186246237
100-key range count, with matcher358305811727
1000-key range count, with matcher1,8201,2005,6645,027
10000-key range count, with matcher16,0609,67153,91247,754
10-key range count, without matcher159162180176
100-key range count, without matcher166188204179
1000-key range count, without matcher267449457236
10000-key range count, without matcher1,2512,7322,424841

One tangential thing to notice is that the size on disk grew between MongoDB 2.4.9 and MongoDB 2.6.0-rc0. I am pretty sure this is due to the change to defaulting to powerOf2Sizes.

We can see some interesting things here with regard to performance. First of all, since the optimization works on full collection counts as well, we can see that it gets about a 3.6x speedup in the case where it can be most effective. We can also see that TokuMX is still considerably slower than MongoDB at range counts when a matcher is required. MongoDB 2.6.0-rc0 has a rewritten matcher that appears to be faster than the old one. That’s a good sign because it’s probably going to get merged in to TokuMX soon.

As for the case we wanted to optimize, we can see that TokuMX 1.3.3 takes about twice as long to perform range counts without a matcher as MongoDB 2.4.9. We can also see that the optimization in TokuMX 1.4.0 sped it up enough to actually beat MongoDB 2.4.9 by 32% on the largest range. At the same time, there was a performance regression in MongoDB 2.6.0, which performed the slowest on large range counts without a matcher.

As you can see, the results are mixed, but for range counts without a matcher, TokuMX is now competitive. Hopefully, with this work, these count queries no longer pose a problem for TokuMX users. We have some work ahead of us to bring this same optimization to counts with a matcher (and to the aggregation framework eventually), but these results are encouraging.

Want to check out the newest version of TokuMX?  Download TokuMX 1.4.0 here:

MongoDB Download MongoDB Download


Share this post