Thoughts on Small Datum – Part 3

Background: If you did not read my first blog post about why I am sharing my thoughts on the benchmarks published by Mark Callaghan on Small Datum you may want to skim through it now for a little context: Thoughts on Small Datum – Part 1”


Last time, in Thoughts on Small Datum – Part 2 I shared my cliff notes and a graph on Mark Callaghan’s (@markcallaghan) March 11th insertion rate benchmarks using flash storage media. In those tests he compares MySQL outfitted with the InnoDB storage engine against two distributions of MongoDB: basic MongoDB from MongoDB, Inc. and TokuMX (the high-performance distribution of MongoDB from Tokutek).

Later, in his March 24th “TokuMX, MongoDB and InnoDB Versus the Insert Benchmark with Disks” Mark presents similar benchmark findings for a new set of insertion rate tests using a different benchmark and the same DBMS products. This time however he uses servers configured with traditional disk storage media instead of flash. In addition he does a number of things to configure the products and tests differently than he did in the flash storage benchmarks.

As the saying goes, a picture is worth a thousand words. The X-axis here is the number of rows being inserted at each stage of the test. The Y-axis is the insertion rate recorded at those levels (and in this case, bigger is better).

 For DR Blog On Mar 24 Small Datum  II

As you can see, Mark found that TokuMX outperforms MySQL/InnoDB as well as basic MongoDB. He also found that shortly after 500M rows it became impractical to test MongoDB (it was taking unreasonably long time to let the test run to completion). The same thing happened with MySQL/InnoDB after 1.6B rows. TokuMX was still running strong at 2B rows.

Note: Mark tests several different configurations of MongoDB, trying to find the optimum configuration. For the purposes of my visual aid I selected the fastest / best MongoDB configuration at each level of 100M rows. That’s not very scientific of me but I wanted to be as fair as possible in the visual comparison.

Bottom Line: Like the flash storage test covered last time, the tests with traditional disk storage show that both MySQL with InnoDB and TokuMX significantly outperform basic MongoDB in benchmarks testing for write-intensive applications. Both MongoDB (540M rows) and MySQL/InnoDB (1.6B rows) become unresponsive in these tests as the database gets large.

This suggests that if your application is a write-intensive NoSQL one, and your servers are outfitted with traditional disk storage, it will perform significantly better on the TokuMX high-performance distribution of MongoDB. And that, with TokuMX performance will not degrade significantly as the database grows. It also shows basic MongoDB may not even be suitable for write-intensive applications that are expected to grow beyond 500M rows.

One footnote: TokuDB (the Tokutek high-performance MySQL storage engine alternative to InnoDB that employees the same underlying technology as TokuMX) is not covered in Mark’s benchmark. That’s too bad because it delivers better performance and scalability than InnoDB for your NewSQL applications.

You can read all the gory details on Mark’s March 24th insertion rate benchmark here. And, you can download and try TokuMX for yourself (for free) here.

As always, your thoughts and comments are welcome below. You can also reach me on Twitter via @dcrosenlund.

Next time, in Thoughts on Small Datum – Part 4, this marketer’s summary and graphs for Mark’s IO-bound point queries tests using sysbench.



Share this post

Leave a Reply