EmergencyEMERGENCY? Get 24/7 Help Now!

Thoughts on Small Datum – Part 2


Posted on:

|

By:


PREVIOUS POST
NEXT POST
Share Button

If you did not read my first blog post about Mark Callaghan’s (@markcallaghan) benchmarks as documented in his blog, Small Datum, you may want to skim through it now for a little context.

——————-

On March 11th, Mark, a former Google and now Facebook database guru, published an insertion rate benchmark comparing MySQL outfitted with the InnoDB storage engine with two NoSQL alternatives — basic MongoDB and TokuMX (the Tokutek high-performance distribution of MongoDB).  In these particular tests Mark uses flash storage media. Here are my cliff notes (a shoutout to @mipsytipsy for the apt description) and my thoughts on the business implications.

If your big data applications are write-intensive you may already know their performance characteristics are primarily governed by insertion rate, which in turn is governed by write efficiency. In this benchmark Mark compares the insertion rates for the aforementioned databases using the open-source Indexed Insertion Benchmark (aka iiBench).  He generates his data using a 100 million row database, then runs the benchmark again adding an additional 400 million rows.

Mark’s tests clearly show that MySQL (with the InnoDB storage engine), and TokuMX, outperform basic MongoDB by a wide margin.  In fact, in these tests TokuMX is at least twice as fast as basic MongoDB.

I’ve graphed some of these insertion rate results for those of us who tend toward visual learning. I used the second / larger of the two tests for the graph and included just two of his MySQL tests for simplicity (he tried a larger number of MySQL configurations with similar results).  I’ve labeled the two I am using with “(c)” for compressed MySQL and “(u)” for uncompressed.

InsertionGraph

I show uncompressed MySQL results because they show a far better insertion rate than either TokuMX or MongoDB (not trying to hide from it).  But, size really does matter.  MySQL without compression has better insertion rates but the rate of database growth and the write amplification characteristics are undesirable. I.e., I feel the MySQL results with compression is the apples-to-apples comparison. You should also check out my footnote at the bottom of this post.

Bottom line: Mark’s insertion rate tests clearly show MySQL with InnoDB and TokuMX significantly outperform basic MongoDB. If your application is a write intensive NoSQL application, it will perform significantly better with TokuMX (versus basic MongoDB).  In fact, real-world customer results and other benchmark data suggest this is just the tip of the insertion rate iceberg.  With TokuMX you will more likely see  a 20x – 80x improvement.

But you don’t have to take our word for it.  You can try these tests, or, even better, test your own MongoDB applications running on TokuMX in your own environment by downloading the free community version of TokuMX (or TokuDB) here. If you need it, the iiBench benchmark is available here. If you run your own tests, I’d love to hear from you.

One footnote: TokuDB (the Tokutek high-performance storage engine alternative to InnoDB) is not covered in Mark’s benchmark.  It delivers better performance, smaller database size, and better write amplification characteristics than MySQL with InnoDB.  But that’s a story for another blog.

You can see all the gory details on Mark’s insertion rate benchmark here.

As always, your thoughts and comments are welcome.  You can also reach me on Twitter via @dcrosenlund.

Next time, in Thoughts on Small Datum – Part 3, this marketer’s summary and graphs for Mark’s benchmark on TokuMX, MongoDB and InnoDB versus the insert benchmark with disks.

Share Button
PREVIOUS POST
NEXT POST


Tags:

, , , , , , , , , , , ,

Categories:
Tokutek, TokuView


Leave a Reply

Percona’s widely read Percona Data Performance blog highlights our expertise in enterprise-class software, support, consulting and managed services solutions for both MySQL® and MongoDB® across traditional and cloud-based platforms. The decades of experience represented by our consultants is found daily in numerous and relevant blog posts.

Besides specific database help, the blog also provides notices on upcoming events and webinars.
Want to get weekly updates listing the latest blog posts? Subscribe to our blog now! Submit your email address below and we’ll send you an update every Friday at 1pm ET.

No, thank you. Please do not ask me again.