UPDATE: Since the publication of this blog post, Tokutek subsequently published preliminary performance benchmark results for TokuMXse. In addition, we have recently released the fourth release candidate of TokuMXse for testing. Also, since this post was written, MongoDB has decided to call the new release of MongoDB v3.0 (formerly known as v2.8).
The major work in MongoDB 2.8 is the creation of a “storage engine” API within the server. This means that in the future, it will be much easier to create a product like TokuMX which uses a different storage implementation but keeps the same networking, clustering, and data modeling stack as MongoDB. TokuMXse is the name of a new product we’ll add to the line, existing alongside TokuMX, which is an implementation of that storage engine interface using the same core TokuFT storage library as TokuDB and TokuMX. We’ve written about some preliminary performance results comparing TokuMXse to TokuMX.
First things first: you can learn how to download TokuMXse v1.0.0-rc.4 here.
The pros and cons of a storage engine interface are complicated, but it comes down to compatibility versus innovation. TokuMXse, since its code is sequestered behind an API which is intended to be static, should be able to follow future upstream development with relative ease; in comparison, since TokuMX is a fork in which we’ve made changes to many parts of the codebase, it takes much more work to incorporate upstream development, and so TokuMX lags behind MongoDB development. However, behind this small API, there is a limited set of improvements we can make. TokuMX features like clustering indexes, primary keys, read-free replication and sharding migrations, fast updates, and multi-document ACID semantics require changes in other parts of the codebase, and thus can’t be implemented from within the storage engine API as it is today. This is likely to remain the case for most such features.
So, long story short, TokuMXse is MongoDB 2.8-compatible, and has the enhanced speed, concurrency, and compression of Fractal Tree storage, but won’t have all of the advanced features of TokuMX. It’s a good choice for applications which require the latest MongoDB features and just want reliably high performance and compression.
When running the server, make sure to start
--storageEngine=tokuft, and check the additional options starting with
--tokuft in the
--help text. There is also a
tokuft section of the
db.serverStatus() output and in
db.collection.stats(). Let us know what you think on the tokumx-user mailing list, or on JIRA.