MongoDB revs you up: What storage engine is right for you? (Part 4)David Avery
In this series of posts, we discussed what a storage engine is, and how you can determine the characteristics of one versus the other:
“A database storage engine is the underlying software that a DBMS uses to create, read, update and delete data from a database. The storage engine should be thought of as a “bolt on” to the database (server daemon), which controls the database’s interaction with memory and storage subsystems.”
Generally speaking, it’s important to understand what type of work environment the database is going to interact with, and to select a storage engine that is tailored to that environment.
The first post looked at MMAPv1, the original default engine for MongoDB (through release 3.0). The second post examined WiredTiger, the new default MongoDB engine. The third post reviewed RocksDB, an engine developed for the Facebook environment.
This post will cover PerconaFT. PerconaFT was developed out of Percona’s acquisition of Tokutek, from their TokuDB product.
Find it in: Percona Builds
PerconaFT is the newest version of the Fractal Tree storage engine that was designed and implemented by Tokutek, which was acquired by Percona in April of 2015. Designed at MIT, SUNY Stony Brook and Rutgers, the Fractal Tree is a data structure that aimed to remove disk bottlenecks from databases that were using the B-tree with datasets that were several times larger that cache.
PerconaFT is arguably the most “mature” storage engine for MongoDB, with support for document level concurrency and compression. The Fractal Tree was first commercially implemented in June of 2013 in TokuMX, a fork of MongoDB, with an advanced feature set.
As described previously, the Fractal Tree (which is available for MongoDB in the PerconaFT storage engine) is a write-optimized data structure utilizing many log-like “queues” called message buffers, but has an arrangement like that of a read-optimized data structure. With the combination of these properties, PerconaFT can provide high performance for applications with high insert rates, while providing very efficient lookups for update/query-based applications. This will theoretically provide very predictable and consistent performance as the database grows. Furthermore, PerconaFT typically provides, comparatively, the deepest compression rates of any of the engines we’ve discussed in this series.
An ideal fit for the PerconaFT storage engine is a system with varied workloads, where predictable vertical scaling is required in addition to the horizontal scaling provide MongoDB. Furthermore, the ability of PerconaFT to maintain performance while compressing – along with support for multiple compression algorithms (snappy, quicklz, zlib and lzma) – make it one of the best options for users looking to optimize their data footprint.
Most people don’t know that they have a choice when it comes to storage engines, and that the choice should be based on what the database workload will look like. Percona’s Vadim Tkachenko performed an excellent benchmark test comparing the performances of PerconaFT and WiredTiger to help specifically differentiate between these engines.