MongoDB revs you up: What storage engine is right for you? (Part 2)Dave Avery
In our last post, we discussed what a storage engine is, and how you can determine the characteristics of one versus the other. From that post:
“A database storage engine is the underlying software that a DBMS uses to create, read, update and delete data from a database. The storage engine should be thought of as a “bolt on” to the database (server daemon), which controls the database’s interaction with memory and storage subsystems.”
Generally speaking, it’s important to understand what type of work environment the database is going to interact with, and to select a storage engine that is tailored to that environment.
The last post looked at MMAPv1, the original default engine for MongoDB (through release 3.0). This post will examine the new default MongoDB engine: WiredTiger.
Find it in: MongoDB or Percona builds
MongoDB, Inc. introduced WiredTiger for document-level concurrency control for write operations in MongoDB v3.0. As a result of this introduction, multiple clients can now modify different documents of a collection at the same time. WiredTiger in MongoDB currently only supports B-trees for the data structure. However, it also has the ability to use LSM-trees, but it is not currently implemented in the MongoDB version of the engine.
WiredTiger has a few interesting features, most notably compression, document-level locking, and index prefix compression. B-trees, due to their rigidity in disk interaction and chattiness with storage, are not typically known for their performance when used with compression. However, WiredTiger has done an excellent job of maintaining good performance with compression and gives a decent performance/compression ratio with the “snappy” compression algorithm. Be that as it may, if deeper compression is necessary, you may want to evaluate another storage engine. Index prefix compression is a unique feature that should improve the usefulness of the cache by decreasing the size of indexes in memory (especially very repetitive indexes).
WiredTiger’s ideal use cases include data that are likely to stay within a few multiples of cache size. One can also expect good performance from TTL-like workloads, especially when data is within the limit previously mentioned.
Most people don’t know that they have a choice when it comes to storage engines, and that the choice should be based on what the database workload will look like. Percona’s Vadim Tkachenko performed an excellent benchmark test comparing the performances of RocksDB, PerconaFT and WiredTiger to help specifically differentiate between these engines.
In Part Three of this blog series, we’ll take a closer look at the RocksDB storage engine.