MongoDB revs you up: What storage engine is right for you? (Part 1)David Avery
The tremendous data growth of the last decade has affected almost all aspects of applications and application use. Since nearly all applications interact with a database at some point, this means databases needed to adapt to the change in usage conditions as well. Database technology has grown significantly in the last decade to meet the needs of constantly changing applications. Enterprises often need to scale, modify, or replace their databases in order to meet new business demands.
Within a database management system (DBMS), there are many levels that can affect performance, including the choice of your database storage engine. Surprisingly, many enterprises don’t know they have a choice of storage engines, or that specific storage engine types are architected to handle specific scenarios. Often the best option depends on what function the database in question is designed to fulfill.
MongoDB is a cross-platform, NoSQL, document-oriented database. It doesn’t use the traditional table-based relational database structure, and instead employs JSON-type documents with dynamic schemas. The intention is making the integration of certain application data types easier and faster.
This blog (the first in a series) will briefly review some of the available options for a MongoDB database storage engine, and the pros and cons of each. Hopefully it will help database administrators, IT staff, and enterprises realize that when it comes to MongoDB, you aren’t limited to a single storage engine choice.
What is a Storage Engine?
A database storage engine is the underlying software that a DBMS uses to create, read, update and delete data from a database. The storage engine should be thought of as a “bolt on” to the database (server daemon), which controls the database’s interaction with memory and storage subsystems. Thus, the storage engine is not actually the database, but a service that the database consumes for the storage and retrieval of information. Given that the storage engine is responsible for managing the information stored in the database, it greatly affects the overall performance of the database (or lack thereof, if the wrong engine is chosen).
Most storage engines are organized using one of the following structures: a Log-Structured Merge (LSM) tree, B-Tree or Fractal tree.
- LSM Tree. An LSM tree has performance characteristics that make it attractive for providing indexed access to files with high insert volume. LSM trees seek to provide the excellent insertion performance of log type storage engines, while minimizing the impact of searches in a data structure that is “sorted” strictly on insertion order. LSMs buffer inserts, updates and deletes by using layers of logs that increase in size, and then get merged in sorted order to increase the efficiency of searches.
- B-Tree. B-Trees are the most commonly implemented data structure in databases. Having been around since the early 1970’s, they are one of the most time-tested storage engine “methodologies.” B-Trees method of data maintenance makes searches very efficient. However, the need to maintain a well-ordered data structure can have a detrimental effect on insertion performance.
- Fractal Tree. A Fractal Tree index is a tree data structure much like that of a B-tree (designed for efficient searches), but also ingests data into log-like structures for efficient memory usage in order to facilitate high-insertion performance. Fractal Trees were designed to ingest data at high rates of speed in order to interact efficiently with the storage for high bandwidth applications.
Fractal Trees and the LSM trees sound very similar. The main differentiating factor, however, is the manner in which they sort the data into the tree for efficient searches. LSM trees merge data into a tree from a series of logs as the logs fill up. Fractal Trees sort data into log-like structures (message buffers) along the proper data path in the tree.
What storage engine is best?
That question is not a simple one. In order decide which engine to choose, it’s necessary to determine the core functionality provided in each engine. Core functionality can generally be aggregated into three areas:
- Locking types. Locking within database engines defines how access and updates to information are controlled. When an object in the database is locked for updating, other processes cannot modify (or in some cases read) the data until the update has completed. Locking not only affects how many different applications can update the information in the database, it can also affect queries on that data. It is important to monitor how queries access data, as the data could be altered or updated as it is being accessed. In general, such delays are minimal. The bulk of the locking mechanism is devoted to preventing multiple processes updating the same data. Since both additions (INSERT statements) and alterations (UPDATE statements) to the data require locking, you can imagine that multiple applications using the same database can have a significant impact. Thus, the “granularity” of the locking mechanism can drastically affect the throughput of the database in “multi-user” (or “highly-concurrent”) environments.
- Indexing. The indexing method can dramatically increase database performance when searching and recovering data. Different storage engines provide different indexing techniques, and some may be better suited for the type of data you are storing. Typically, every index defined on a collection is another data structure of the particular type the engine uses (B-tree for WiredTiger, Fractal Tree for PerconaFT, and so forth). The efficiency of that data structure in relation to your workload is very important. An easy way of thinking about it is viewing every extra index as having performance overhead. A data structure that is write-optimized will have lower overhead for every index in a high-insert application environment than a non-write optimized data structure would. For use cases that require a large number of indexes, choosing an appropriate storage engine can have a dramatic impact.
- Transactions. Transactions provide data reliability during the update or insert of information by enabling you to add data to the database, but only to commit that data when other conditions and stages in the application execution have completed successfully. For example, when transferring information (like a monetary credit) from one account to another, you would use transactions to ensure that both the debit from one account and the credit to the other completed successfully. Often, you will hear this referred to as “atomicity.” This means the operations that are bundled together are an immutable unit: either all operations complete successfully, or none do. Despite the ability of RocksDB, PerconaFT and WiredTiger to support transactions, as of version 3.2 this functionality is not available in the MongoDB storage engine API. Multi-document transactions cannot be used in MongoDB. However, atomicity can be achieved at the single document level. According to statements from MongoDB, Inc., multi-document transactions will be supported in the future, but a firm date has not been set as of this writing.
Now that we’ve established a general framework, we’ll move onto discussing engines. For the first blog in this series, we’ll look at MMAPv1 (the default storage engine that comes with MongoDB up until the release 3.0).
Find it in: MongoDB or Percona builds
MMAPv1 is MongoDB’s original storage engine, and was the default engine in MongoDB 3.0 and earlier. It is a B-tree based system that offloads much of the functions of storage interaction and memory management to the operating system. MongoDB is based on memory mapped files.
The MMAP storage engine uses a process called “record allocation” to grab disk space for document storage. All records are contiguously located on disk, and when a document becomes larger than the allocated record, it must allocate a new record. New allocations require moving a document and updating all indexes that refer to the document, which takes more time than in-place updates and leads to storage fragmentation. Furthermore, MMAPv1 in it’s current iterations usually leads to high space utilization on your filesystem due to over-allocation of record space and it’s lack of support for compression.
As mentioned previously, a storage engine’s locking scheme is one of the most important factors in overall database performance. MMAPv1 has collection-level locking – meaning only one insert, update or delete operation can use a collection at a time. This type of locking scheme creates a very common scenario in concurrent workloads, where update/delete/insert operations are always waiting for the operation(s) in front of them to complete. Furthermore, oftentimes those operations are flowing in more quickly than they can be completed in serial fashion by the storage engine. To put it in context, imagine a giant supermarket on Sunday afternoon that only has one checkout line open: plenty of customers, but low throughput!
Given the storage engine choices brought about by the storage engine API in MongoDB 3.0, it is hard to imagine an application that demands the MMAPv1 storage engine for optimized performance. If you read between the lines, you could conclude that MongoDB, Inc. would agree given that the default engine was switched to WiredTiger in v3.2.
Most people don’t know that they have a choice when it comes to storage engines, and that the choice should be based on what the database workload will look like. Percona’s Vadim Tkachenko performed an excellent benchmark test comparing the performances of RocksDB, PerconaFT and WiredTiger to help specifically differentiate between these engines.
In Part Two of this blog series, we examine the ins and outs of MongoDB’s new default storage engine, WiredTiger.