EmergencyEMERGENCY? Get 24/7 Help Now!

RocksDB 101

 | October 21, 2015 |  Posted In: MongoDB

PREVIOUS POST
NEXT POST

RocksDBAfter we announced that Percona offers support for RocksDB, we saw many people looking for more details about this storage engine. Here is a quick list of some of the most frequent questions we get.

Q: What is RocksDB?

A: Quoting the homepage of the project:

RocksDB is an embeddable persistent key-value store for fast storage. RocksDB can also be the foundation for a client-server database, but our current focus is on embedded workloads.

RocksDB builds on LevelDB to be scalable to run on servers with many CPU cores, to efficiently use fast storage, to support IO-bound, in-memory and write-once workloads, and to be flexible to allow for innovation.

Q: Where is it available?

A: You have 2 main options to get RocksDB:

In both cases you will then need to start MongoDB with --storageEngine = rocksdb.

Q: What are the main features of RocksDB?

A: Because of its design using LSM trees, RocksDB offers excellent write performance without sacrificing too much read performance. And as a modern storage engine, it compresses data.

So each time you are concerned with MongoDB write performance, RocksDB is a good candidate.

Also note that RocksDB has been developed with fast storage in mind.

Q: Why is RocksDB write optimized?

A: RocksDB uses LSM trees to store data, unlike most other storage engines which are using B-Trees.

In most cases, B-Trees offer a very good tradeoff between read performance and write performance; this is why they are so widely used in the database world. However when the working set no longer fits in memory, writes become extremely slow because at least an I/O is needed for each write operation.

LSM trees are designed to amortize the cost of writes: data is written to log files that are sequentially written to disk and never modified. Then a background thread merges the log files (compaction). With this design a single I/O can flush to disk tens or hundreds of write operations.

The tradeoff is that reading a document is more complex and therefore slower than for a B-Tree; because we don’t know in advance in which log file the latest version of the data is stored, we may need to read multiple files to perform a single read. Tricks like bloom filters help alleviate this issue.

Q: How is RocksDB performance compared to other storage engine?

Mark Callaghan from Facebook published results for cached databases (data fits in memory) some time ago.

Vadim Tkachenko from Percona published additional results when data is larger than memory.

Q: Where can I find RocksDB support?

A: You can report issues here, go to this Facebook group to discuss RocksDB-related topics, or hire us.

Q: How can I run backups?

Storage-engine agnostic methods like cold backups or volume snapshots work with RocksDB.

RocksDB also has native support for hot backups with the following command:

See this post from Facebook/Parse engineering team for more details.

The LSM tree design makes incremental backups much easier than with many technologies and rocks-strata is probably a good place to start.

Conclusion

The storage engine ecosystem for MongoDB is quickly advancing now with at least 3 strong contenders: WiredTiger, RocksDB and PerconaFT. If you want to learn more from RocksDB, PerconaFT and Percona Server for MongoDB, please register for my free webinar on Wed Oct 28 at 11am Pacific Time.

PREVIOUS POST
NEXT POST
Stephane Combaudon

Stéphane joined Percona in July 2012, after working as a MySQL DBA for leading French companies such as Dailymotion and France Telecom. In real life, he lives in Paris with his wife and their twin daughters. When not in front of a computer or not spending time with his family, he likes playing chess and hiking.

7 Comments

  • Stephane,
    How does backup for MongoRocks method described on the url below is different from triggering API call

    http://blog.parse.com/learn/engineering/strata-open-source-library-for-efficient-mongodb-backups/

  • Peter,
    rocks-strata tool leverages the MongoRocks backup API to build the full framework for managing backups and restores. The only thing that API does is that it creates a snapshot of a database in another directory. Rocks-strata then takes the snapshot and sends it to remote storage. It also performs incremental backup, so it only sends the new files to remote storage. With rocks-strata you can also restore data from backup and we even have a way to query backups while they’re still on remote storage.

  • Actually, https://github.com/MySQLOnRocksDB/mysql-5.6 is an old link. Go to https://github.com/facebook/mysql-5.6/ to get the latest MySQL on RocksDB

  • The latest MySQL on RocksDB is available at https://github.com/facebook/mysql-5.6/ . We migrated away from https://github.com/MySQLOnRocksDB/mysql-5.6 a few months ago. Sorry for the confusion.

Leave a Reply