We recently released the GA version of Percona Server for MongoDB, which comes with a variety of storage engines: RocksDB, PerconaFT and WiredTiger.
Both RocksDB and PerconaFT are write-optimized engines, so I wanted to compare all engines in a workload oriented to data ingestions.
For a benchmark I used iiBench-mongo (https://github.com/mdcallag/iibench-mongodb), and I inserted one billion (bln) rows into a collection with three indexes. Inserts were done in ten parallel threads.
For memory limits, I used a 10GB as the cache size, with a total limit of 20GB available for the mongod process, limited with cgroups (so the extra 10GB of memory was available for engine memory allocation and OS cache).
For the storage I used a single Crucial M500 960GB SSD. This is a consumer grade SATA SSD. It does not provide the best performance, but it is a great option price/performance wise.
Every time I mention WiredTiger, someone in the comments asks about the LSM option for WiredTiger. Even though LSM is still not an official mode in MongoDB 3.2, I added WiredTiger-LSM from MongoDB 3.2 into the mix. It won’t have the optimal settings, as there is no documentation how to use LSM in WiredTiger.
And now, let’s zoom in on the individual engines.
RocksDB + PerconaFT:
UPDATE on 12/30/15
With an input from RocksDB developers at Facebook, after extra tuning of RocksDB (add
delayed_write_rate=12582912;soft_rate_limit=0;hard_rate_limit=0; to config) I am able to get much better result for RocksDB:
What conclusions can we make?
- WiredTiger’s memory (about the first one million (mln) rows) performed extremely well, achieving over 100,000 inserts/sec. As data grows and exceeds memory size, WiredTiger behaved as a traditional B-Tree engine (which is no surprise).
- PerconaFT and RocksDB showed closer to constant throughput, with RocksDB being overall better, However, with data growth both engines start to experience challenges. For PerconaFT, the throughput varies more with more data, and RocksDB shows more stalls (which I think is related to a compaction process).
- WiredTiger LSM didn’t show as much variance as a B-Tree, but it still had a decline related to data size, which in general should not be there (as we see with RocksDB, also LSM based).
Inserting data is only one part of the equation. Now we also need to retrieve data from the database (which we’ll cover in another blog post).
Configuration for PerconaFT:
numactl --interleave=all ./mongod --dbpath=/mnt/m500/perconaft --storageEngine=PerconaFT --PerconaFTEngineCacheSize=$(( 10*1024*1024*1024 )) --syncdelay=900 --PerconaFTIndexFanout=128 --PerconaFTCollectionFanout=128 --PerconaFTIndexCompression=quicklz --PerconaFTCollectionCompression=quicklz --PerconaFTIndexReadPageSize=16384 --PerconaFTCollectionReadPageSize=16384
Configuration for RocksDB:
Configuration for WiredTiger-3.2 LSM:
Load parameters for iibench:
TEST_RUN_ARGS_LOAD="1000000000 6000 1000 999999 10 256 3 0