I have always believed that TokuMX’s Fractal Tree indexes are an ideal fit for MongoDB’s sharding model, especially when it comes time for balancing to occur. At a very high level, balancing is needed when one shard contains more chunks than another. The actual formula is well described in the MongoDB documentation. Balancing shards impacts performance, so MongoDB implemented a scheduler to allow control of when the balancing process is allowed to occur, hopefully during a window of low activity.
Chunk migration is a 6 step process, but the most expensive operations are as follows:
- Get all the documents for a given chunk from the donor shard.
- Insert all the documents from step 1 into the receiving shard.
- Delete all the documents from step 1 on the donor shard.
The above workload is ideal for TokuMX, especially if your working set fits in memory but your full data set is larger than RAM, here is why…
1. Get all the documents for a given chunk from the donor shard.
TokuMX creates a clustering index on the shard key, so the documents in a chunk are located together on disk, so the range scan is fast and the IO is more sequential than random.
2. Insert all the documents from step 1 into the receiving shard.
TokuMX’s indexed insertion performance is unmatched, see the iiBench benchmark.
3. Delete all the documents from step 1 on the donor shard.
TokuMX’s deletes require very little IO, plus TokuMX supports concurrent writers which eliminates blocking the running workload, see the Sysbench benchmark.
Benchmark Description and Environment
- Single Dell R710, (2) Xeon 5540, 16GB RAM, PERC 6/i (256MB, write-back), 8x10K SAS/RAID 10
- Ubuntu 13.10 (64-bit), XFS file system
- 3 mongod shards, 3 config servers, 1 mongos
- MongoDB v2.4.9
- read-ahead set to 16K
- TokuMX v1.4.0
- 4GB cache per mongod instance, directIO
- The schema is Sysbench, with 1 change. Rather than creating 12 tables with 10 million documents each I added a new field “collectionId”, and inserted 10 million documents with each collectionId between 1 and 12.
- Sharding is range based
- Shard key is collectionId + documentId
- Secondary index on collectionId + k
- Other benchmark specifics
- Pre-split into 12 chunks, 1 each of collectionId 1 through 12
- Prior to loading each shard contains 4 chunks
- Balancer off for the entire benchmark
Always worth noting is insert performance. I inserted 120 million documents into the collection, with 12 concurrent threads. Overall insert throughput for MongoDB was 4,596 inserts per second versus TokuMX’s 43,769 inserts per second.
Once the data was loaded I began the traditional Sysbench workload (point queries, range queries, aggregation, update, delete, insert), and gated the throughput for MongoDB at 15 Sysbench “transactions” per second. TokuMX was gated at 30 transactions per second, double that of MongoDB.
After ten minutes (600 seconds), I performed 3 sh.moveChunk() operations; moving a single chunk from shard0001 to shard0002, then one from shard0002 to shard0000, and finally one from shard0000 to shard0001. The impact of this operation is clearly shown in the following graph.
The TokuMX cluster was running twice as fast as MongoDB, yet was unaffected by the chunk migrations. The MongoDB cluster performance was measurably impacted, sometimes down over 50% for a 10 second interval.
This benchmark specifically highlights a few of the performance advantages of TokuMX (clustering indexes, concurrent writers, low IO index maintenance) but there are others, plus TokuMX includes high compression and transactions. Try it on your workload today by visiting our download page.