Where the open source community meets: Secure your spot for Percona Live Amsterdam! - Register

Downloads

Blog

268x Query Performance Increase for MongoDB with Fractal Tree Indexes, SAY WHAT?

August 30, 2012

Author

Tim.Callaghan

MySQL

Share this Post:

Last week I wrote about our 10x insertion performance increase with MongoDB. We’ve continued our experimental integration of Fractal Tree^® Indexes into MongoDB, adding support for clustered indexes. A clustered index stores all non-index fields as the “value” portion of the index, as opposed to a standard MongoDB index that stores a pointer to the document data. The benefit is that indexed lookups can immediately return any requested values instead of needing to do an additional lookup (and potential disk IOs) for the requested fields.

To create a clustered index you just need to add “clustering:true” as in the following example (note that version 2 indexes are Fractal Tree Indexes):

db.tokubench.ensureIndex({URI : 1}, {v : 2, clustering : true})

1	db.tokubench.ensureIndex({URI : 1}, {v : 2, clustering : true})

In this benchmark I measured the performance of a single threaded insertion workload combined with a range query retrieving 1000 documents greater than or equal to a random URI. The range query runs on a separate thread and sleeps 60 seconds after each completed query.

The inserted documents contained the following: URI (character), name (character), origin (character), creation date (timestamp), and expiration date (timestamp). We created a total of four secondary indexes: URI (clustered), name, origin, and creation date.

We ran the benchmark with journaling disabled and the default WriteConcern of disabled.

My benchmark client is available here.

Benchmark Environment

Sun x4150, (2) Xeon 5460, 8GB RAM, StorageTek Controller (256MB, write-back), 4x10K SAS/RAID 0

Ubuntu 10.04 Server (64-bit), ext4 filesystem

MongoDB v2.2.RC0

Benchmark Results

The exit velocity of standard MongoDB was 1,092 inserts per second at 38 million document insertions versus MongoDB with Fractal Tree Indexes exit velocity of 12,241 inserts per second at 49 million document insertions: an improvement of 1,020%.

More interesting is the query performance. Note that this is a latency graph where lower is better and also that the Y-axis is on a log scale to make comparison easier. MongoDB exited with an average of 16,668 milliseconds per query versus MongoDB with Fractal Tree Indexes average of 62 milliseconds: a 26,816% improvement.

As I said in my last post, we’re not MongoDB experts by any stretch but we wanted to share these results with the community and get people’s thoughts on applications where this might help, suggestions for next steps, and any other feedback. Also, if you are interested in learning more about TokuDB, please stop by to hear us speak at StrangeLoop, MySQL Connect, Percona Live, or join our introductory webinar next week.

By the way, MongoDB also supports covered indexes, which I will talk about in my next post. Covered indexes can provide some of the benefits of a clustered index, but can have significant drawbacks as well.

0 0 votes

Article Rating

9 Comments

Oldest

Newest Most Voted

Benjamin Abt

13 years ago

Hi,

great information!
Do you know if there is any way to get this improvements by using the MongoDB C# Driver?

Thanks!

Tim.Callaghan

13 years ago

Reply to Benjamin Abt

Benjamin,

Our indexing performance improvements are within MongoDB itself so they are available to all clients, regardless of the driver language.

-Tim

Fulano Tal

13 years ago

Hi,

thanks for sharing these benchmarks. Do you also have data for the standard deviation etc.?

Cheers

Tim.Callaghan

13 years ago

Reply to Fulano Tal

The raw data used for the graphs is available at here.

Fulano Tal

13 years ago

Reply to Tim.Callaghan

Thank you. 🙂

Tyler

13 years ago

I’ve looked around on the mongodb site and I’ve not found any documentation for the clustering indexes. Is this something tokutek has developed as a plugin/upgrade?

Tim.Callaghan

13 years ago

Reply to Tyler

Tyler,

MongoDB supports covered indexes as is discussed in their documentation at http://docs.mongodb.org/manual/tutorial/create-indexes-to-support-queries/. Clustering indexes are exclusive to our implementation. Please let us know if you’d like to evaluate it.

Bobo

11 years ago

The mongo documentation recommends you size indexes to fit in memory. How does the performance drop off as your database exceeds your machine’s memory size? And what about your other collections in the same database, their indexes will also be pushed out of memory.

Tim.Callaghan

11 years ago

Reply to Bobo

The point of this experiment was to show how the two products behave on a mixed workload (inserts plus queries). In addition, the secondary index on the TokuMX collection is created clustered as this type of index allows for optimal range query performance. Since the queries are totally random there is no way that the indexes will fit in memory as the data set is constantly growing.