The challenge of handling massive data processing workloads has spawned many new innovations and techniques in the database world, from indexing innovations like our Fractal Tree® technology to a myriad of “NoSQL” solutions (here is our Chief Scientist’s perspective). Among the most popular and widely adopted NoSQL solutions is MongoDB and we became curious if our Fractal Tree indexing could offer some advantage when combined with it. The answer seems to be a strong “yes”.
Earlier in the summer we kicked off a small side project and here’s what we did: we implemented a “version 2” IndexInterface as a Fractal Tree index and ran some benchmarks. Note that our integration only affects MongoDB’s secondary indexes; primary indexes continue to rely on MongoDB’s indexing code. All the changes we made to the MongoDB source are available here. Caveat: this was a quick and dirty project – the code is experimental grade so none of it is supported or went through any careful design analysis.
For our initial benchmark we measured the performance of a single threaded insertion workload. The inserted documents contained the following: URI (character), name (character), origin (character), creation date (timestamp), and expiration date (timestamp). We created a total of four secondary indexes: URI, name, origin, and creation date. The point of the benchmark is to insert enough documents such that the indexes are larger than main memory and show the insertion performance from an empty database to one that is largely dependent on disk IO. We ran the benchmark with journaling disabled, then again with journaling enabled.
Without journaling the exit velocity of standard MongoDB was 1,045 inserts per second at 54 million document insertions versus MongoDB with Fractal Tree Indexes exit velocity of 13,304 inserts per second at 198 million document insertions: an improvement of 1,173%. With journaling, MongoDB = 763 and MongoDB/FTI = 6,951: an improvement of 811%.
At this point there are several technical directions we could take and here are some we’ll likely be looking at more closely:
We’re not MongoDB experts by any stretch but we wanted to share these results with the community and get people’s thoughts on applications where this might help, suggestions for next steps, and any other feedback.
Percona’s widely read Percona Data Performance blog highlights our expertise in enterprise-class software, support, consulting and managed services solutions for both MySQL® and MongoDB® across traditional and cloud-based platforms. The decades of experience represented by our consultants is found daily in numerous and relevant blog posts.
Besides specific database help, the blog also provides notices on upcoming events and webinars.
Want to get weekly updates listing the latest blog posts? Subscribe to our blog now! Submit your email address below and we’ll send you an update every Friday at 1pm ET.