At the recent OpenSQL Camp in Charlottesville, VA, Tokutek offered a challenge to the MySQL community – who can insert a billion rows into MySQL the fastest? We will post the results on our website and the winner gets a $100 Starbucks card, along with valuable bragging rights.
Tokutek’s technical founders (Michael A. Bender, Martin Farach-Colton, and I), in our academic roles (at Stony Brook, Rutgers, and MIT, respectively) have been investigating how to maintain indexes for large databases. Part of the challenge for this kind of research is to figure out what to measure.
Some other benchmarks, such as TPC-H and SSB, measure bulk load time rather than insertions. We are interested in the case where you must insert a small number of rows at a time at a high rate, and keep the index up-to-date. Indexed insertions are interesting in situations with high incoming data rates and a desire to concurrently query on new data without waiting for periodic batch loads. We wrote, with the help of Tokuteknologist Vincenzo Liberatore, a simple open source benchmark named iiBench, specifically designed to stress indexed insertion performance. Using iiBench, we tested InnoDB and MyISAM, and found that insertion rates for both storage engines drop off dramatically as the database grows.
This benchmark is a work in progress. It has problems, and we’re looking for feedback on how to improve it. The benchmark is essentially to insert a billion rows into a table, maintaining an interesting primary key plus two interesting indexes.
Although our research is on how to improve insertions, this contest isn’t about how much faster I can solve this problem with some other storage engine. The contest is more like a “peer review” to demonstrate that we’ve gotten as much out of MyISAM and InnoDB as we can. Admittedly, we may not have found the optimal MySQL parameters, so we are sponsoring a contest to see who can insert 1B rows into MySQL the fastest using iiBench. We’re hoping to learn just how fast MySQL can do indexed insertions given better tuning or via other innovative techniques. An overview of the contest with ground rules along with the iiBench source is available at http://www.tokutek.com/contest.php
We want to improve iiBench. Some day it may be good enough to be a useful benchmark.
Take a look and submit an entry by 31 Dec 2008. If you have any questions, please e-mail us at firstname.lastname@example.org.
Percona’s widely read Percona Data Performance blog highlights our expertise in enterprise-class software, support, consulting and managed services solutions for both MySQL® and MongoDB® across traditional and cloud-based platforms. The decades of experience represented by our consultants is found daily in numerous and relevant blog posts.
Besides specific database help, the blog also provides notices on upcoming events and webinars.
Want to get weekly updates listing the latest blog posts? Subscribe to our blog now! Submit your email address below and we’ll send you an update every Friday at 1pm ET.