Percona’s TPCC for MySQL toolset allows one to measure the query performance for an OLTP workload on various MySQL storage engines. The toolset includes a program to load the database tables, and a program to run queries and measure performance. We have found Percona’s TPCC toolset to be extremely useful for tuning our software. However, we want to take advantage of TokuDB’s bulk load capability when loading the database.
We created a new tool, a simple variant of the existing code, that generates CSV files for the TPCC database. These CSV files can be bulk loaded into TokuDB with a “LOAD DATA INFILE” statement. TokuDB’s bulk loader uses a parallel merge sort algorithm that is implemented in CILK, an extension to the C language that allows one to easily exploit multiple core machines. In contrast, Percona’s TPCC loader inserts one row at a time, and typically uses one core.
We ran TPCC experiments on an Amazon EC2 machine (c1.xlarge, 8 cores, 7 GB of memory) with a single EBS volume for storage. All experiments used the latest version of TokuDB release 5.0. We observed an 8 fold speedup on an 8 core machine when loading the 200 warehouse TPCC database into TokuDB using the bulk loader, compared with Percona’s tpcc_load program. We expect further performance increases on machines with additional cores.
Percona’s widely read Percona Data Performance blog highlights our expertise in enterprise-class software, support, consulting and managed services solutions for both MySQL® and MongoDB® across traditional and cloud-based platforms. The decades of experience represented by our consultants is found daily in numerous and relevant blog posts.
Besides specific database help, the blog also provides notices on upcoming events and webinars.
Want to get weekly updates listing the latest blog posts? Subscribe to our blog now! Submit your email address below and we’ll send you an update every Friday at 1pm ET.