TokuDB’s loader uses the available multicore computing resources of the machine to presort and insert the data. In the last couple of posts (here and here), Rich and Dave presented performance results of TokuDB’s loader. Comparing load times with TokuDB 2.1.0, Rich found a 2.1x speedup on a 2 core machine, and a 4.2x speedup on an 8 core machine. Comparing load times with TokuDB 3.1, Dave found an 8.2x speedup on Amazon Web Services c1.large node with 8 cores while loading a table with 256 byte rows.
This leads to these natural questions: how does one use the TokuDB loader? Under what scenarios is it used?
The loader has two purposes:
- to ease migration of data from other sources (e.g. other storage engines, data files) to TokuDB.
- to build newly defined indexes (that TokuDB can maintain in real time) very fast.
So, the loader is designed to operate on empty tables.
The loader is integrated into the TokuDB storage engine. Using it does not require any external binaries. Users can use the loader by inserting data into an empty TokuDB table with any of the following commands:
- insert into
- load data infile
- alter table … engine=TokuDB
- alter table … add index…
Scenarios that do not yet work are:
- replace into
- insert ignore