Benchmarking Percona Server TokuDB vs InnoDB

May 7, 2013
Author
Vadim Tkachenko
Share this Post:

After compiling Percona Server with TokuDB, of course I wanted to compare InnoDB performance vs TokuDB.
I have a particular workload I’m interested in testing – it is an insert-intensive workload (which is TokuDB’s strong suit) with some roll-up aggregation, which should produce updates in-place (I will use INSERT .. ON DUPLICATE KEY UPDATE statements for that), so it will produce all good amount of reads.

A few words about the hardware: I am going to use new the Dell PowerEdge R420 with two Intel(R) Xeon(R) CPU E5-2450 0 @ 2.10GHz, 48GB of RAM and SATA SSD: Kingston HyperX 3K 240 GB.

Workload: I will use two different schemas. The first schema is from sysbench, and the table looks like:

CREATE TABLE sbtest$I (
id BIGINT UNSIGNED NOT NULL,
k INTEGER UNSIGNED DEFAULT '0' NOT NULL,
c CHAR(120) DEFAULT '' NOT NULL,
pad CHAR(60) DEFAULT '' NOT NULL,
PRIMARY KEY (id)
)
CREATE INDEX k on sbtest$I(k)

and tables sbtest$I_r10, sbtest$I_r100, sbtest$I_r1000, with roll-up sum for 10, 100, 1000 records in the main table.

and transactions for this workload are:

So this workload produces SEQUENTIAL inserts into Primary Key, that this is quite suitable for InnoDB, and in it we have random inserts into SECONDARY KEYS (k),
which is not so good for InnoDB.

So let’s see what results we have. The results are in TPS (more is better) and we start with empty tables.

Now, before looking at the graph, please do not jump to conclusions, as the graph is MISLEADING.

TokuDB-1

So we see that InnoDB performance steadily declines from 24000 tps to 18000 tps, but InnoDB can’t make 5h run. After 3h the disk is full, and InnoDB data size is about 210GB with 234.238.440 inserted records.
While TokuDB averages around 14000 tps mark with some periodical drops into 10000 tps area.
TokuDB datasize after 5h of run is about 50GB with 276.934.863 records.

So why do I say that the graph is misleading?
Obviously we can say that InnoDB is faster, but you should look into the steady decline of InnoDB throughput. Eventually it will drop to the level 14000 tps and below. I do not have enough space on this SSD to run this experiment that long. So there we see the strong side of TokuDB: it has more than 4x data compression on this dataset.
We can easily fill TokuDB tables with 1bln of rows on this SSD, and projected InnoDB performance on this size will be the same or worse, but will require 1TB in size.

Now to see this point of intersection, let’s review different workload (which actually is closer to what I need).

Tables looks like:

and transactions are:

That is, our PRIMARY KEY is not sequential anymore, which is bad for InnoDB, but this is what I need for my setup (I still can have synthetic auto_inc PK, but in that case I still will need SECONDARY KEY (hid,mid,id) ).

Also please note the transaction produces 4 INSERTs and workload is very write intensive.

So what are results in this case:

TokuDB-2

InnoDB gradually declines as data growth (which is expected) and by the end of 5 hours averages at 2700 tps.
With TokuDB we also see a drop, and by end of 5 hours the average throughput is 7800 tps.
Something to take into account: TokuDB results are not quite stable, that is why I also show 5-minute moving averages to TokuDB.

So TokuDB shows about 2.8x better throughput, and on data size:

  • InnoDB table: 58GB and 244.980.192 records
  • TokuDB table: 15GB and 232.927.460 records

So TokuDB looks better in this workload, however the sparse throughput is worrisome to me. Let’s zoom in to 10 min intervals and see throughput:
TokuDB-3
We can see periodical drops, which I believe are related to 60-sec checkpoint interval, as TokuDB does time-based checkpoints.
These drops are quite concerning, and it might be a problem for some users.

Now, I understand that my PRIMARY KEY (hid,mid,id) where id is sequential, and hid,mid is low selectivity is not good for fast inserts, but it is suitable for range selects by id. However it will interesting how both InnoDB and TokuDB performs if PK is (id,hid,mid). This also will affect select performance, so we will need to measure that also.

And, if you want to repeat this benchmark, the sysbench code is on Launchpad lp:~vadim-tk/sysbench/insert-roll-2,
command line to run:

and InnoDB options are:

TokuDB-related options are all defaults, as I understand from documentation TokuDB comes with good settings out-of-box, but I am ready to tune something if there are suggestions.

Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved