September 16, 2014

Considering TokuDB as an engine for timeseries data

TokuDBI am working on a customer’s system where the requirement is to store a lot of timeseries data from different sensors.

For performance reasons we are going to use SSD, and therefore there is a list of requirements for the architecture:

  • Provide high insertion rate
  • Provide a good compression rate to store more data on expensive SSDs
  • Engine should be SSD friendly (less writes per timeperiod to help with SSD wear)
  • Provide a reasonable response time (within ~50 ms) on SELECT queries on hot recently inserted data

Looking on these requirements I actually think that TokuDB might be a good fit for this task.

There are several aspects to consider. This time I want to compare TokuDB vs InnoDB on an initial load time and space consumption.

Let’s assume the schema is following

where sensor_id is in a range from 1 to about 1000 and ts is monotonically increasing timestamp.

This schema exploits both TokuDB and InnoDB clustering primary key, and all inserts are “almost” sequential, which guarantee that all inserts will not require disk access and work with data in memory.
The same for SELECTS – select queries on the most recent time periods will be executed only by a memory access.

I am doing this research on the Dell PowerEdge R420 box with 48GB of memory (40GB for InnoDB buffer pool size, and default memory allocation for TokuDB, which is 24GB for tokudb cache). The storage is a very fast PCI-e Flash card.

The test export CSV file, suitable for LOAD DATA INFILE is 40GB in size and contains over 1 bln records (1.238.201.948 exactly)

MySQL Versions:

  • For InnoDB tests I used Percona Server 5.6-RC2
  • For TokuDB tests I used mariadb-5.5.30-tokudb-7.0.4 from Tokutek website

So, first, let’s load data into InnoDB, again, I am using LOAD DATA INFILE statement

  • InnoDB, no compression. Load time is 1 hour 26 min 25.77 sec, final table size is 90GB
  • InnoDB, 8K compression. Load time 3 hours 26 min 17.06 sec, the table size is 45GB
  • InnoDB, 4K compression. Load time 17 hours 23 min 43.48 sec, the table size is 26GB

Now for TokuDB:

  • TokuDB, default compression. Load time 33 min 1.18 sec, the table size on disk is 10GB
  • TokuDB, tokudb_small table format. Load time 37 min 2.34 sec, the table size is 4.6GB

So TokuDB is the obvious leader in both load time and compression. Of course just these are not enough, and now we need to see the performance of further INSERTs and SELECTs queries. This is what I am running right now and will post the results when I have them.

About Vadim Tkachenko

Vadim leads Percona's development group, which produces Percona Clould Tools, the Percona Server, Percona XraDB Cluster and Percona XtraBackup. He is an expert in solid-state storage, and has helped many hardware and software providers succeed in the MySQL market.

Comments

  1. Vadim,

    Very interesting results ! It is interesting what for Innodb the overhead of compression (in terms of CPU probably) is so large – where enabling compression adds whole 2 hours for this data. You did not test TokuDB without compression but we can see it is able to compress all this data and get it loaded doing all file IO needed in around 30 minutes giving much better compression.

    I assume some of this relates to TokuDB being able to use multi cores for compression while Innodb does not and as such if we use some parallel load/insert process performance difference might be smaller ? Did you look at this by chance ?

  2. The cost of splitting nodes and re-compression because of key_block_size misses can be quite substantial, as the time it took to get 4K shows.

  3. I think there are several factors contributing to such big time difference.
    First, you are correct, LOAD DATA is highly optimized in TokuDB and able to load data in parallel.
    With simple INSERTS the difference will be smaller.
    Now also the data size is important. I am not sure how TokuDB handles compression, but for InnoDB both
    compressed and uncompressed pages are stored in buffer pool, which decrease available memory for InnoDB operations.

  4. sand says:

    awesome, I like posts like that with benchmarks/numbers and some real world usage example

  5. David Bainbridge says:

    Just curious if you have looked into some NoSQL solutions for storage and query of time series data. I have been using OpenTSDB (http://opentsdb.net/) which leverages HBase and Hadoop, and have seen very good performance. Elasticsearch also seems to be at least dipping their toe into the possibility of storage of time series data. It would be interesting to see how these solutions compare to TokuDB and InnoDB.

    Additional things that might be interesting to consider with respect to the query capability is downsampling as well as dealing with sensors that return counter values, where the value of the data is in the rate calculation as opposed to the returned value. The value of these capabilities depends highly on your use cases and sensors, but perhaps something to consider.

  6. Fernando Mattera says:

    How about online backups in TokuDB?

  7. @FernandoMattera, online backups are available in the TokuDB Enterprise Edition, more information and downloads are available at http://www.tokutek.com/products/downloads/enterprise-edition-downloads/

  8. @David,

    Yes, I considered OpenTSDB and Kairosdb, and Cassandra by itself.
    These may be viable solutions, but there more questions to consider from app and ops points of view.
    You saw there already a question on backup. so admins should be trained to perform backups of OpenTSDB.
    The same for monitoring and performance troubleshooting. From what I saw Cassandra is very far yet from being problem-free.
    And the same for app developers. There is still a big entry barrier to switch from SQL based apps to NoSQL.

    Additionally I think both OpenTSDB and Kairosdb are not quite flexible for different schemas. Not that it can’t be done, but I was not able to figure it out in reasonable time.

    Now I also saw a message Cassandra sends is “disk is cheap, denormalize everything, if the same data stored several times – no problem”, which is exactly contrary to what I am trying to achieve with SSD.

    So one thing to another, I am not saying that OpenTSDB / NoSQL are not suitable to store timeseries, but in this particular case I am working with, they seem less preferable.

  9. @FernandoMattera,

    Another backup solution we consider is LVM-snapshot backups.

    In future I may play with btrfs and zfs-for-linux filesystems and their snapshots.

  10. @David,

    Actually what else can be a good fit for this task is some column based storage like InfoBright or InfiniDB.
    The main concern with these is that they are analytic oriented, that means that SELECTs still will be fast,
    but probably not that fast as with InnoDB/TokuDB.

    But this is something to try, I may actually do it.

  11. gpfeng.cs says:

    Vadim

    Do you split the 40GB CSV data file and load them in parallel using multi connections?

    If there is only single connection to execute LOAD DATA INFILE, then InnoDB will not suffer from random IO because there is no secondary index on the table, and all inserts are “almost” sequential, as you mentioned. I want to know why TokuDB can run faster than InnoDB.

    BTY, from your coment:

    > LOAD DATA is highly optimized in TokuDB and able to load data in parallel.

    could you give more detail of LOAD DATA optimization in TokuDB against InnoDB.

    Thanks.

  12. TokuDB uses a bulk loader when bulk inserting into an empty table. The bulk loader gathers up all of the rows to be inserted in a bunch of temp files and then executes a multi-phase merge sort of the temp files. The data in the temp files are sorted by a in memory sort prior to writing the temp files. This in memory sort is fast since LOTS of rows are being sorted. The final merge sort phase writes out fractal tree files. All of the compression of the blocks in the fractal tree files is highly parallel.

    We originally intended a lot more parallelism in the TokuDB bulk loader than is there today. For example, the in memory sorts could be parallelized, and the merge sort could also be parallelized. It is nice to know that even without this level of parallelism, the bulk load speed is still good.

  13. Rich,

    Thanks for explanations! Indeed it would be interesting to see how performance changes if you do inserts (ie as per mysqldump) and how does the size and performance change in this case – I’d expect it can cause more overhead and fragmentation

  14. Alfie John says:

    Would you consider doing the same test with MyISAM (although I would bet TokuDB would clearly be the winner here)?

    I have an almost identical schema for finance data (i.e. code/at/open/high/low/close/volume) and found that INSERTs for MyISAM were a lot faster than InnoDB. However the frequency of table corruption with MyISAM (at least one a week) made me move to InnoDB.

  15. The real question that needs to be answered is when is TokuDB going to be integrated into Percona mainline 5.5?

  16. @Alfie,

    I have no plans to test MyISAM, as this is something I’d never use in production systems.

  17. @Brian,

    We have no plans to integrate TokuDB with Percona Server 5.5, but we may do this for Percona Server 5.6 and this may happen this year.

  18. Crap, given the performance issues the 5.6 branch has been having, I was going to delay using it until next year sometime.

  19. Michael Andrews says:

    Following Brian’s question, do you plan to incorporate Tokudb with Xtradb Cluster in the future?

  20. @Michael,

    Adjusting TokuDB to support clustering environment requires a lot of fixes inside TokuDB,
    and at this moment only Tokutek has enough expertise to do that.
    So this question is totally to Tokutek team what is their plan regarding clustering.

  21. Peter (Stig) Edwards says:

    I have used the ARCHIVE storage engine with *many* tables for non production critical timeseries data, a high compression ratio, high insertion rate and low memory footprint was required. The data was daily linux process accounting data which was in parallel bulk loaded once per day per host into a hostname_yyyymmdd specific table, typically read only a few times with full table scans to aggregate data before being dropped after X days. I expect the sensor data needs to be loaded more frequently than once per day and that there are several other reasons why the ARCHIVE storage engine would not be suitable (SELECT response time requirement for one), but the comments about optimizing bulk loading (in my case into empty tables) in parallel and compression operating on large chunks struck a chord. I found the space consumption and initial load time using *many* ARCHIVE tables in parallel (for my usage pattern and requirements) very compelling. At the time TokuDB was not an option for me, but I have tested it for other projects and have been very impressed.

    http://dev.mysql.com/doc/refman/5.6/en/archive-storage-engine.html

  22. My experiece with LOAD DATA on TokuDB shows that when table is already loaded it is *slower* to some extent than InnoDB. Still need to understand why.

    Parallelizing LOAD DATA for same InnoDB table is bad; I tried this recently with obvious results: a LOT longer than single threaded. I did not try the same on TokuDB as yet.

    I also used ARCHIVE tables twice (they were already existing in customers DBs). In both cases I very quickly ran into table corruption. ARCHIVE and MyISAM are a big NO in my dictionary for data you care about.

  23. Kaifeng Chen says:

    I also get the same conclusion that TokuDB slow down then load speed when the table is not empty. For example, the time cost to load 1,500,000 row from file to table is 41s, while the time is 1min 22s to load the same data to the same table.

    It dosn’t help to set the read_buffer_size to a much larger value, say 512m.

    From the TokuDB document, MySQL uses normal load data instead of parallel one when the table is not empty. I was wondering if there is any other setting to speed up loading to the not-empty table.

  24. @Kaifeng, TokuDB implements a bulk-loader when loading into an empty table. The bulk-loader is creating Fractal Tree indexes directly (outside of MySQL), and is therefore much faster than loading into a table containing data. I do not understand what you mean by your read_buffer_size comment, that appears to be a MyISAM variable.

    Lastly, the performance of inserting into non-empty TokuDB tables is driven by the schema of the table and the tuples themselves, meaning:

    Schema: if the table’s PK is auto-increment and there are no unique secondary indexes, insertions will be fast (and require no I/O except for logging and/or fsyncs). Batching several inserts per transaction or using multiple concurrent insert clients will go even faster.

    Tuples: (and this is where Vadim’s example schema is affecting the insertion performance of TokuDB). If the table has a declared primary key (and non-auto-increment) and/or a unique secondary index, then the tuples must arrive in a “right-most” pattern for TokuDB to avoid I/O and thus be fast. Vadim declared his primary key to be (sensor_id,ts), and since sensor data is arriving from all sensors simultaneously this becomes a “random” insertion pattern. Each arriving tuple is likely to end up anywhere in the bottom level of the tree. By switching the primary key order to be (ts,sensor_id) then all arriving tuples are arriving on the right side of the tree, which is easy to keep in memory and avoid I/O.

    I hope this helps describe the performance and behavior of TokuDB.

    -Tim

  25. gpfeng.cs says:

    I got the same results:

    1. load data into an empty TokuDB table is much faster than InnoDB tables

    2. when split the LARGE data file(500 million rows, auto_increment primary key, no unique keys) into several segments(1 million rows per segment) and load one by one(to avoid the “Multi-statement transaction required more than ‘max_binlog_cache_size’ bytes of storage” error), the load speed of the first segment is much faster than InnoDB, but the remained segments are loaded much slower than InnoDB, results: http://www.gpfeng.com/wp-content/uploads/2013/07/load_innodb_vs_tokudb.jpg

  26. @gpfeng.cs, I’m not sure I understand the use-case that your benchmark simulates. In #2, are you creating any secondary indexes at all? TokuDB’s real insertion performance advantage is seen when you have non-unique secondary indexes _and_ your working set of data is larger than RAM. Based on the information you provided I assume that was not present in your test.

    Also, in your test were you using InnoDB compression or not? If so, what was the difference in compression achieved? If not, I’d be interested in seeing the performance difference when InnoDB compression is enabled.

    -Tim

  27. gpfeng.cs says:

    @Tim Callaghan

    There is no secondary index at all in #2, only auto_increment primary key exists, and compression is not used. InnoDB will not suffer from random IO in this scenario, what really confused me is that TokuDB performs well at the beginning but falls behind InnoDB when loading the remaining segments, and now I know that it benefits from the “LOAD DATA optimization” in TokuDB and only the first file can be loaded more efficiently.

  28. @gpfeng.cs, Then my last question is whether or not you are using InnoDB compression in this example (and does compression matter to your use-case). If your not using compression and have no secondary indexes, then there are no advantages to using TokuDB for your use-case. Our Fractal Tree indexes are ideal for maintaining non-unique secondary indexes, allowing better query performance as more indexes exist.

    … However, you could to add the following to your benchmark …

    Let the data get large, and “alter table sensordata add column data7 double null;”. TokuDB’s hot schema change functionality will give you the column immediately. With InnoDB choose one of the following:
    (1) get a nice big system maintenance window, as the table is now read only during the rebuild
    (2) use an online schema change tool and watch the performance of your insertion workload take a nose-dive as a new table is created in the background
    (3) disconnect a slave, perform this operation on the slave (and wait for the table to be rebuilt), catch up to the master, switch slave to master and master to slave, …

    The bottom line is, TokuDB excels at Performance (maintaining secondary indexes, support for clustering secondary indexes, and more), Compression, and Agility (hot add/drop/expand columns, hot add indexes).

    -Tim

  29. gpfeng.cs says:

    @Tim Callaghan

    Thanks for your kind reply.

    I konw that TokuDB is a write-optimized storage engine, the b-tree with buffers, which I can summarize from the implementation of the Fractal Tree in TokuDB can help transform random IO to sequential IO(another saying: delay small write until lots of writes have been gathered) with the cost of read performance.

    I have used the iiBench tool to verify that, the insert performance(with 3 non-unique secondary indexes) remains pretty good for TokuDB when the table becomes huge, and hot schema change(add/delete columns) is really attractive.

    In my LOAD DATA test, data can’t be loaded once, and no secondary index is created, I just want to figure out the load performance between InnoDB(no compression) and TokuDB when I heard that TokuDB has done some optimizations on LOAD DATA.

  30. @gpfeng.cs

    Fair enough. TokuDB’s bulk-loader can only be used by the “LOAD DATA” command when the target table is empty. Our bulk-loader sorts the data and builds the Fractal Tree indexes outside of MySQL which cannot be done if the table already contains data.

  31. snicks1 says:

    And for the jointure?

Speak Your Mind

*