Buy Percona ServicesBuy Now!

Evaluating Database Compression Methods: Update

and  | April 13, 2016 |  Posted In: Benchmarks, MySQL


Database Compression MethodsThis blog post is an update to our last post discussing database compression methods, and how they stack up against each other. 

When Vadim and I wrote about Evaluating Database Compression Methods last month, we claimed that evaluating database compression algorithms was easy these days because there are ready-to-use benchmark suites such as lzbench.

As easy as it was to do an evaluation with this tool, it turned out it was also easy to make a mistake. Due to a bug in the benchmark we got incorrect results for the LZ4 compression algorithm, and as such made some incorrect claims and observations in the original article. A big thank you to Yann Collet for reporting the issue!

In this post, we will restate and correct the important observations and recommendations that were incorrect in the last post. You can view the fully updated results in this document.

Compression Method

As you can see above, there was little change in compression performance. LZ4 is still the fastest, though not as fast after correcting the issue.

Compression Ratio

The compression ratio is where our results changed substantially. We reported LZ4 achieving a compression ratio of only 1.89 — by far lowest among compression engines we compared. In fact, after our correction, the ratio is 3.89 — better than Snappy and on par with QuickLZ (while also having much better performance).  

LZ4 is a superior engine in terms of the compression ratio achieved versus the CPU spent.

Compression vs Decompression

The compression versus decompression graph now shows LZ4 has the highest ratio between compression and decompression performance of the compression engines we looked at.

Compression Speed vs Block Size

The compression speed was not significantly affected by the LZ4 block size, which makes it great for compressing both large and small objects. The highest compression speed achieved was with a block size of 64KB — not the highest size, but not the smallest either among the sizes tested.

Compression Speed vs Block Size

We saw some positive impact on the compression ratio by increasing the block size, However, increasing the block size over 64K did not substantially improve the compression ratio, making 64K an excellent block for LZ4, where it had the best compression speed and about as-good-as-it-gets compression. A 64K block size works great for other data as well, though we can’t say how universal it is.

Scatterplot with compression speed vs compression ratio



Updated Recommendations

Most of our recommendations still stand after reviewing the updated results, with one important change. If you’re looking for a fast compression algorithm that has decent compression, consider LZ4.  It offers better performance as well as a better compression ratio, at least on the data sets we tested.


Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Vadim Tkachenko

Vadim Tkachenko co-founded Percona in 2006 and serves as its Chief Technology Officer. Vadim leads Percona Labs, which focuses on technology research and performance evaluations of Percona’s and third-party products. Percona Labs designs no-gimmick tests of hardware, filesystems, storage engines, and databases that surpass the standard performance and functionality scenario benchmarks. Vadim’s expertise in LAMP performance and multi-threaded programming help optimize MySQL and InnoDB internals to take full advantage of modern hardware. Oracle Corporation and its predecessors have incorporated Vadim’s source code patches into the mainstream MySQL and InnoDB products. He also co-authored the book High Performance MySQL: Optimization, Backups, and Replication 3rd Edition.


Leave a Reply