Percona Live: Data Performance Conference 2016 Logo

April 18-21, 2016

Santa Clara, California

The Language of Compression Benchmarking

The Language of Compression Benchmarking

 19 April 11:30 AM - 12:20 PM @ Ballroom C
Experience level: 
50 minutes conference
Operations and Management


Whether your data's in MySQL, a NoSQL, Hadoop, or somewhere in the cloud, you're likely paying decent money for storage and IOPS. With ever-growing data volumes, and the need for SSDs to cut latency and replication to provide insurance, your storage footprint is an important place to look for savings. It makes sense, then, why so many storage vendors tout compression as a key metric and differentiator. The language vendors and users employ to reason about storage footprint and compression is embarrassingly vague if not meaningless or downright deceptive, but we can do better, and we must do better. In this talk, we'll discuss each part of the durable storage stack, from the hardware on up, and how usage numbers can take on different meanings at each layer. We'll talk about what's important to know at each layer, and how to think about and talk about concepts like compression, fragmentation, write amplification, and wear leveling. Finally, we'll see different ways benchmarketers present data to lie to you, and learn some techniques for identifying and cutting through those kinds of lies.


Leif Walsh's picture

Leif Walsh

Engineer, Two Sigma


Leif Walsh worked on TokuMX at Tokutek. He also worked on performance-critical software at Google and Microsoft, and helped start RethinkDB. Leif studied math and computer science at Stony Brook University. In his spare time, he is an amateur lithography assistant.

Share this talk