Identifying major sources of variance in a traditional DBMS: A Case for Predictable Databases

High Availability
14 April 4:50pm - 5:40pm @ Room 203

Duration: 
50 minutes conference
Consistency and predictability of performance is often as important as the performance itself. Many time-sensitive applications rely on their underlying database to deliver a required level of performance in order to meet their SLA requirements. As a result, sudden changes of performance is often directly linked to loss of revenue in many enterprise applications. While many researchers and developers have focused on improving performance (e.g., latency or throughput), the sustainability and variance of this performance has been surprisingly neglected. In our research group, we have uncovered the various sources of variance in MySQL’s performance by carefully profiling its execution under numerous workloads. Our results show that, even in the absence of any variation in the workload or the type of transaction executed in the system, MySQL exhibits an extremely high variance in its transaction latencies, with a standard deviation that is 5x larger than the average latency, and a 99% quantile higher than 50x the average latency. In this talk, we will present a principled framework for tracing the various factors causing MySQL’s performance to suffer from high variance. We identify the major sources of performance variability and unpredictability in the current implementation of MySQL and present our alternative proposal for a DBMS that can guarantee transaction latencies that are not only low on average, but also have a low variance.


Speakers

Assistant Professor, University of Michigan, Ann Arbor
Biography: 
Barzan Mozafari is an Assistant Professor of Computer Science and Engineering at the University of Michigan (Ann Arbor), where he is a member of the Michigan Database Group and the Software Systems Lab. Prior to that, he was a Postdoctoral Associate at Massachusetts Institute of Technology. He earned his Ph.D. in Computer Science from the University of California at Los Angeles. He is passionate about building large-scale data-intensive systems, with a particular interest in database-as-a-service clouds, distributed systems, and crowdsourcing. In his research, he draws on advanced mathematical models to deliver practical database solutions. He has won several awards and fellowships, including SIGMOD 2012 and EuroSys 2013_s best paper awards.