Database Stalls, From the Ordinary to the Obscure
26 April - 2:00 PM - 2:50 PM @ Room 210
50 minutes conference
VividCortex monitors lots of production database servers, which means we get to see lots of different database problems. One specific type of problem that we like to focus on is database stalls. We define stalls as short periods of time, typically one second, when work isn’t getting done. It’s easy to see when a database isn’t performing its work as usual, but trying to find the cause is much more difficult. I’ll talk about what kinds of metrics and instrumentation I rely on to diagnose obscure stalls, and how to develop a “work-centric” monitoring process to solve problems. I’ll also talk about the basics of back pressure, and how applications should properly react to stalls to avoid query stampedes and cascading failures.
Preetam is an engineer at VividCortex where he works on anomaly detection and back-end systems.