Shashank is a Principal Site Reliability Engineer at ThousandEyes working on Data Infrastructure. He spends his time scaling MySQL, MongoDB and Kafka as a technical leader. Previously, he bootstrapped the Insights team at ThousandEyes, building a platform for All things Observability.
Slow queries are the harsh reality of any database. Irrespective of how you build the system, design data, educate developers and control access, it's very hard to prevent slow queries. They negatively affect the performance of a database and thus of any application using it. It's crucial to monitor them. Otherwise, you will find yourself finding them while troubleshooting a production database performance issue.
At ThousandEyes, we have taken a proactive and automated approach to monitoring slow queries in our MySQL fleet. We have built a completely automated pipeline using various open source technologies percona-toolkit, Anemometer and an in-house tool - Slow Query Notifier - to catch these slow queries and notify us. Slow Query Notifier identifies top offenders and poorly designed queries, focusing on the most impactful culprits. It notifies authors of the respective query via our JIRA ticketing system and is capable of managing a complex JIRA workflow - creating, reopening issues, and updating priorities.
In this presentation, we go over the importance of proactively monitoring slow queries, and share our design and learnings. We share our goals to open source slow-query-notifier, integrate with PMM query analytics, and add support for Mongodb slow queries.