Andy Pavlo is an Assistant Professor of Databaseology in the Computer Science Department at Carnegie Mellon University.
In the last 20 years, researchers and vendors have built advisory tools to assist DBAs in tuning and physical design. Most of this previous work is incomplete because they require humans to make the final decisions about any database changes and are reactionary measures that fix problems after they occur. What is needed for a "self-driving" DBMS are components that are designed for autonomous operation. This will enable new optimizations that are not possible today because the complexity of managing these systems has surpassed the abilities of humans.
In this talk, I present the core principles of an autonomous DBMS based on reinforcement learning. These are necessary to support ample data collection, fast state changes, and accurate reward observations. I will discuss techniques on how to build a new autonomous DBMS or retrofit an existing one. Our work is based on our experiences at CMU from developing an automatic tuning service (OtterTune) and our self-driving DBMS (Peloton).
Automatization of Postgres Administration
Cloud services like Amazon RDS or Google Cloud SQL help to automate half of DBA tasks: launch database instances, provision replicas, create backups. But the other, very important part is almost not automated now: database tuning and query optimization.
High Performance, Scalable, and Available MySQL Clustering System for the Cloud
Vitess is now used in production at multiple companies. Vitess shines in this area by providing query logs, transaction logs, information URLs, and status variables that can feed into a monitoring system like Prometheus.
Ghostferry: the Swiss Army Knife of Live Data Migrations with Minimum Downtime
Inspired by gh-ost, our tool is named Ghostferry and allows application developers at Shopify to migrate data without assistance from DBAs. We plan to open source Ghostferry at the conference so that anyone can migrate their own data with minimal hassle and downtime.
What is a Self-Driving Database Management System?
People are touting the rise of ""self-driving"" database management systems (DBMSs). But nobody has clearly defined what it means for a DBMS to be self-driving. Thus, in this keynote, Andy provides the history of autonomous databases and what is needed to make a true self-driving DBMS.