Database management systems (DBMSs) are notoriously difficult to deploy and administer because of their long list of functionalities. If a system could optimize itself automatically, then it would remove many of the complications and costs involved with its deployment. Most of the advisory tools built by researchers and vendors are incomplete because they require humans to make the final decisions about any database change and only fix problems after they occur. Recent work has proposed "self-driving" DBMSs that optimize the system for both the application's current workload, as well as the expected workload in the future. These systems will support existing tuning techniques and capacity planning without requiring a human to determine the right way and proper time to deploy them.
The first step towards such an autonomous DBMS is the ability to model and predict the target application's workload. In this talk, I present a robust forecasting framework called "QueryBot 5000" that we designed for self-driving operations. The framework integrates with any DBMS to predict the expected arrival rate of queries in the future based on historical data. It then provides multiple prediction horizons (short- vs. long-term) with varying aggregation intervals. I also discuss our vision and progress on how a self-driving DBMSs uses these forecast models to optimize its performance.
Infrastructure automation is not easy, especially for stateful services like MySQL (or any other database for that matter). It goes way beyond the capabilities of Ansible, Chef, SaltStack or other similar tools. In this session I'm going to show you how we went from fully manual operations to a self-healing system in less than a year at Salesforce. Having done this at several companies already, I've seen the common mistakes that can break your system and make your well intended scheduler/scripts/orchestrator a ticking bomb. I will share how to avoid these problems and build a robust and scalable automation framework that's been battle tested at companies such as Booking.com and Dropbox.
We will cover:
* Tool comparison
* Centralised vs decentralised system
* Concurrency handling
* Best practices and anti-patterns
At Square, we operate thousands of database instances to power a financial network, from payments to payroll. In a word: money. "Mission-critical" isn't critical enough. Come learn how we operate MySQL and Redis with billions of dollars at stake. We'll look at everything: configuration, management, monitoring, tooling, security, high-availability, replication, etc.
In this session we'll discuss how we use Ansible to manage the internal MySQL services at DigitalOcean, where this has worked well for us, as well as some of the issues we've experienced along the way.
We'll dive into how we use Ansible to manage MySQL, ProxySQL & Orchestrator and other related technologies in our environment, and will discuss topics such as static vs dynamic config management, Ansible performance tuning & anti-patterns, as well as testing strategies.
At the end of this session, you should have a good understanding of how Ansible could help in your environment, as well as the potential limitations you may need to think about.
Our world is moving fast, and data is piling up. There was a time when DBA managed a few machines, then a few hundred, and now each DBA needs to handle thousands. To enable this, we are going to talk about how we used to monitor, then how SRE has changed the rules. We will also cover why your DBA is still needed even in SRE, but the function is different. Finally, we will break into migrating and enabling SRE functions with Machine Learning or AIOps. This will include a discussion on what AIOps for DB's is not, and how you can get there.
* Classic Database Operations and Deployment
* SRE Concepts for the DBA
* Automation DBA tasks and investigations
* Monitoring of Yesterday
* Automating Alert Handling
* Machine Learning and how it helps SRE
* What not to expect from ML and AIOps
* Talk about Monasca as a good architecture template
Based on our experience deploying PMM on hundreds of different environments, in this session, I'm going to show how to manage PMM on production using Ansible for automation.
Some of the automation tasks include:
- PMM Server Deployment
- PMM Client for multiple server types (MySQL, Proxysql, ...)
- PMM managed for RDS
- exporters monitoring