Jordan has been at Shopify for the past 5 years, and spent the past 3 years there working on MySQL high availability, automation, and performance tuning.
In this session, we will discuss our fully automated failover solution running in containers on Kubernetes. Using Orchestrator for MySQL failovers, ProxySQL to route queries and a Zookeeper-backed application we wrote called Taiji for service discovery, database failures and topology changes are handled without any human intervention. This system is tolerant to network partitions and connectivity issues, node failures, and even full region outages.
After adding additional functionality to Orchestrator, we have it deployed with the raft consensus protocol and automatic failovers enabled. ProxySQL is deployed alongside a Taiji container that watches for changes in Zookeeper. All topology changes are automatically pushed to Zookeeper via Orchestrator callback scripts and a Taiji agent that performs health checks on databases. In less than a second, these changes are pushed to ProxySQL, so our application will seamlessly begin sending read and write queries to the proper database.