Shlomi is an engineer and a database geek. He is an active MySQL community member, authors gh-ost, orchestrator, common_schema and other open source tools, and blogs at http://openark.org. Shlomi works at GitHub on the database infrastructure team solving high availability, reliability, enablement, automation and testing. Previously he managed high availability of X,000 of MySQL servers at Booking.com, and solved infrastructure problems at Outbrain. He is recipient of Oracle ACE, Oracle Technologist of the Year, and MySQL Community Member of the Year awards.
Orchestrator uses Raft consensus as of version 3.x. This setup improves the high availability of both the orchestrator service itself as well as that of the managed topologies and allows for easier operations.
This session will briefly introduce Raft consensus concepts, and elaborate on orchestrator's use of Raft: from leader election, through high availability, cross DC deployments and DC fencing mitigation, and lightweight deployments with SQLite.
Of course, nothing comes for free, and we will discuss considerations to using Raft: expected impact, eventual consistency and time-based assumptions.
Orchestrator/Raft is running in production at GitHub, Wix and other large and busy deployments.
Orchestrator is a MySQL topology manager and a failover solution, used in production on many large MySQL installments. It allows for detecting, querying and refactoring complex replication topologies, and provides reliable failure detection and intelligent recovery and promotion.
This practical tutorial focuses on and demonstrates Orchestrator's failure detection and recovery, and provides real-world examples and cookbooks for handling failovers.
- Brief introduction to Orchestrator
- Brief overview of basic configuration
- Reliable detection
- The complexity of successful failover
- Orchestrator's approach to failover
- Failover meta: anti-flapping, acknowledgments, auditing, downtime, promotion rules
- Master service discovery schemes: VIP, DNS, Proxy, Consul
- Cookbooks and considerations for master service discovery and for failover configuration
We will run demos in class. As time allows, the attendees may have time for hands-on operations.