The MySQL replication and HA suite of tools and technologies ensure that we have efficient and safe replication at Facebook scale. In spite of different kinds of process and system failures, MySQL continues to safely replicate trillions of transactions a year. The MySQL Replication/HA stack at Facebook, which delivers this scale involves Facebook enhanced Semi-Sync replication plugin, Binlog Server and high-availability suite of tools called DBStatus, Logtailer and FastFailover. Come learn about these exciting technologies and how we handle these challenges of scale.
In the second portion of the 2 part talk, we will focus on automations we've been building on top of FB MySQL replication technologies. The automations maintains MySQL high availability by dealing with both day-to-day hiccups to large scale disasters. The automation areas we will cover include automatic master failover, data consistency invariants, replication failure domain, power-loss failure recovery and continuous disaster drills.
Jeff Jiang is currently a Production Engineer at Facebook, he is leading the development of MySQL disaster recovery and high availability solutions and he has been delivering lots of works to improve Facebook's MySQL data consistency and availability. Prior to Facebook, Jeff also worked at Google on database systems such as Spanner/Cloud Spanner and Vitess, he contributed much to bring down Golang GC latency and worked as one of the initial SREs to launch Cloud Spanner as well as its Golang library. Jeff earned a Master degree in Computer Science at the Cornell University.