While GitHub isn't the biggest database around in terms of the amount of data we hold in MySQL, it is among the top 50 busiest sites on the internet. Facing an immediate need to distribute load, we came up with creative ways to move a significant amount of traffic off of our main MySQL cluster, with no user impact. Moving five of our hottest tables required collaboration between engineers, DBAs and SRE. This talk will describe when and how to do it, and prove it to be an efficient database scalability solution.
Moving tables required changes to our database infrastructure as well as our application. I'll explain the impetus for this work and why we did it. We'll walk through the application-level changes that allowed us to change connections while still serving data. Then, I'll discuss the ways we moved tables to different clusters, using MySQL replication, or in some cases, temporary sharding and copying billions of rows. Finally, I'll outline the orchestration of the actual cutovers.
Bryana Knight works on the platform-data team at GitHub. She is a tech lead for the Boston chapter of Women Who Code. She has worked as both a full-stack developer implementing user facing features as well as a back-end engineer, most recently focusing on scaling back-end services touching GitHub's MySQL database and other data stores.