Global Transaction IDs (GTIDs) are one of my favorite features of MySQL 5.6. The main limitation is that you must stop all the servers at the same time to allow GTID-replication. Not everyone can afford to take a downtime so this requirement has been a showstopper for many people. Starting with Percona Server 5.6.22-72.0 enabling GTID replication can be done without almost no downtime. Let’s see how to do it.
Implementation of the Facebook patch
Finding a solution to migrate to GTIDs with no downtime is not a new idea, and several companies have already developed their own patch. The 2 best known implementations are the one from Facebook and the one from Booking.com.
Both options have pros and cons, and we finally chose to port the Facebook patch and add a new setting (
Performing the migration
Let’s assume we have a master-slaves setup with 4 servers A, B, C and D. A is the master:
gtid_deployment_step = ON means that a server will not generate GTIDs when it executes writes, but it will record a GTID in its binary log if it gets an event from the replication stream tagged with a GTID.
The 2nd step is to promote one of the slaves to become the new master (for instance C) and to disable
gtid_deployment_step. It is a regular slave promotion so you should do it the same way you deal with planned slave promotions (for instance using MHA or your own scripts). Our patch doesn’t help you do this promotion.
At this point replication will break on the old master as it has
gtid_mode = OFF and
gtid_deployment_step = OFF.
The 3rd step is to restart the old master to set
gtid_mode = ON. Replication will resume automatically, but don’t forget to set
MASTER_AUTO_POSITION = 1.
mysql> SET GLOBAL gtid_deployment_step = OFF;
and you should remove the setting from the my.cnf file so that it is not set again when the server is restarted.
Optionally, you can promote the old master back to its original role.
That’s it, GTID replication is now available without having restarted all servers at the same time!
At some point during the migration, a slave promotion is needed. And at this point, you are still using position-based replication. The patch will not help you with this promotion so use your regular failover scripts. If you have no scripts to deal with that kind of situation, make sure you know how to proceed.
Also be aware that this patch provides a way to migrate to GTIDs with no downtime, but not a way to migrate away from GTIDs with no downtime. So test carefully and make sure you understand all the new stuff that comes with GTIDs, like the new replication protocol, or how to skip transactions.
If you are using master-master replication or multiple tier replication, you can follow the same steps. With multiple tier replication, simply start by setting
gtid_mode = ON and
gtid_deployment_step = ON for the leaves first.
If you’re interested by the benefits of GTID replication but if taking a downtime has always scared you, you should definitely download the latest Percona Server 5.6 and give it a try!