Few months ago, I wrote about a faster way to do certain table modifications online. It works well when all you want is to remove auto_increment or change ENUM values. When it comes to changes that really require table to be rebuilt – adding/dropping columns or indexes, changing data type, converting data to different character set – MySQL master-master replication especially accompanied by MMM can be very handy to do the changes with virtually no downtime.
Couple of days ago I worked with one of our MySQL support customers as they were upgrading their application and mysql schema. We deployed and used MySQL Master-Master replication manager (MMM) ever since we started working so doing all the schema changes synchronously and with only couple seconds of “downtime” was really trivial. I’d like to share my experience.
The article does not cover MMM installation nor how to do this without MMM. It assumes you have one master-master pair named C1 with db1 (current active master) and db2 (standby master) in it.
Precautions: Before applying any changes, you should ensure that your application does not have inserts without columns named (bad e.g. “INSERT INTO table VALUES (id,value1,value2,..)”; good e.g. “INSERT INTO tables (id,col1,col2,..) VALUES (id,value1,value2,..)”), deletes or updates of rows that will not exist after schema changes and other things that would break logical replication if tables on master and replica are different. binary log should be a good place to look for these.
OK, the procedure:
1. on monitor: set db2 off-line
$ mmm_control @C1 set_offline db2
When schema changes will be applied to db2, we will have to cut the replication between db2 and db1. In that case MMM would switch db2 and db1 roles. This step ensures that when replication db2 => db1 is broken, db1 will be available as “writer” and roles stay consistent.
2. on db1: stop replication
mysql> STOP SLAVE;
we don’t want schema changes to be replicated to db1 now.
3. on db2: load the sql scripts
$ mysql -u root db < changes.sql
At this step we load sql commands from changes.sql though of course one can just enter commands from cli, execute some php/python/ruby script or anything else. Most importantly – this must actually be executed as SUPER user (remember, standby master is running with read-only=1) and they must be executed on db2 so better use mysql file socket, or real IP of the node, but not virtual MMM IP as all virtual IPs are assigned to db1 at this stage.
4. on db2: start replication
mysql> START SLAVE;
On step #1 MMM should have stopped db1 => db2 replication so at this stage we start it to ensure db2 accumulates the data changes from db1. If you did not follow the Precautions, there’s a chance replication will fail at this step. If this happens, you’ll have to clone db2 from current db1 (mmm_clone), fix the application and start from the beginning.
The essential requirement for this step is to ensure replication has caught up. If you have multiple pairs doing some changes in parallel and you want to ensure switch from old layout to new one happens on all of them at pretty much the same time, you should wait until all of them reach this point.
5. on monitor: switch masters
Once standby servers (only db2 in our case) have caught-up, we are ready to upgrade the application. The sequence is:
$ mmm_control @C1 set_offline db1
.. application changes are applied ..
$ mmm_control @C1 set_online db2
This is the “downtime” part. The longer it takes to apply application changes, the longer is the downtime. Our customer built his application so it automatically detects the new schema and so no changes had to be made during these two MMM steps – it’s a good practice if you want to minimize the downtime. Also, application can typically work with new schema without any changes rolled out, so if that’s the case – you can apply those changes independently.
This is the only downtime you would (or would not =) have. Further steps are only to ensure db1 also gets the schema updated (it’s now a standby master) and db1 is back on-line for a failover. There are two ways to do it:
6. Allow replication to catch up
This can be slower or faster method depending on your overall database size, scope of changes, and few other variables though if you can’t use mmm_clone i.e. you don’t have lvm or another way to create snapshots, this could be the only good way to go.
6.1 on db1: start replication
mysql> START SLAVE;
db1 will receive the updates that were executed by changes.sql or some other script and rebuild its tables.
When replication on db1 has caught-up:
6.2 on monitor: set db1 on-line
$ mmm_control @C1 set_online db1
Now you can check MMM status for the cluster – it should display db2 as active master (“writer”) and db1 as standby master (having only “reader” role) which means you’re all set.
If mmm_clone is configured to do mysql snapshots, this could be a better way to achieve the same:
Alternative 6. Clone db1 from db2
On db1 execute:
$ mmm_clone --host db2 --clone-mode master-master
There are several different methods to clone a master, also you can have your custom ones though if you’re using MMM, I suppose this is already configured.
Pretty much the same can be achieved with master-master replication even if you’re not using MMM though it takes few more steps i.e. you must change application to use different IP for MySQL or ensure some load balancer/proxy does this, set read-only manually on standby master so application can’t write to it etc. so MMM really helps here ensuring all is done in the background.