Few months ago, I wrote about a faster way to do certain table modifications online. It works well when all you want is to remove auto_increment or change ENUM values. When it comes to changes that really require table to be rebuilt – adding/dropping columns or indexes, changing data type, converting data to different character set – MySQL master-master replication especially accompanied by MMM can be very handy to do the changes with virtually no downtime.
Couple of days ago I worked with one of our MySQL support customers as they were upgrading their application and mysql schema. We deployed and used MySQL Master-Master replication manager (MMM) ever since we started working so doing all the schema changes synchronously and with only couple seconds of “downtime” was really trivial. I’d like to share my experience.
The article does not cover MMM installation nor how to do this without MMM. It assumes you have one master-master pair named C1 with db1 (current active master) and db2 (standby master) in it.
Precautions: Before applying any changes, you should ensure that your application does not have inserts without columns named (bad e.g. “INSERT INTO table VALUES (id,value1,value2,..)”; good e.g. “INSERT INTO tables (id,col1,col2,..) VALUES (id,value1,value2,..)”), deletes or updates of rows that will not exist after schema changes and other things that would break logical replication if tables on master and replica are different. binary log should be a good place to look for these.
OK, the procedure:
1. on monitor: set db2 off-line
| 1 | $ mmm_control @C1 set_offline db2 | 
When schema changes will be applied to db2, we will have to cut the replication between db2 and db1. In that case MMM would switch db2 and db1 roles. This step ensures that when replication db2 => db1 is broken, db1 will be available as “writer” and roles stay consistent.
2. on db1: stop replication
| 1 | mysql> STOP SLAVE; | 
we don’t want schema changes to be replicated to db1 now.
3. on db2: load the sql scripts
| 1 | $ mysql -u root db < changes.sql | 
At this step we load sql commands from changes.sql though of course one can just enter commands from cli, execute some php/python/ruby script or anything else. Most importantly – this must actually be executed as SUPER user (remember, standby master is running with read-only=1) and they must be executed on db2 so better use mysql file socket, or real IP of the node, but not virtual MMM IP as all virtual IPs are assigned to db1 at this stage.
If you’re altering gigabyte size tables, this step may take some time. Go get a pizza or read some cool articles on our blog. When new changes are loaded:
4. on db2: start replication
| 1 | mysql> START SLAVE; | 
On step #1 MMM should have stopped db1 => db2 replication so at this stage we start it to ensure db2 accumulates the data changes from db1. If you did not follow the Precautions, there’s a chance replication will fail at this step. If this happens, you’ll have to clone db2 from current db1 (mmm_clone), fix the application and start from the beginning.
The essential requirement for this step is to ensure replication has caught up. If you have multiple pairs doing some changes in parallel and you want to ensure switch from old layout to new one happens on all of them at pretty much the same time, you should wait until all of them reach this point.
5. on monitor: switch masters
Once standby servers (only db2 in our case) have caught-up, we are ready to upgrade the application. The sequence is:
| 1 2 3 | $ mmm_control @C1 set_offline db1 .. application changes are applied .. $ mmm_control @C1 set_online db2 | 
This is the “downtime” part. The longer it takes to apply application changes, the longer is the downtime. Our customer built his application so it automatically detects the new schema and so no changes had to be made during these two MMM steps – it’s a good practice if you want to minimize the downtime. Also, application can typically work with new schema without any changes rolled out, so if that’s the case – you can apply those changes independently.
This is the only downtime you would (or would not =) have. Further steps are only to ensure db1 also gets the schema updated (it’s now a standby master) and db1 is back on-line for a failover. There are two ways to do it:
6. Allow replication to catch up
This can be slower or faster method depending on your overall database size, scope of changes, and few other variables though if you can’t use mmm_clone i.e. you don’t have lvm or another way to create snapshots, this could be the only good way to go.
6.1 on db1: start replication
| 1 | mysql> START SLAVE; | 
db1 will receive the updates that were executed by changes.sql or some other script and rebuild its tables.
When replication on db1 has caught-up:
6.2 on monitor: set db1 on-line
| 1 | $ mmm_control @C1 set_online db1 | 
Now you can check MMM status for the cluster – it should display db2 as active master (“writer”) and db1 as standby master (having only “reader” role) which means you’re all set.
If mmm_clone is configured to do mysql snapshots, this could be a better way to achieve the same:
Alternative 6. Clone db1 from db2
On db1 execute:
| 1 | $ mmm_clone --host db2 --clone-mode master-master | 
There are several different methods to clone a master, also you can have your custom ones though if you’re using MMM, I suppose this is already configured.
Pretty much the same can be achieved with master-master replication even if you’re not using MMM though it takes few more steps i.e. you must change application to use different IP for MySQL or ensure some load balancer/proxy does this, set read-only manually on standby master so application can’t write to it etc. so MMM really helps here ensuring all is done in the background.
 
 
 
 
						 
						 
						 
						 
						 
						
If you’re just doing an ALTER, a simple way to do it is run:
SET SQL_LOG_BIN=0;
ALTER TABLE …
SET SQL_LOG_BIN=1;
First on one server then on the other, switching masters each time.
This is a great way to do it without having the complexity in the MySQL server from trying to do it online as Oracle might. I hope that MySQL doesn’t spend too much time trying to support that. The MMM approach works for most schema changes with far less complexity and risk. But if MySQL spends time on that, then this is an opportunity for MMM as it works right now. The problem gets more interesting as you increase the number of slaves on which the change must be done.
When can we expect the merger of MMM + Maatkit (== MMMaatkit)?
Nice article Peter. Thanks for the example!!
Great idea, but what do you do if you have to alter the structure due to better performance?
While rading this article i had an idea, i would be really happy if you could comment it:
At first you create in both Masters views which represent the new data structure (maybe you have to alter the relationship of tables). Then you will change application to use these new views instead of the old tables, because the views will later be tables.
Now you turn off one of the masters, and change the tables according to the views you have created. This will take a long long time…
When all changes are ready you can turn the master on, and it can load all changes from the master who was online. Now you have got one master with all data and the new structure. Now you can copy the data to the other master who was online all the time.
This is no different from if you use master-slave replication, surely? The only caveat there is that you need to promote one of the slaves to a master if you do this, but that shouldn’t be a problem. Wikipedia does essentially this to alter tables without downtime, and of course it uses slave replication, not master-master.
Keith,
This article is by Aurimas, lets give him a credit !
Tobias,
In theory you can use this VIEW approach for complex migrations but in practice there can be many performance gotchas you can run into.
Another note – besides ALTER this works for other similar type of statements. For example you can do online OPTIMIZE table this way.
another method to dump all the rows, reload them into new table with new structure, rename table, and drop the old table. This coupled with SET SQL_BINLOG = 0 (first comment) should work.
je vous remercie de partager, très gentil de votre part