How to upgrade your master server with minimal downtime

Master SlaveHere’s a step-by-step guide on how to invert roles for master and slave so you can perform a master server upgrade, and then switch roles back to the original setup.

* While following this guide consider server-A as your original master and server-B as your original slave. We will assume server-B already produces binlogs and that both nodes have log-slave-updates=1 in my.cnf


Check this following link for more details on log-slave-updates:
https://dev.mysql.com/doc/refman/5.6/en/replication-options-slave.html#option_mysqld_log-slave-updates

1. Prepare the original Slave to invert roles.

Make sure the slave is not in read-only mode. This should be set to 0, if not change it:


* For critical service, you might also want to make sure the slave is actually a perfect replica by running a checksum of the tables with pt-table-checksum.

 

Check SHOW SLAVE STATUS \G  on server-B until you see Seconds_Behind_Master is zero

 

When server-B is fully caught up, then issue  FLUSH TABLES WITH READ LOCK;  in a mysql client prompt in server-A

At this point your users will suffer a service request interruption and downtime starts ticking.

* DO NOT CLOSE this mysql client; otherwise the lock will be lost. We’ll use this same session to run the rest of the commands in server-A.

 

Get master coordinates:

 

Run  SHOW SLAVE STATUS \G  in server-B
Repeat until Relay_Master_Log_File  and Exec_Master_Log_Pos  matches file and position from the previous step.

 

Check if SHOW MASTER STATUS;  is not changing on server-B (to ensure that there are no queries local to server-B), then stop the slave

 

Run SHOW MASTER STATUS \G  in server-B and save this information in a safe place.

Also make sure binlogs from that position and onwards are kept until you bring server-A back online. Otherwise you’ll need to rebuild server-A.


2. Reverse roles for Master – Slave and upgrade original Master

Direct traffic to server-B (point VIP to server-B, or through DNS -not really recommended-, or by updating the configuration files of API services).

In a separate shell session in server-A, stop mysql

Now server-A can be shut down and serviced

* At this point you’re safe to upgrade server-A.

* After you’ve finished upgrading your server you can continue this guide to put server-A back as master.

 

Start MySQL in server-A (with skip-slave-start!) and run RESET SLAVE ALL;

 

Reconfigure slave in A with  CHANGE MASTER TO ...  and start it
Make sure to put the same
MASTER_LOG_FILE  and MASTER_LOG_POS  values as you previously got from running SHOW MASTER STATUS \G  on server-B.

 

 

Make sure replication is running

Repeat until you see Seconds_Behind_Master  is 0.

* Here is the moment where the roles are fully inverted, and A is an up-to-date slave of B.

 

3. Prepare the original Master to set back the original roles.

Prepare

When server-A is fully caught up, then issue FLUSH TABLES WITH READ LOCK;  in a mysql client prompt in server-B.

DO NOT CLOSE this mysql client on server-B; otherwise the lock will be lost. We’ll use this same session to run the rest of the commands in server-B.

* DO NOT CLOSE THIS CLIENT SESSION! *

 

Check master status

 

Run SHOW SLAVE STATUS \G  in server-A. Repeat until Relay_Master_Log_File  and Exec_Master_Log_Pos  matches the file and position from SHOW MASTER STATUS;

 

Get master coordinates:

4. Set back the original roles for Master – Slave

Direct traffic to server-A (point VIP to server-A, or through DNS -not really recommended-, or by updating the configuration files of API services).

In a separate shell session in server-B, restart mysql.

Make server-B again a slave of server-A:

 

Stop the slave threads in server-A:

 

* If you have a chained replication setup you should take into consideration the log_slave_updates variable, especially as it is used in this case and should be enabled.
i.e.: In a replication chain A > B > C, for C to receive updates from A, B will need to log them to the binary logs and that is what this option does.

For more information regarding this variable, please check the following link:
https://dev.mysql.com/doc/refman/5.6/en/replication-options-binary-log.html#sysvar_log_slave_updates

Share this post

Comments (2)

  • Moll Reply

    All those steps are being executed manually, which means long downtime (if no mistakes happened)!
    and the process will be even more complicate if you have many slaves for that master.
    MHA can simplify that process with less manual processes and less downtime (from 0.5 to 3 seconds in the online master switching).
    Check this blog for more details on how to use MHA to perform the manual maintenance on the master:
    http://mysqlstepbystep.com/2015/06/01/mysql-high-available-with-mha-2/

    December 2, 2015 at 1:55 am
  • Jörg Brühe Reply

    When you restart the MySQL server process on server-A, you should ensure “read-only = 1” from the very beginning.
    OTOH, I don’t see the reason for “skip-slave-start”:
    – If server-A was configured to replicate from server-B (aka “master-master” setup), that should still be valid, and there is no need to use “change master”.
    – If server-A was not configured to replicate from anywhere, “skip-slave-start” will not change anything.
    – If server-A was configured to replicate from somewhere else, all the description is incomplete anyways.
    So IMO on server-A the “read-only = 1” is both necessary and sufficient at restart.

    December 2, 2015 at 3:26 am

Leave a Reply