EmergencyEMERGENCY? Get 24/7 Help Now!

Beware of MySQL 5.6 server UUID when cloning slaves

 | January 21, 2014 |  Posted In: Insight for DBAs, MySQL, Percona MySQL Support

PREVIOUS POST
NEXT POST

The other day I was working on an issue where one of the slaves was showing unexpected lag. Interestingly with only the IO thread running the slave was doing significantly more IO as compared to the rate at which the IO thread was fetching the binary log events from the master.

I found this out by polling the SLAVE STATUS and monitoring the value of Read_Master_Log_Pos as it changed over time. Then compared it to the actual IO being done by the server using the pt-diskstats tool from the excellent Percona Toolkit. Note that, when doing this analysis, I had already stopped the slave SQL thread and made sure that there were no dirty InnoDB pages, otherwise my analysis would have been skewed.

A closer inspection showed that this had to do with the error log getting filled up with error messages apparently suggesting that the slave IO thread was disconnecting and reconnecting repeatedly.

Below is the type of error messages that were being generated and were filling up the error log:

These error messages are quite vague and do not suggest the actual problem at all. The error messages appear to point out to the disconnection and reconnection of the IO thread. The rate at which these error messages were being generated was close to 20MB/s which was unnecessarily producing IO load.

Importance of a Unique MySQL 5.6 Server UUID

But anyway coming back to the problem. Looking at the output of running SHOW SLAVE HOSTS on the master showed what the problem was:

The slave hosts for the UUID “6c27ed6d-7ee1-11e3-be39-6c626d957cff” were fluctuating between “SBslave1” and “SBslave2”. So the actual problem was that there were two slaves running with the same server UUID and that was confusing the master. This was caused by the new slave “SBslave2” being cloned from the slave “SBslave1”. The cloning process ended up copying the file “auto.cnf” which is present in the MySQL data directory. The auto.cnf file stores the server’s UUID which is used to uniquely identify a server. This is a new feature in MySQL 5.6 and is explained very well in the MySQL manual here.

So essentially when a server starts up, it reads the UUID value from the auto.cnf file. If the auto.cnf file is not present or the UUID value cannot be read, then the server generates a new UUID. When a new UUID is generated you should see a message similar to the following in the MySQL error log:

Conclusion

So the important lesson is to remember to not reuse the auto.cnf file when cloning a server, as the server UUID is an important identification of a particular server. This is used by the master for the purpose of server identification. I also believe that MySQL should show a different error message that is more pertinent to the actual error. The master could in theory send a different type of event to the slave when it notices one with a duplicate UUID.

PREVIOUS POST
NEXT POST

9 Comments

  • Ovais,

    So what does this mean in practice ? I take backup of Slave1 with Percona Xtrabackup and restore it on Slave2

    Does it mean “auto.cnf” should be removed from restored backup so we can set it up as a slave with no problem or is there different cause of action needed ?

  • Peter,

    It depends on how you are doing the backup. XtraBackup does not backup the auto.cnf file so no action is needed when you are setting up a slave using XtraBackup. However, some people prefer to cone slaves using data directory copy, in which case the auto.cnf file should be removed before starting up MySQL on the copied data directory.

  • yes i faced this issue before and resolved it by changing the server id to a different one, usually we run an instal script for mysql which modifies the server id as (server’s ipaddress without dots) this make sure no server id is cloned or duplicated.

  • Also weird to see that this still causes fluctuation … it has already been suggested on the previous bug reports that the master should reject the new slave connection in that case instead of disconnecting the existing one …

  • Sadly I found this article during my research after a similar problem. This killed replication for me for both slaves (the one I cloned from and the new one) after a few minutes by seemingly sending wrong replication data to the hosts. Now I have to restore both from daily backups (700GB DB – bah)

Leave a Reply

 
 

Percona’s widely read Percona Data Performance blog highlights our expertise in enterprise-class software, support, consulting and managed services solutions for both MySQL® and MongoDB® across traditional and cloud-based platforms. The decades of experience represented by our consultants is found daily in numerous and relevant blog posts.

Besides specific database help, the blog also provides notices on upcoming events and webinars.
Want to get weekly updates listing the latest blog posts? Subscribe to our blog now! Submit your email address below and we’ll send you an update every Friday at 1pm ET.

No, thank you. Please do not ask me again.