Until MySQL 5.5 the only variable used to identify a network connectivity problem between Master and Slave was slave-net-timeout. This variable specifies the number of seconds to wait for more Binary Logs events from the master before abort the connection and establish it again. With a default value of 3600 this has been a historically bad configured variable and stalled connections or high latency peaks were not detected for a long period of time or not detected at all. We needed an active master/slave connection check. And here is where replication’s heartbeat can help us.
This feature was introduce in 5.5 as another parameter to the CHANGE MASTER TO command. After you enable it, the MASTER starts to send “beat” packages (of 106 bytes) to the SLAVE every X seconds where X is a value you can define. If the network link goes down or the latency goes up for more than the time threshold, then the SLAVE IO thread will disconnect and try to connect again. This means we now measure the connection time or latency, not the time without binary log events. We’re actively checking the communication.
How can I configure replication’s heartbeat?
Is very easy to setup with negligible overhead:
mysql_slave > STOP SLAVE;
mysql_slave > CHANGE MASTER TO MASTER_HEARTBEAT_PERIOD=1;
mysql_slave > START SLAVE;
MASTER_HEATBEAT_PERIOD is a value in seconds in the range between 0 to 4294967 with resolution in milliseconds. After the loss of a beat the SLAVE IO Thread will disconnect and try to connect again. Here is the SHOW SLAVE STATUS output after an error:
mysql_slave > show slave status\G
Last_IO_Error: error reconnecting to master 'firstname.lastname@example.org:19972' - retry-time: 60 retries: 86400
Is interesting to note that having a 5.5 slave with replication’s heartbeat enabled and connected to a 5.1 master doesn’t break the replication. Of course, the heartbeat will not work in this case because the master doesn’t know what is a beat or how to send it
What status variables do I have?
The heartbeat check period time and the number of beats received.
mysql_slave > SHOW STATUS LIKE '%heartbeat%';
| Variable_name | Value |
| Slave_heartbeat_period | 1.000 |
| Slave_received_heartbeats | 1476 |
If you need to know when exactly the connection between your Master/Slaves breaks then replication’s heartbeat is the easiest and fastest solution to implement.