EmergencyEMERGENCY? Get 24/7 Help Now!

Actively monitoring replication connectivity with MySQL’s heartbeat

 | December 29, 2011 |  Posted In: MySQL


Until MySQL 5.5 the only variable used to identify a network connectivity problem between Master and Slave was slave-net-timeout. This variable specifies the number of seconds to wait for more Binary Logs events from the master before abort the connection and establish it again. With a default value of 3600 this has been a historically bad configured variable and stalled connections or high latency peaks were not detected for a long period of time or not detected at all. Also, if that variable is set to a low value, let’s say 30 seconds, and the master had no events to send, the slave would reset the connection after 30 seconds even if the connection was healthy.

Therefore, before this new heartbeat feature, we had no way to check the connection status between the servers. We needed an active master/slave connection check. And here is where replication’s heartbeat can help us.

This feature was introduce in 5.5 as another parameter to the CHANGE MASTER TO command. After you enable it, the MASTER starts to send “beat” packages (of 106 bytes) when it is idle (no events to send to the slave) every X seconds where X is a value you can define in seconds.

Now, let’s say that slave-net-timeout=30. If the master is idle, without events to send, it will start to send those beats. Therefore, the connection reset won’t be triggered after those 30 seconds, because now the slave knows that the connection is still alive.

How can I configure replication’s heartbeat?

Is very easy to setup with negligible overhead:

mysql_slave > STOP SLAVE;
mysql_slave > START SLAVE;

MASTER_HEARTBEAT_PERIOD is a value in seconds in the range between 0 to 4294967 with resolution in milliseconds.

Is interesting to note that having a 5.5 slave with replication’s heartbeat enabled and connected to a 5.1 master doesn’t break the replication. Of course, the heartbeat will not work in this case because the master doesn’t know what is a beat or how to send it 🙂

What status variables do I have?

The heartbeat check period time and the number of beats received.

mysql_slave > SHOW STATUS LIKE '%heartbeat%';
| Variable_name | Value |
| Slave_heartbeat_period | 1.000 |
| Slave_received_heartbeats | 1476 |

How can we check if the connection is down?

– If the master’s binary log position is greater than the one in slave but it is not receiving those new events, then it is down.
– If the master is idle but we see the number of received heartbeats increasing, then the connection is not down.
– If the master is idle but we don’t see heartbeats increasing, then it is down.

Miguel Angel Nieto

Miguel joined Percona in October 2011. He has worked as a System Administrator for a Free Software consultant and in the supporting area of the biggest hosting company in Spain. His current focus is improving MySQL and helping the community of Free Software to grow.


  • Hi Miguel,

    Very useful article, but I’m confused — is the MASTER_HEARTBEAT_PERIOD in milliseconds or seconds? One part of the article states that you can configure the heartbeat in seconds, and the example shows the value ‘1’, which would be very aggressive if it were in milliseconds, yet you state that “MASTER_HEATBEAT_PERIOD is a value in seconds in the range between 0 to 4294967 with resolution in milliseconds”.

    Just want to make sure we get this right.

    mysql> SHOW STATUS LIKE ‘%heartbeat%’;
    | Variable_name | Value |
    | Slave_heartbeat_period | 15.000 |
    | Slave_received_heartbeats | 147 |
    2 rows in set (0.00 sec)

    Also, what is the significance of ‘Slave_received_heartbeats’? In our 5.5 environment it seems to be a static number (such as 147, as you see above) — is this the number of missed beats? It doesn’t seem to increment at all.


  • Hi Miguel,

    in my case it doesn’t work out of the box. I performed some tests simulating a network failure and what I observed is that, before considering the first heartbeat failed, it must fail the TCP connection. In order to do so, the system first wait for the TCP keepalive time (/proc/sys/net/ipv4/tcp_keepalive_time) and then it starts to perform /proc/sys/net/ipv4/tcp_keepalive_intvl attempts every /proc/sys/net/ipv4/tcp_keepalive_probes seconds. In total, since the default for the first TCP keepalive is 2 hours (7200s), this would mean that even if you set MASTER_HEARTBEAT_PERIOD=1, in case the network connection fails, the system won’t restart the I/O thread before 2 hours and this is misleading. I checked this simulating the network failure on the master and on the slave and the behaviour is the same.
    Just my contribution for all the other users that might experience an issue like this.



  • This worked out really good in Percona 5.6 with Slave_last_heartbeat implemented. Also this really helps in cross dc replication to auto start replication if it doesn’t get any hearbeat reply. It just save me from network blips and slave stopping without any error.

  • Hi Miguel,

    Very Good Article,

    I setup the Master-Master replication as you described in your tutorial but when 1 server goes down then it will not connect to secondary. Means Failover is not working in my structure. Here I have one Web application which is on & I am using two database servers for this application Server 1 : Server 2 : Primarily web application dump the data in but in case this mysql server 1 goes down then it should automatically connected to Server 2 : without noticing to web application user. Please provide me the solustion. Waiting for your reply. Thanks, Suraj

  • in my case

    | Variable_name | Value |
    | Slave_heartbeat_period | 30.000 |
    | Slave_received_heartbeats | 0 |

    slave recived heartbeats are not updating/increasing, what is the root cause in my case, kindly help me asap.


Leave a Reply


Percona’s widely read Percona Data Performance blog highlights our expertise in enterprise-class software, support, consulting and managed services solutions for both MySQL® and MongoDB® across traditional and cloud-based platforms. The decades of experience represented by our consultants is found daily in numerous and relevant blog posts.

Besides specific database help, the blog also provides notices on upcoming events and webinars.
Want to get weekly updates listing the latest blog posts? Subscribe to our blog now! Submit your email address below and we’ll send you an update every Friday at 1pm ET.

No, thank you. Please do not ask me again.