The trouble with slave lag is that you often can’t see it coming. Especially if the slave’s load is pretty uniform, a slave that’s at 90% of its capacity to keep up with the master can be indistinguishable from one that’s at 5% of its capacity.
So how can you tell when your slave is nearing its capacity to keep up with the master? Here are three ways:
One: watch for spikes of lag. If you have Cacti (and these Cacti templates for MySQL) you can see this in the graphs. If the graphs start to get a little bumpy, you can assume that the iceberg is floating higher and higher in the water, so to speak. (Hopefully that’s not too strange a metaphor.) As the slave’s routine work gets closer and closer to its capacity, you’ll see these spikes get bigger and “wider”. The front-side of the spike will always be less than a 45-degree angle in ordinary operation but the back-side, when the slave is catching up after lagging behind, will become a gentler and gentler slope.
Two: deliberately make a slave fall behind, then see how fast it can catch up. This is sort of related to Method One. The goal here is to explicitly see how steep the backside of that slope is. If you stop a slave for an hour, then start it again and it catches up in one hour, it is running at 1/2 of its capacity. (In case that’s confusing: if you stop it at noon and restart it at 1:00, and it’s caught up again at 2:00, it played all statements from 12:00 to 2:00 in 1 hour, so it went at 2x speed.)
Three: measure it more scientifically. Use our patched server, which gives you a USER_STATISTICS table.
mysql> SELECT * FROM INFORMATION_SCHEMA.USER_STATISTICS WHERE USER='#mysql_system#'\G
*************************** 1. row ***************************
You can compare the BUSY_TIME to one-half the CONNECTED_TIME (because there are two replication threads on the slave) to see how much of the time the slave thread was actively processing statements. If the slave threads are always running, you can just use the server’s uptime instead.
 There are cases where this isn’t true, especially if you’re monitoring Seconds_behind_master instead of using mk-heartbeat, which is immune to this anomaly.
Percona’s widely read Percona Data Performance blog highlights our expertise in enterprise-class software, support, consulting and managed services solutions for both MySQL® and MongoDB® across traditional and cloud-based platforms. The decades of experience represented by our consultants is found daily in numerous and relevant blog posts.
Besides specific database help, the blog also provides notices on upcoming events and webinars.
Want to get weekly updates listing the latest blog posts? Subscribe to our blog now! Submit your email address below and we’ll send you an update every Friday at 1pm ET.