Where the open source community meets: Secure your spot for Percona Live Amsterdam! - Register

Downloads

Blog

Is Replication Slave ever going to catch up? When?!

August 29, 2012

Author

Aurimas Mikalauskas

Insight for DBAs

MySQL

Share this Post:

If you ever had a replication slave that is severely behind, you probably noticed that it’s not catching up with a busy master at a steady pace. Instead, the “Seconds behind master” is going up and down so you can’t really tell whether the replica is catching up or not by looking at just few samples, unless these are spread apart. And even then you can’t tell at a glance when it is going to catch up.

Normally, the “severely behind” thing should not happen, but it does often happen in our consulting practice:

- sometimes replication would break and then it needs to catch up after it is fixed,

- other times new replication slave is built from a backup which is normally hours behind,

- or, it could be that replication slave became too slow to catch up due to missing index

Whatever the case is, single question I am being asked by the customer every time this happens is this: When is the replica going to catch up?”

I used to tell them “I don’t know, it depends..” and indeed it is not an easy question to answer. There are few reasons catching up is so unstable:

1. If you have restarted the server, or started a new one, caches are cold and there’s a lot of IO happening,

1. Not all queries are created equal – some would run for seconds, while others can be instant,

1. Batch jobs: some sites would run nightly tasks like building statistics tables or table checksum – these are usually very intense and cause slave to backup slightly.

I didn’t like my own answer to The question, so I decided to do something about it. And because I love awk, I did that something in awk:

delay=60
cmd="mysql -e 'show slave statusG' | grep Seconds_Behind_Master | awk '{print $2}'"
while sleep $delay; do
  eval $cmd
done | awk -W interactive -v delay=$delay '
{
   passed += delay;
   if (count%10==0)   
      printf("s_behind d_behind   c_sec_s   eta_d | O_c_sec_s O_eta_d O_eta_hn");
   if (prev==NULL){
      prev = $1;
      start = $1;
   }
   speed = (delay-($1-prev))/delay;
   o_speed = (start-($1-passed))/passed
   if (speed == 0)    speed_d = 1;
     else             speed_d = speed;
   eta = $1/speed_d;
   if (eta<0)         eta = -86400;
   o_eta = $1/o_speed;
   printf("%8d %8.6f %9.3f %7.3f | %9.3f %7.3f %7.2fn",
      $1, $1/86400, speed, eta/86400, o_speed, o_eta/86400, o_eta/3600);
   prev=$1;
   count++;
}'

delay=60

cmd="mysql -e 'show slave statusG' | grep Seconds_Behind_Master | awk '{print $2}'"

while sleep $delay; do

eval $cmd

done | awk -W interactive -v delay=$delay '

{

passed += delay;

if (count%10==0)

printf("s_behind d_behind c_sec_s eta_d | O_c_sec_s O_eta_d O_eta_hn");

if (prev==NULL){

prev = $1;

start = $1;

}

speed = (delay-($1-prev))/delay;

o_speed = (start-($1-passed))/passed

if (speed == 0) speed_d = 1;

else speed_d = speed;

eta = $1/speed_d;

if (eta<0) eta = -86400;

o_eta = $1/o_speed;

printf("%8d %8.6f %9.3f %7.3f | %9.3f %7.3f %7.2fn",

$1, $1/86400, speed, eta/86400, o_speed, o_eta/86400, o_eta/3600);

prev=$1;

count++;

I don’t know if this is ever going to become a part of a Percona Toolkit, however since it’s pretty much a one-liner, I just keep it in my snippets pool for easy copy’n’paste.

Here’s a piece of an output from a server that was almost 27 days behind just yesterday:

// at the beginning:
s_behind  d_behind   c_sec_s   eta_d | O_c_sec_s O_eta_d O_eta_h
 2309941 26.735428     0.000  26.735 |     1.000  26.735  641.65
 2309764 26.733380     2.950   9.062 |     2.475  10.801  259.23
 2308946 26.723912    13.633   1.960 |     6.528   4.094   98.25
 2308962 26.724097    -0.267  -1.000 |     5.079   5.262  126.28
 2309022 26.724792    -1.000  -1.000 |     4.063   6.577  157.85
...
// after one hour:
s_behind  d_behind   c_sec_s   eta_d | O_c_sec_s O_eta_d O_eta_h
 2264490 26.209375    39.033   0.671 |    13.418   1.953   46.88
 2262422 26.185440    34.467   0.760 |    13.774   1.901   45.63
 2261702 26.177106    12.000   2.181 |    13.762   1.902   45.65
...
// after three hours:
s_behind  d_behind   c_sec_s   eta_d | O_c_sec_s O_eta_d O_eta_h
 2179124 25.221343    13.383   1.885 |    13.046   1.933   46.40
 2178937 25.219178     3.117   8.092 |    12.997   1.940   46.57
 2178472 25.213796     7.750   3.253 |    12.973   1.943   46.64
...
// after 12 hours:
s_behind  d_behind   c_sec_s   eta_d | O_c_sec_s O_eta_d O_eta_h
 1824590 21.117940    20.233   1.044 |    12.219   1.728   41.48
 1823867 21.109572    12.050   1.752 |    12.221   1.727   41.46
 1823089 21.100567    12.967   1.627 |    12.223   1.726   41.43
...
// after 21 hours:
s_behind  d_behind   c_sec_s   eta_d | O_c_sec_s O_eta_d O_eta_h
 1501659 17.380312    -0.533  -1.000 |    11.768   1.477   35.44
 1501664 17.380370    -0.083  -1.000 |    11.760   1.478   35.47
 1501689 17.380660    -0.417  -1.000 |    11.751   1.479   35.50
...

// at the beginning:

s_behind d_behind c_sec_s eta_d | O_c_sec_s O_eta_d O_eta_h

2309941 26.735428 0.000 26.735 | 1.000 26.735 641.65

2309764 26.733380 2.950 9.062 | 2.475 10.801 259.23

2308946 26.723912 13.633 1.960 | 6.528 4.094 98.25

2308962 26.724097 -0.267 -1.000 | 5.079 5.262 126.28

2309022 26.724792 -1.000 -1.000 | 4.063 6.577 157.85

...

// after one hour:

s_behind d_behind c_sec_s eta_d | O_c_sec_s O_eta_d O_eta_h

2264490 26.209375 39.033 0.671 | 13.418 1.953 46.88

2262422 26.185440 34.467 0.760 | 13.774 1.901 45.63

2261702 26.177106 12.000 2.181 | 13.762 1.902 45.65

...

// after three hours:

s_behind d_behind c_sec_s eta_d | O_c_sec_s O_eta_d O_eta_h

2179124 25.221343 13.383 1.885 | 13.046 1.933 46.40

2178937 25.219178 3.117 8.092 | 12.997 1.940 46.57

2178472 25.213796 7.750 3.253 | 12.973 1.943 46.64

...

// after 12 hours:

s_behind d_behind c_sec_s eta_d | O_c_sec_s O_eta_d O_eta_h

1824590 21.117940 20.233 1.044 | 12.219 1.728 41.48

1823867 21.109572 12.050 1.752 | 12.221 1.727 41.46

1823089 21.100567 12.967 1.627 | 12.223 1.726 41.43

...

// after 21 hours:

s_behind d_behind c_sec_s eta_d | O_c_sec_s O_eta_d O_eta_h

1501659 17.380312 -0.533 -1.000 | 11.768 1.477 35.44

1501664 17.380370 -0.083 -1.000 | 11.760 1.478 35.47

1501689 17.380660 -0.417 -1.000 | 11.751 1.479 35.50

...

Of course, it is still not perfectly accurate and it does not account for any potential changes in queries, workload, warm-up, nor the time it takes to run the mysql cli, but it does give you an idea and direction that replication slave is going. Note, negative values mean replication isn’t catching up, but values themselves are mostly meaningless.

Here’s what the weird acronyms stand for:

- s_behind – current Seconds_Behind_Master value

- d_behind – number of days behind based on current s_behind

- c_sec_s – how many seconds per second were caught up during last interval

- eta_d – this is ETA based on last interval

- O_c_sec_s – overall catch-up speed in seconds per second

- O_eta_d – ETA based on overall catch-up speed (in days)

- O_eta_h – same like previous but in hours

Let me know if you ever find this useful.

0 0 votes

Article Rating

21 Comments

Oldest

Newest Most Voted

Baron Schwartz

13 years ago

Table checksums should not add delay, because pt-table-checksum won’t do anything until replication is caught up.

Michael Wehrle

13 years ago

Can you trust “Seconds_Behind_Master” to be accurate? I have been rebuilding servers lately for a client, and typically by the time we get the new slave built, its between 24-36 hours behind, and something like a dozen binlogs “behind”. What I have found is that the “Seconds_Behind_Master” doesn’t even show non-zero until the slave is on the same binlog as the master. That doesn’t even account for the fact that its often very inaccurate.

Also… 27 days behind? Why not rebuild the slave with a hot backup from the master?

Michael Wehrle

13 years ago

Also, the code is quite nice actually. Succint and useful, given the intent.

Author

Aurimas Mikalauskas

13 years ago

Baron,

normally – indeed pt-table-checksum wouldn’t do that. Interestingly though, the slave in the example above was actually a [3 days] delayed slave (using pt-slave-delay) but pt-table-checksum was configured to not check its status, for a pretty obvious reason that it will always be at least 3 days behind. We still want checksums to run though, to check that our last-resort backup is consistent.

Author

Aurimas Mikalauskas

13 years ago

Michael, –

it’s not completely accurate, but you also have to be sure you know what the number means. It is NOT “how long until the slave will catch up”, but rather “what is the difference between NOW() and the time when master was writing the events [I’m currently executing] into the binary log”. In that sense, it is accurate to one second.

One problem with Seconds_Behind_Master is that if will show zero if it is IO thread, not SQL thread that is behind.

Bill Willits

13 years ago

Very nice post. And, super-useful. I’ve had the same need for this tool many times. Thanks for creating.

aftab

13 years ago

Would mk-slave-prefetch help in this situation?

Daniël van Eeden

13 years ago

Thanks for sharing the script.

The line with:
grep Seconds_Behind_Master | awk ‘{print $2}’

Could be rewritten as:
awk ‘/Seconds_Behind_Master/ { print $2 }’

It might be possible to use getline and make it more pure awk.
http://www.gnu.org/software/gawk/manual/html_node/Getline_002fVariable_002fPipe.html#Getline_002fVariable_002fPipe

AWK is a great tool. The only downside is that it’s probably not available by default on most Windows and other non-UNIX (like) operating systems.

Rewriting this script to run server side (stored procedure/view/etc) seems favorable, but is not possible:
http://code.openark.org/blog/mysql/reading-results-of-show-statements-on-server-side

Author

Aurimas Mikalauskas

13 years ago

aftab, it might, though I’d probably try Yoshinori’s Replication Booster or Domas’ Replication prefetcher first, if need be.

Author

Aurimas Mikalauskas

13 years ago

Daniël, –

thanks for your tips!

Jesse R. Adams

13 years ago

Thanks for sharing this script! It is quite useful!

Frank Van Damme

12 years ago

yes, pretty useful. A lot better than to keep running “show slave status” and making a guess about the time to catch up.

Ross Carver

12 years ago

Why not use statsD or some other data collector in a cronjob and just plot it on a graph? Lots of online tools that will capture/display this stat over time nicely for ya.

Author

Aurimas Mikalauskas

12 years ago

Thanks for your comment, Ross. In this particular case I’m not interested in having the data plotted, I assume that most of the time (like, virtually ALL the time), slave is not behind the master and I ask this question (and use this script) very rarely when I get a task like setting up a new replication slave (or fixing a broken one) from a backup which is, say one week behind – and then we’re curious to know when we will have the slave up to date. This fix sometimes is an additional index, RAM increase or anything of that sorts.

Note that as MySQL consultants (I’m not talking Remote DBA now) we work with variety of customers that would often come for a quick 16-32 hour gig and we don’t have any control nor time to have any graphing tools installed.

Bill Willits

12 years ago

Hey Aurimas.
I have found this code to be very useful. The prediction of catch-up time is remarkably accurate!
I created a script that add some bells and whistles – like notification via email during the catchup process and when fully caught up. The script relies on the heartbeat table for computing the lag so you need to have that.
I hope it is useful to others. (and , I hope this pastebin works).
http://pastebin.com/H2pxihrx

~Bill

Author

Aurimas Mikalauskas

12 years ago

Bill, thanks so much! I do see the the pastebin just fine. I’m happy to hear that you find the catch-up time to be accurate, I also find that after 20-30 minutes or more (depending on a number of factors) it stabilises and shows pretty close ETA.

Ross Carver

12 years ago

Aurimas – makes sense. I stumbled on this post when trying to diagnose a slave not catching up. For others and I believe it may be covered in other posts here – setting innodb_flush_log_at_trx_commit=0 helped a permanently lagging slave catch up quickly. I then set it back to 1 once it caught up and its fine and not lagging. HTH this helps someone else.

Joey Aldrin Cruz

11 years ago

Thanks Aurimas; I found this script very useful mostly when the times that the slave took so long to catch-up with the master. I also always answered IDK or it depends on what is the query is or how big is the query that is processing by the slave. When I used this script I can perfectly tell the ETA on when the slave can catch-up with the master. Thank you very much for this one! Kudos~!

sometimesomehow

10 years ago

The script is missing something.
I cannot reproduce the numbers in the output by hand.

Also, I do not get negative values for “speed” / “c_sec_s” when the replication falls farther behind as the example shows.

Ed Morley

9 years ago

There is a bug in the script – as noted in another comment, it doesn’t provide negative c_sec_s in some cases.

Notably the way speed and o_speed are calculated result in values where “1.0” means “just keeping up with new activity, not actually eating into the existing seconds behind”, which is contrary to the definition of c_sec_s on this page – and also means the ETAs are wrong.

I have an updated version which fixes the above, plus:
* Displays “Inf” if the current ETA is “will never catch up”.
* Switches the -W interactive to fflush(), since the former results in an error/warning on newer awk.
* Reduces the number of columns to improve readability.
* Adds a column separator between the first two columns and the rest.
* Uses hours rather than days for the ETA columns (if you’re that many days behind replicating that hours is too verbose, then you need to just perform another DB dump and start from scratch IMO).
* Replaces the grep invocation with awk matching, as suggested in a comment above.

See:
https://gist.github.com/edmorley/f8589f4d6fa84df4cce1a13cbac83d98