November 23, 2014

How SHOW SLAVE STATUS relates to CHANGE MASTER TO

As you probably know MySQL Replication (statement based) works by fetching statements from MASTERs binary log and executing them on the SLAVE. Since MySQL 4.0 this process is a bit more involved having events passing via relay logs on the Slave which also means there are two replication threads “IO Thread” and “SQL Thread” used in the process but idea remains the same.

For replication to work properly you need to have proper “snapshot” – consistent state of the database as it was on master and corresponding position in masters binary logs. There are multiple ways to get the “snapshot” – shutting down MySQL Server and copying data, using LVM, Using Innodb Hot Backup Tool, Using another Slave, using backup image etc but in all the cases you have to be sure the snapshot you’re dealing with corresponds to correct binary log position.

There are two sources of information for binary log position SHOW MASTER STATUS – this shows position as master writes its own binary and is helpful when you for example take backup on the Master to bring up first slave. This command output is pretty simple and leaves no room for mistakes:

If you’re backing up from the Slave or you’re cloning the Slave to make another one you need to use SHOW SLAVE STATUS which provides much more information.

As you can see in this output there are 3 log file names and log file positions which can get confusing, and indeed I often see people using wrong values to restore the replication using CHANGE MASTER TO statement.

It is very important to understand the difference between IO Thread position in Master logs (Master_Log_File:Read_Master_Log_Pos) and SQL Thread position in Master logs (Relay_Master_Log_File:Exec_Master_Log_Pos).

When you’re using CHANGE MASTER TO to set start position for the slave you’re specifying position for SQL thread and so you should use Relay_Master_Log_File:Exec_Master_Log_Pos. Otherwise you’re going to ruin your replication.

It does not help SHOW MASTER STATUS has Master_Log_File value and CHANGE MASTER TO accepts Master_Log_File parameter, while you actually want to use Relay_Master_Log_File instead. This naming was left from pre MySQL 4.0 era and it still causes confusion so many years later.

The mistake of using Master_Log_File is in fact VERY typical. I’ve seen multiple “internal instructions” for cloning the slave or slave recovery from backup using this value.

Infact Master_Log_File will work in most cases as well – you can note in the output about Master_Log_File and Relay_Master_Log_File are the same, and it will be the case in vast majority of cases.

If your replication does not fall behind the SQL Thread and IO thread will be running close, quite often being at same master position at all. Even if replication gets a bit behind in most cases you would still see SQL thread and IO threads close enough to share same master log file. Only in rare cases of switching master binary log file or replication being delayed very significantly you will see different log files for these variables which makes it very easy to make mistake and very hard to catch.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Alexis Guajardo says:

    Getting this error:
    Got fatal error 1236 from master when reading data from binary log: ‘Could not find first log file name in binary log index file’

    tried changing binlon file to position to full path, but no luck, any suggestions?

  2. Thanks for clarifying that – it is easy to be misled by the seemingly-matching terminology.

    Do you mean “It does not help SHOW SLAVE STATUS has …” rather than “It does not help SHOW MASTER STATUS has …” in the fourth-to-last paragraph?

  3. peter says:

    Paul,

    The point is Master_Log_File exists both in CHANGE MASTER TO as parameter and SHOW MASTER STATUS as a column while they mean different thing (you should not take one from SHOW MASTER STATUS and pass it to CHANGE MASTER TO to restart replication)

  4. Erik says:

    It’s one of those niggley things which – as you say – is so very easy to do, but you feel ever so silly once you realise because it’s really so obvious! A true “I need to wear shades and a baseball cap so no one can see it’s me for the next six months”-moment.

  5. Peter, I think you’re missing Paul’s point: not only SHOW MASTER STATUS has a Master_Log_File column. SHOW SLAVE STATUS does too. And the value in SHOW SLAVE STATUS is the one that people wrongly use.

  6. peter says:

    Baron,

    Actually SHOW MASTER STATUS has column named “File” and SHOW SLAVE STATUS has Master_Log_File – check the example printouts above:

    Master_Log_File: db01-bin.003814

  7. PaulM says:

    So the bottom line is
    1) When you are using the master as a consistent snapshot, use SHOW MASTER STATUS to determine the position.
    2) When you are using a slave as a consistent snapshot, use SHOW SLAVE STATUS and Exec_Master_Log_Pos.

  8. Jure Pecar says:

    When setting up automated failover and then failback in a circular A->B->C->D->A replication, which values does one use for master_log_file and master_log_pos in ‘change master to’ statement?

  9. Jure, don’t do that. Here’s just one problem: statements can get into an infinite loop if you remove a failed server from the setup. There are many more problems with it, too. Never use more than two servers in plain master-master.

  10. Jure Pecar says:
  11. Jure Pecar says:

    Looks like mysql 6 is introducing MASTER_IGNORE_SERVER_ID to fix this problem. See http://bugs.mysql.com/bug.php?id=25998

  12. Just to be sure. In case when slave stops to work (create table statement on master specified InnoDB storage but slave is MyISAM only), what options do I have to fix replication? – master executed SQL without any problem but slave didn’t.
    1) I suppose I can read the position from the master and write it to the slave with “change master to master_log_file= …” statement
    2) I suppose I can skip problematic statement with SQL_SLAVE_SKIP_COUNTER on slave only
    3) is there any other option like:
    slave stop;
    change master to master_log_file='(binlog name in relay_master_log_file)’, master_log_pos=(exec_master_log_pos number);
    slave start;
    … on slave of course, or I missed the point.

  13. martir says:

    I have problem and I don’t know how to solve this problem. When I do the command “SHOW MASTER STATUS;’ I get resoult “Empty set (0.00 sec)”. I don’t know what to do. Please if someone can halp me. Thanks a lot

  14. Mat says:

    Am Using Multi master Replication, and everything works great
    The problem is when a failure happens or connection is lost , replication does not recover automatic
    I think its because the slave Position changes
    and i have to stop slaves and show master positions again then retype the change master into and insert there new values again to make it up again
    how can i avoid that ?! isnt there a way to make the slave know the position automatic ?! or even fix the position and not let it change ?

    Thanks in advance

  15. Hi Peter .
    Thanx for such a quality information.
    Can u please tell me where i can learn mysql administration concepts like replication and things like this.
    Because the organizxation in which i am working using mysql for various tasks nad there are a lots of slave and circular replication.
    I will appreciate if you please help in getting the concepts related to memcache.

    thanx.

  16. Kyle says:

    Hi,

    We are trying to add an additional server to our replication using the Relay_Master_Log_File/Exec_Master_Log_Pos of an existing slave and copying the data. When we try and restart the new slave with those values we get the following error.

    Got fatal error 1236 from master when reading data from binary log: ‘error reading log entry’

    Lots of posts have recommended increasing the max_allowed_packet and keeping them the same on the master/slave. We have done this with no success. When I run mysqlbinlog on the Exec_Master_Log_Pos the position is invalid and I get:

    ERROR: Error in Log_event::read_log_event(): ‘Event too big’, data_len: 1415072078, event_type: 82

    Looking at the binlog the Exec position is not valid in the binlog. Why would that be?

  17. Kyle,

    It looks like you either have wrong position (for whatever reason) or binary log is corrupted for some reason. It is unlikely you really have 1.5GB packet.

  18. Getting this error:
    Got fatal error 1236 from master when reading data from binary log: ‘Could not find first log file name in binary log index file’

    tried changing binlon file to position to full path, but no luck, any suggestions?

  19. Hi Peter, are you sure you are correct here on Relay_Master_Log_File:Exec_Master_Log_Pos being the values to use?

    I have a master-master replication scenario (2 nodes). If replication breaks on one node (lets say due to duplicate key insert) and I want to kickstart replication then I stop slaves on both, refresh mysql data (dump with –master-data) on passive node from active node, “change master to” on passive node using the values in the dump. Then to deal with the active node I also use “change master to” but I use the values fetched from “show master status” on the passive slave. It would be incorrect to use the Relay_Master_Log_File:Exec_Master_Log_Pos from the passive slave as you seemed to imply. I’m sure it’s just a misunderstanding and you didn’t mean this :-)

    Thanks,
    Imran

  20. Robin says:

    Hello,everybody.
    Does mysql have a feature to customize the slave port? I see my slave port is random.
    Thank you!

  21. Abhishek Chordia says:

    Is there any option to save the result of show slave status in a table like

    insert into temp show slave status;

Speak Your Mind

*