Another reason why SQL_SLAVE_SKIP_COUNTER is bad in MySQL

It is everywhere in the world of MySQL that if your replication is broken because an event caused a duplicate key or a row was not found and it cannot be updated or deleted, then you can use ‘ STOP SLAVE; SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; START SLAVE; ‘ and be done with it. In some cases this is fine and you can repair the offending row or statements later on. But what if the statement is part of a multi-statement transaction? Well, then it becomes more interesting, because skipping the offending statement will cause the whole transaction to be skipped. This is well documented in the manual by the way. So here’s a quick example.

3 rows on the master:

2 on the slave:

Execute a transaction on the master to break replication:

Broken slave:

An attempt to fix replication only caused bigger inconsistencies on slave:

This happens because the replication honors transaction boundaries, and is definitely something you should consider the next time you try to use this workaround on a broken slave. Of course, there is pt-table-checksum and pt-table-sync to rescue you when inconsistencies occur, however, prevention is always better than cure. Make sure to put safeguards in place to prevent your slaves from drifting.

Lastly, the example above is for ROW-based replication as my colleague pointed out, but can similarly happen with STATEMENT for example with a duplicate key error.  You can optionally fix the error above by temporarily setting slave_exec_mode to IDEMPOTENT so errors because of missing rows are skipped, but then again, it does not apply in all cases like an UPDATE statement that cannot be applied because the row on the slave is missing.

Here is a demonstration of the problem with STATEMENT-based replication:


Share this post

Comments (8)

  • vishnu rao

    An eye opener. Cool post.

    July 23, 2013 at 12:23 pm
  • quaker

    I wonder, how anyone can use in production MySQL database knowing, that every thing is made “partial”. We give you row based replication which is, quoting from manual: “(..) is the same as in most other database management systems; knowledge about other systems transfers to MySQL.”, but hey – it can make your slave drift, because we broken something. Of course you can report this to our bug system, and maybe in this decade we will fix it.

    Yes, this is my point of view and maybe I dramatize to much. But for example – PostgreSQL had make it replication “slave drift safe”. I wonder how they could done that and MySQL is unable to do it for a long time.

    And now technical question. What is the reason that such errors are likely to happen? I mean – database is running and some day you noticed that your slave is out if sync? Mysql is missing something like CRCs for records? I assume that queries sent to master are written to be “replication safe”.

    July 24, 2013 at 3:08 pm
  • Jervin Real


    There is a number of reasons your slave can get out of sync i.e. a crash on the master where sync_binlog = 0, writing directly on the slave, abuse of sql_log_bin, non-deterministic queries, etc. Many of these are likely to be user triggered choices or mistake if not bugs. On Percona Server 5.5 and MySQL 5.6 there are additional measures to prevent some of these like crash-safe-replication binary log checksums while some measures are left for the users to implement i.e. setting your slaves read_only and limiting users with SUPER privileges.

    July 24, 2013 at 8:49 pm
  • Shlomi Noach

    Eye opener indeed! I didn’t know it killed the entire transaction. Good stuff.

    July 25, 2013 at 7:01 am
  • ANKIT Khedulkar

    To fix these types of Replication issues, you need three things:
    1) MySQL Master Backup(FULL)
    2)Binary log in text file
    3)Position at binary log at which last backup completed.

    Thats all! The chances of replication break is reduced. 🙂

    December 18, 2014 at 1:07 am
  • Andres Gomez Aragoneses

    What if the error to skip is simply a propagation of a password? I imagine this password setting statement would not be part of a transaction so it would be fine to SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1, right?

    (I come from here: )

    September 16, 2016 at 3:28 pm
  • Jervin Real


    If its a single statement, I assume it would, then you should be fine to skip and correct directly on the slave if needed.

    September 16, 2016 at 7:31 pm
  • Gulab Pasha

    Hi Guys,

    I have enabled MySQL Master Slave replication, but unfortunately getting the following error.

    Last_SQL_Errno: 1032
    Last_SQL_Error: Could not execute Update_rows event on table XXXXXXX; Can’t find record in ‘txn_re’port_accrual_redemption, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event’s master log mysql-bin.000119, end_log_pos 12537772

    July 24, 2019 at 1:16 pm

Comments are closed.

Use Percona's Technical Forum to ask any follow-up questions on this blog topic.