Making replication a bit more reliable

Running MySQL slave is quite common and regular task which we do every day, taking backups from slave is often recommended solution. However the current state of MySQL replication makes restoring slave a bit tricky (if possible at all). The main problem is that InnoDB transaction state and replication state are not synchronized. If we speak about backup and you can execute SHOW SLAVE STATUS command you can get reliable information about current state, but some solutions does not allow that. Look for example Sun Storage 7410, which provides storage via NFS and where you can make ZFS snapshots without any info what kind of data you are storing there. What makes situation worse is that files with replication state (, are not synchronized on disk after each update, and even wrose – in case with NFS they are stored on client side OS/NFS cache for long time. As solution we can do patch to execute fsync() for these files after each write, but I can’t predict how much performance penalty we will see here, I expect it will be very significant.

Our idea is not new, it was taken from TransactionalReplication patch and bug report. Basically we want to store the state of replication in InnoDB transactional log file, in this case we will be able to see what position in replication the last executed transaction corresponds to. Of course it will work if you have writes exclusively to InnoDB storage engine, case with mix of storage engines is much more complex and I do not see easy way to solve it.

So we propose overwrite_relay_log_info extension for XtraDB storage engine, the name comes from fact that XtraDB will try to rewrite by replication position, or at least this info is available in error-log output, so you can repoint your slave to correct position by executing CHANGE MASTER command.

Currently patch is available in Launchpad lp:~percona-dev/percona-xtradb/overwrite-relay-log-info , and after testing it will go to main XtraDB tree. If there is interest the patch will be ported to 5.0 and 5.1 trees. Some info available at this documentation page.

Share this post

Comments (15)

  • Shlomi Noach

    Very nice!!

    March 4, 2009 at 10:51 pm
  • Domas Mituzas

    it is my favorite feature from google patch 😉

    March 5, 2009 at 4:45 am
  • Alex P.

    hi, i have a ‘Master’ DB replicating to 3 ‘Slaves’ DB (with InnoDB engine) and it’s a bit difficult to restore when some slave stops replicate… if im not wrong this “patch/feature” will help me to restore replication rigth?
    thanks in advance and sry for my ‘not very good’ english 🙂
    btw, great mysql blog.

    March 5, 2009 at 11:09 am
  • Vadim


    It may help, or may not. To be sure I need to know what exact problem you have with restoring replication.

    March 5, 2009 at 1:30 pm
  • Alex P.

    hi. Sometimes i got (in slave) the “Duplicate key entry” error (i see this in the ‘show slave status’ Last_error) always with an insert query. Dont know exactly why but just happened in 1 of 3 slave, and the others was fine and running normally. With Myisam just need “load data from master” but with InnoDB we have to download and run a backup of the Master DB and it takes some minutes….. i have this problem around 1 time every 3 months but im looking for a better solution..
    thanks again.

    March 5, 2009 at 6:06 pm
  • Gil

    Alex, I have run into similar duplicate key errors in the past. It turns out the slave I was using to take backups had slightly different data than the master. You may want to do a consistency check using Maatkit just to be sure.

    March 6, 2009 at 10:03 am
  • Vadim

    Alex P, Gil

    Actually this patch can fix that issue.

    The problem is when your slave crash or some another problem – contains old information, and when you do slave start – slave tries to repeat transactions which were already executed (and you getting Duplicate key error).

    Our patch tries to fix that and change to the real last transaction.

    However Dublicate key error may be related to different issues also, so I can’t say for sure our patch is solution.

    March 6, 2009 at 11:05 am
  • Alex P.

    yes i know the “duplicate key error” may be related to different issues (sometimes its isp problem, manual inserts, our slaves uses dyndns and the update of the service sometime make it crash, etc ), the way i resolve the crash is making a backup of the master, send this backup to the slave (via sftp) then i execute this backup, delete the ‘relay’ files and restart the mysql service… after that slave is runnning ok again. I know this “solution” is not very good (what do you think? ) so im trying to find another one, that’s why i think this patch/feature will help me.

    March 6, 2009 at 11:46 am
  • Ernesto Vargas

    Alex P,

    You could also do:


    So the replication will jump that Dup Key error and continue with the next statement of the replication.

    March 9, 2009 at 11:24 pm
  • Alex P.

    thank you. i will try it and find if can solve the problem.


    March 17, 2009 at 5:41 pm
  • Aric

    Also, in my.cnf you can skip replication errors.

    slave-skip-errors = all

    November 15, 2009 at 12:17 am
  • Baron Schwartz

    These “solutions” make replication LESS reliable and MORE likely to have further problems! FIX the problem, don’t hide it.

    November 15, 2009 at 8:21 am
  • Shlomi Noach


    Skipping errors by specifying “slave-skip-errors = all” in my.cnf is quite dangerous, since you will not even be aware of replication issues; with Ernesto’s solution, you are at least forced into manual intervention, which may supply you with enlightenment about the problem’s origins.

    Instead, your replicated data will keep on moving away from master’s data.

    November 16, 2009 at 3:53 am
  • Robin

    Does mysql have a feature to customize the slave port? I see my slave port is random.
    Thank you!

    November 26, 2011 at 9:50 pm
  • James Parks

    Your slave port is random because you’re looking at it’s ephemeral port. The slave does not listen for incoming connections from the master, but is rather a client of the master.

    I imagine you were asking this question because you were trying to figure out which ports to allow through your firewall. If you’ll allow all established TCP connections, you’ll solve this problem. In IPtables, it would look something like this:


    I know it’s been a long time since you posted, but thought someone else might find this useful.

    August 18, 2013 at 8:35 am

Comments are closed.

Use Percona's Technical Forum to ask any follow-up questions on this blog topic.