Buy Percona ServicesBuy Now!

Making replication a bit more reliable

 | March 4, 2009 |  Posted In: Percona Software


Running MySQL slave is quite common and regular task which we do every day, taking backups from slave is often recommended solution. However the current state of MySQL replication makes restoring slave a bit tricky (if possible at all). The main problem is that InnoDB transaction state and replication state are not synchronized. If we speak about backup and you can execute SHOW SLAVE STATUS command you can get reliable information about current state, but some solutions does not allow that. Look for example Sun Storage 7410, which provides storage via NFS and where you can make ZFS snapshots without any info what kind of data you are storing there. What makes situation worse is that files with replication state (, are not synchronized on disk after each update, and even wrose – in case with NFS they are stored on client side OS/NFS cache for long time. As solution we can do patch to execute fsync() for these files after each write, but I can’t predict how much performance penalty we will see here, I expect it will be very significant.

Our idea is not new, it was taken from TransactionalReplication patch and bug report. Basically we want to store the state of replication in InnoDB transactional log file, in this case we will be able to see what position in replication the last executed transaction corresponds to. Of course it will work if you have writes exclusively to InnoDB storage engine, case with mix of storage engines is much more complex and I do not see easy way to solve it.

So we propose overwrite_relay_log_info extension for XtraDB storage engine, the name comes from fact that XtraDB will try to rewrite by replication position, or at least this info is available in error-log output, so you can repoint your slave to correct position by executing CHANGE MASTER command.

Currently patch is available in Launchpad lp:~percona-dev/percona-xtradb/overwrite-relay-log-info , and after testing it will go to main XtraDB tree. If there is interest the patch will be ported to 5.0 and 5.1 trees. Some info available at this documentation page.

Vadim Tkachenko

Vadim Tkachenko co-founded Percona in 2006 and serves as its Chief Technology Officer. Vadim leads Percona Labs, which focuses on technology research and performance evaluations of Percona’s and third-party products. Percona Labs designs no-gimmick tests of hardware, filesystems, storage engines, and databases that surpass the standard performance and functionality scenario benchmarks. Vadim’s expertise in LAMP performance and multi-threaded programming help optimize MySQL and InnoDB internals to take full advantage of modern hardware. Oracle Corporation and its predecessors have incorporated Vadim’s source code patches into the mainstream MySQL and InnoDB products. He also co-authored the book High Performance MySQL: Optimization, Backups, and Replication 3rd Edition.


  • hi, i have a ‘Master’ DB replicating to 3 ‘Slaves’ DB (with InnoDB engine) and it’s a bit difficult to restore when some slave stops replicate… if im not wrong this “patch/feature” will help me to restore replication rigth?
    thanks in advance and sry for my ‘not very good’ english 🙂
    btw, great mysql blog.

  • Alex,

    It may help, or may not. To be sure I need to know what exact problem you have with restoring replication.

  • hi. Sometimes i got (in slave) the “Duplicate key entry” error (i see this in the ‘show slave status’ Last_error) always with an insert query. Dont know exactly why but just happened in 1 of 3 slave, and the others was fine and running normally. With Myisam just need “load data from master” but with InnoDB we have to download and run a backup of the Master DB and it takes some minutes….. i have this problem around 1 time every 3 months but im looking for a better solution..
    thanks again.

  • Alex, I have run into similar duplicate key errors in the past. It turns out the slave I was using to take backups had slightly different data than the master. You may want to do a consistency check using Maatkit just to be sure.

  • Alex P, Gil

    Actually this patch can fix that issue.

    The problem is when your slave crash or some another problem – contains old information, and when you do slave start – slave tries to repeat transactions which were already executed (and you getting Duplicate key error).

    Our patch tries to fix that and change to the real last transaction.

    However Dublicate key error may be related to different issues also, so I can’t say for sure our patch is solution.

  • yes i know the “duplicate key error” may be related to different issues (sometimes its isp problem, manual inserts, our slaves uses dyndns and the update of the service sometime make it crash, etc ), the way i resolve the crash is making a backup of the master, send this backup to the slave (via sftp) then i execute this backup, delete the ‘relay’ files and restart the mysql service… after that slave is runnning ok again. I know this “solution” is not very good (what do you think? ) so im trying to find another one, that’s why i think this patch/feature will help me.

  • Alex P,

    You could also do:


    So the replication will jump that Dup Key error and continue with the next statement of the replication.

  • @Aric,

    Skipping errors by specifying “slave-skip-errors = all” in my.cnf is quite dangerous, since you will not even be aware of replication issues; with Ernesto’s solution, you are at least forced into manual intervention, which may supply you with enlightenment about the problem’s origins.

    Instead, your replicated data will keep on moving away from master’s data.

  • Your slave port is random because you’re looking at it’s ephemeral port. The slave does not listen for incoming connections from the master, but is rather a client of the master.

    I imagine you were asking this question because you were trying to figure out which ports to allow through your firewall. If you’ll allow all established TCP connections, you’ll solve this problem. In IPtables, it would look something like this:


    I know it’s been a long time since you posted, but thought someone else might find this useful.

Comments are closed