EmergencyEMERGENCY? Get 24/7 Help Now!

Syncing MySQL slave table with pt-online-schema-change

 | September 17, 2014 |  Posted In: Insight for DBAs, MySQL, Percona Toolkit

PREVIOUS POST
NEXT POST

I recently encountered a situation in which after running Percona Toolkit’s pt-table-checksum on a customer system, 95% of the table on the MySQL master was different on the MySQL slave. Although this table was not a critical part of the infrastructure, from time to time, writes to the table from the master would break replication. Additionally, this table has about 6 million rows, and running pt-table-sync would take sometime. Rebuilding the slave from backup of the master would not be an easy option as well since the slave acts as an archive where it has a lot more data than the master.

So how did we solve it? With pt-online-schema-change and a NOOP ALTER.

How is it possible? pt-online-schema-change works by creating a shadow copy of the original table and start copying the rows from the latter to the former. An additional set of TRIGGERs also ensures that any additional changes to existing rows after they have been copied to the shadow version will propagate.

There is little one caveat though, the  binlog_format  on the master would have to be ROW so the actual ROW images from the master would be copied to the slave. If your binlog_format is set to STATEMENT, you’d still end up with the same inconsistency. Since change statements will be logged as STATEMENT, the NOOP ALTER will operate on the slave table instead of copying from the master. You can configure the binlog_format for the alter process via the –set-vars option i.e.  --set-vars 'binlog_format=ROW'  to make it more easier!

PREVIOUS POST
NEXT POST
Jervin Real

As Senior Consultant, Jervin partners with Percona's customers on building reliable and highly performant MySQL infrastructures while also doing other fun stuff like watching cat videos on the internet. Jervin joined Percona in Apr 2010.

4 Comments

  • @Alex,

    pt-table-sync relies on the same code as pt-table-checksum, as such for any chunks that are calculated as different, each row is inspected for possible change correction. Not in all case pt-online-schema-change is faster, for example, if you only have a few rows difference then pt-table-sync is still the way to go, however if its a large table and simply a large percentage has drifted, the method described could be faster.

    @Karl,

    It depends really, I’d say go for ROW if you can, but if you have certain dependencies on STATEMENT then choose MIXED or STATEMENT altogether.

Leave a Reply