EmergencyEMERGENCY? Get 24/7 Help Now!

Multi-threaded replication with MySQL 5.6: Use GTIDs!

 | January 29, 2015 |  Posted In: Insight for DBAs, MySQL

PREVIOUS POST
NEXT POST

MySQL 5.6 allows you to execute replicated events in parallel as long as data is split across several databases. This feature is named “Multi-Threaded Slave” (MTS) and it is easy to enable by setting slave_parallel_workers to a > 1 value. However if you decide to use MTS without GTIDs, you may run into annoying issues. Let’s look at two of them.

Skipping replication errors

When replication stops with an error, a frequent approach is to “ignore now and fix later.” This means you will run SET GLOBAL sql_slave_skip_counter=1 to be able to restart replication as quickly as possible and later use pt-table-checksum/pt-table-sync to resync data on the slave.

Then the day when I hit:

I tried to use the trick:

But:

Note that the position reported with Exec_Master_Log_Pos has moved forward, but I still have my duplicate key error. What’s wrong?

The issue is that the positions reported by SHOW SLAVE STATUS are misleading when using MTS. Quoting the documentation about Exec_Master_Log_Pos:

When using a multi-threaded slave (by setting slave_parallel_workers to a nonzero value in MySQL 5.6.3 and later), the value in this column actually represents a “low-water” mark, before which no uncommitted transactions remain. Because the current implementation allows execution of transactions on different databases in a different order on the slave than on the master, this is not necessarily the position of the most recently executed transaction.

So the solution to my problem is first to make sure that there is no execution gap, and only then to skip the offending event. There is a specific statement for the first part:

And now I can finally skip the error and restart replication:

The last thing to do is of course to resync the slave.

Backups

If you cannot trust the output of SHOW SLAVE STATUS to get the current binlog position, it means that taking a backup from a slave with parallel replication is tricky.

For instance, if you run mysqldump --dump-slave=2 to get the binlog position of the master, mysqldump will first run STOP SLAVE and then SHOW SLAVE STATUS. Is stopping the slave sufficient to avoid execution gaps? Actually, no.

The only option then seems to be: run STOP SLAVE followed by START SLAVE UNTIL SQL_AFTER_MTS_GAPS, followed by mysqldump while replication is stopped. Not very handy!

GTIDs to the rescue!

The solution for both issues is to use GTIDs.

They help when you want to skip an event because when using GTIDs, you must explicitly specify the transaction you will be skipping. It doesn’t matter whether there are execution holes.

They also help for backups because mysqldump takes the position from gtid_executed which is updated at each transaction commit (XtraBackup does that too).

Conclusion

If your application uses several databases and if you’re fighting with replication lag, MTS can be a great feature for you. But although GTIDs are not technically necessary, you’ll be exposed to tricky situations if you don’t use them.

Is everything rosy when using both GTIDs and MTS? Not exactly… But that will be the topic for a separate post!

By the way, if you are in the Brussels area this weekend, come see me and other great speakers at the MySQL and friends devroom at FOSDEM!

PREVIOUS POST
NEXT POST
Stephane Combaudon

Stéphane joined Percona in July 2012, after working as a MySQL DBA for leading French companies such as Dailymotion and France Telecom. In real life, he lives in Paris with his wife and their twin daughters. When not in front of a computer or not spending time with his family, he likes playing chess and hiking.

7 Comments

  • Voja,

    MTS is only used by MySQL replication, not by Galera replication, so parallel apply in XtraDB Cluster and MTS are independent features.

    If you have a cluster and if one of the nodes has an async replica, it can make sense to enable the MTS feature on the async replica if you have several databases in your schema. But it won’t have any impact on what happens inside the cluster.

  • Thanks for the use full info,

    Is it suggestable ? , to usage MTS without GTID while having mixed database engines, i.e. some of our databases are in MyISAM, and the remaining are in InnoDB.

  • We’ve enabled parallel slave workers on a DB and are seeing some log messages, which, I don’t think are bad but can’t find anything in the documentation about.

    2015-05-14 12:30:07 23709 [Note] Multi-threaded slave statistics: seconds elapsed = 123; events assigned = 646145; worker queues filled over overrun level = 0; waited due a Worker queue full = 0; waited due the total size = 0; slept when Workers occupied = 150007

    Could you shed some light / point me at documentation that explains these?

  • Q Stephane,

    is it recommended to use Multi-threaded replication if we have 1 database in a M-M scenario. Note that we are only writing to active. Or will this impact -vely?

Leave a Reply