MySQL 5.6 allows you to execute replicated events in parallel as long as data is split across several databases. This feature is named “Multi-Threaded Slave” (MTS) and it is easy to enable by setting slave_parallel_workers to a > 1 value. However if you decide to use MTS without GTIDs, you may run into annoying issues. Let’s look at two of them.

Skipping replication errors

When replication stops with an error, a frequent approach is to “ignore now and fix later.” This means you will run SET GLOBAL sql_slave_skip_counter=1 to be able to restart replication as quickly as possible and later use pt-table-checksum/pt-table-sync to resync data on the slave.

Then the day when I hit:

I tried to use the trick:

But:

Note that the position reported with Exec_Master_Log_Pos has moved forward, but I still have my duplicate key error. What’s wrong?

The issue is that the positions reported by SHOW SLAVE STATUS are misleading when using MTS. Quoting the documentation about Exec_Master_Log_Pos:

When using a multi-threaded slave (by setting slave_parallel_workers to a nonzero value in MySQL 5.6.3 and later), the value in this column actually represents a “low-water” mark, before which no uncommitted transactions remain. Because the current implementation allows execution of transactions on different databases in a different order on the slave than on the master, this is not necessarily the position of the most recently executed transaction.

So the solution to my problem is first to make sure that there is no execution gap, and only then to skip the offending event. There is a specific statement for the first part:

And now I can finally skip the error and restart replication:

The last thing to do is of course to resync the slave.

Backups

If you cannot trust the output of SHOW SLAVE STATUS to get the current binlog position, it means that taking a backup from a slave with parallel replication is tricky.

For instance, if you run mysqldump --dump-slave=2 to get the binlog position of the master, mysqldump will first run STOP SLAVE and then SHOW SLAVE STATUS. Is stopping the slave sufficient to avoid execution gaps? Actually, no.

The only option then seems to be: run STOP SLAVE followed by START SLAVE UNTIL SQL_AFTER_MTS_GAPS, followed by mysqldump while replication is stopped. Not very handy!

GTIDs to the rescue!

The solution for both issues is to use GTIDs.

They help when you want to skip an event because when using GTIDs, you must explicitly specify the transaction you will be skipping. It doesn’t matter whether there are execution holes.

They also help for backups because mysqldump takes the position from gtid_executed which is updated at each transaction commit (XtraBackup does that too).

Conclusion

If your application uses several databases and if you’re fighting with replication lag, MTS can be a great feature for you. But although GTIDs are not technically necessary, you’ll be exposed to tricky situations if you don’t use them.

Is everything rosy when using both GTIDs and MTS? Not exactly… But that will be the topic for a separate post!

By the way, if you are in the Brussels area this weekend, come see me and other great speakers at the MySQL and friends devroom at FOSDEM!

9 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
tekslate

Thank you, its very useful

voja

How does the MTS feature fit with Percona XtraDB Cluster and parallel apply? Complementary/conflicting/not supported?

Jayaram

Thanks for the use full info,

Is it suggestable ? , to usage MTS without GTID while having mixed database engines, i.e. some of our databases are in MyISAM, and the remaining are in InnoDB.

Ian W

We’ve enabled parallel slave workers on a DB and are seeing some log messages, which, I don’t think are bad but can’t find anything in the documentation about.

2015-05-14 12:30:07 23709 [Note] Multi-threaded slave statistics: seconds elapsed = 123; events assigned = 646145; worker queues filled over overrun level = 0; waited due a Worker queue full = 0; waited due the total size = 0; slept when Workers occupied = 150007

Could you shed some light / point me at documentation that explains these?

Khan

Q Stephane,

is it recommended to use Multi-threaded replication if we have 1 database in a M-M scenario. Note that we are only writing to active. Or will this impact -vely?

Flavian

Current MTS setup is per database connection so if you have only 1 database MTS wont be of much use.

Bala Guntipalli

It’s A Great Pleasure reading your Article Bala Guntipalli Thanks for posting.

Bala Guntipalli

Excellent article, Cool, Looking ahead to reading a lot. Sensible article Bala Guntipalli Thanks for posting.