Where the open source community meets: Secure your spot for Percona Live Amsterdam! - Register

Downloads

Blog

Multi-threaded replication with MySQL 5.6: Use GTIDs!

January 29, 2015

Author

Stephane Combaudon

Insight for DBAs

MySQL

Share this Post:

MySQL 5.6 allows you to execute replicated events in parallel as long as data is split across several databases. This feature is named “Multi-Threaded Slave” (MTS) and it is easy to enable by setting slave_parallel_workers to a > 1 value. However if you decide to use MTS without GTIDs, you may run into annoying issues. Let’s look at two of them.

Skipping replication errors

When replication stops with an error, a frequent approach is to “ignore now and fix later.” This means you will run SET GLOBAL sql_slave_skip_counter=1 to be able to restart replication as quickly as possible and later use pt-table-checksum/pt-table-sync to resync data on the slave.

Then the day when I hit:

mysql> show slave status;
[...]
Last_SQL_Error: Worker 0 failed executing transaction '' at master log mysql-bin.000017, end_log_pos 1216451; Error 'Duplicate entry '1001' for key 'PRIMARY'' on query. Default database: 'db1'. Query: 'INSERT INTO sbtest1 (id, k, c, pad) VALUES (0, 5320, '49123511666-22272014664-85739796464-62261637750-57593947547-00947134109-73607171516-11063345053-55659776318-82888369235', '11400300639-05875856680-20973514928-29618434959-69429576205')'
Exec_Master_Log_Pos: 1005432

mysql> show slave status;

[...]

Last_SQL_Error: Worker 0 failed executing transaction '' at master log mysql-bin.000017, end_log_pos 1216451; Error 'Duplicate entry '1001' for key 'PRIMARY'' on query. Default database: 'db1'. Query: 'INSERT INTO sbtest1 (id, k, c, pad) VALUES (0, 5320, '49123511666-22272014664-85739796464-62261637750-57593947547-00947134109-73607171516-11063345053-55659776318-82888369235', '11400300639-05875856680-20973514928-29618434959-69429576205')'

Exec_Master_Log_Pos: 1005432

I tried to use the trick:

mysql> set global sql_slave_skip_counter=1;
mysql> start slave;

1 2	mysql> set global sql_slave_skip_counter=1; mysql> start slave;

But:

mysql> show slave status;
[...]
Last_SQL_Error: Worker 0 failed executing transaction '' at master log mysql-bin.000017, end_log_pos 1216451; Error 'Duplicate entry '1001' for key 'PRIMARY'' on query. Default database: 'db1'. Query: 'INSERT INTO sbtest1 (id, k, c, pad) VALUES (0, 5320, '49123511666-22272014664-85739796464-62261637750-57593947547-00947134109-73607171516-11063345053-55659776318-82888369235', '11400300639-05875856680-20973514928-29618434959-69429576205')'
Exec_Master_Log_Pos: 1005882

mysql> show slave status;

[...]

Exec_Master_Log_Pos: 1005882

Note that the position reported with Exec_Master_Log_Pos has moved forward, but I still have my duplicate key error. What’s wrong?

The issue is that the positions reported by SHOW SLAVE STATUS are misleading when using MTS. Quoting the documentation about Exec_Master_Log_Pos:

When using a multi-threaded slave (by setting slave_parallel_workers to a nonzero value in MySQL 5.6.3 and later), the value in this column actually represents a “low-water” mark, before which no uncommitted transactions remain. Because the current implementation allows execution of transactions on different databases in a different order on the slave than on the master, this is not necessarily the position of the most recently executed transaction.

So the solution to my problem is first to make sure that there is no execution gap, and only then to skip the offending event. There is a specific statement for the first part:

mysql> start slave until sql_after_mts_gaps;

1	mysql> start slave until sql_after_mts_gaps;

And now I can finally skip the error and restart replication:

mysql> set global sql_slave_skip_counter=1;
mysql> start slave;
mysql> show slave statusG 
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

mysql> set global sql_slave_skip_counter=1;

mysql> start slave;

mysql> show slave statusG

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

The last thing to do is of course to resync the slave.

Backups

If you cannot trust the output of SHOW SLAVE STATUS to get the current binlog position, it means that taking a backup from a slave with parallel replication is tricky.

For instance, if you run mysqldump --dump-slave=2 to get the binlog position of the master, mysqldump will first run STOP SLAVE and then SHOW SLAVE STATUS. Is stopping the slave sufficient to avoid execution gaps? Actually, no.

The only option then seems to be: run STOP SLAVE followed by START SLAVE UNTIL SQL_AFTER_MTS_GAPS, followed by mysqldump while replication is stopped. Not very handy!

GTIDs to the rescue!

The solution for both issues is to use GTIDs.

They help when you want to skip an event because when using GTIDs, you must explicitly specify the transaction you will be skipping. It doesn’t matter whether there are execution holes.

They also help for backups because mysqldump takes the position from gtid_executed which is updated at each transaction commit (XtraBackup does that too).

Conclusion

If your application uses several databases and if you’re fighting with replication lag, MTS can be a great feature for you. But although GTIDs are not technically necessary, you’ll be exposed to tricky situations if you don’t use them.

Is everything rosy when using both GTIDs and MTS? Not exactly… But that will be the topic for a separate post!

By the way, if you are in the Brussels area this weekend, come see me and other great speakers at the MySQL and friends devroom at FOSDEM!

0 0 votes

Article Rating

9 Comments

Oldest

Newest Most Voted

tekslate

11 years ago

Thank you, its very useful

voja

11 years ago

How does the MTS feature fit with Percona XtraDB Cluster and parallel apply? Complementary/conflicting/not supported?

Author

Stephane Combaudon

11 years ago

Voja,

MTS is only used by MySQL replication, not by Galera replication, so parallel apply in XtraDB Cluster and MTS are independent features.

If you have a cluster and if one of the nodes has an async replica, it can make sense to enable the MTS feature on the async replica if you have several databases in your schema. But it won’t have any impact on what happens inside the cluster.

Jayaram

11 years ago

Thanks for the use full info,

Is it suggestable ? , to usage MTS without GTID while having mixed database engines, i.e. some of our databases are in MyISAM, and the remaining are in InnoDB.

Ian W

11 years ago

We’ve enabled parallel slave workers on a DB and are seeing some log messages, which, I don’t think are bad but can’t find anything in the documentation about.

2015-05-14 12:30:07 23709 [Note] Multi-threaded slave statistics: seconds elapsed = 123; events assigned = 646145; worker queues filled over overrun level = 0; waited due a Worker queue full = 0; waited due the total size = 0; slept when Workers occupied = 150007

Could you shed some light / point me at documentation that explains these?

Khan

10 years ago

Q Stephane,

is it recommended to use Multi-threaded replication if we have 1 database in a M-M scenario. Note that we are only writing to active. Or will this impact -vely?