EmergencyEMERGENCY? Get 24/7 Help Now!

Estimating potential for MySQL 5.7 parallel replication

 | February 10, 2016 |  Posted In: MySQL

PREVIOUS POST
NEXT POST

parallel replicationUnlike MySQL 5.6, where parallel replication can only be used when replicas have several schemas, MySQL 5.7 replicas can read binlog group commit information coming from the master to replicate transactions in parallel even when a single schema is used. Now the question is: how many replication threads should you use?

A simple benchmark

Let’s assume we have one master and three slaves, all running MySQL 5.7.

One slave is using regular single-threaded replication (the control slave in the graph below), one is using 20 parallel workers (MTS 20 workers) and the last one is using 100 parallel workers (MTS 100 workers).

As a reminder, here is the settings that need to be adjusted for 5.7 parallel replication:

GTID replication is also highly recommended, if you don’t want to face annoying issues.

Now let’s run a simple sysbench workload, inserting records in 25 separate tables in the same database with 100 concurrent threads:

Because we’re using 100 concurrent threads on the master, we can expect that some parallelization is possible. This means that if we see replication lag with the control slave, we’ll probably see less lag with the 20-worker slave and even less with the 100-worker slave.

This is not exactly what we get:

lag_all

Parallel replication is indeed useful, but the 100-worker slave doesn’t provide any benefits compared to the 20-worker slave. Replication lag is even slightly worse.

What happened?

Some instrumentation with performance_schema

To have a better understanding of how efficiently the parallel replication threads are used, let’s enable some instrumentation on slaves (in other words, recording executed transactions):

For better readability of the results, let’s create a new view (tracking how many transactions are executed by each replication thread):

Now, after running sysbench again, let’s do some math to see how often each replication thread was run:

We can see that the workload has a limited potential for parallelism – therefore, it’s not worth configuring more than 3-4 replication threads.

The slight performance degradation with 100 replication threads is probably due to the overhead of the coordinator thread.

Conclusion

Estimating the optimal number of replication threads with MySQL 5.7 parallel replication is quite difficult if your just guessing. The performance_schema provides a simple way to understand how the workload is handled by the replication threads.

It also allows you to see if tuning binlog_group_commit_sync_delay provides more throughput on slaves without too much impact on the master’s performance.

PREVIOUS POST
NEXT POST
Stephane Combaudon

Stéphane joined Percona in July 2012, after working as a MySQL DBA for leading French companies such as Dailymotion and France Telecom. In real life, he lives in Paris with his wife and their twin daughters. When not in front of a computer or not spending time with his family, he likes playing chess and hiking.

6 Comments

  • Vadim,

    I was surprised by the result too and I don’t have a good explanation. The workload may be too simple with INSERTs running too fast to get any chance to get committed in parallel on the master.

    If I have time, I’ll try to run the same test but with a different workload (like UPDATEs with a secondary key).

  • Hi Stephane,

    Maybe some more tuning is possible to improve performance.

    – Tuning binlog_group_commit_sync_delay and binlog_group_commit_sync_no_delay_count (and keeping sync_binlog=1) on the master could increase group commits and increase the potential to execute queries in parallel. In MariaDB you can actually see status of how good much group commit is happening by looking at Binlog_commits and Binlog_group_commits (don’t know yet how to do that in MySQL 5.7).
    – Having a larger slave_pending_jobs_size_max on the slave might increase performance as well, allthough I think that’s not the case in your test.

  • Stéphan, very nice blog post, congrats!

    Can you do the same test but having master configured with:

    [mysqld]
    binlog_group_commit_sync_delay=1000000 (default value)
    binlog_group_commit_sync_no_delay_count=20

    and then, slaves with:

    slave_parallel_type = LOGICAL_CLOCK
    slave_parallel_workers = 20

    I think the big improvement as well as tested by Vadim before is the writeback cache
    https://www.percona.com/blog/2011/07/13/testing-the-group-commit-fix/

    Thanks.

Leave a Reply