Where the open source database community meets: Use code PERCONA75 and secure your spot for Percona Live. Register

Downloads

Blog

When MySQL Lies: Wrong seconds_behind_master with slave_parallel_workers > 0

January 27, 2017

Author

Marcelo Altmann

Insight for DBAs

MySQL

Share this Post:

In today’s blog, I will show an issue with seconds_behind_master that one of our clients faced when running slave_parallel_workers > 0. We found out that the reported seconds_behind_master from SHOW SLAVE STATUS was lying. To be more specific, I’m talking about bugs #84415 and #1654091.

The Issue

MySQL will not report the correct slave lag if you have slave_parallel_workers > 0. Let’s show it in practice.

I’ll use MySQL Sandbox to speed up one master and two slaves on MySQL version 5.7.17, and sysbench to populate the database:

# Create sandboxes
make_replication_sandbox /path/to/mysql/5.7.17
# Create table with 1.5M rows on it
sysbench --test=/usr/share/sysbench/tests/db/oltp.lua --mysql-host=localhost --mysql-user=msandbox --mysql-password=msandbox --mysql-socket=/tmp/mysql_sandbox20192.sock --mysql-db=test --oltp-table-size=1500000 prepare
# Add slave_parallel_workers=5 and slave_pending_jobs_size_max=1G" on node1
echo "slave_parallel_workers=5" >> node1/my.sandbox.cnf
echo "slave_pending_jobs_size_max=1G" >> node1/my.sandbox.cnf
node1/restart

# Create sandboxes

make_replication_sandbox /path/to/mysql/5.7.17

# Create table with 1.5M rows on it

sysbench --test=/usr/share/sysbench/tests/db/oltp.lua --mysql-host=localhost --mysql-user=msandbox --mysql-password=msandbox --mysql-socket=/tmp/mysql_sandbox20192.sock --mysql-db=test --oltp-table-size=1500000 prepare

# Add slave_parallel_workers=5 and slave_pending_jobs_size_max=1G" on node1

echo "slave_parallel_workers=5" >> node1/my.sandbox.cnf

echo "slave_pending_jobs_size_max=1G" >> node1/my.sandbox.cnf

node1/restart

Monitor Replication lag via SHOW SLAVE STATUS:

for i in {1..1000}; 
do 
    (
        node1/use -e "SHOW SLAVE STATUSG" | grep "Seconds_Behind_Master" | awk '{print "Node1: " $2}' & 
        sleep 0.1 ; 
        node2/use -e "SHOW SLAVE STATUSG" | grep "Seconds_Behind_Master" | awk '{print "Node2: " $2}' &
    ); 
    sleep 1; 
done

for i in {1..1000};

(

node1/use -e "SHOW SLAVE STATUSG" | grep "Seconds_Behind_Master" | awk '{print "Node1: " $2}' &

sleep 0.1 ;

node2/use -e "SHOW SLAVE STATUSG" | grep "Seconds_Behind_Master" | awk '{print "Node2: " $2}' &

);

sleep 1;

done

On a separate terminal, DELETE some rows in the test.sbtest1 table on the master, and monitor the above output once the master completes the delete command:

DELETE FROM test.sbtest1 WHERE id > 100;

1	DELETE FROM test.sbtest1 WHERE id > 100;

Here is a sample output:

master [localhost] {msandbox} (test) > DELETE FROM test.sbtest1 WHERE id > 100;
Query OK, 1499900 rows affected (46.42 sec)

. . .
Node1: 0
Node2: 0
Node1: 0
Node2: 48
Node1: 0
Node2: 48
Node1: 0
Node2: 49
Node1: 0
Node2: 50
. . .
Node1: 0
Node2: 90
Node1: 0
Node2: 91
Node1: 0
Node2: 0
Node1: 0
Node2: 0

master [localhost] {msandbox} (test) > DELETE FROM test.sbtest1 WHERE id > 100;

Query OK, 1499900 rows affected (46.42 sec)

. . .

Node1: 0

Node2: 0

Node1: 0

Node2: 48

Node1: 0

Node2: 48

Node1: 0

Node2: 49

Node1: 0

Node2: 50

. . .

Node1: 0

Node2: 90

Node1: 0

Node2: 91

Node1: 0

Node2: 0

Node1: 0

Node2: 0

As you can see, node1 (which is running with slave_parallel_workers = 5) doesn’t report any lag.

The Workaround

We can workaround this issue by querying performance_schema.threads:

SELECT PROCESSLIST_TIME FROM performance_schema.threads WHERE NAME = 'thread/sql/slave_worker' AND (PROCESSLIST_STATE IS NULL  or PROCESSLIST_STATE != 'Waiting for an event from Coordinator') ORDER BY PROCESSLIST_TIME DESC LIMIT 1;

1	SELECT PROCESSLIST_TIME FROM performance_schema.threads WHERE NAME = 'thread/sql/slave_worker' AND (PROCESSLIST_STATE IS NULL or PROCESSLIST_STATE != 'Waiting for an event from Coordinator') ORDER BY PROCESSLIST_TIME DESC LIMIT 1;

Let’s modify our monitoring script, and use the above query to monitor the lag on node1:

for i in {1..1000}; 
do 
    (
        node1/use -BNe "SELECT PROCESSLIST_TIME FROM performance_schema.threads WHERE NAME = 'thread/sql/slave_worker' AND (PROCESSLIST_STATE IS NULL  or PROCESSLIST_STATE != 'Waiting for an event from Coordinator') ORDER BY PROCESSLIST_TIME DESC LIMIT 1 INTO @delay; SELECT IFNULL(@delay, 0) AS 'lag';" | awk '{print "Node1: " $1}' & 
        sleep 0.1 ; 
        node2/use -e "SHOW SLAVE STATUSG" | grep "Seconds_Behind_Master" | awk '{print "Node2: " $2}' &
    ); 
    sleep 1; 
done

for i in {1..1000};

(

node1/use -BNe "SELECT PROCESSLIST_TIME FROM performance_schema.threads WHERE NAME = 'thread/sql/slave_worker' AND (PROCESSLIST_STATE IS NULL or PROCESSLIST_STATE != 'Waiting for an event from Coordinator') ORDER BY PROCESSLIST_TIME DESC LIMIT 1 INTO @delay; SELECT IFNULL(@delay, 0) AS 'lag';" | awk '{print "Node1: " $1}' &

sleep 0.1 ;

node2/use -e "SHOW SLAVE STATUSG" | grep "Seconds_Behind_Master" | awk '{print "Node2: " $2}' &

);

sleep 1;

done

master [localhost] {msandbox} (test) > DELETE FROM test.sbtest1 WHERE id > 100;
Query OK, 1499900 rows affected (45.21 sec)

Node1: 0
Node2: 0
Node1: 0
Node2: 0
Node1: 45
Node2: 45
Node1: 46
Node2: 46
. . .
Node1: 77
Node2: 77
Node1: 78
Node2: 79
Node1: 0
Node2: 80
Node1: 0
Node2: 81
Node1: 0
Node2: 0
Node1: 0
Node2: 0

master [localhost] {msandbox} (test) > DELETE FROM test.sbtest1 WHERE id > 100;

Query OK, 1499900 rows affected (45.21 sec)

Node1: 0

Node2: 0

Node1: 0

Node2: 0

Node1: 45

Node2: 45

Node1: 46

Node2: 46

. . .

Node1: 77

Node2: 77

Node1: 78

Node2: 79

Node1: 0

Node2: 80

Node1: 0

Node2: 81

Node1: 0

Node2: 0

Node1: 0

Node2: 0

Please note that in our query to performance_schema.threads, we are filtering PROCESSLIST_STATE "NULL" and "!= Waiting for an event from Coordinator". The correct state is "Executing Event", but it seems like it doesn’t correctly report that state (#84655).

Summary

MySQL parallel replication is a nice feature, but we still need to make sure we are aware of any potential issues it might bring. Most monitoring systems use the output of SHOW SLAVE STATUS to verify whether or not the slave is lagging behind the master. As shown above, it has its caveats.

As always, we should test, test and test again before implementing any change like this in production!

0 0 votes

Article Rating

4 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Moll

9 years ago

Nice blog, Marcelo.
For sure, parallel replication is better then single-threaded in terms of slave lagging but I think – according to the bug – most of the benchmarking results and comparisons between Single-Threaded and Multi-Threaded-Slave which depend on the SHOW SLAVE STATUS output are giving fake and non-accurate results!

Andrew Gebo

9 years ago

For anyone who wants to get away from using replication check out ClustrixDB. It is a MySQL compatible database that offers far superior performance and scalability without sharding or in this case, replication.