Performance Schema
In many types of database workloads, using a multi-threaded slave from 5.6+ helps improve replication performance. I’ve had a number of users enable this feature, but have not seen anyone ask how each thread is performing. Here’s a quick way with Performance_Schema to measure the amount of multi-threaded slave activity on each thread (after you have already configured MTS on your slave of course ).
First, we need to enable the statements instruments:
1 2 3 |
slave1> UPDATE setup_consumers SET ENABLED = 'YES' WHERE NAME LIKE 'events_statements_%'; Query OK, 2 rows affected (0.00 sec) Rows matched: 3 Changed: 2 Warnings: 0 |
Next, let’s find the THREAD_ID for our slave workers:
1 2 3 4 5 6 7 8 9 10 |
slave1> SELECT THREAD_ID, NAME FROM threads WHERE NAME LIKE '%worker'; +-----------+-------------------------+ | THREAD_ID | NAME | +-----------+-------------------------+ | 28 | thread/sql/slave_worker | | 29 | thread/sql/slave_worker | | 30 | thread/sql/slave_worker | | 31 | thread/sql/slave_worker | +-----------+-------------------------+ 4 rows in set (0.00 sec) |
And then we capture measurements:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
slave1> SELECT THREAD_ID, EVENT_NAME, COUNT_STAR FROM events_statements_summary_by_thread_by_event_name WHERE THREAD_ID IN (28,29,30,31) AND COUNT_STAR > 0; +-----------+----------------------+------------+ | THREAD_ID | EVENT_NAME | COUNT_STAR | +-----------+----------------------+------------+ | 28 | statement/sql/update | 48 | | 28 | statement/sql/insert | 24 | | 28 | statement/sql/delete | 24 | | 28 | statement/sql/begin | 24 | | 29 | statement/sql/update | 68 | | 29 | statement/sql/insert | 34 | | 29 | statement/sql/delete | 34 | | 29 | statement/sql/begin | 34 | | 30 | statement/sql/update | 1864 | | 30 | statement/sql/insert | 932 | | 30 | statement/sql/delete | 932 | | 30 | statement/sql/begin | 932 | | 31 | statement/sql/update | 40320 | | 31 | statement/sql/insert | 20160 | | 31 | statement/sql/delete | 20160 | | 31 | statement/sql/begin | 20160 | +-----------+----------------------+------------+ 16 rows in set (0.01 sec) |
As you can see above, one of the multi-threadedthreads is doing more work compared to the other three. This is a sysbench on the master in the database doing --trx-rate=10, compared to the other three which are only doing --trx-rate=2|5.
Below is another example, this time including the event times. With a slightly modified sysbench test, I’ve designed the tests to send large updates but one with a low transaction rate, another with a high transaction rate, and the rest in between. You can see specifically in threads 32 and 34, that the former is performing a lower amount of UPDATES, but spending more time per event.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
slave1> SELECT THREAD_ID, EVENT_NAME, COUNT_STAR, -> SUM_TIMER_WAIT/1000000 AS SUM_TIMER_WAIT_MS, -> (SUM_TIMER_WAIT/COUNT_STAR)/1000000 AS AVG_TIME_MS -> FROM events_statements_summary_by_thread_by_event_name -> WHERE THREAD_ID IN (31,32,33,34) AND COUNT_STAR > 0; +-----------+----------------------+------------+-------------------+--------------+ | THREAD_ID | EVENT_NAME | COUNT_STAR | SUM_TIMER_WAIT_MS | AVG_TIME_MS | +-----------+----------------------+------------+-------------------+--------------+ | 31 | statement/sql/update | 3 | 403.9690 | 134.65633333 | | 31 | statement/sql/insert | 1 | 91.5340 | 91.53400000 | | 31 | statement/sql/delete | 1 | 75.9540 | 75.95400000 | | 31 | statement/sql/begin | 1 | 11.6800 | 11.68000000 | | 32 | statement/sql/update | 75 | 25298.5090 | 337.31345333 | | 32 | statement/sql/insert | 25 | 2221.1410 | 88.84564000 | | 32 | statement/sql/delete | 25 | 2187.8970 | 87.51588000 | | 32 | statement/sql/begin | 25 | 321.0220 | 12.84088000 | | 33 | statement/sql/update | 4728 | 1008736.6000 | 213.35376481 | | 33 | statement/sql/insert | 1576 | 138815.0570 | 88.08061992 | | 33 | statement/sql/delete | 1576 | 136161.9060 | 86.39714848 | | 33 | statement/sql/begin | 1576 | 20498.2300 | 13.00649112 | | 34 | statement/sql/update | 70668 | 12304784.2380 | 174.12101995 | | 34 | statement/sql/insert | 23556 | 2083454.5350 | 88.44687277 | | 34 | statement/sql/delete | 23556 | 2041202.7690 | 86.65319957 | | 34 | statement/sql/begin | 23556 | 303710.4860 | 12.89312642 | +-----------+----------------------+------------+-------------------+--------------+ 16 rows in set (0.00 sec) |
Exploring a bit further, one thing I noticed is that when using binlog_format=ROW for replication, you will only see the BEGIN events instrumented from performance_schema. I think this is a bug so I reported it here.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
slave1> SELECT THREAD_ID, EVENT_NAME, COUNT_STAR, -> SUM_TIMER_WAIT/1000000 AS SUM_TIMER_WAIT_MS, -> (SUM_TIMER_WAIT/COUNT_STAR)/1000000 AS AVG_TIME_MS -> FROM events_statements_summary_by_thread_by_event_name -> WHERE THREAD_ID IN (40,41,42,43) AND COUNT_STAR > 0; +-----------+---------------------+------------+-------------------+-------------+ | THREAD_ID | EVENT_NAME | COUNT_STAR | SUM_TIMER_WAIT_MS | AVG_TIME_MS | +-----------+---------------------+------------+-------------------+-------------+ | 40 | statement/sql/begin | 16 | 258.6500 | 16.16562500 | | 41 | statement/sql/begin | 91 | 1526.4320 | 16.77397802 | | 42 | statement/sql/begin | 589 | 10131.4500 | 17.20110357 | |