Figuring out what limits MySQL Replication

Today I was cloning the master using LVM Snapshot and found it was taking quite a while to catch up, which highlighted replication could be the limiting factor for this system quite soon, so I decided to check what is limiting MySQL Replication speed.

My first idea was to check it based on slow query log, happily the server was running MySQL with slow query log with microsecond resolution so I could check exactly which update queries take most time to execute. Unfortunately it did not work because Slave thread seems to have problems picking up long_query_time when it is set online, meaning changing it via
set global long_query_time=0 and restarting slave does not pick up the change.

I of course could just restart MySQL but this would make replication to be delayed few more hours because of the time Innodb takes to open tables first time after start.

I knew in my case most of the queries are rather simple being updates by primary key or other keys so knowing count was enough for me so I turned my attention to mysqlsla which I remembered parses all kinds of MySQL Logs. Too bad MySQL Binary log is the only one which it does not support natively. Happily it supports “raw” query log file which is basically queries one query per line.

mysqlbinlog almost provides what we’re looking for, if you strip out comments, USE statements and SET statements, which was done by running: mysqlbinlog host-relay.000005 | grep -v -i -P “^(SET|use|#)” > q1.log I’m not sure if this would clear everything but in my case it did exactly what was needed.

Now running mysqlsla to get most common queries: ./mysqlsla-1.4 –top 100 –raw q1.log > report.txt

We get report something like:

Which is good to start reviewing queries and finding if they can be merged or optimized any other way.

Still looking at queries which took took the most time to execute would be better but this would need to wait for the next time.

Share this post