Faster MySQL failover with SELECT mirroring

February 1, 2009
Author
Baron Schwartz
Share this Post:

One of my favorite MySQL configurations for high availability is master-master replication, which is just like normal master-slave replication except that you can fail over in both directions. Aside from MySQL Cluster, which is more special-purpose, this is probably the best general-purpose way to get fast failover and a bunch of other benefits (non-blocking ALTER TABLE, for example).

The benefit is that you have another server with all the same data, up and running, ready to serve queries. In theory, it’s a truly hot standby (stay with me — that’s not really guaranteed). You don’t get this with shared storage or DRBD, although those provide stronger guarantees against data loss if mysqld crashes. And you can use the standby (passive) master for serving some SELECT queries, taking backups, etc as usual. However, if you do this you actually compromise your high-availability plan a little, because you can mask the lack of capacity that will result when one of the servers is down and you have to rely on just one server to keep everything on its feet.

If you need really high availability, you can’t load the pair of servers more than a single server can handle. (You can always use the passive server for non-essential needs — it doesn’t have to be completely dead weight.) As a result, some people choose to make the passive server truly passive, handling none of the application’s queries. It just sits there replicating and doing nothing else.

The problem is that the passive server’s caches start to get skewed to handle the write workload from replication, and not the read workload it will have to handle if there’s a planned or unplanned failover. This isn’t a big problem on small systems, but with buffer pools in the dozens of gigabytes (which is arguably “small” these days), it starts to matter a lot. Warming up a system so it’s actually responsive can take hours. As a result, the passive master isn’t truly hot anymore. It needs to handle the workload it’s supposed to be ready to take over. If you fail over to it, it might perform very badly — get unresponsive, cause tons of I/O, etc. In reality, it can be completely unusable for a long time.

To measure how much this really matters, I did some tests for a customer who was having troubles with this type of scenario. I used mk-query-digest (with some new features) to watch the traffic on the active master and replay SELECT queries against the passive one. I timed the results and ran them through the analysis part of mk-query-digest. A simple key lookup ran in tens of milliseconds on the active master, but executed for up to dozens of seconds on the passive one.

After a couple of hours of handling SELECT traffic, these same queries were responding nicely on the passive master, too.

Is that all? “Buffer pool warmed up, performance is better, case closed!” No. This isn’t as simple as it sounds on the surface. There are two things happening and both are important to understand.

The first, most obvious phenomenon is that the buffer pool gets skewed to handle the write workload. Since we’re running Percona’s patched server, we can actually measure what’s in the buffer pool. I measured the active master’s buffer pool with the following query:

I loaded this file into a table on my laptop with LOAD DATA INFILE and kept it for later. I did the same on the slave. Then I used mk-query-digest to watch the traffic on the active master:

After a bit I CTRL-C’ed it and it printed out the analysis of the time taken to run the queries against the passive master. I restarted it and after a few hours of this I did the same thing; the query timings were dramatically better now. Then I just let it keep running without any aggregation options to avoid any overhead of storing and analyzing queries. (I added –mirror and –daemonize options so it can run in the background and follow along when the passive/active roles switch.)

After a day or so of doing this, I re-sampled the buffer pool contents on the passive server. With all three samples stored in tables on my laptop, I wrote a query against these three sets of stats to find the top tables on the active server and left-join those against the tables on the passive server, with both a mixed workload from my mirrored SELECT statements and with the “pure” replication workload. I totaled the pages up into gigabytes. Here’s the result:

db_table active passive + SELECT passive
site.benefits 8.30 5.73 1.32
. 3.13 0.94 0.50
site.user_actions 2.55 4.09 6.29
site.user_achievements 1.36 1.20 0.35
site.clicks 1.26 3.05 5.13
site.actions_finished 1.14 0.46 0.74
site.ratings 0.91 0.89 0.48

The difference is clear. The buffer pool contains over 8G of data for the site.benefits table on the active master, but if you just put a replication workload on the server, that falls to 1.32G. Other tables are similar. The mixed workload with some SELECT queries mirrored is somewhere between the two.

One thing we don’t know is which pages are in the pool. Same table, same size of data doesn’t mean same buffer pool contents. An insert-only workload will probably fill the buffer pool with the most recent data; a mixed workload will usually have some different hot spot or mixture of hot spots, so it’ll bring different parts of the table into memory.

So that’s the first thing that’s happening. The second is the insert buffer. Notice the pages with no database or table name — the second row in the table above. Those are a mixture of things, but it’s overwhelmingly the insert buffer.

As Peter explained in his recent post on the insert buffer, the other thing the SELECTs do is keep the insert buffer in a production steady-state. The buffered records are forced to be merged by the SELECTs, and a lot more of the pages from the insert buffer are in the buffer pool, not on disk. So it’s not just the buffer pool that gets skewed with a write-only workload! The insert buffer can also cause terrible performance. There are some subtleties about exactly what’s happening that I’m still investigating and may write more about later, in this particular case.

So what can we conclude from this? Simply this: if you have a standby server that’s not under a realistic workload, you won’t be able to get good performance after a failover. You need to use some technique to mirror the read-only workload to the passive server. It doesn’t have to be the tools I used — it could be MySQL Proxy or a TCP sniffer or anything else. But if you need fast failover, you need some way to at least partially emulate a production workload on the standby machine.

PS: I see Robert Hodges just published an article on warm standby for PostgreSQL. Link love for interested readers.

Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved