Emulating global transaction ID with pt-heartbeat

Global transaction IDs are being considered for a future version of MySQL. A global transaction ID lets you determine a server’s replication position reliably, among other benefits. This is great when you need to switch a replica to another master, or any number of other needs.

Sometimes you can’t wait for the real thing, but an imitation is good enough for your needs. That’s why we built a pseudo global transaction ID into pt-heartbeat, the replication heartbeat tool included with Percona Toolkit. It has some of the benefits of a global transaction ID, but it works on any version of MySQL.

Now why would we want this, and who cares anyway? Good question. Let me tell you a tale of two datacenters. Once upon a time, one of our customers had a very “tall” replication topology, with servers replicating back and forth across datacenters so that if something went really wrong with one of them, there would be a backup on the other. The worst-case scenario happened, and a datacenter was offline for a very long time.

As a result, we had a large number of replicas that we needed to switch to new masters. We had, for example, A-B-C replication, and we needed to start at C and work up the chain to figure out its replication position relative to A. But B was gone, so we couldn’t simply do the usual process of matching up binary log filenames and byte offsets. We had to look in A’s binary logs, find some distinctive pattern of updates, and look in C’s binary logs to find the same thing — and match the two up to figure out C’s position relative to A.

It turned out to be tedious, but not impossible, because pt-heartbeat updates in server A were the most recognizable thing to look for in C’s binary log. That was still very painful, though. We had to find the latest heartbeat in C, then go looking for where it originated in A, in many many binary logs. This was a lot of work, even when automated. Wouldn’t it be nice if the replication heartbeat record on C actually said approximately which binary log and byte offset to look in? That would have saved hours and hours of downtime grepping through binary logs.

That’s what we decided to build into pt-heartbeat. But not only that, we decided that this tool should be usable on any server in the replication topology, so that if the situation happened again, no matter how crazy the topology or how many servers were missing, we’d have some guidepost for where everything is relative to everything else upstream from it.

So how does the new pt-heartbeat work? In a nutshell, this feature enables you to have multiple heartbeat rows in your heartbeat table, and multiple instances of pt-heartbeat, one per server in your replication topology. The heartbeat table is also extended to allow extra information to be maintained. This includes the server’s current binary log position, and its position in replication relative to its master.

In contrast, the old version of this tool supported only a single row in the heartbeat table, and it was just a primary key and a timestamp.

Here’s a sample of what you might see in a three-server replication chain. I’m using sandbox servers on my laptop to illustrate:

The servers represent our A-B-C scenario. On the second server we can see that its position relative to its master was ginger-bin.000033:7936, and we can see that the third instance is replicating from ginger-bin.000020:21198 on the second instance.

Now if the second server disappears, and I want to start the third one replicating from the first one, I can deduce that I should look in the first server’s binlogs somewhere near ginger-bin.000033:7936 to find the right location. That saves me a huge amount of time in emergency situations.

To enable this functionality, just drop your heartbeat table (if you already have one), let the tool create a new one for you if you want (or look in the documentation to see how to do it yourself), and start an instance of pt-heartbeat per server. It just works.

Everyone who runs replication should be running pt-heartbeat already to get a reliable indicator of actual replication delay. This avoids the many problems with Seconds_behind_master, which pop up continually. But now it’s time to update that advice: everyone should not only be running pt-heartbeat, but they should be running it with the new version, and making it update every server in their replication topology so that it’s easier to figure out replication positions when things go wrong.

It’s not a real global transaction ID, but it is good enough sometimes!

Share this post

Comments (6)

  • Mat Keep Reply

    Hi Baron
    more than just “being considered for a future version of MySQL”.

    A preview version has been released and is discussed here:

    The preview cab be downloaded today from labs.mysql.com in the mysql-5.6-labs-global-trans-ids build

    November 4, 2011 at 5:55 am
  • Dave Juntgen Reply

    Baron – can not achieve the same with MHA?

    November 4, 2011 at 10:46 am
  • Baron Schwartz Reply

    Not if servers are missing. There is no reliable way to correlate positions between servers when there is a server in between that is simply missing. We need TRUE global transaction IDs to do this programmatically and reliably.

    November 4, 2011 at 11:02 am
  • Baron Schwartz Reply

    Mat, my understanding is that features that are in lab previews aren’t guaranteed to go into the next release, but features that are in development milestone releases are definitely targeted for the next release. Is this incorrect?

    November 7, 2011 at 6:15 am
  • Robert Hodges Reply

    Why wouldn’t you just recommend Tungsten Replicator in this case? It has global IDs and covers this case as well as normal slave-to-master promotion very well. Even when you know roughly where to look, matching logs is very tedious and could result in prolonged downtime for busy systems.

    November 14, 2011 at 8:20 am
  • vishal saxena Reply

    Whenever i hear the talk about failover from intermediate slaves to master and knowing that mysql does not have a way match the transactions as ‘global transactions id’.I agree that matching logs is very tedious but how about ‘row based replication’ ? i think it can help.Or may be its a dumb idea but still throwing.

    if you have A——>B——->C and B crashed and captured information is telling that B ( relay_master_log_file: test_file , exec_master_log_pos: 2399 ) was replicating from these positions before crashing then we can tell C to start replicating from A with “relay_master_log_file: test_file , exec_master_log_pos: 2399” and ignore the duplicate key errors for some time untill replication catches up.This should work fine in ‘row based replication’ correct ? because even pointing to previous position and by ignoring duplicate key errors we are bringing the rows up to the same level.

    that way we do not have to match the logs.

    April 24, 2012 at 12:49 pm

Leave a Reply