How To Inject an Empty XA Transaction in MySQL

Inject an Empty XA Transaction in MySQLIf you are using XA transactions, then you’ve likely run into a few replication issues with the 2PCs (2 Phase Commits). Here is a common error we see in Percona’s Managed Services and a few ways to handle it, including injecting an empty XA transaction.

What Does it Mean?

It means that replication has tried to commit an XID (XA transaction ID) that does not exist on the server. We can verify that it does not exist by checking:

In this case, there is a prepared XA transaction on the server but it is XID X’2B’,X’B2’,1′ not X’1a’,X’a1’,1′. So indeed, the XID does not exist.

How Do We Fix It?

A few ways… When using anonymous replication, it can be skipped like any other error:

When using GTIDs, it can be skipped by the typical injecting an empty GTID:

Another option is that we can inject an empty XA transaction, much like we do with GTID. Then we can resume replication so it can naturally commit that XID.

To prepare an empty XA, first copy the SQL + XID from the error. In this case “XA COMMIT X’1a’,X’a1′,1”.

Now transform it into three statements, and run them on the erred replica.

This will have created a prepared XA transaction on the server. We can verify by running:

So, let’s start replication:

Uh oh, now what? When you prepare an XA transaction on a server, your session cannot execute any other SQL. You must disconnect from MySQL, reconnect, then start replication.

Regardless of how you handled the error, it is recommended to run a checksum to validate data consistency.

How Does This Happen?

2PCs write to the binlogs in two…phases 🙂 The first phase contains the {XA START/transaction SQL/XA END/XA PREPARE}. Think of all those statements as a single GTID. Once the XA PREPARE command has run, that whole transaction is written to the binary log so these statements will always be written together. Example:

Now we have a prepared XA on the source and replicas (which can hold row locks and block other transactions). The transaction can now be committed or rolled back, this depends on the second phase and this is where the problems come in.

The second phase commit/rollback can come seconds later, minutes later, days later, or even never. It all depends on when/if the Transaction Manager issues the command. In this case, it was 4 minutes later:

There could be hundreds or thousands of other transactions written to the binary log in between the first and second phases. They could even be written to different binlogs.

This explanation is just to show how 2PCs work to understand the separate parts of an XA transaction.

Summary

Now to try and give some answer to “how does this happen?”…it could be from restoring a backup but MySQL did not get told to prepare some XID on the server. Now replication starts and it is reading events from the source’s binlog. Then it comes across an XA COMMIT but that XID was not prepared so it errors. Ultimately, these issues usually come down to some bug (here is one for example).

Do you have XAs blocking other transactions? Check out Planes, Trains, and Automobiles: MySQL XA Transactions.

Here is another post on how to troubleshoot XA recovery.

Percona Server for MySQL is also working on making XA RECOVER CONVERT XID more helpful!

Share this post

Leave a Reply