Feature in details: Incremental state transfer after a node crash in Percona XtraDB Cluster

January 31, 2013
Author
Vadim Tkachenko
Share this Post:

With our newest release of Percona XtraDB Cluster, I would like to highlight a very nice ability to recovery a node and bring it back to the cluster with an incremental transfer after a crash.
This feature was available even in previous release, but now I want to give some details.

So, MySQL crashes from time to time and this a fact of life. HA solution is exactly needed to deal with an one node failure and allowing whole cluster continuing to work.

The idea is, if a node crashed, after it recovered – we just transfer all changes that happened in the cluster, while the node was down. It sounds easy in words, but proven hard when it comes to implementation. It all comes to the question: if mysqld crashes, how do we know what is the last transaction was executed. For a single InnoDB instance it is easy, there is always LSN, which is used for recovery, but in a cluster all nodes have their individual LSNs. Instead Cluster uses Global Transaction ID (GTID), in form
50176f05-69b5-11e2-0800-930817fe924a:8549230.
So, how can we store GTID so it is available after a crash? Of course there is always a good all way to store it in a separate file, which however will require an additional fsync call for each transaction, and it is know performance killer.

Instead, we store GTID in InnoDB system area, which is updated for each transaction. So even the system crashes, we can access information about the last commited transaction.
In XtraDB Cluster you can access this information by calling mysqld with option:
mysqld --wsrep-recover
and having this information, we can force the node to start forcing to use initial GTID:, i.e
mysqld --wsrep_start_position=50176f05-69b5-11e2-0800-930817fe924a:8549230

In fact the same methodology can be used if to restore nodes from backup.
We can start all nodes from an identical starting position, so they all will assume to start on identical data. Well, you can do it even on not identical data, but you know that you do not have a consistent cluster in this case.

As it all may sound complicated, this logic
--wsrep-recover / --wsrep_start_position= is implemented in mysqld_safe script, so you have it out of box.

Here is how a start process looks like in an error.log:

So what happens there? Basically at start node detects that the last transaction had GTID:
50176f05-69b5-11e2-0800-930817fe924a:8549230, but joining cluster, it figures out
that the cluster already at position
50176f05-69b5-11e2-0800-930817fe924a:8549242,
and to catch up, the node has to recieve 12 events, which succesfully happen there:
130129 23:02:01 [Note] WSREP: Receiving IST: 12 writesets, seqnos 8549230-8549242

After applying 12 events locally the node is ready and succesfully joins cluster.

You may try it with the latest release Percona XtraDB Cluster 5.5.29

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved