September 17, 2014

Avoiding SST when adding new Percona XtraDB Cluster node

Some people want to use a backup to prepare a new Percona XtraDB Cluster node. They want this to avoid State Snapshot Transfer that could slow down the donor (depending of the SST method you are using, the donor can be blocked. I will cover this in a future blog post). As backup are generally performed during non-peak time, the effect should be reduced, and this avoid the need of performing 2 backups: the usual backup and the SST).

So to be able to use a backup for this purpose, we have 3 prerequisites:

  • use XtraBackup >= 2.0.1
  • the backup needs to be performed with –galera-info (option for innobackupex)
  • have a gcache big enough to store all the changes from the time of the backup until the restore to be able to perform the Incremental State Transfer (IST) gcache.size cannot be changed during runtime but needs to be defined in my.cnf. This change requires a restart of mysql.
  • provide the ist.recv_addr (ex: wsrep_provider_options = “ist.recv_addr=192.168.70.2;”) if you don’t use yet the magic wsrep_node_address variable (see below)

Once you have your backup, you should now see a file called xtrabackup_galera_info. The file contains the local node state at the time of the backup.

So when you have restored the backup, you can notice that you don’t have the file grastate.dat in the datadir (or you have an old one if this is not a fresh node).
The trick is to modify this file with the information fetched during the backup.

For example, if we have in xtrabackup_galera_info the following content:

5f22b204-dc6b-11e1-0800-7a9c9624dd66:23

We will need to edit grastate.dat as follow:

The version in grastate.dat comes from the global variable wsrep_provider_version:

After that you will be able to start the node and see in the donor that IST is used to populate the new node. You can see it in the logs:

and on the new node :

PS1: you can change the size of gcache in my.cnf using the following syntax:

wsrep_provider_options="gcache.size=4G;"

PS2: using wsrep_node_address is the recommended way to define on which address lives a PXC node.
You can then avoid to specify wsrep_sst_receive_address, wsrep_node_incoming_address and ist.recv_addr that are very common in PXC configuration.

About Frederic Descamps

Frédéric joined Percona in June 2011, he is an experienced Open Source consultant with expertise in infrastructure projects as well in development tracks and database administration.

Frédéric is a believer of devops culture.

Comments

  1. tenx for information

  2. gphilip says:

    Thanks, very useful info. Is it possible that xtrabackup_galera_info is not generated in case of a partial backup (using –include)?

  3. Thank you for the information, I’m looking for

  4. Morten Isaksen says:

    Thank you for this article.

    One hint. Remember when you have created the grastate.dat file to make it writeable for the mysql user. I think this error was the reason my first attempt did not work.

  5. Spin0us says:

    Thank you for this article.

    I just have a kernel panic on one of my 3 nodes cluster and try your method to restore the crash node.
    But it still do the State Snapshot Transfert that slow down the donor.
    What’s wrong ?

Speak Your Mind

*