Improving Percona XtraDB Cluster SST startup with Google Compute Engine snapshots

Improving Percona XtraDB Cluster SST startup with Google Compute Engine snapshots

PREVIOUS POST
NEXT POST

As the need for information grows so does the size of data we need to keep in our databases. SST is unavoidable for spinning up new nodes in a PXC cluster and when datasets reach the “terra-byte” range this becomes ever more cumbersome requiring many hours for a new node to synchronize.

More often that not, it is necessary to implement custom “wsrep_sst” scripts or resort to manual synchronization processes. Luckily cloud providers provide convenient methods to leverage disk snapshots that can be used to quickly transfer data between nodes.

This article deals with the actions needed to perform a snapshot on Google’s Compute Engine (GCE) infrastructure. A similar method can be used on AWS EC2 instances using EBS snapshots or any other form of snapshots such as LVM, ZFS or SAN. The steps described can be used to add a new node to a PXC cluster or to avoid SST. The following procedure can also be used to take advantage of the performance benefit of GCE Snapshots. A similar procedure can be used for adding a regular slave provided the binary log co-ordinates have been captured. This article assumes your “datadir” is on a separate disk to your operating system partition using the “ext4” filesystem:

  1. Select a suitable “donor” node, we will use “node1” for this example.
  2. Stop the MySQL service on “node1” or perform a FTWRL with the MySQL service running on a node which is in “desync/donor” mode
  3. While the MySQL service is down on “node1” or the FTWRL is held create a snapshot in the Google Developer Console for the disk or using the GCE API (* this assumes that the datadir is located in a separate standalone disk). This part of the process takes around 15 minutes for a 3.5 TB disk.
  4. As soon as the snapshot has completed start the MySQL service on “node1” (verifying the node has successfully joined the cluster) or release the FTWRL

    ***** IMPORTANT NOTE: In case “node1” is unable to rejoin the cluster or requires an SST you will need to re-create the snapshot from another node or after SST completes.
  5. Now connect to the “joiner” node, we will use “node2” for this example.
  6. Unmount the existing disk from “node2” for this example (assuming MySQL service is not running else stop the MySQL service first)
  7. Detach and delete the disk containing the MySQL datadir from the “node2” instance in the Google Developer Console or using the GCE API
  8. Create and attach a new disk to the “node2” instance in the Google Developer Console or using the GCE API using the snapshot you created in step 3. This part of the process takes around 10 minutes for a 3.5 TB disk
  9. [ *** LVM only step *** ]: If you are using LVM the device will not show up in this list until you have activated the Volume Group (“vg_mysql_data” in this example)
  10. After the device has been added it should show up on the “node2” operating system – you can retrieve the new UUID using the following command (in case you have mounted using “/dev/disk/by-name” and the name of the new disk is the same as the previous you do not need to update “/etc/fstab” e.g. this holds true for VM instances created using the Percona XtraDB click-to-deploy installer)
  11.  You can now proceed to adding the new UUID you retrieved in step 9 to “/etc/fstab” (unless you are using “/dev/disk/by-name” with the same disk name) and mount the new disk
  12. Verify the data is mounted correctly and the ownership of the data directory and sub-contents are using the correct UID / GID for the MySQL user on the destination system (although this is usually OK, it is good to do a quick check)
  13. You are now ready to start MySQL and verify that the node has in fact initialised with IST (provided you have sufficient “gcache” available there shouldn’t be any other issues)

The Percona XtraDB Click-to-deploy tool can be used for automated deployments and further details on creating a cluster on Google Compute Engine using this method can be found in Jay Janssen’s post, “Google Compute Engine adds Percona XtraDB Cluster to click-to-deploy process.”

 

PREVIOUS POST
NEXT POST

Share this post

Leave a Reply