Improving Percona XtraDB Cluster SST startup with Google Compute Engine snapshotsNik Vyzas
As the need for information grows so does the size of data we need to keep in our databases. SST is unavoidable for spinning up new nodes in a PXC cluster and when datasets reach the “terra-byte” range this becomes ever more cumbersome requiring many hours for a new node to synchronize.
More often that not, it is necessary to implement custom “wsrep_sst” scripts or resort to manual synchronization processes. Luckily cloud providers provide convenient methods to leverage disk snapshots that can be used to quickly transfer data between nodes.
This article deals with the actions needed to perform a snapshot on Google’s Compute Engine (GCE) infrastructure. A similar method can be used on AWS EC2 instances using EBS snapshots or any other form of snapshots such as LVM, ZFS or SAN. The steps described can be used to add a new node to a PXC cluster or to avoid SST. The following procedure can also be used to take advantage of the performance benefit of GCE Snapshots. A similar procedure can be used for adding a regular slave provided the binary log co-ordinates have been captured. This article assumes your “datadir” is on a separate disk to your operating system partition using the “ext4” filesystem:
- Select a suitable “donor” node, we will use “node1” for this example.
- Stop the MySQL service on “node1” or perform a FTWRL with the MySQL service running on a node which is in “desync/donor” mode
Shell12345678910# Take the snapshot from a stopped instance[root@node1 /] service mysql stop & tail -f /var/log/mysql/error.log# OR alternatively take the snapshot from a 'desynced' node### desync from cluster replicationmysql> set global wsrep_desync=ON;### get FTWRLmysql> flush tables with read lock;
- While the MySQL service is down on “node1” or the FTWRL is held create a snapshot in the Google Developer Console for the disk or using the GCE API (* this assumes that the datadir is located in a separate standalone disk). This part of the process takes around 15 minutes for a 3.5 TB disk.
Shell1gcloud compute disks snapshot node1-datadir-disk --snapshot-name node1-datadir-disk-snapshot-1
- As soon as the snapshot has completed start the MySQL service on “node1” (verifying the node has successfully joined the cluster) or release the FTWRL
Shell12345678910111213# Depending on the steps followed in step 1 either start MySQL on node1[root@node1 /] service mysql start & tail -f /var/log/mysql/error.log# OR alternatively release the FTWRL and "sync" the node### release FTWRLmysql> unlock tables;### if there is high load on the cluster monitor wsrep_local_recv_queue### until it reaches 0 before running the following command to rejoin### the cluster replication (otherwise it can be run immediately after### releasing the FTWRL):mysql> set global wsrep_desync=OFF;
***** IMPORTANT NOTE: In case “node1” is unable to rejoin the cluster or requires an SST you will need to re-create the snapshot from another node or after SST completes.
- Now connect to the “joiner” node, we will use “node2” for this example.
- Unmount the existing disk from “node2” for this example (assuming MySQL service is not running else stop the MySQL service first)
Shell1[root@node2 /] umount /var/lib/mysql
- Detach and delete the disk containing the MySQL datadir from the “node2” instance in the Google Developer Console or using the GCE API
Shell12gcloud compute instances detach-disk node2 --disk node2-datadir-diskgcloud compute disks delete node2-datadir-disk
- Create and attach a new disk to the “node2” instance in the Google Developer Console or using the GCE API using the snapshot you created in step 3. This part of the process takes around 10 minutes for a 3.5 TB disk
Shell12gcloud compute disks create node2-datadir-disk --source-snapshot node1-datadir-disk-snapshot-1gcloud compute instance attach-disk node2 --disk node2-datadir-disk
- [ *** LVM only step *** ]: If you are using LVM the device will not show up in this list until you have activated the Volume Group (“vg_mysql_data” in this example)
Shell123456789101112131415# this command will report the available volume groups[root@node2 /] vgscanReading all physical volumes. This may take a while...Found volume group "vg_mysql_data" using metadata type lvm2# this command will report the available logical volumes, you should see the LV INACTIVE now[root@node2 /] lvscanINACTIVE '/dev/vg_mysql_data/lv_mysql' [20.00 TiB] inherit# this command will activate all logical volumes within the volume group[root@node2 /] vgchange -ay vg_mysql_data# this command will report the available logical volumes, you should see the LV ACTIVE now[root@node2 /] lvscanACTIVE '/dev/vg_mysql_data/lv_mysql' [20.00 TiB]
- After the device has been added it should show up on the “node2” operating system – you can retrieve the new UUID using the following command (in case you have mounted using “/dev/disk/by-name” and the name of the new disk is the same as the previous you do not need to update “/etc/fstab” e.g. this holds true for VM instances created using the Percona XtraDB click-to-deploy installer)
Shell1234[root@node2 /] ls -l /dev/disk/by-uuid/total 0lrwxrwxrwx 1 root root 10 Feb 14 15:56 4ad2d22b-500a-4ad2-b929-12f38347659c -> ../../sda1lrwxrwxrwx 1 root root 10 Feb 19 03:12 9e48fefc-960c-456f-95c9-9d893bcafc62 -> ../../dm-0 # This is the 'new' disk
- You can now proceed to adding the new UUID you retrieved in step 9 to “/etc/fstab” (unless you are using “/dev/disk/by-name” with the same disk name) and mount the new disk
Shell123456[root@node2 /] vi /etc/fstab...UUID=9e48fefc-960c-456f-95c9-9d893bcafc62 /var/lib/mysql ext4 defaults,noatime 0 0...[root@node2 /] mount -a
- Verify the data is mounted correctly and the ownership of the data directory and sub-contents are using the correct UID / GID for the MySQL user on the destination system (although this is usually OK, it is good to do a quick check)
Shell1[root@node2 /] ls -lhtR /var/lib/mysql/
- You are now ready to start MySQL and verify that the node has in fact initialised with IST (provided you have sufficient “gcache” available there shouldn’t be any other issues)
Shell1[root@node2 /] service mysql start & tail -f /var/log/mysql/error.log
The Percona XtraDB Click-to-deploy tool can be used for automated deployments and further details on creating a cluster on Google Compute Engine using this method can be found in Jay Janssen’s post, “Google Compute Engine adds Percona XtraDB Cluster to click-to-deploy process.”