Percona Backup for MongoDB (PBM) supports snapshot-based physical backups. This is made possible by the backup cursor functionality present in Percona Server for MongoDB.
In a previous post, we discussed Percona Backup for MongoDB and Disk Snapshots in Google Cloud Platform (part 1) and showed how to implement snapshot-based backups. Now, let’s see how to restore a snapshot-based backup in GCP.
For this demo, I have created a 2-shard MongoDB cluster (each shard consisting of a 3-node PSA replica set) deployed on Google Cloud Platform instances. Each instance has an extra persistent disk attached for storing the MongoDB data, and the PBM agent is installed as per the documentation.
Let’s start by checking the details of the backup we are going to restore. Remember that we can also get the complete list of available backups by running pbm list.
|
1 |
# pbm describe-backup 2024-10-03T15:23:51Z<br>name: "2024-10-03T15:23:51Z"<br>opid: 66feb70725398000d9398e35<br>type: external<br>last_write_time: "2024-10-03T15:23:54Z"<br>last_transition_time: "2024-10-03T15:24:09Z"<br>mongodb_version: 7.0.14-8<br>fcv: "7.0"<br>pbm_version: 2.6.0<br>status: done<br>size_h: 0 B<br>replsets:<br>- name: shard1<br> status: done<br> node: gcp-test-mongodb-shard01svr1:27018<br> last_write_time: "2024-10-03T15:23:53Z"<br> last_transition_time: "2024-10-03T15:24:08Z"<br> security: {}<br>- name: shard0<br> status: done<br> node: gcp-test-mongodb-shard00svr1:27018<br> last_write_time: "2024-10-03T15:23:46Z"<br> last_transition_time: "2024-10-03T15:24:09Z"<br> security: {}<br>- name: mongo-cfg<br> status: done<br> node: gcp-test-mongodb-cfg02:27019<br> last_write_time: "2024-10-03T15:23:54Z"<br> last_transition_time: "2024-10-03T15:24:08Z"<br> configsvr: true<br> security: {}<br> |
Here we can see the nodes that PBM had selected (one per replica set) to be snapshotted at the time of the backup.
The first step of the restore is to shut down all mongos routers and arbiter nodes. PBM agent is not meant to be run on those server types, so PBM cannot do it for you automatically.
|
1 |
# systemctl stop mongos<br># systemctl stop mongod |
Now we need to start the restore from any node with pbm client:
|
1 |
# pbm restore --external<br>Starting restore 2024-10-03T15:23:51Z from [external].....................................................Ready to copy data to the nodes data directory.<br>After the copy is done, run: pbm restore-finish 2024-10-03T15:23:51Z -c </path/to/pbm.conf.yaml><br>Check restore status with: pbm describe-restore 2024-10-03T15:23:51Z -c </path/to/pbm.conf.yaml><br>No other pbm command is available while the restore is running! |
This step takes a few minutes while Percona Backup for MongoDB stops the database, cleans up data directories on all nodes, provides the restore name, and prompts you to copy the data.
Next, we use the snapshots to re-create the volumes for each member of the cluster. Let’s start with the config servers.
We need to get the ID of the snapshot to restore. Here, we can search through available snapshots using the snapshot name and date as we saved it in the “Description” field. For example:
|
1 |
# gcloud compute snapshots list <br> --filter="name:gcp-test-mongodb-cfg01-data AND description:*2024-10-01-11-52*" <br> --format="value(name)"<br>NAME DISK_SIZE_GB SRC_DISK STATUS<br>gcp-test-mongodb-cfg01-data-2024-10-01-11-52z 20 northamerica-northeast1-b/disks/gcp-test-mongodb-cfg01 READY<br> |
Now, we need to follow these steps for all three Config Servers:
1. Unmount and detach the old volume
|
1 |
# umount /var/lib/mongo<br># gcloud compute disks list <br> --filter="labels.Name=gcp-test-mongodb-cfg01-data" <br> --format="value(name,zone)"<br># gcloud compute instances detach-disk gcp-test-mongodb-cfg01 --disk=gcp-test-mongodb-cfg01-data |
2. Create a new volume based on the snapshot
|
1 |
# gcloud compute disks create gcp-test-mongodb-shard01svr1-data-new <br> --source-snapshot=gcp-test-mongodb-cfg01-data-2024-10-01-11-52z <br> --zone=us-west1-a <br> --type=pd-balanced<br> |
3. Attach the new volume
|
1 |
# INSTANCE_ID=$(gcloud compute instances list --filter="name='gcp-test-mongodb-cfg01'" --format="get(name)")<br><br># gcloud compute instances attach-disk $INSTANCE_ID <br> --disk=gcp-test-mongodb-shard01svr1-data-new <br> --device-name=/dev/disk/by-id/google-persistent-disk-1 <br> --zone=us-west1-a<br> |
4. Mount the volume
|
1 |
# lsblk<br>NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS<br>sda 8:0 0 20G 0 disk<br>├─sda1 8:1 0 200M 0 part /boot/efi<br>└─sda2 8:2 0 19.8G 0 part /<br>sdb 8:16 0 20G 0 disk<br><br># mount /dev/sdb /var/lib/mongo |
We need to repeat the process for all Shard1 and Shard2 replica set members, using the proper snapshot in each case.
Once that is done, the last step is to finish the restore with PBM:
|
1 |
# pbm restore-finish -c /etc/pbm-storage.conf |
We have covered the manual approach, let’s see now how we can automate the above steps.
The idea is to provide the script with the backup name we want to restore and the cluster’s topology.
The script should:
Note: In order to manipulate instances, volumes and snapshots we need to have the instanceAdmin and storageAdmin IAM roles assigned to our user (or service account).
The example script is available on Github. It requires gcloud CLI to be installed. Keep in mind this is just a proof of concept, so don’t use it for production environments, as there is only basic error checking.
We call the script specifying:
For example:
|
1 |
# ./test_restore.sh 2024-10-03T15:23:51Z <br> --config-servers=gcp-test-mongodb-cfg00,gcp-test-mongodb-cfg01,gcp-test-mongodb-cfg02 <br> --shard0=gcp-test-mongodb-shard00svr0,gcp-test-mongodb-shard00svr1 <br> --shard1=gcp-test-mongodb-shard01svr0,gcp-test-mongodb-shard01svr1 <br> --arbiters=gcp-test-mongodb-shard01arb0,gcp-test-mongodb-shard00arb0 <br> --mongos=gcp-test-mongodb-mongos00<br> |
Percona Backup for MongoDB provides the interface for making snapshot-based physical backups and restores. We have covered the complete restore process using snapshots in GCP and provided a bash script example. In a real production environment, automating the restore process using Ansible or similar tooling could also be a good idea.
If you have any suggestions for feature requests or bug reports, make sure to let us know by creating a ticket in our public issue tracker. Pull requests are also more than welcome!
Resources
RELATED POSTS