Running Percona Backup for MongoDB¶
- Initial Setup
- Running Percona Backup for MongoDB
Please see Authentication if you have not already. This will explain the MongoDB user that needs to be created, and the connection method used by Percona Backup for MongoDB.
- Determine the right MongoDB connection string for the
pbmCLI. (See MongoDB connection strings - A Reminder (or Primer))
- Use the
pbmCLI to insert the config (especially the Remote Storage location and credentials information). See Insert the whole Percona Backup for MongoDB config from a YAML file
- Start (or restart) the pbm-agent processes for all mongod nodes.
After installing pbm-agent on the all the servers that have mongod nodes make sure one instance of it is started for each mongod node.
E.g. Imagine you put configsvr nodes (listen port 27019) colocated on the same servers as the first shard’s mongod nodes (listen port 27018, replica set name “sh1rs”). In this server there should be two pbm-agent processes, one connected to the shard (e.g. “mongodb://username:password@localhost:27018/”) and one to the configsvr node (e.g. “mongodb://username:password@localhost:27019/”).
It is best to use the packaged service scripts to run pbm-agent. After adding the database connection configuration for them (see Configuring service init scripts), you can start the pbm-agent service as below:
$ sudo systemctl start pbm-agent $ sudo systemctl status pbm-agent
For reference an example of starting pbm-agent manually is shown below. The output is redirected to a file and the process is backgrounded. Alternatively you can run it on a shell terminal temporarily if you want to observe and/or debug the startup from the log messages.
$ nohup pbm-agent --mongodb-uri "mongodb://username:password@localhost:27018/" > /data/mdb_node_xyz/pbm-agent.$(hostname -s).27018.log 2>&1 &
Running as the
mongod user would be the most intuitive and convenient way.
But if you want it can be another user.
When a message “pbm agent is listening for the commands” is printed to the pbm-agent log file it confirms it connected to its mongod successfully.
With the packaged systemd service the log output to stdout is captured by systemd’s default redirection to systemd-journald. You can view it with the command below. See man journalctl for useful options such as ‘–lines’, ‘–follow’, etc.
~$ journalctl -u pbm-agent.service -- Logs begin at Tue 2019-10-22 09:31:34 JST. -- Jan 22 15:59:14 akira-x1 systemd: Started pbm-agent. Jan 22 15:59:14 akira-x1 pbm-agent: pbm agent is listening for the commands ... ...
If you started pbm-agent manually see the file you redirected stdout and stderr to.
Provide the MongoDB URI connection string for
pbm. This allows you to call
pbm commands without the
Use the following command:
For more information what connection string to specify, refer to The pbm connection string section.
pbm is the command line utility to control the backup system.
This must be done once, at installation or re-installation time, before backups can be listed, made, or restored. Please see Percona Backup for MongoDB config in a Cluster (or Non-sharded Replica set).
$ pbm list
2020-07-10T07:04:14Z 2020-07-09T07:03:50Z 2020-07-08T07:04:21Z 2020-07-07T07:04:18Z
$ pbm backup
Starting a backup with compression
$ pbm backup --compression=s2
s2 is the default compression type. Other supported compression types are:
none value means no compression is done during
For PBM v1.0 (only) before running
pbm backup on a cluster stop the
pbm list command and you will see the running backup listed with a
‘In progress’ label. When that is absent the backup is complete.
To restore a backup that you have made using
pbm backup you should use the
pbm restore command supplying the time stamp of the backup that you intend to
pbm restore on a cluster stop the
If you enabled Point-in-Time Recovery, disable it before running
pbm restore. This is because Point-in-Time Recovery incremental backups and restore are incompatible operations and cannot be run together.
Whilst the restore is running, clients should be stopped from accessing the database. The data will naturally be incomplete whilst the restore is in progress, and writes they make will cause the final restored data to differ from the backed-up data. In a cluster’s restore the simplest way would be to shutdown all mongos nodes.
Percona Backup for MongoDB is designed to be a full-database restore tool. As of version <=1.x it
will perform a full all-databases, all collections restore and does not
offer an option to restore only a subset of collections in the backup, as
MongoDB’s mongodump tool does. But to avoid surprising mongodump users Percona Backup for MongoDB
as of now (versions 1.x) replicates mongodump’s behavior to only drop
collections in the backup. It does not drop collections that are created new
after the time of the backup and before the restore. Run a db.dropDatabase()
manually in all non-system databases (i.e. all databases except “local”,
“config” and “admin”) before running
pbm restore if you want to guarantee
the post-restore database only includes collections that are in the backup.
$ pbm restore 2019-06-09T07:03:50Z
After a cluster’s restore is complete, restart all
mongos nodes to reload the sharding metadata.
Starting from v1.3.2, the Percona Backup for MongoDB config includes the restore options to adjust the memory consumption by the pbm-agent in environments with tight memory bounds. This allows preventing out of memory errors during the restore operation.
restore: batchSize: 500 numInsertionWorkers: 10
Default Value: 500
The number of documents to buffer.
Default Value: 10
The number of workers that add the documents to buffer.
The default values were adjusted to fit the setups with the memory allocation of 1GB and less for the agent.
The lower the values, the less memory is allocated for the restore. However, the performance decreases too.
You can cancel a running backup if, for example, you want to do another maintenance and don’t want to wait for the large backup to finish first.
To cancel the backup, use the
pbm cancel-backup command.
$ pbm cancel-backup Backup cancellation has started
After the command execution, the backup is marked as canceled in the
pbm list output:
$ pbm list ... 2020-04-30T18:05:26Z Canceled at 2020-04-30T18:05:37Z
pbm delete-backup command to delete a specified backup or all backups
older than the specified time.
The command deletes the backup regardless of the remote storage used: either S3-compatible or a filesystem-type remote storage.
You can only delete a backup that is not running (has the “done” or the “error” state).
To delete a backup, specify the
<backup_name> from the the
output as an argument.
$ #Get the backup name $ pbm list Backup history: 2020-04-20T10:55:42Z 2020-04-20T13:07:34Z 2020-04-20T13:13:20Z 2020-04-20T13:45:59Z $ #Delete a backup $ pbm delete-backup 2020-04-20T13:45:59Z
By default, the
pbm delete-backup command asks for your confirmation
to proceed with the deletion. To bypass it, add the
$ pbm delete-backup --force 2020-04-20T13:45:59Z
To delete backups that were created before the specified time, pass the
--older-than flag to the
command. Specify the timestamp as an argument
pbm delete-backup command in the following format:
%Y-%M-%DT%H:%M:%S(e.g. 2020-04-20T13:13:20) or
$ #Get the backup name $ pbm list Backup history: 2020-04-20T20:55:42Z 2020-04-20T23:47:34Z 2020-04-20T23:53:20Z 2020-04-21T02:16:33Z $ #Delete backups created before the specified timestamp $ pbm delete-backup -f --older-than 2020-04-21 Backup history: 2020-04-21T02:16:33Z