This blog was originally published in September 2020 and was updated in April 2025.

As a MongoDB user, ensuring your data is safe and secure in the event of a disaster or system failure is crucial. That’s why it’s essential to implement effective MongoDB backup best practices and strategies. Regular database backups are the cornerstone of data protection.

Why are MongoDB database backups important? (A core best practice)

Implementing regular database backups is a fundamental MongoDB backup best practice. It’s essential to protect against data loss caused by system failures, human errors, natural disasters, or cyberattacks. Without a proper backup strategy, data can be lost forever, leading to significant financial and reputational damage.

For organizations that rely on data to operate, database backups are critical for business continuity. With a robust backup and recovery plan in place – another key best practice – companies can restore their systems and data quickly and minimize downtime. This is essential to maintain customer trust and avoid business disruption.

In this blog, we will discuss different MongoDB database backup strategies and their use cases, highlight MongoDB backup best practices, pros and cons, and provide a few other useful tips.

Understanding MongoDB backup types: Logical vs. physical

Generally, there are two primary types of backups used with database technologies like MongoDB, each with its own set of best practices:

  • Logical Backups: Capture data by reading it from the database and writing it to a file, typically in a format like BSON, JSON, or CSV.

  • Physical Backups: Involve copying the actual data files from the filesystem.

Additionally, when working with logical backups, incremental backups (capturing changes since the last full backup using oplog entries) are a common best practice to minimize data loss.

We will discuss these backup options, how to implement them, and which is better based on requirements and environment, including our open-source utility, Percona Backup for MongoDB (PBM). PBM is a fully supported community backup tool for consistent backups in MongoDB replica sets and sharded clusters.

Logical backups for MongoDB: mongodump

Logical backups involve dumping data from databases into backup files. With MongoDB, this means creating BSON-formatted files using the mongodump utility. mongodump reads data via the client API, serializes it, and writes it to disk.

Note: Omitting –db or –collection backs up all databases or collections, respectively. authenticationDatabase is required if authorization is enabled.

A key MongoDB backup best practice when using mongodump for point-in-time recovery capability is to include the –oplog option. This captures incremental changes while the backup is running. (Note: –oplog typically works with full database dumps, not specific collections).

Pros of logical backups

  1. Granular: Can back up specific databases or collections.

  2. Online: Does not require halting writes on the node where the backup runs (though performance impact is possible).

Cons of logical backups

  1. Slow: Can be slow for large databases as it reads all data, increasing WiredTiger cache pressure.

  2. Index Rebuilds: Does not back up index data directly; indexes must be rebuilt during restore, which is time-consuming.

  3. I/O Intensive: Involves significant read/write activity.

Best Practice Tip: Always run logical backups (like mongodump) against secondary nodes in a replica set to avoid impacting the PRIMARY’s performance.

Logical backup best practices for different setups:

  • Replica Set: Run mongodump on a secondary.

  • Sharded Clusters: Back up the config server replica set and each shard replica set (using their secondaries) individually. Achieving point-in-time consistency across a sharded cluster with mongodump alone can be challenging due to varying backup completion times for each shard.

Restoring logical backups with mongorestore

To restore an incremental dump (taken with –oplog), use the –oplogReplay option with mongorestore.

Best Practice Tip: The –oplogReplay option is generally used when restoring all databases from a full instance dump with oplog.

MongoDB Alternative

Percona Backup for MongoDB: A best practice tool

Percona Backup for MongoDB (PBM) is a distributed, low-impact solution designed for consistent backups of MongoDB sharded clusters and replica sets, aligning with many MongoDB backup best practices. It addresses consistency challenges in sharded cluster backups and is well-suited for large datasets.

Key advantages & best practices with PBM:

  • Cluster Consistency: Achieves replica set and sharded cluster consistency via oplog capture. Supports distributed transaction consistency (MongoDB 4.2+).

  • Flexible Storage: Back up to cloud (S3-compatible) or on-premise (locally mounted remote filesystem).

  • Efficient Compression: Choice of compression algorithms (e.g., s2 with snappy for speed if resources allow).

  • Progress Logging: Track backup progress, especially for large datasets.

  • Point-in-Time Recovery (PITR): Enables PITR by restoring from a backup and replaying oplog slices up to a specific moment. This is a critical best practice for minimizing data loss.

  • Low Production Impact: Optimized for minimal performance impact on production systems.

Best Practice Tip: Use PBM to accurately time large backup and restore operations. Restores, especially to/from throttled storage, can take longer than anticipated.

Best Practice TipTwo : When scripting PBM, use a replica set connection string to avoid failures if a specific mongod host is temporarily down.

PBM uses pbm-agent processes on mongod nodes for backup and restore. The pbm list command shows backup snapshots and PITR-enabled oplog ranges.

If you have a large backup, you can track its progress in pbm-agent logs. Let’s also examine the output of “pbm-agent” while it is taking the backup.

The last three lines of the above output mean that the full backup is completed, and the incremental backup is started with a sleep interval of 10 minutes. This is an example of the Backup Progress Logging mentioned above.

Physical/filesystem backups for MongoDB: Speed and simplicity

Physical backups involve snapshotting or copying the underlying MongoDB data files (–dbPath) at a specific point in time. These are generally faster for large databases.

Methods for physical backups:

  1. Manual File Copy (e.g., rsync): Requires careful handling of consistency (e.g., stopping writes or using fsyncLock()).

  2. LVM Snapshots: Filesystem-level snapshots providing a point-in-time view.

  3. Cloud Disk Snapshots (AWS, GCP, Azure): Convenient for cloud-hosted MongoDB.

  4. Percona Server for MongoDB Hot Backup: An integrated open-source feature creating physical backups on a running server with minimal performance degradation. 

Pros of physical backups

  • Fast: Usually faster than logical backups, especially for large datasets.

  • Easy to Copy/Share: Backup files can be easily moved.

  • Good for Node Rebuilds: Convenient for quickly spinning up new nodes.

Cons of physical backups

  • Less Granular Restore: Typically restores the entire dataset; specific DB/collection restores are harder.

  • No Native Incremental (Generally): Standard filesystem copies don’t offer easy incremental options without additional tooling.

  • Consistency Management: Requires stopping writes (e.g., fsyncLock(), db.fsyncLock()) or shutting down mongod cleanly on the node being snapshotted to ensure data consistency, unless using a tool designed for hot physical backups. A dedicated (possibly hidden) node is a best practice for this.

Backup time comparison (Illustrative)

Below is the backup time consumption comparison for the same dataset:

DB Size: 267.6GB
Index Size: <1MB (since it was only on _id for testing)

=============================

  1. Percona Server for MongoDB’s hot backup:

Syntax:

Best Practice Tip (Percona Hot Backup): The backup path (backupDir) should be absolute. It supports filesystem and AWS S3. It is recommended to run hot backups against secondary nodes.

Notice the time taken by “Percona Hot Backup” was just four minutes, approximately. 

This is very helpful when rebuilding a node or spinning up new instances/clusters with the same dataset. The best part is it doesn’t compromise performance with locking of writes or other performance hits. 

Best practice tip: It is recommended to run it against the secondaries. 

  1. Filesystem snapshot:

The approximate time taken for the snapshot to be completed was only four minutes.

     3. Mongodump:

Results: As you can see from this quick example using the same dataset, both the file system level snapshot and Percona Server for MongoDB Hot Backup methods took only 3-5 minutes. However, “mongodump” took almost 15 minutes for just 20% of the dump to complete. Hence, the speed to back up the data with mongodump is definitely very slow when compared to the other two options discussed. That is where the s2 compression and the parallelized threads of Percona Backup for MongoDB can help.

Learn more about physical backup support in Percona Backup for MongoDB

Key factors & best practices when choosing a MongoDB backup solution

Selecting the right MongoDB backup solution requires considering several factors, incorporating best practices:

Scalability

To ensure the longevity of a MongoDB database, a backup solution must be created with the database’s growth in mind. MongoDB is a flexible NoSQL database that can expand horizontally by incorporating additional servers or shards and vertically by increasing the resources available on existing servers.

Furthermore, an effective MongoDB backup solution should incorporate scalable storage alternatives, such as cloud storage or distributed file systems. These solutions allow you to expand storage capacity without requiring significant alterations to your existing backup infrastructure.

Performance

MongoDB backup solutions can have a significant impact on database performance, particularly when you are backing up large databases or using them during peak usage hours. Here are some of the things to consider when choosing a backup solution to minimize its impact on your MongoDB database performance:

  • The type of backup solution: Full backups are time-consuming and resource-intensive. In contrast, incremental backups only save changes since the last backup and are typically faster and less resource-intensive.
  • Storage destination: Backups stored on the same disk as the database can impact read and write operations, while backups stored on a remote server can increase network traffic and cause latency.
  • Database size: The larger the database, the longer it will take to backup and restore. 
  • Frequency of backups: Frequent backups consume more resources, while infrequent backups increase the risk of data loss. Balancing data protection and database performance is important to achieve optimal results.
  • Backup schedule: To minimize any effect on database users, schedule backups during off-peak hours.
  • Compression and security: Although compression and encryption can reduce the backup size and improve security, they may also impact database performance. Compression necessitates additional CPU resources, while encryption requires additional I/O resources, both of which can potentially affect database performance.

Security

Backing up your MongoDB database is critical to safeguard your data from unauthorized access, damage, or theft. Here are some ways in which a MongoDB backup solution can help:

  • Disaster recovery: A backup solution helps you recover your data in case of a natural disaster or a hacker. Regularly backing up your MongoDB database ensures that you can restore your data to a previous state if it gets lost or corrupted.
  • Data encryption: Sensitive data can be kept secure with data encryption at rest and in transit via a backup solution.
  • Access control: A good backup solution lets you regulate data access and set up encryption and authentication protocols to ensure only authorized users have access.
  • Version control: A backup solution makes it easier to track different versions of your data, enabling you to roll back to a previous version (or compare versions over time).
  • Offsite backup: Offsite backups protect data from physical theft or damage. It can also help you comply with any regulations requiring off-site backup storage.

Recommendations: Choosing your MongoDB backup method

The optimal MongoDB backup strategy depends on infrastructure, environment, resources, dataset size, and load. Consistency and managing complexity are paramount for distributed systems.

  • Small Instances: Simple logical backups via mongodump (following best practices like running on secondaries and using –oplog) might suffice.

  • Medium to Large Databases (~100GB+): Utilize tools like Percona Backup for MongoDB (PBM). Its support for incremental backups, consistent oplog capture for sharded clusters, and Point-in-Time Recovery (PITR) capabilities are essential best practices for minimizing potential data loss and ensuring robust recovery. PBM’s features like cloud integration, efficient compression, and low production impact make it a strong choice.

  • Very Large Systems (1TB+): Physical file system-level snapshot backups often become necessary for speed. Tools like Percona Server for MongoDB’s Hot Backup feature offer a reliable open-source solution for taking consistent physical backups with minimal performance impact.

Download Percona Backup for MongoDB

FAQs: MongoDB backup best practices

1. What is the best way to back up MongoDB?
A: The “best” way depends on size and requirements. Key MongoDB backup best practices include: using mongodump with –oplog on secondaries for smaller setups; leveraging tools like Percona Backup for MongoDB (PBM) for larger, sharded environments needing PITR and consistency; or using physical snapshots (like Percona Hot Backup) for very large databases where speed is critical. Regularly testing restores is also a vital best practice.

2. How often should MongoDB be backed up?
A: Backup frequency should be determined by your Recovery Point Objective (RPO) – how much data you can afford to lose. For critical systems, a MongoDB backup best practice is to take frequent full or incremental backups (e.g., daily) combined with continuous Point-in-Time Recovery (PITR) oplog capture (e.g., every few minutes via tools like PBM).

3. What is Point-in-Time Recovery (PITR) for MongoDB and why is it a best practice?
A: PITR allows you to restore your MongoDB database to a specific moment in time, rather than just to the time of the last full backup. It’s a crucial MongoDB backup best practice as it combines a full backup with continuously archived oplog (transaction log) entries. This minimizes data loss in case of corruption or accidental deletion. Tools like Percona Backup for MongoDB facilitate PITR.

4. Should I run MongoDB backups on the primary or secondary nodes?
A: A widely recommended practice is to run backups (both logical like mongodump and physical like snapshots or Hot Backup) on secondary nodes of a replica set. This minimizes the performance impact on your primary node, which is serving live application traffic.

5. How important is testing MongoDB backups?
A: A backup is useless if it cannot be successfully restored. Testing validates your backup integrity, your restore procedure, and helps estimate your Recovery Time Objective (RTO).

Subscribe
Notify of
guest

4 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Ankur Sahu

Very informative article…. Thanks for writing

Yiding

How is about backup a sharding cluster

Hardik Chhabra

Backup process explained very thouroghly, very good blog to learn. @Divyanshu Good work.!!

nmarukovich

I suppose we need to stop balancer before backup start on sharded cluster. Otherwise it can lead to inconsistent backups.