On behalf of the entire Percona product team for MongoDB, I’m excited to announce a significant enhancement to Percona Server for MongoDB: File Copy-Based Initial Sync (FCBIS). It is designed to accelerate your large-scale database deployment with a more efficient method for initial data synchronization. FCBIS reduces the time and resources required by the initial sync process.
The challenge
Initial sync, a critical step in deploying MongoDB Replica Set members, relies on data replication techniques that are time-consuming for large datasets. The logical method of initializing Percona Server for MongoDB new members, which is the default one, is too slow to answer the fast and dynamic scalability needs of the digital world. Moreover, in case of the server recovery, it’s especially critical to recover it as fast as possible. Until now, Percona and MongoDB Community users had only this approach to set up new ReplicaSet members or recover them if they fell out of the available oplog window. This challenge led us to close the gap to the MongoDB Enterprise Advanced and develop a source-available solution that accelerates the initial sync process without compromising on reliability and vendor lock-in.
The solution: File Copy-Based Initial Sync
File Copy-Based Initial Sync represents a leap forward in Percona Server for MongoDB deployment efficiency. Leveraging this feature, Percona Server for MongoDB users can now significantly reduce the time and resources required to bootstrap new clusters or add new nodes to existing deployments. This is achieved by directly copying data files from a fully synced MongoDB instance to a new node, bypassing the need for lengthy logical data replication over the network.
How it works
When a new MongoDB node is added to an existing Replica Set, it picks a source node with a compatible setup. It then uses a backup cursor on the source to get a list of files to copy and a timestamp that marks the point the data is consistent. Then, the file copy begins while the source node continues to operate normally. To prevent the data gap between source and target from getting too big, the target node executes $backupCursorExtend to pull in the latest changes that happened while the files were being copied. Depending on how long the file copy takes, this “catch-up” step might happen a few times. Once files are copied and the gap between the source and target node’s oplog is acceptable, the target node closes the backup cursor. Finally, it moves the copied files to its dbPath, and adjusts timestamps to ensure data consistency. At this point, the node is ready to join the set and start normal oplog apply.
Comparison of performance: Logical vs. file Copy-Based Initial Sync
By default, Percona Server for MongoDB and MongoDB Community users have relied on the logical initial sync method, which involves the database instance copying data by reading and writing BSON data and then time-costly rebuilding indexes. While effective, this process can be resource-intensive, particularly with large datasets. In contrast, FCBIS optimizes this procedure by directly copying database files from a source instance to the target, significantly reducing sync times and server overhead. This method ensures a faster and more efficient initialization of DB instances, empowering users to deploy and scale their databases with enhanced agility.
During our performance tests, we’ve observed that file copy-based initial sync performs predictably and that init sync times are similar despite differences in the data distribution or number of indexes. The overall init sync time depends mainly on external factors such as network, disk I/O utilization, and the size of the files to copy. We plan to publish another blog post with detailed benchmark testing soon.
Getting started with FCBIS on Percona Server for MongoDB
To start benefiting from FCBIS on Percona Server for MongoDB, follow these simple steps:
- Update to the latest version of PSMDB, which includes support for FCBIS (7.0.22-12, and upcoming 8.0 release):
- If subscribed to Percona Services, use one of our ProBuilds as a turnkey solution.
- Alternatively, build from sources for free.
- Configure new Replica Set member: To enable file copy-based initial sync, set the initialSyncMethod parameter to fileCopyBased on the destination member for the initial sync.
- Adjust initial sync: Configuration parameters allow you to customize the FCBIS process’s sync attempts and maximum lag size.
By adopting FCBIS, Percona Server for MongoDB users can streamline their database deployment workflows, reduce operational complexity, and achieve faster time-to-market for their applications. This is an important innovation that not only boosts scalability but also:
- Recovery from a failure, when a full re-sync is required
- Migration between private, hybrid, or cloud data centers
Stay tuned for more updates and insights as we continue to enhance our Percona Server for MongoDB to meet the evolving needs of our community and customers.
For more information on how FCBIS can benefit your MongoDB deployment strategy, visit our documentation or reach out to our support team.