MongoDB high availability is essential to ensure reliability, customer satisfaction, and business resilience in an increasingly interconnected and always-on digital environment. Ensuring high availability for database systems introduces complexity, as databases are stateful applications. Adding a new operational node to a cluster can take hours or even days, depending on the dataset size. This is why having the correct topology is crucial.

In this blog post, we review the minimal topology design needed to get as close as possible to the five-nines rule. We will also review how to deploy a multi-region MongoDB replica set for HA purposes in case of a total outage in one region. This implies that the nodes are distributed so the cluster can always maintain availability.

Two region topology

Let us start with a 3-node replica set with the default configuration:

The nodes will be distributed as follows:

Region A
localhost:27000
localhost:27001

Region B
localhost:27002

Our current status of the cluster should have this setting:

Now, let’s simulate Region A becoming unavailable. This can be done by killing the processes or rejecting network traffic for the nodes. The outcome is the same.

  • Killing the process

  • Reject network traffic

Now, our cluster has two unreachable nodes:

Now, we have only one node available that can’t become primary as the majorityVoteCount is unmet. This makes the cluster unavailable for writing and enters a read-only mode status.

This message should appear in the logs of the remaining node:

In this case, we cannot bring the other nodes up again, so we must either remove the bad nodes or change its priority and votes properties in the replica set configuration until region A becomes available again.

Reconfiguring the replica set

Removing the nodes won’t be possible by running the rs.remove()  as a PRIMARY  node is needed to execute that command. We must modify the replica set configuration object and force a reconfiguration of the replica set by removing the down nodes or setting their priority and votes to 0.

  • Removing the nodes

1. First, save the output of rs.config()  to a variable:

2. Print the members field (output suppressed for better readability):

3. Remove the unreachable nodes from the configuration and leave only the sane node.

4. Then reconfigure the replica set with the force option:

      • Changing the priority and votes setting. This method is less aggressive and will make it easier to get the nodes back once the outage is over.

1. First, save the output of rs.config()  to a variable:

2. Modify the priority and votes values of the cfg variable for the unreachable nodes.

3. Then reconfigure the replica set with the force option:

If we now check the status of the cluster, we will see that the number of nodes needed to elect a PRIMARY  will be one, and our node will change its state to PRIMARY

Once Region A becomes available again, we must revert the changes to get our nodes back into the replica set. This can be done using the rs.add() helper or manually editing the replica set configuration object as we did before and reconfiguring the replica set with rs.reconfig().

  • Using rs.add()  to add the nodes back

  • Reconfiguring the replica set by setting the priority  and votes

  • Reconfiguring the replica set by adding the configuration for the missing nodes.

Three region topology

So, how do we achieve high availability if a whole region goes down? We need at least three regions and all the nodes scattered across them. This way, if a whole region is unavailable, you will only lose a minority of nodes, which will not impact the cluster’s availability.

In our new scenario, we have three regions: Region A, Region B, and Region C. Our nodes are distributed as follows:

region A
localhost:27000

Region B
localhost:27001

Region C
localhost:27002

We will simulate a full region outage for Region C:

We can see that even though one node is down, we still can meet the required majorityVoteCount, which is still 2.

In this case, we don’t need to change the topology or the configuration and can wait until Region C becomes available again.

Conclusion

To achieve a MongoDB high-availability cluster that can withstand a full regional outage, we must distribute our replica set members across at least three regions. This ensures that we have enough members to maintain a quorum in the event of a full region outage. Automated failover mechanisms, monitoring, testing of the system regularly, and regular backups to further safeguard against data loss and minimize recovery time are also important to ensure the cluster performs reliably under various failure scenarios.

Considerations

It is important to remember that the oplog and flowControl will also play an important role in this scenario, but that is out of the scope of this article. It is worth mentioning that it’s important to have a big oplog window to ensure the bad nodes won’t go out of sync, and a full sync will be needed. As a baseline, a good starting point for the oplog is 24 hrs. Alongside this, flowControl must also be monitored to ensure there won’t be performance degradation on write operations as the replication lag keeps increasing.

Further Reading

reasons to switch from Mongodb to percona for mongodb

Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments