MongoDB provides scalability and high availability at ease. If you already have a sharded cluster, you know for sure what the Balancer does. If you are not an experienced MongoDB user, the Balancer is one of the key components of a sharded cluster, and the main goal is to maintain the cluster balanced, moving chunks of data between shard pairs in order to achieve the even distribution of the data. A sharded cluster is effective as long as the data is evenly distributed across the shards. Only this way you can get the best possible performance.

Along with the data, the workload must also be evenly distributed. This way, you can easily manage the provisioning of your physical resources in terms of CPUs, memory, and disks.

Some changes have been deployed for the Balancer in recent versions of MongoDB. In this article, I’ll show how the Balancer works, how its policy has recently changed, and introduce you to the new Automerger.

How a sharded cluster works in a nutshell

In a sharded cluster, you have shards; each shard is a replica set (at least three nodes recommended), providing high availability and read scalability. The collections that are partitioned are split in chunks and the chunks are distributed in the shards. MongoDB uses the shard key defined on the collections to partition the data into chunks. Each chunk has an inclusive lower and exclusive upper range based on the shard key. All the documents having the shard key values inside the boundaries of a chunk will be saved on that chunk.

When you insert documents in the collections, new chunks can be created at some point since they have a maximum size. When a chunk reaches its maximum size, it is split into two new chunks, and the boundaries are recalculated for each chunk.
The chunk splits happen automatically, but you can also run manual splits using the sh.splitAt or sh.splitFind functions (https://www.mongodb.com/docs/manual/tutorial/split-chunks-in-sharded-cluster/).

In the more recent versions, the default for the maximum chunk size changed from 64 MB to 128 MB. However, you can configure your maximum chunk size using the config database (https://www.mongodb.com/docs/manual/tutorial/modify-chunk-size-in-sharded-cluster/).

If you are not familiar with sharding, you can have a look at the documentation: https://www.mongodb.com/docs/manual/sharding/

The goal of the Balancer

The Balancer is a background thread that runs in the PRIMARY node of the Config Server Replica Set. It periodically checks how the chunks and data are distributed in the shards. If some migration thresholds are reached, the Balancer decides to migrate chunks from one shard to another. The main goal is to have roughly the same amount of data in all the shards.

Migrations could be expensive in a busy cluster. Every migration requires reading many documents from the source shard and writing them into the target shard. For this reason, there is a limitation: given a shard pair, only one chunk at a time can be moved. Consequently, in an N-shard cluster, only N/2 chunks at most can be migrated simultaneously.

If all the shard keys you defined in the collections are optimal, the migrations managed by the Balancer should be minimal or eventually close to zero. Suboptimal shard keys or a poor collection design can lead instead to a lot of expensive migrations that can slow down the entire cluster. Even worse is the case that the Balancer cannot run all the migrations that are needed, and you can have, at some point, an unbalanced and unmanageable cluster.

If you fail to define the shard key, you can change your mind and you can create a new one.. Unfortunately, re-sharding a large collection is expensive and should be avoided. You can have a look at the following article to get more details about re-sharding: https://www.percona.com/blog/resharding-in-mongodb-5-0/

Balancer policy

The Balancer has been developed from the beginning to migrate a chunk of shard pairs when specific migration thresholds are reached in the shard. The thresholds apply to the difference in the number of chunks between the shard with the most chunks for the collection and the shard with the fewest chunks for that collection.

The Balancer consistently moves chunks from the source shard to the target one with fewer chunks, ensuring the data is available for reading and writing. This simple logic was helpful in maintaining roughly the same number of chunks in all the shards.

The following table shows the migration thresholds.

Number of chunks Migration Threshold
Less than 20 2
20-79 4
More than 80 8

Unfortunately, this logic has shown some limitations. In many use cases it is possible you can have chunks with different amounts of data. Some can be full, close to the maximum permitted size for a chunk, and some others could be close to being empty. Mostly depends on the shard keys you defined in the collections. If you have suboptimal shard keys that don’t provide an even distribution of the values, then it is possible that you can face this situation.

The migrations done by the Balancer can help, but remember that migrations are costly, and the number of migrations executed in some cases is not fast enough to ensure an even distribution of data. It could happen that the differences between the shards in terms of data diverge continuously.

Starting from MongoDB version 6.0, the Balancer logic has undergone an important change. The policy no longer counts the number of chunks but the real size of the data in the shards. Migration thresholds are still in place, but they are now based on the real size of the data, no matter how many chunks there are.

A collection is deemed balanced if the data variation between its shards is less than three times the configured range size for that collection. For a default range size of 128 MB, migration will only occur if the data size difference between two shards of the same collection reaches at least 384 MB. This threshold is affected by the maximum size of the chunk. Setting a different chunk size changes the threshold as well, anyway, calculated as three times the mac chunk size.

The new policy helps in getting better-balanced clusters, even in cases where the shard keys are not optimal. However, keep in mind that very bad shard keys can lead to a very unstable cluster. The new balancer policy is not magic and cannot help in terrible situations.

Manual migrations are still doable, and you can use the moveChunk command.

Further details here:
https://www.mongodb.com/docs/v6.0/core/sharding-balancer-administration/

The new Automerger feature

Merging chunks has always been doable manually. With the mergeChunks admin command, it is possible to combine contiguous chunks in the same shard. Merging is useful to avoid having too many chunks with very little data. More chunks around also make the config database larger and less efficient. A new automerger feature has been deployed inside the Balancer since version 7.0. Manual merges with the mergeChunks command are also possible.

Automerger runs as part of the regular Balancer thread in the Primary node of the Config Server replica set.

At every execution, the Automerger checks for chunks of the same collection that hit specific mergeability requirements and automatically merges them. Two or more chunks can be merged at the same time. The mergeability requirements are listed here, and all of them need to be met:

  • The chunks must be in the same shard
  • They are not jumbo chunks
  • Their history can be purged safely without breaking transactions and snapshot reads (more details about that in the documentation)

If you like, the Automerger can be disabled or configured to run only during the Balancer window. Automerger can be disabled for a collection using sh.disableAutoMerger( <namespace> ) helper from mongosh client. It can be enabled instead with sh.enableAutoMerger( <namespace> ). 

You can also use the admin command:

The Automerger is a new, interesting feature that can help you optimize your cluster automatically and make your life a little easier. You no longer need to think about manual chunk merges. The feature is as simple as useful.

In the documentation, you can get further details: https://www.mongodb.com/docs/manual/core/automerger-concept/

Conclusions

The new Automerger introduced in 7.0 and the new Balancer policy from 6.0 can simplify the administration tasks and can help to get a more stable and reliable sharded cluster.

Remember, migrations are expensive, as usual. We, as DBAs or developers, have to help MongoDB avoid these costs. Choosing optimal shard keys remains one of the more important things you have to do. Don’t fail on that, and don’t simply trust new amazing features.

MongoDB develops very quickly. A new major version is released every year, and old versions become unsupported in a matter of a little more than a couple of years. I strongly recommend upgrading to 7.0 as soon as you can if you’re still using older versions. Version 5.0 is end-of-life in October 2024, so in a few days. Version 6.0 will be end-of-life in July 2025. I recommend having a plan for regular upgrades every year, even because you could benefit from new features like the Automerger, which could be really useful.


Whether you’re a seasoned DBA well-versed in MongoDB or a newcomer looking to harness its potential, this eBook provides the insights, strategies, and best practices to guide you through MongoDB upgrades, ensuring they go as smoothly as possible and your databases remain optimized and secure.

 

From Planning to Performance: MongoDB Upgrade Best Practices

Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments