MongoDB Sharding: Are Chunks Balanced (Part 1)?

MongoDB ShardingIn this blog post, we will look at how chunks split and migrate in MongoDB sharding.

Sometimes even using good shard keys can create imbalanced chunks. Below, we’ll look at how jumbo chunks affect MongoDB sharding, how they are created and why they need to split into smaller chunks. We will also walk through the introduction of some tools you can use to find and split those chunks and help the balancer migrate those split chunks across shards.

Before we start, please check that the balancer is running:

The most important consideration for sharding collections and distributing data efficiently across shards is the selection of a shard key. Or to be more specific, the selection of a good shard key, because MongoDB uses these ranges of shard key values and partition data in the collection and it is associated with a chunk.

What is a good shard key?

A good shard key enables MongoDB to distribute documents evenly throughout shards. A key that has high cardinality for better horizontal scaling and low frequency to prevent uneven document distribution, and does not increase or decrease monotonically is considered a good shard key.

Ok, I have a good shard key, but my chunks are still not balanced

If your chunks are not balanced, instead of using a good shard key, check for the jumbo chunks. This is one possible reason preventing the balancer from migrating those chunks, which leads to an uneven distribution of chunks across the shards — and ultimately to performance issues.

What are jumbo chunks?

MongoDB splits chunks when they increase beyond the configured chunk size (i.e., 64 MB) or exceeds 250000 documents. Chunk sizes are configurable, and you can change them per requirements. The balancer migrates these split chunks between shards and achieves equal distribution per shard.

Sometimes chunks cannot be broken up and continue to grow beyond the configured size. The balancer cannot move it. These chunks remain on the particular shard and are called jumbo chunks.

How can I figure out if my shard has jumbo chunks or not?

If MongoDB was not able to split the chunks that exceed its max chunk size or max number of documents, then those chunks are marked as “jumbo”.

These can be found by checking the shard status:

It should have a flag, but we can’t see one because mongos sometimes has not moved a chunk yet and is unaware it’s a jumbo chunk.

Let’s try to remove a shard in this same example. In this process, mongos has to move the chunk. Let’s see what happen,

It started moving the chunks, as it is the primary shard. I have already move database “jumbo_t” to another shard, and draining of chunks is taking a long time. We have to figure out what is wrong:

Jumbo found! Now at this time, mongos is aware that it has a jumbo chunk.

Let’s see what the output of same sh.status() is now:

Please note, only one chunk is moved to the other shard, while the other is not. The balancer can’t move that, and it is flagged as “jumbo”. So sometimes when moving a chunk is triggered, the mongos will mark a large chunk as “jumbo”.

Ok, I have Jumbo chunks in my shard, but why should I bother as all my chunks are distributed fairly?

These jumbo chunks are inconvenient to deal with, and are the possible causes of performance degradation in a sharded environment.

Let’s look at an example. There are two shards: shard1 and shard2. shard1 has jumbo chunks and all the writes are routing to shard1. mongos will try to balance the number of chunks evenly between the shards. But the chunks that can be migrated by the balancer are only non-jumbo chunks

A common misconception about the balancer is that it balances the chunks by data size. This isn’t true. It just balances the number of chunks when a particular shard reaches maximum thershold counts (that’s why you see chunks equally distributed). Hence a chunk with 0 documents in it counts just the same as one with 500k documents.

Now, let’s get back to the example. If jumbo chunks are created on the shard1, that means the chunks are more than 64MB of size. shard1 fills up faster compared to shard2, even though the number of chunks is balanced.

More data to shard1 leads to more queries route to shard1 as compare to shard2. This causes one shard with a higher load compared to the other, and leads to performance issues. Load balancing concepts aren’t correctly applied in the case of jumbo chunks.

Let’s consider one more example. Chunks are distributed among the shards equally, but we can check the chunk size and document details specific to the collections:

In the above example, you can see shard distribution specific to collection “col”. It says two chunks in each shard (shard0000 and shard0001), and shard0001 has no data while shard0000 has all the data, 129MB each chunk. Here the two chunks are on each shard because range-based sharding is being used. MongoDB allocated two chunks for each shard initially, then documents are allocated to these chunks.

How are jumbos created?

Let’s figure out how jumbos are created, and what makes them turn into non-splitting chunks that grow beyond a reasonable size.

  • The main reason for jumbos is multiple mongos, or restarting mongos regularly. This causes the splitIfShould not to be called enough, and prevents chunks from splitting. The balancer won’t be able to move it.
  • Each mongos measures how much data it has seen inserted or updated for each chunk. With each write, a call to ShouldSplit is made by sending an internal command “splitVector” to the primary that owns the chunks. If mongos is restarted, it loses this memory. If mongo services restarts are frequent, it leads to many chunks that could split, but don’t. This causes chunk imbalance across the shards.

Are these jumbos curable?

Yes, these can be fixed by performing a manual split, using the “split” command. These chunks can be split into smaller pieces and easily moved by the balancer. For more specific information on how to manually use the splitAt() and splitFind() commands, please refer this blog post written by Miguel Angel Nieto.

I cannot see any jumbo chunks, and chunks are distributed evenly in each shard but of different sizes

Chunk distribution is fairly equal among the shards, but some documents are different in chunks and that does not really cause performance issues. The data isn’t equal because of two possible reasons: the differences in the document sizes and the range of the documents to which it belongs and will reside in particular chunk.

As we discussed, the balancer just balances the number of chunks (not based on the size).

Let’s consider a test case for two shards with shard keys (custId and prodId), and range-based sharding is used. Chunks are split based on the range used, so the number of documents can be varied as per insertion. This can lead to the difference in the chunks’ sizes:

shard1: The number of documents against the same custId can be different, as well as the size of the documents:

shard2: Here there are more documents against the same custId, and the size of the documents also might vary:

Hashed sharding is considered good for shard keys with fields that change monotonically. If you need the data to be exactly split among shards, then a hashed index must be used. For details on ranged-based and hashed-based sharding, please check here under the heading “Hashed vs. Ranged Sharding”.

I hope this blog helps you understand the possible causes of uneven distribution of chunks in MongoDB sharding, and how and when these chunks are eligible for split and migration. This leads to jumbo chunks that need to be further split and migrated, and can be sorted with some chunk management tools. You also need to understand how range-based sharding makes chunks size different, as well as when to use hashed- and range-based sharding as per the requirements, as well as when to correctly optimize the shard key.

Share this post