MongoDB 101: How to Tune Your MongoDB Configuration After Upgrading to More Memory

MongoDB configurationIn this post, we will be discussing what to do when you add more memory to your MongoDB deployment, a common practice when you are scaling resources.

Why Might You Need to Add More Memory?

Scaling resources is a way of adding more resources to your environment.  There are two main ways this can be accomplished: vertical scaling and horizontal scaling.

  • Vertical scaling is increasing hardware capacity for a given instance, thus having a more powerful server.
  • Horizontal scaling is when you add more servers to your architecture.   A pretty standard approach for horizontal scaling, especially for databases,  is load balancing and sharding.

As your application grows, working sets are getting bigger, and thus we start to see bottlenecks as data that doesn’t fit into memory has to be retrieved from disk. Reading from disk is a costly operation, even with modern NVME drives, so we will need to deal with either of the scaling solutions we mentioned.

In this case, we will discuss adding more RAM, which is usually the fastest and easiest way to scale hardware vertically, and how having more memory can be a major help for MongoDB performance.

How to Calculate Memory Utilization in MongoDB

Before we add memory to our MongoDB deployment, we need to understand our current Memory Utilization.  This is best done by querying serverStatus and requesting data on the WiredTiger cache.

Since MongoDB 3.2, MongoDB has used WiredTiger as its default Storage Engine. And by default, MongoDB will reserve 50% of the available memory – 1 GB for the WiredTiger cache or 256 MB whichever is greater.

For example, a system with 16 GB of RAM, would have a WiredTiger cache size of 7.5 GB.

The size of this cache is important to ensure WiredTiger is performant. It’s worth taking a look to see if you should alter it from the default. A good rule is that the size of the cache should be large enough to hold the entire application working set.

How do we know whether to alter it? Let’s look at the cache usage statistics:

 

There’s a lot of data here about WiredTiger’s cache, but we can focus on the following fields:

  • wiredTiger.cache.maximum bytes configured: This is the current maximum cache size.
  • wiredTiger.cache.bytes currently in the cache – This is the size of the data currently in the cache.   This is typically 80% of your cache size plus the amount of “dirty” cache that has not yet been written to disk. This should not be greater than the maximum bytes configured.  Having a value equal to or greater than the maximum bytes configured is a great indicator that you should have already scaled out.
  • wiredTiger.cache.tracked dirty bytes in the cache – This is the size of the dirty data in the cache. This should be less than five percent of your cache size value and can be another indicator that we need to scale out.   Once this goes over five percent of your cache size value WiredTiger will get more aggressive with removing data from your cache and in some cases may force your client to evict data from the cache before it can successfully write to it.
  • wiredTiger.cache.pages read into cache – This is the number of pages that are read into cache and you can use this to judge its per-second average to know what data is coming into your cache.
  • wiredTiger.cache.pages written from cache – This is the number of pages that are written from the cache to disk.   This will be especially heavy before checkpoints have occurred.  If this value continues to increase, then your checkpoints will continue to get longer and longer.

Looking at the above values, we can determine if we need to increase the size of the WiredTiger cache for our instance.  We might also look at the WiredTiger Concurrency