MongoDB: Impact-free Index Builds using Detached ReplicaSet Nodes

MongoDB Impact-free Index BuildsCreating an index on a MongoDB collection is simple; just run the command CreateIndex and that’s all there is to it. There are several index types available, and in a previous post, you can find the more important index types: MongoDB index types and explain().

The command is quite simple, but for MongoDB, building an index is probably the most complicated task. I’m going to explain what the potential issues are and the best way to create any kind of index on a Replica Set environment.

A Replica Set is a cluster of mongod servers, at least 3, where the complete database is replicated. Some of the main advantages of this kind of structure are automatic failover and read scalability. If you need more familiarity with Replica Set, you may take a look at the following posts:

Deploy a MongoDB Replica Set with Transport Encryption (Part 1)

Deploy a MongoDB Replica Set with Transport Encryption (Part 2)

Deploy a MongoDB Replica Set with Transport Encryption (Part 3)

Create Index Impact

As said, creating an index for MongoDB has really a severe impact. A simple index creation on a field like the following blocks all other operations on the database:

This could be ok for a very small collection, let’s say where the building will take a few milliseconds. But for larger collections, this is absolutely forbidden.

We call this way of building an index the “foreground” mode.

The foreground index creation is the fastest way, but since it is blocking we have to use something different in the production environments. Fortunately, we can also create an index in “background” mode. We may use the following command:

The index will be built in the background by mongod using a different incremental approach. The advantage is that the database can continue to operate normally without any lock. Unfortunately, background creation takes much longer than the foreground build.

The first hint is then to create the indexes using the background option. This is OK, but not in all the cases. More on that later.

Another impact when building an index is memory usage. MongoDB uses, by default,  up to 500MB of memory for building the index, but you can override it if the index is larger. The larger the index, the higher will be the impact if you don’t have the capability to assign more memory for the task.

To increase the amount of memory for index builds, set the following in the configuration file:

maxIndexBuildMemoryUsageMegabytes: 1024

Example: set it for 1 GB.

Create Index on a Replica Set

As long as the index creation command is replicated on all the nodes of the cluster in the same way all the other write commands are replicated. the index creation is replicated on a Replica Set cluster. A foreground creation on the PRIMARY is replicated as foreground on SECONDARY nodes. A background creation is replicated as background on SECONDARY nodes as well.

The same limitation applies for the Replica Set as the standalone server. The foreground build is fast but blocking and the background build is not blocking, but it is significantly slower for very large collections.

So, what can we do?

If you need to create a small index, let’s say the size is less than the available memory, you can rely on the background creation on the PRIMARY node. The operation will be correctly replicated to the SECONDARY nodes and the overall impact won’t be too bad.

But if you need to create an index larger than the memory, on a huge collection, then even the background build is bad. The creation will have a significant impact on the server resources and you can get overall performance problems on all the nodes. In this case, we have to follow another procedure. The procedure requires more manual steps, but it’s the only way to properly build such a large index.

The idea is to detach from the Replica Set one node at the time, create the index, and rejoin the node to the cluster. But first, you need to take care of the oplog size. The oplog window should be large enough to give you the time for the index build when a node is detached.

Note: the oplog window is the timestamp difference between the first entry in the oplog and the more recent one. It represents the maximum amount of time you can have a node detached from the cluster for any kind of task (software upgrades, index builds, backups). If you rejoin the node inside the window, the node will be able to catch up with the PRIMARY just executing the missing operations from the oplog. If you rejoin the node after the window, it will have to copy completely all the collections. This will take a long time and an impressive bandwidth usage for large deployments. 

The following is the step by step guide:

  • choose one of the SECONDARY nodes
  • detach the node from the Replica Set
    • comment into the configuration file the replSetName and the port options
    • set a different port number
    • set the parameter disableLogicalSessionCacheRefresh to true

    • restart mongod
    • now the server is running as standalone; any query won’t be replicated
  • connect using the alternative port and build the index in foreground mode

  • connect the node to the Replica Set
    • uncomment the options in the configuration file
    • remove the disableLogicalSessionCacheRefresh option
    • restart mongod
    • now the node is a member of the Replica Set
    • wait some time for the node to catch up with the PRIMARY
  • repeat the previous steps for all the remaining SECONDARY nodes
  • stepdown the PRIMARY node to force an election
    • run rs.stepDown() command. This forces an election. Wait for some time for the PRIMARY to become a SECONDARY node.
  • restart it as standalone
    • use the same procedure we saw before
  • build the index in foreground mode
  • restart the node and connect it to the Replica Set

That’s all. We have created the index on all the nodes without any impact for the cluster and for the production applications.

Note: when restarting a node as standalone, the node could be exposed to mistake writes. For the sake of security, a good practice could be to disable TCP connections, allowing only local connections using the socket. Then you can put into the configuration file for example:
bindIp: /tmp/mongod.sock

Conclusion

This procedure is definitely more complicated than running a single command. It will require some time, but we hope you don’t have to create such large indexes every single day. 😉

Share this post

Comment (1)

  • EDUARDO LEGATTI Reply

    Hello,

    What is the purpose of the disableLogicalSessionCacheRefresh parameter? Is this really nessary to set it to true?

    Thanks

    August 6, 2019 at 7:55 am

Leave a Reply