How to Perform Rolling Index Builds with Percona Operator for MongoDB

This post explains how to perform a Rolling Index Build on a Kubernetes environment running Percona Operator for MongoDB.

Rolling Index Builds with Percona Operator for MongoDB

Why and when to perform a Rolling Index Build?

Building an index requires:

CPU and I/O resources
Database locks (even if brief)
Network bandwidth

If you have very tight SLAs or systems that are already operating close to their peak capacity, building an index the traditional way could lead to an outage.

A Rolling Index Build approach takes advantage of the replica set architecture by building the index on a single non-primary member at a time. This reduces the impact while maintaining the availability of the system.

If you have been managing MongoDB for some time, you likely know how to perform a Rolling Index Build. The following workflow explains the process:

However, things get more complicated in the Kubernetes world, as you cannot simply stop the mongod process in a pod. If you were to do so, the health check would fail, causing Kubernetes to spin up a new replacement pod immediately.

This prevents a straightforward approach to performing a rolling index build. Fortunately, there’s Percona Operator for MongoDB that addresses this challenge.

Step-by-step: Rolling Index Build

Let’s see the procedure for a 3-node replica set. In case you want to replicate this, here is the cr.yaml I’ve used after installing Percona Operator for MongoDB on my Kubernetes cluster:

tee testrs.yml <<EOF
apiVersion: psmdb.percona.com/v1
kind: PerconaServerMongoDB
metadata:
  name: testrs
spec:
  crVersion: 1.20.1
  image: percona/percona-server-mongodb:7.0.18-11
  unsafeFlags:
    replsetSize: true
    mongosSize: true
  upgradeOptions:
    apply: disabled
    schedule: "0 2 * * *"
  secrets:
    users: testrs
  replsets:
    - name: rs0
      size: 3
      affinity:
        antiAffinityTopologyKey: "none"
      volumeSpec:
        persistentVolumeClaim:
          resources:
            requests:
              storage: 3Gi
EOF

tee testrs.yml <<EOF

apiVersion: psmdb.percona.com/v1

kind: PerconaServerMongoDB

metadata:

name: testrs

spec:

crVersion: 1.20.1

image: percona/percona-server-mongodb:7.0.18-11

unsafeFlags:

replsetSize: true

mongosSize: true

upgradeOptions:

apply: disabled

schedule: "0 2 * * *"

secrets:

users: testrs

replsets:

- name: rs0

size: 3

affinity:

antiAffinityTopologyKey: "none"

volumeSpec:

persistentVolumeClaim:

resources:

requests:

storage: 3Gi

EOF

You can deploy it in the current namespace by running:

kubectl apply -f testrs.yml

1	kubectl apply -f testrs.yml

1. Verify the topology

After the deployment is complete, find the MongoDB pods:

$ kubectl get pods
NAME                                    READY   STATUS    RESTARTS   AGE
my-op-psmdb-operator-67b6686ffd-dmp8z   1/1     Running   0          24h
testrs-rs0-0                            1/1     Running   0          62m
testrs-rs0-1                            1/1     Running   0          4m47s
testrs-rs0-2                            1/1     Running   0          62m

$ kubectl get pods

NAME READY STATUS RESTARTS AGE

my-op-psmdb-operator-67b6686ffd-dmp8z 1/1 Running 0 24h

testrs-rs0-0 1/1 Running 0 62m

testrs-rs0-1 1/1 Running 0 4m47s

testrs-rs0-2 1/1 Running 0 62m

Pick one pod and start a shell against it:

kubectl exec -it testrs-rs0-0 -- /bin/bash

1	kubectl exec -it testrs-rs0-0 -- /bin/bash

Verify the topology to see the current primary and secondary members:

$ mongosh -u clusterAdmin -p password
rs0 [direct: primary] test> rs.status().members.forEach(m => print(m.name + " - " + m.stateStr))
testrs-rs0-0.testrs-rs0.percona-operator-testing.svc.cluster.local:27017 - PRIMARY
testrs-rs0-1.testrs-rs0.percona-operator-testing.svc.cluster.local:27017 - SECONDARY
testrs-rs0-2.testrs-rs0.percona-operator-testing.svc.cluster.local:27017 - SECONDARY

$ mongosh -u clusterAdmin -p password

rs0 [direct: primary] test> rs.status().members.forEach(m => print(m.name + " - " + m.stateStr))

testrs-rs0-0.testrs-rs0.percona-operator-testing.svc.cluster.local:27017 - PRIMARY

testrs-rs0-1.testrs-rs0.percona-operator-testing.svc.cluster.local:27017 - SECONDARY

testrs-rs0-2.testrs-rs0.percona-operator-testing.svc.cluster.local:27017 - SECONDARY

2. Stop one Secondary

In this case, we can take advantage of the Operator feature to avoid the restart-on-fail loop for Percona Server for MongoDB containers.

Let’s start by doing this in one of our secondary nodes testrs-rs0-1:

kubectl exec -it testrs-rs0-1 -c mongod -- sh -c 'touch /data/db/sleep-forever'

1	kubectl exec -it testrs-rs0-1 -c mongod -- sh -c 'touch /data/db/sleep-forever'

Now we can safely stop the mongod process:

kubectl exec -it testrs-rs0-1 -c mongod -- sh -c 'mongod --shutdown'

1	kubectl exec -it testrs-rs0-1 -c mongod -- sh -c 'mongod --shutdown'

This causes the pod to be restarted in “infinite sleep” mode, without starting mongod. We can connect to a different member and verify the status of the replica set:

kubectl exec -it testrs-rs0-0 -- /bin/bash
$ mongosh -u clusterAdmin -p password
rs0 [direct: primary] test> rs.status().members.forEach(m => print(m.name + " - " + m.stateStr))
testrs-rs0-0.testrs-rs0.percona-operator-testing.svc.cluster.local:27017 - PRIMARY
testrs-rs0-1.testrs-rs0.percona-operator-testing.svc.cluster.local:27017 - (not reachable/healthy)
testrs-rs0-2.testrs-rs0.percona-operator-testing.svc.cluster.local:27017 - SECONDARY

kubectl exec -it testrs-rs0-0 -- /bin/bash

$ mongosh -u clusterAdmin -p password

rs0 [direct: primary] test> rs.status().members.forEach(m => print(m.name + " - " + m.stateStr))

testrs-rs0-0.testrs-rs0.percona-operator-testing.svc.cluster.local:27017 - PRIMARY

testrs-rs0-1.testrs-rs0.percona-operator-testing.svc.cluster.local:27017 - (not reachable/healthy)

testrs-rs0-2.testrs-rs0.percona-operator-testing.svc.cluster.local:27017 - SECONDARY

3. Starting in standalone mode

We’ll be starting mongod manually, so let’s write down the configuration options from one of the remaining nodes:

$ kubectl exec -it testrs-rs0-2 -c mongod -- /bin/bash
$ ps -ef | grep mongo
mongodb        1       0  1 14:45 ?        00:00:01 mongod --bind_ip_all --auth --dbpath=/data/db --port=27017 --replSet=rs0 --storageEngine=wiredTiger --relaxPermChecks --clusterAuthMode=x509 --enableEncryption --encryptionKeyFile=/etc/mongodb-encryption/encryption-key --wiredTigerIndexPrefixCompression=true --quiet --tlsMode preferTLS --sslPEMKeyFile /tmp/tls.pem --tlsAllowInvalidCertificates --tlsClusterFile /tmp/tls-internal.pem --tlsCAFile /etc/mongodb-ssl/ca.crt --tlsClusterCAFile /etc/mongodb-ssl-internal/ca.crt

$ kubectl exec -it testrs-rs0-2 -c mongod -- /bin/bash

$ ps -ef | grep mongo

mongodb 1 0 1 14:45 ? 00:00:01 mongod --bind_ip_all --auth --dbpath=/data/db --port=27017 --replSet=rs0 --storageEngine=wiredTiger --relaxPermChecks --clusterAuthMode=x509 --enableEncryption --encryptionKeyFile=/etc/mongodb-encryption/encryption-key --wiredTigerIndexPrefixCompression=true --quiet --tlsMode preferTLS --sslPEMKeyFile /tmp/tls.pem --tlsAllowInvalidCertificates --tlsClusterFile /tmp/tls-internal.pem --tlsCAFile /etc/mongodb-ssl/ca.crt --tlsClusterCAFile /etc/mongodb-ssl-internal/ca.crt

Now we are ready to start our stopped node in standalone mode. Since our pod is in the “infinite sleep” mode, we need to build the TLS certificates. In “normal” mode, these steps happen automatically:

kubectl exec -it testrs-rs0-1 -c mongod -- /bin/bash
cat /etc/mongodb-ssl/tls.key /etc/mongodb-ssl/tls.crt > /tmp/tls.pem
cat /etc/mongodb-ssl-internal/tls.key /etc/mongodb-ssl-internal/tls.crt > /tmp/tls-internal.pem

kubectl exec -it testrs-rs0-1 -c mongod -- /bin/bash

cat /etc/mongodb-ssl/tls.key /etc/mongodb-ssl/tls.crt > /tmp/tls.pem

cat /etc/mongodb-ssl-internal/tls.key /etc/mongodb-ssl-internal/tls.crt > /tmp/tls-internal.pem

Now, based on the configuration options we wrote down, we can prepare the command to start mongod in standalone mode. Start by removing the –replset parameter, changing the default port, and binding to localhost (for extra security). Optionally remove the –auth parameter for convenience. You should end up with something like this:

mongod --dbpath=/data/db --port=27117 --storageEngine=wiredTiger --relaxPermChecks --enableEncryption --encryptionKeyFile=/etc/mongodb-encryption/encryption-key --wiredTigerIndexPrefixCompression=true --quiet --tlsMode preferTLS --sslPEMKeyFile /tmp/tls.pem --tlsAllowInvalidCertificates --tlsClusterFile /tmp/tls-internal.pem --tlsCAFile /etc/mongodb-ssl/ca.crt --tlsClusterCAFile /etc/mongodb-ssl-internal/ca.crt

mongod --dbpath=/data/db --port=27117 --storageEngine=wiredTiger --relaxPermChecks --enableEncryption --encryptionKeyFile=/etc/mongodb-encryption/encryption-key --wiredTigerIndexPrefixCompression=true --quiet --tlsMode preferTLS --sslPEMKeyFile /tmp/tls.pem --tlsAllowInvalidCertificates --tlsClusterFile /tmp/tls-internal.pem --tlsCAFile /etc/mongodb-ssl/ca.crt --tlsClusterCAFile /etc/mongodb-ssl-internal/ca.crt

Run that command, and mongod will start. Logs will be printed to stdout, so we leave this shell session alone for now.

4. Build the Index

Start a new shell session, connect to the pod we are working with, and build the index:

$ kubectl exec -it testrs-rs0-1 -c mongod -- /bin/bash
$ mongosh --port 27117
use mydb  
db.mycollection.createIndex({ myfield: 1 }, { name: "my_index_name" })

$ kubectl exec -it testrs-rs0-1 -c mongod -- /bin/bash

$ mongosh --port 27117

use mydb

db.mycollection.createIndex({ myfield: 1 }, { name: "my_index_name" })

When the index build finishes, shut down mongod from the shell:

db.shutdownServer()

1	db.shutdownServer()

5. Resume normal operation

After the shutdown is complete, we can delete the “sleep-forever” file to go back to “normal” pod behavior:

kubectl exec -it testrs-rs0-1 -c mongod -- sh -c 'rm /data/db/sleep-forever'

1	kubectl exec -it testrs-rs0-1 -c mongod -- sh -c 'rm /data/db/sleep-forever'

As soon as this file is removed, the mongod process on the pod will automatically start with the usual arguments. The node should eventually catch up via oplog apply and go back to Secondary state.

6. Repeat on other Secondaries

Now simply repeat steps one through five on each remaining secondary node. This process can be scripted, but it’s safer to proceed manually unless you’re very sure of your automation.

7. Execute on the Primary

After all secondaries have the new index, you can perform a controlled failover:

kubectl exec -it testrs-rs0-0 -- /bin/bash
$ mongosh -u clusterAdmin -p password
rs0 [direct: primary] test> rs.stepDown()

kubectl exec -it testrs-rs0-0 -- /bin/bash

$ mongosh -u clusterAdmin -p password

rs0 [direct: primary] test> rs.stepDown()

This part obviously will have some impact, so ideally, perform it during off-hours. Once the new primary is promoted, all that remains is to repeat the steps 1-5 on the former primary.

Summary

When managing a production-grade MongoDB, schema changes—like adding indexes—must be carefully planned to avoid performance degradation or downtime. While MongoDB has improved the index build process over time, in some cases, it is still impossible to create an index directly on a primary server without affecting the system.

With a rolling approach, you can safely add indexes across your MongoDB replica set with minimal disruption to production workloads, even with the added complexity of Kubernetes.

One caveat to keep in mind is that the oplog window has to be big enough. If your index takes two hours to build, you should have at least a two hour oplog window (probably even a bit more to be safe).

MySQL 5.7
Support

Compare Percona to Leading Database Solutions

Software
Downloads

Valkey Contribution

Product Documentation

Resource Hub

Why Percona for MongoDB?

Why Percona for PostgreSQL?

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

How to Perform Rolling Index Builds with Percona Operator for MongoDB

Why and when to perform a Rolling Index Build?

Step-by-step: Rolling Index Build

1. Verify the topology

2. Stop one Secondary

3. Starting in standalone mode

4. Build the Index

5. Resume normal operation

6. Repeat on other Secondaries

7. Execute on the Primary

Summary

Related Blog Articles

RECOMMENDED ARTICLES

Data Retention Policy Implementation – How and Why

Introducing the GA Release of the New Percona Operator for MySQL: More Replication Options on Kubernetes

A Tale of Two Databases: No-Op Updates in PostgreSQL and MySQL

MOST POPULAR ARTICLES

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL Performance Tuning: Maximizing Database Efficiency and Speed

The Ultimate Guide to Open Source Databases

MySQL 5.7 Support

Compare Percona to Leading Database Solutions

Software Downloads

Valkey Contribution

Product Documentation

Resource Hub

Why Percona for MongoDB?

Why Percona for PostgreSQL?

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

How to Perform Rolling Index Builds with Percona Operator for MongoDB

Why and when to perform a Rolling Index Build?

Step-by-step: Rolling Index Build

1. Verify the topology

2. Stop one Secondary

3. Starting in standalone mode

4. Build the Index

5. Resume normal operation

6. Repeat on other Secondaries

7. Execute on the Primary

Summary

About the Author

Share This Post!

Stay up to date with the Percona Blog

Related Blog Articles

RECOMMENDED ARTICLES

Data Retention Policy Implementation – How and Why

Introducing the GA Release of the New Percona Operator for MySQL: More Replication Options on Kubernetes

A Tale of Two Databases: No-Op Updates in PostgreSQL and MySQL

MOST POPULAR ARTICLES

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL Performance Tuning: Maximizing Database Efficiency and Speed

The Ultimate Guide to Open Source Databases

MySQL 5.7
Support

Software
Downloads