Identifying Unused Indexes in MongoDB

Identifying Unused Indexes in MongoDB

PREVIOUS POST
NEXT POST

Like MySQL, having too many indexes on a MongoDB collection not only affects overall write performance, but disk and memory resources as well. While MongoDB holds predictably well in scaling both reads and writes options, maintaining a heathly schema design should always remain a core character of a good application stack.

Aside from knowing when to add an index to improve query performance, and how to modify indexes to satisfy changing query complexities, we also need to know how to identify unused indexes and cut their unnecessary overhead.

First of all, you can already identify access operation counters from each collection using the $indexStats  ( indexStats  command before 3.0) aggregation command. This command provides two important pieces of information: the ops  counter value, and since , which is when the ops counter first iterated to one. It is reset when the mongod  instance is restarted.

From this information, if the ops counter is zero for any index, then we can assume it has not been used either since the index was added or since the server was restarted, with a few exceptions. An index might be unique and not used at all (a uniqueness check on INSERT does not increment the ops counter). The documentation also indicates that index stats counter does not get updated by TTL indexes expiration or chunk split and migration operations.

Be aware of occasional index use

One golden rule, however, is that this type of observation based on type is subjective – before you decide to drop the index, make sure that the counter has collected for a considerable amount of time. Dropping an index that is only used once a month for some heavy reporting can be problematic.

The same information from $indexStats  can also be made available to PMM. By default, the mongo_exporter  does not include this this information but it can be enabled as an additional collection parameter.

Once enabled, we can create a custom graph for this information from any PMM dashboard, as shown below. As mentioned above, any index(es) that has zero values will not have been used for the current time range in the graph. One minor issue with the collector is that each metric does not come with the database and collection information. Consequently, we cannot filter to the collection level yet, we have an improvement request open for that.

MongoDB index usage dashboard report from percona monitoring and management

An alternative view to this information from Grafana PMM is available from the Time Series to Aggregation table panel, shown below. One advantage of having these metrics in PMM is that the data survives an instance restart. Of course, to be useful for identifying unused indexes, the retention period has to match or exceed your complete application “cycle” period.MongoDB index usage stats from PMM

Given that in a MongoDB replicaset, you can delegate data bearing member nodes to different roles, perhaps with tags and priorities. You can also have nodes with different sets of indexes. Being able to identify the sets of indexes needed at the node level allows you to optimize replication, queries, and resource usage.

More Resources

We have an introductory series of posts on MongoDB indexes available on this blog. Read Part 1 here.

You can download Percona Server for MongoDB – all Percona software is open source and free.

PREVIOUS POST
NEXT POST

Share this post

Comments (2)

  • Seth Centerbar Reply

    Jervin, great post! I have a quick question for you.

    “By default, the mongo_exporter does not include this this information but it can be enabled as an additional collection parameter.”

    I actually just installed pmm for our mongo cluster yesterday. Any reason why index usage isn’t already included? Also, where can I find the list of parameters such as -collect.indexusage? I followed this guide but saw no mention of additional params.

    https://www.percona.com/doc/percona-monitoring-and-management/conf-mongodb.html

    Once again, thanks for the great post!

    November 21, 2018 at 10:16 am
  • Jervin Real Reply

    @Seth

    Index usage stats are not collected by default since under the hood, this is another query (an aggregate in fact) and may or may not have an impact on performance depending on the source instance. You can see all options supported by the exporter by doing “mongodb_exporter –help”. The options shown in the help can be passed on to pmm-admin as I shown above – https://www.percona.com/doc/percona-monitoring-and-management/pmm-admin.html#pmm-pmm-admin-monitoring-service-pass-parameter

    November 21, 2018 at 11:05 am

Leave a Reply