Don’t Use MongoDB Profiling Level 1

MongoDB ProfilingTLDR: It is not profile level 1 that is the problem; it’s a gotcha with the optional ‘slowms’ argument that causes users to accidentally set verbose logging and fill their disk with log files.

In MongoDB, there are two ways to see, with individual detail, which operations were executed and how long they took.

  • Profiling. Saves the operation details to a capped collection system.profile. You access this information through a mongodb connection.
  • Log lines of “COMMAND” component type in the mongod log files. (Also mongos v4.0+ log files). You have to access the files as Unix or Windows user and work with them as text.

Profiling is low-cost, but it is limited to keeping only a small snapshot. It has levels: 0 (off), 1 (slowOp(s only)), 2 (all).

The (log file) logger also has levels of its own, but there is no ‘off’. Even at level 0, it prints any slow operation in a “COMMAND” log line. ‘Slow’ is defined by the configured slowOpThresholdMs option (originally “slowMS”). That is 100ms by default, which is a good default i.m.o.

Usability problem

The log-or-not code and profile-or-not are unrelated systems to the user, but they share the same global variable for ‘slow’: serverGlobalParams.slowMS

The basic post-mortem will go:

  • Someone used db.setProfilingLevel(1/*slowOp*/, 0/*ms*/) to start profiling all operations (DON’T DO THIS – use db.setProfilingLevel(2/*all*/) instead.)
  • The logger starts writing every command to the log files.
  • They executed db.setProfilingLevel(0/*off*/) to turn the profiler off.
  • The logger continues writing every command to the log files because slowMS is still 0ms.
  • The DBA gets paged after hours because the disk filled up with log file and thought ‘oh, I should have set up log rotation on that server; I’ll do it in the morning’.
  • The DBA gets woken up in the small hours of the morning because the new primary node has also crashed due to a full disk.

So that’s the advice: Until MongoDB is enhanced to have separate slow-op threshold options for the profiler, never use db.setProfilingLevel(1, …). Even if you know the gotcha, someone learning over your shoulder won’t see it.

What to do instead:

  • Use only db.setProfilingLevel(0 /*off*/) <-> db.setProfilingLevel(2 /*all*/) and don’t touch slowms
    • There is still a valid use case for using profiler at level 1, but if you are not taking on the responsibility of looking after the log file’s code interdependence on the slowMS value, don’t go there.
  • If you want the logger to print “COMMAND” lines for every command, including fast ones, use db.setLogLevel(1, "command"). And run db.setLogLevel(-1, "command") to stop it again. ‘Use log level 1, not profile level 1’ could almost be the catchphrase of this article.
    (db.setLogLevel(1) + db.setLogLevel(0) is an alternative to the above, but is an older, blunter method.)
  • If you want to set the sampleRate you can do that without changing level (or ‘slowms’) with the following command: db.setProfilingLevel(db.getProfilingStatus().was, {“sampleRate”: <new float value>})
  • If you want a different threshold for slowMS permanently, use the slowOpThresholdMs option in the config file, but you can also do it dynamically as for the sampleRate instructions above.

Some other gotchas to do with profiling and logging:

  • The logger level is global; the slowOpThresholdMs a.k.a. slowMS value is global, but profiling level is per db namespace.
  • All of the above are local only to the mongod (or mongos) the commands are run on / the config file is set for. If you run it on a primary it does not change the secondaries, and if you run it on a mongos it does not change the shard or the configsvr replicaset nodes.
  • mongos nodes only provide a subset of these diagnostic features. They have no collections of their own, so for starters, they cannot make a system.profile collection.


Share this post

Comments (4)

  • Mark Callaghan Reply

    Please file a feature request and list the URL in this post

    November 15, 2019 at 12:40 pm
    • Akira Kurogane Reply (from 2015), with related SERVER-4785 (2012).

      We can hope that it might be fixed in v4.4 I guess. (I can’t see any patch as being backward compatible due to it probably having small user interface changes).

      November 15, 2019 at 11:57 pm
  • asaSAS Reply

    Wait a minute. System.profile can be set as a capped collection, did you try it?

    December 9, 2019 at 2:30 am

Leave a Reply