TLDR: It is not profile level 1 that is the problem; it’s a gotcha with the optional ‘slowms’ argument that causes users to accidentally set verbose logging and fill their disk with log files.
In MongoDB, there are two ways to see, with individual detail, which operations were executed and how long they took.
Profiling is low-cost, but it is limited to keeping only a small snapshot. It has levels: 0 (off), 1 (slowOp(s only)), 2 (all).
The (log file) logger also has levels of its own, but there is no ‘off’. Even at level 0, it prints any slow operation in a “COMMAND” log line. ‘Slow’ is defined by the configured slowOpThresholdMs option (originally “slowMS”). That is 100ms by default, which is a good default i.m.o.
The log-or-not code and profile-or-not are unrelated systems to the user, but they share the same global variable for ‘slow’: serverGlobalParams.slowMS
The basic post-mortem will go:
So that’s the advice: Until MongoDB is enhanced to have separate slow-op threshold options for the profiler, never use db.setProfilingLevel(1, …). Even if you know the gotcha, someone learning over your shoulder won’t see it.
db.setProfilingLevel(0 /*off*/) <-> db.setProfilingLevel(2 /*all*/) and don’t touch slowms
db.setLogLevel(1, "command"). And run db.setLogLevel(-1, "command") to stop it again. ‘Use log level 1, not profile level 1’ could almost be the catchphrase of this article.
Some other gotchas to do with profiling and logging:
|
1 |
bool shouldDBProfile(bool shouldSample = true) {<br> // Profile level 2 should override any sample rate or slowms settings.<br> if (_dbprofile >= 2)<br> return true;<br><br> if (!shouldSample || _dbprofile <= 0)<br> return false;<br><br> /* Blog: and by elimination if _dbprofile == 1: */<br> return elapsedTimeExcludingPauses() >= Milliseconds{serverGlobalParams.slowMS};<br>}<br><br>bool CurOp::completeAndLogOperation(OperationContext* opCtx,<br> logger::LogComponent component,<br> boost::optional<size_t> responseLength,<br> boost::optional<long long> slowMsOverride,<br> bool forceLog) {<br> // Log the operation if it is eligible according to the current slowMS and sampleRate settings.<br> const bool shouldLogOp = (forceLog || shouldLog(component, logger::LogSeverity::Debug(1)));<br> const long long slowMs = slowMsOverride.value_or(serverGlobalParams.slowMS);<br> ...<br> ...<br> const bool shouldSample =<br> client->getPrng().nextCanonicalDouble() < serverGlobalParams.sampleRate;<br><br> if (shouldLogOp || (shouldSample && _debug.executionTimeMicros > slowMs * 1000LL)) {<br> auto lockerInfo = opCtx->lockState()->getLockerInfo(_lockStatsBase);<br><br> ...<br> // Return 'true' if this operation should also be added to the profiler.<br> return shouldDBProfile(shouldSample);<br>}<br> |
Resources
RELATED POSTS