Editor’s Note: The first version of this post contained a section criticizing what appeared to be a major regression concerning dropDatabase and movePrimary commands. It was found out that it was merely a documentation error in the MongoDB 4.2 release notes, which is now fixed: https://jira.mongodb.org/browse/DOCS-12474. The “(In)Stability” section is now removed.
At Percona we’ve been tracking the upstream releases from MongoDB Inc. closely and, like many of you, are happy that MongoDB is finally available in its General Availability 4.2.0 version.
It is time for MongoDB Community users to start using 4.2 in their testing, QA, and pre-production staging environments. As with many products, the first few minor versions of a new major version are the ones that have the quickest-landing and most important fixes. Historically this has also been true for MongoDB. In short, I wouldn’t start using 4.2.0 in production today, but by the time you’ve finished trialing 4.2 in your pre-production environments, it will already be a past version.
For users of Percona Server for MongoDB, which includes open-source replacements for MongoDB Enterprise add-on modules (feature comparison here) you’ll be pleased to know that we’ve already merged the 4.2.0 code, and testing has begun. We expect to release Percona Server for MongoDB 4.2 in the next few weeks.
What’s new in MongoDB 4.2?
We covered the keynotes new features of 4.2 in our recent blog post Percona’s View on MongoDB’s 4.2 Release – The Good, the Bad, and the Ugly… This looked at:
- Distributed transactions
- Server-side updates (both with the classic CRUD update op and aggregation pipeline $merge stage)
- Field-level encryption (MongoDB Enterprise only so far)
I also discussed some internal changes of interest in Diving into the MongoDB 4.2 Release Small Print. This covered:
- MMAPv1 storage engine removed
- queryHash added in diagnostic fields
- Auto-splitting thread removed from mongos nodes
- Modularization of config files through –configExpand
- Improved WiredTiger data file repair
- Listing open cursors
But of course, there is still more to discuss given the size and long ‘cooking time’ of this version! Some additional thoughts on the new and enhanced features are below.
I think many organizations decided 4.0 was not the right time to start using transactions, even though they were supported for non-sharded replicasets. A company usually has multiple databases; some are sharded and some are not. It was easier to wait until the feature was ready for both. But now is the time – if you need them. Do not use transactions unless you have a compelling reason (see performance sub-section below).
The syntax for using transactions hasn’t changed from 4.0. You must create a session, and then with that session as scope run a startTransaction command; the normal CRUD commands within the transaction; then a commitTransaction (or abortTransaction).
Although the syntax for using transactions hasn’t changed, the client driver specs did change a little. To use transactions in 4.2 you must upgrade your drivers to the 4.2 server’s compatible versions. These are the twelve officially-supported drivers:
- C 1.15.0
- C# 2.9.0
- C++ 3.5
- Go 1.1
- Java 3.11.0
- Node 3.3.0
- Perl 2.2.0
- PHP Extension 1.6 (or 1.5 for the library)
- Python 3.9.0
- Motor (i.e. async Python) 4.2 compatibility not available. Last supported version is 3.6
- Ruby 2.10.0
- Scala 2.7.0
The fundamentals in this next section are true for any transaction-supporting database, but I’ll use examples in the context of MongoDB to make it clear it applies here too.
Using transactions is going to hurt your performance if you use them too freely. Reads and writes as you’ve been doing them before transactions were supported should still make-up the great majority of the operations being run in your MongoDB cluster or non-shared replica set.
Let’s say a read in your MongoDB server takes a double-digit number of microseconds on average, and a write takes about ten times as long. The chance of there being conflicts for the same document as those single ops take place is limited to those small windows of time. It can happen, but you have to be really pushing it.
If a conflict of a normal, single-doc write happens, it will be retried automatically by the storage engine until it gets its turn and creates the new document version. The update will be completed, following data consistency rules, so it might not seem so bad. But the processing cost grows and the latency grows.
Transactions stretch out the time window of operations as logical units. Conflicts in transactions vary, but the transaction will possibly have to walk through old, rather than the latest, versions of a doc. A transaction will abort if the docs in a write (or reads in the same transaction preceding the write) are found to have a conflicting update that arrived during its lifetime. So the internal ‘housekeeping’ work for the storage engine increases with the number of documents being affected by the transaction, and the chance of conflicts in time increases exponentially to the length of transaction (by random spread assumption at least).
Both of these things can make transactions much slower than normal operations unless you:
- design transactions to use as few documents as possible
- never have a slow op such as an unindexed query within a transaction
This is in addition to only having transactions as a minor part of the database traffic.
Also, don’t forget a point promoted well in MongoDB’s early days; if you put fields in the same collection document (i.e. Document database style rather than relation database table style) you avoid the need for transactions in the first place.
New client locking/blocking throttle mechanism on Primaries
Marketing name: “Flow control”
I am going to cover this feature in detail in an upcoming blog post. But, it also belongs on any review of 4.2 major new features, so here is a quick summary.
Version 4.2 primary nodes will keep clients waiting, as much as it needs to, to ensure that they are not getting more than 10 seconds ahead of the secondaries.
Benefit: high replication lag will be a much less likely event in the future. (Secondary reads still risk being stale <= 10 secs at any normal time though.)
Major benefit: I believe this limits exponentially-increasing storage engine costs that must be paid until writes are confirmed written to secondaries.
Nuisance: you’ll have to disable this feature (it’s on by default) if you want best latencies on the primary during sudden heavy load events.
I like this feature. But please call it what it is: a Primary Throttle.
Retryable writes will become the default for 4.2-compatible drivers. If you decide you don’t want automatic retries you will now have to explicitly disable them.
I don’t see a reason why you wouldn’t use retryable writes. The result for the client is the same as if there had been no primary switch at that time, i.e. just as if things were normal.
New workaround needed after dropDatabase and movePrimary
Since 3.4 there has been a bug where using dropDatabase and then creating a new database of the *same name* makes some edge-case shard routing errors possible. A movePrimary command can also have the same issue.
This is not such a terrible bug – reusing the same object name means separate parts of the code need to determine if it is the old object or the new object of the same name. Doing this without misunderstanding in an asynchronous message situation is a difficult issue to solve.
What’s changed in 4.2 is the workaround to this bug has become more onerous. Before you only needed to run the flushRouterConfig command on the mongos node. Now it seems a full restart of all shard replicaset nodes (in a rolling fashion as normal, please) and a restart of the mongos nodes.
Starting in MongoDB 4.2:
If you drop a database and create a new database with the same name, you must either:
* Restart all mongos instances and all mongod shard members (including the secondary members);
* Use the flushRouterConfig command on all mongos instances and all mongod shard members (including the secondary members) before reading or writing to that database.
This ensures that the mongos and shard instances refresh their metadata cache, including the location of the primary shard for the new database.
Otherwise, you may miss data on reads, and may not write data to the correct shard. To recover, you must manually intervene.
This is a major revamp of how indexes are built. Background index builds are gone. The new index build process takes a lock briefly at the start, then does something like a background index build as it scans the entire collection to build the index structure, then locks again only whilst finishing the process.
I believe these indexes solve a problem for users who are temporarily forced to use MongoDB with data structured as it was from a legacy key-value store.
Please be aware using wildcard indexes does not mean a new index is created every time a new, as-of-yet unindexed, field name is encountered during an insert or update. Instead, it creates one physical index. (You can confirm this by looking at db.collection.getIndexes()).
As it is not made of normal indexes, limitations apply: null or existence comparisons can’t be performed; field-to-object (or array) comparisons can’t be performed (including the pseudo $min and $max objects); a sort can not be a composite spec such as ‘wildcard-indexed field + other non-wildcard field’ or ‘wildcard-indexed field + another wildcard-indexed field’); cannot be made to also be a unique index, or text, geo or hashed index; cannot be used for TTL index.
The mongodb tools will start using a stricter Extended JSON. No longer is the ambiguity of numeric literals in JSON being tolerated (E.g. ‘Is this “6” going to be an integer? Or float value 6.0?’). Dates will also be serialized as a numeric type. It appears the ISO 8601 date format string (YYYY-MM-DDTHH:MM:SS.nnn(TZ)) is no longer OK because it wasn’t able to be used for dates before the unix epoch start.
The new format might take up to five times as much space if you happen to be dumping your big data with mongodump (unlikely, admittedly). Dates are basically the same size, but a number such as 567.9 will now take 28 bytes to serialize instead of 5.
“ssl*” option names are changing to “tls*” ones
One for the protocol name pedants. And fair enough.
There is a new dropConnections command. It will only kill outgoing connections. In other words, if you want to log as the admin user to a shard node and kill all clients ops, this is not going to do that.
currentOp output now includes idle ops and sessions, not just active ops. To distinguish between them, there is a new “type” field.
Other new fields of interest to me are: effectiveUsers, runBy, writeConflicts, and prepareReadConflicts.
A previously enterprise-only feature that is being shifted into the Community edition. This is an aid to make sure a mongod node dies without a certain time limit if the disk interface goes completely silent to the kernel. Without storage ‘watchdog’ the mongod won’t react here because even the kernel won’t react.
I don’t recommend using storage watchdog. Disk full-errors, or even complete disk death, will be detected by the kernel and signaled through to mongod, which will then react appropriately. It is only a really evil design of SCSI or RAID card etc. that can create the kernel-silence situation that this feature addresses. I think we should exorcise not accommodate hardware like that.
In conclusion, it’s great to see that MongoDB Community 4.2 has squirmed out from its release-candidate cocoon and become the beautiful butterfly we’ve been waiting for (my complaints about the new dropDatabase sensitivities aside).
We look forward to bringing you Percona Server for MongoDB 4.2 in the near future, which incorporates the key elements of 4.2 with additional Enterprise features.
Please contact us if you would like any assistance with your MongoDB database set-up, or if you would like to discuss any of the 4.2 features in more detail.