MongoDB 3.6 Community Is Here!David Murphy
By now you surely know MongoDB 3.6 Community became generally available on Dec 5, 2017. Of course, this is great news: it has some big ticket items that we are all excited about! But I want to also talk about my general thoughts on this release.
It is always a good idea for your internal teams to study and consider new versions. This is crucial for understanding if and when you should start using it. After deciding to use it, there is the question of if you want your developers using the new features (or are they not suitably implemented yet to be used)?
So what is in MongoDB 3.6 Community? Check it out:
- Change Streams
- Retryable Writes
- Security Improvement
- Major love for Arrays in Aggregation
- A better balancer
- JSON Schema Validation
- Better Time management
- Compass is Community
- Major WiredTiger internal overhaul
As you can see, this is an extensive list. But there are 1400+ implemented Jira tickets just on the server itself (not even in the WiredTigers project).
To that end, I thought we should break my review into a few areas. We will have blog posts out soon covering these areas in more depth. This blog is more about my thoughts on the topics above.
Expected blogs (we will link to them as they become available):
- Change Streams – Nov 11 2017
- Transactions and new functions
- Aggregation improvements
- Security Controls to use ASAP
- Other changes from diagnostics to Validation
Today let’s quickly recap the above areas.
We will have a blog on this (it has some history). This move has been long-awaited by anyone using MongoDB before 2.4. There were connection changes in that release that made it complicated for load balancers due to the inability to “re-attach” to the same session. If you were not careful in 2.4+, you could easily use a load-balancer and have very odd behavior: from broken to invisibly incomplete getMores (big queries).
Sessions aim is to change this. Now, the client drivers know about the internal session to the database used for reads and writes. Better yet, MongoDB tracks these sessions so even if an election occurs, when your drive fails over so will the session. For anyone who’s applications handled fail-overs badly, this is an amazing improvement. Some of the other new features that make 3.6 a solid release require this new behavior.
Does this mean this solution is perfect and works everywhere? No! It, like newer features we have seen, leave MMAPv1 out in the cold due to its inability without major re-work to support logic that is so native to Wired Tiger. Talking with engine engineers, it’s clear that some of the logic behind the underlying database snapshots and rollbacks added here can cause issues (which we will talk about more in the transactions blog).
As one of the most talked about (but most specialized) features, I can see its appeal in a very specific use case (but it is rather limited). However, I am getting ahead of myself! Let’s talk about what it is first and where it came from.
Before this feature, people streamed data out of MySQL and MongoDB into Elastic and Hadoop stacks. I mention MySQL, as this was the primer for the initial method MongoDB used. The tools would read the MySQL binlogs – typically saved off somewhere – and then apply those operations to some other technology. When they went to do the same thing in MongoDB, there was a big issue: if writes are not sent to the majority of the nodes, it can cause a rollback. In fact, such rollbacks were not uncommon. The default was w:1 (meaning the primary only needed to have the write), which resulted in data existing in Elastic that had been removed from MongoDB. Hopefully, everyone can see the issue here, and why a better solution was needed than just reading the oplog.
Enter $changeStream, which in the shell has a helper called .watch() . This is a method that uses a multi-node consistent read to ensure the data is on the majority of nodes before the command returns the data in a tailable cursor. For this use case, this is amazing as it allows the data replicating tool much more assurance that data is not going to vanish. $changeStream is not without limits: if we have 10k collections and we want to watch them all, this is 10k separate operations and cursors. This puts a good deal of strain on the systems, so MongoDB Inc. suggests you do this on no more than 1000 namespaces at a time to prevent overhead issues.
Sadly it is not currently possible to take a mongod-wide snapshot to support this under the hood, as this is done on each namespace to implement the snapshot directly inside WiredTiger’s engine. So for anyone with a very large collection count, this will not be your silver bullet yet. This also means streams between collections and databases are not guaranteed to be in sync. This could be an issue for someone even with a smaller number of namespaces that expect this. Please don’t get me wrong: it’s a step in the correct direction, but it falls short.
I had very high hopes for this to simplify backups in MongoDB. Percona Labs’s GitHub has a tool called MongoDB-Consistent-Backup, which tails multiple oplogs to get a consistent sharded backup without the need to pay for MongoDB’s backup service or use the complicated design that is Ops Manager (when you host it yourself). Due to the inability to do a system-wide change stream, this type of tool still needs to use the oplog. If you are not using w:majority it could trigger a failure if you have an election or if a rollback occurs. Still, I have hopes this will be something that can be considered in the future to make things better for everyone.
Unlike change streams, this feature is much more helpful to the general MongoDB audience. I am very excited for it. If you have not watched this video, please do right now! Samantha does a good job explaining the issue and solution in detail. However, for now just know there has been a problem that where a write that has an issue (network, app shutdown, DB shutdown, election), you had no way to know if the write failed or not. This unknown situation was terrible for a developer, and they would not know if they needed to run the command again or not. This is especially true if you have an ordering system and you’re trying to remove stock from your inventory system. Sessions, as discussed before, allowed the client to reconnect to a broken connection and keep getting results to know what happened or didn’t. To me, this is the second best feature of 3.6. Only Security is more important to me personally.
In speaking of security, there is one change that the security community wanted (which I don’t think is that big of a deal). For years now, the MongoDB packaging for all OSs (and even the Percona Server for MongoDB packing) by default would limit the bindIP setting to localhost. This was to prevent unintended situations where you had a database open to the public. With 3.6 now the binaries also default to this. So, yes, it will help some. But I would (and have) argued that when you install a database from binaries or source, you are taking more ownership of its setup compared to using Docker, Yum or Apt.
The other major move forward, however, is something I have been waiting for since 2010. Seriously, I am not joking! It offers the ability to limit users to specific CIDR or IP address ranges. Please note MySQL has had this since at least 1998. I can’t recall if it’s just always been there, so let’s say two decades.
This is also known as “authenticationRestriction” and it’s an array you can put into the user document when creating a document. The manual describes it as:
The authentication restrictions the server enforces on the created user. Specifies a list of IP addresses and CIDR ranges from which the user is allowed to connect to the server or from which the server can accept users.
I can not overstate how huge this is. MongoDB Inc. did a fantastic job on it. Not only does it support the classic client address matching, it supports an array of these with matching on the server IP/host also. This means supporting multiple IP segments with a single user is very easy. By extension, I could see a future where you could even limit some actions by range – allowing dev/load test to drop collections, but production apps would not be allowed to. While they should have separate users, I regularly see clients who have one password everywhere. That extension would save them from unplanned outages due to human errors of connecting to the wrong thing.
We will have a whole blog talking about these changes, their importance and using them in practice. Yes, security is that important!
Major love for array and more in Aggregation
This one is a bit easier to summarize. Arrays and dates get some much-needed aggregation love in particular. I could list all the new operators here, but I feel it’s better served in a follow-up blog where we talk about each operator and how to get the most of it. I will say my favorite new option is the $hint. Finally, I can try to control the work plan a bit if it’s making bad decisions, which sometimes happens in any technology.
A better balancer
Like many other areas, there was a good deal that went into balancer improvements. However, there are a few key things that continue the work of 3.4’s parallel balancing improvements.
Some of it makes a good deal of sense for anyone in a support role, such as FTDC now also existing in mongos’. If you do not know what this is, basically MongoDB collects some core metrics and state data and puts it into binary files in dbPath for engineers at companies like Percona and MongoDB Inc. to review. That is not to say you can’t use this data also. However, think of it as a package of performance information if a bug happens. Other diagnostic type improvements include moveChunk, which provides data about what happened when it runs in its output. Previously you could get this data from the config.changeLog or config.actionLog collections in the config servers. Obviously, more and more people are learning MongoDB’s internals and this should be made more available to them.
Having talked about diagnostic items, let’s move more into the operations wheelhouse. The single biggest frustration about sharding and replica-sets is the sensitivity to time variations that cause odd issues, even when using ntpd. To this point, as of 3.6 there is now a logical clock in MongoDB. For the geekier in the crowd, this was implemented using a Lamport Clock (great video of them). For the less geeky, think of it as a cluster-wide clock preventing some of the typical issues related to varying system times. In fact, if you look closer at the oplog record format in 3.6 there is a new wt field for tracking this. Having done that, the team at MongoDB Inc. considered what other things were an issue. At times like elections of the config servers, meta refreshes did not try enough times and could cause a mongos to stop working or fail. Those days are gone! Now it will check three times as much, for a total of ten attempts before giving up. This makes the system much more resilient.
A final change that is still somewhat invisible to the user but helps make dropping collections more stable, is that they remove the issue MongoDB had about dropping and recreating sharded collections. Your namespaces look as they always have. Behind the scenes, however, they have UUID’s attached to them so that if foo.bar drops and gets recreated, it would be a different UUID. This allows for less-blocking drops. Additionally, it prevents confusion in a distributed system if we are talking about the current or old collection.
JSON Schema validation
Some non-developers might not know much about something called JSON Schema. It allows you to set rules on schema design more efficiently and strictly than MongoDB’s internal rules. With 3.6, you can use this directly. Read more about it here. Some key points:
- Describes your existing data format
- Clear, human- and machine-readable documentation
- Complete structural validation, useful for:
- Automated testing
- Validating client-submitted data
Compass is Community
If you have never used Compass, you might think this isn’t that great. You could use things like RoboMongo and such. You absolutely could, but Compass can do visualization as well as CRUD operations. It’s also a very fluid experience that everyone should know is available for use. This is especially true for QA teams who want to review how often some types of data are present, or a business analyst who needs to understand in two seconds what your demographics are.
Major WiredTiger internal overhaul
There is so much here that I recommend any engineer-minded person take a look at this deck, presented by one of the great minds behind WiredTiger. It does a fantastic job explaining all the reasons behind some of the 3.2 and 3.4 scaling issues MongoDB had on WiredTiger. Of particular interest is why it tended to have worse and worse performance as you added more collections and indexes. It then goes into how they fixed those issues:
- Some key points on what they did
- Made Evictions Smarts, as they are not collection uniform
- Improved assumption around the handle cache
- Made Checkpoints better in all ways
- Aggressively cleaned up old handles
I hope this provides a peek into the good, bad, and ugly in MongoDB 3.6 Community! Please check back as we publish more in-depth blogs on how these features work in practice, and how to best use them.