Where the open source community meets: Secure your spot for Percona Live Amsterdam! - Register

Downloads

Blog

Using the MongoDB 3.6 Expression Query Operator $expr

March 9, 2018

Author

Tim Vaillancourt

Insight for Developers

MongoDB

Percona Software

Share this Post:

In this blog, we will discuss the MongoDB 3.6 expression query operator $expr. To show the power of this functionality, I will demonstrate the use of this feature with a simple example.

The $expr Query Operator

With the exception of a few basic operators ($and, $or, $lt, $gt, etc.), before MongoDB 3.6 you could only use several powerful expressions operators on query results via the aggregation pipeline. In practice, this meant that MongoDB .find() queries could not take advantage of a lot of powerful server features.

In 3.6 and above, support for a new query operator named “$expr“ was added to the MongoDB .find() operation. This allows queries to take advantage of the unavailable operators previously available only in aggregations.

Users that are familiar with the aggregation framework will remember that expressions/conditions in aggregation are evaluated on a per-document basis. Aggregations allow document fields to be used as variables in conditionals (by prefixing the field name with a dollar sign). The new .find() $expr operator adds that same flexibility and power to the .find(), and perhaps more importantly in this article: .findAndModify() commands!

I hope to show how this new functionality creates some very powerful and efficient application workflows in MongoDB.

Our Example Application

In this example, let’s pretend we are designing a store inventory application based on MongoDB. Among other things, one of the major functions of the store inventory application is to update items when they’re sold. In this article, we will focus on this action only.

Each item in the inventory system is stored in a single MongoDB document containing:

1. A numeric “itemId”

1. A “name” string

1. The number of times the item has been sold (“sold”)

1. The total inventory available (“total”). Importantly, each item may have a different total inventory available.

An example “items” collection document:

> db.items.findOne()
{
	"_id" : ObjectId("5a85d32f8a734d82e8bcb5b5"),
	"itemId" : 123456,
	"name" : "a really cool item",
	"total" : 10,
	"sold" : 0
}

> db.items.findOne()

{

"_id" : ObjectId("5a85d32f8a734d82e8bcb5b5"),

"itemId" : 123456,

"name" : "a really cool item",

"total" : 10,

"sold" : 0

}

Some additional expectations are:

1. When the application calls the “sold item” workflow if all items are sold an empty document (or “null”) should be returned to the application.

1. If the item is still available, the number of sold items should be incremented and the updated document is returned to the application.

Pre-3.6: Example #1

Before 3.6, a common way to tackle our example application’s “sold item” workflow was by using a .findAndModify() operation. A .findAndModify() does exactly what it suggests: finds documents and modifies the matching documents however you specified.

In this example, our .findAndModify() operation should contain a query for the exact “itemId” we are selling and an updated document to tell MongoDB to increment the number of sold items during the query.

By default, .findAndModify() returns a document BEFORE the modification/update took place. In our example, we want the result from AFTER the update, so we will also add the boolean option “new” (set to true) to cause the updated document to be returned.

In the MongoDB shell this .findAndModify() operation for “itemId” 123456 would look like this:

db.items.findAndModify({
  query: { itemId: 123456 },
  update: { $inc: { sold: 1 } },
  new: true
})

db.items.findAndModify({

query: { itemId: 123456 },

update: { $inc: { sold: 1 } },

new: true

})

But there’s a problem: “itemId” 123456 only has ten items available:

> db.items.find(
    { itemId: 123456 },
    { _id: 0, total: 1 }
  )
{ "total" : 10 }

> db.items.find(

{ itemId: 123456 },

{ _id: 0, total: 1 }

)

{ "total" : 10 }

What if we run this .findAndModify() more than 10 times? Our query does not check if we exceeded the “total” items, this won’t work!

Here we can see after running the query 11 times, our “sold” count of 11 is incorrect:

> db.items.find(
    { itemId: 123456 },
    { _id: 0, sold: 1 }
  )
{ "sold" : 11 }

> db.items.find(

{ itemId: 123456 },

{ _id: 0, sold: 1 }

)

{ "sold" : 11 }

If ALL items in the inventory system had a total of 10 items, this would be quite simple; the .findAndModify() operation could just be modified to consider that “sold” should be less than ($lt) 10:

db.items.findAndModify({
  query: {
    itemId: 123456,
    sold: { $lt: 10 }
  },
  update: {
    $inc: { sold: 1 }
  },
  new: true
})

db.items.findAndModify({

query: {

itemId: 123456,

sold: { $lt: 10 }

update: {

$inc: { sold: 1 }

new: true

})

But this isn’t good enough either.

In our example, each document has it’s own “total” and “sold” counts. We can’t rely on every item having ten items available, every item may have a different “total”.

Pre-3.6: Example #2

In the pre-3.6 world, there weren’t too many ways to address the problem we found in our first approach, other than breaking the database logic into two different calls and having the application run some logic on the result in the middle. Let’s try that now.

Here’s what this new “sold item” workflow would look like:

1. A .find() query to fetch the document needed, based on “itemId”:
  
  JavaScript
  
  > db.items.find( { itemId: 123456 } ) { ... }
  
  1
  2
  3
  4
  
  > db.items.find(
  { itemId: 123456 }
  )
  { ... }

1. The application analyses the result document and checks if “sold” field is greater than the “total”.

1. If there are items available an .update() operation is run to increment the “sold” number for the item. The updated document is returned by the application:
  
  JavaScript
  
  > db.items.update({ { itemId: 123456 }, { $inc: { sold: 1 } } }) { ... }
  
  1
  2
  3
  4
  5
  
  > db.items.update({
      { itemId: 123456 },
      { $inc: { sold: 1 } }
    })
  { ... }

Aside from increased code complexity, there are several problems with this approach that might not be obvious:

1. Atomicity: A single MongoDB operation is atomic at a single document level. In other words, if two operations update a document, one of the operations must wait. In the situation in “Example #1”, where our query AND document increment occur in the same operation, we can feel safe knowing that our “sold” and “total” counts were updated atomically. Unfortunately, in our new approach, we’ve broken our query and update into 2 x separate database calls. This means that this operation is not entirely atomic and prone to race conditions. If many sessions run this logic at the same time, it’s possible for another session to increment the counter between your first and second database operation!

1. Several Operations: In storage engines like WiredTiger and RocksDB, operations wait in a queue when the system is busy. In our new approach, we must wait to enter the storage engine twice. Under serious load, this could create a cascading bottleneck in your architecture. It could cause application servers to stall and backlog operations in lockstep with the overwhelmed database. The most efficient approach is to perform the query and increment in a single operation.

1. Network Inefficiency: Performing two database commands requires double the serialization overhead and network round trip time. There is also a minor increase in bandwidth usage required.

Post-3.6: Example #3

In this third example, let’s utilize the new Expression Query Operator ($expr) that was added in 3.6 to make this workflow as efficient as possible. Note that the approach in this example only works on MongoDB 3.6 or the upcoming Percona Server for MongoDB 3.6 (or greater)!

The Expression Query Operator allows powerful operators to be used on the result document(s) of regular .find() queries. How? Expression Queries/$expr is actually running an aggregation after translating your find filter/condition to a $match aggregation pipeline stage.

In our new approach, we will only need to use one basic operator in our expression: $lt (ie: less-than). The $lt operator is used in our expression to check that our “sold” field is less than the “total” field. Using the $lt under the new $expr operator, we can make the MongoDB server compare the “sold” vs. “total” count of the item we are querying and only return and increment the document if the expression is true, all in a single, atomic, server-side operation!

Here is what our improved query looks like:

db.items.findAndModify({
  query: {
    itemId: 123456,
    $expr: {
      $lt: [
        "$sold",
        "$total"
      ]
    }
  },
  update: {
    $inc: { sold: 1 }
  },
  new: true
})

db.items.findAndModify({

query: {

itemId: 123456,

$expr: {

$lt: [

"$sold",

"$total"

]

}

update: {

$inc: { sold: 1 }

new: true

})

Notice the $lt fields “sold” and “total” are prefixed with a dollar sign ($). This tells the aggregation to use the real value of each matched document in the less-than comparison dynamically. This resolves a problem we encountered earlier, and now this query only succeeds if there are items available (“sold” is less-than “total)! This efficiently pushes logic down to the database.

The itemId: 123456 has a “total” value of 10. If we run the .findAndModify() 10 times, we get this result:

> db.items.findAndModify({
   query: {
     itemId: 123456,
     $expr: {
       $lt: [
         "$sold",
         "$total"
       ]
     }
   },
   update: {
     $inc: { sold: 1 }
   },
   new: true
 })
{
	"_id" : ObjectId("5a85d32f8a734d82e8bcb5b5"),
	"itemId" : 123456,
	"name" : "a really cool item",
	"total" : 10,
	"sold" : 10
}

> db.items.findAndModify({

query: {

itemId: 123456,

$expr: {

$lt: [

"$sold",

"$total"

]

}

update: {

$inc: { sold: 1 }

new: true

})

{

"_id" : ObjectId("5a85d32f8a734d82e8bcb5b5"),

"itemId" : 123456,

"name" : "a really cool item",

"total" : 10,

"sold" : 10

}

Ten sold, ten total. Great!

If we run the same query one more time we receive a “null”:

db.items.findAndModify({
   query: {
     itemId: 123456,
     $expr: {
       $lt: [
         "$sold",
         "$total"
       ]
     }
   },
   update: {
     $inc: { sold: 1 }
   },
   new: true
 })
null

db.items.findAndModify({

query: {

itemId: 123456,

$expr: {

$lt: [

"$sold",

"$total"

]

}

update: {

$inc: { sold: 1 }

new: true

})

null

Perfect! A null is returned because our $lt condition in our $expr operator did not succeed, just like we wanted.

Let’s make sure there was NOT an 11th increment, an issue we had in our first example:

> db.items.find(
    { itemId: 123456 },
    { _id: 0, sold: 1 }
  )
{ "sold" : 10 }

> db.items.find(

{ itemId: 123456 },

{ _id: 0, sold: 1 }

)

{ "sold" : 10 }

Here we can see the 11th increment did not run because the $expr failed to pass. 10 of 10 items are sold. This is very cool!

Conclusion

Here we can see the combination of two existing server features (.findAndModify() and document fields as variables in aggregations) and a new MongoDB 3.6 expression query operator $expr has solved many problems with a previously inefficient and potentially dangerous data workflow.