MongoDB Transactions? Yes

PREVIOUS POST
NEXT POST

People claim that MongoDB is not transactional. It actually is, and that’s a good thing.

In MongoDB 2.2, individual operations are Atomic. By having per database locks control reads and writes to collections, write operations on collections are Consistent and Isolated. With journaling on, operations may be made Durable. Put these properties together, and you have basic ACID properties for transactions.

The shortcoming with MongoDB’s implementation is that these semantics apply to individual write operations, such as an individual insert or individual update. If a MongoDB statement updates 10 rows, and something goes wrong with the fifth row, then the statement will finish execution with four rows updated and six rows not updated.

Running MongoDB with Fractal Tree Indexes (used today in the MySQL storage engine TokuDB) is fully transactional. Each statement is transactional. If an update is to modify ten rows, then either all rows are modified, or none are. Queries use multi-versioning concurrency control (MVCC) to return results from a snapshot of the system, thereby not being affected by write operations that may happen concurrently.

Here are some benefits:

  • the state of the system after a failed command is well defined. Nothing is applied.
  • users that run queries requiring calls to getMore will have the results come from a consistent snapshot
  • clone command will clone a consistent snapshot of the data

From what we can tell, users want this.

Do you want to participate in the process of bringing full transactions to MongoDB? We’re looking for MongoDB experts to test our build on your real-world workloads. Evaluator feedback will be used in creating the product road map. Please email me at zardosht@tokutek.com if interested.

Later, I will write about multi-statement transactions, and our plans to introduce those.

PREVIOUS POST
NEXT POST

Share this post

Comments (13)

  • Mark Callaghan Reply

    So they use the “atomic operations” model made famous by MyISAM?
    http://dev.mysql.com/doc/refman/5.6/en/ansi-diff-transactions.html

    April 2, 2013 at 5:01 pm
    • zardosht Reply

      Mark,

      A lot of MongoDB’s storage algorithms remind me of MyISAM. In addition to atomic individual operations, they have database level locking for writes, as MyISAM has table level locking, and their primary key, the “id” index, is non-clustering. That said, it’s also important to note that MongoDB does have crash recovery.

      April 2, 2013 at 5:28 pm
      • Mark Callaghan Reply

        They also reproduced the excellent community building done by MySQL. Too bad MyISAM was never made crash safe.

        April 3, 2013 at 2:42 pm
        • Michael Carney Reply

          Just a lowly sales guy commenting late but as far as I’m aware the Aria engine in MariaDB is a crash-safe MyISAM. It just needs some friends to play with it

          June 25, 2013 at 12:21 pm
  • Robert Hodges Reply

    How exactly does Tokutek enable multi-statement ACID transactions for MongoDB? Is Tokutek a replacement for the MongoDB storage layer?

    April 3, 2013 at 2:52 am
    • zardosht Reply

      Yes, we completely replace the MongoDB storage layer with fractal tree indexing.

      April 3, 2013 at 3:09 am
  • Ilya Reply

    Will this be open-sourced?

    April 4, 2013 at 1:29 pm
  • Benjamin Darfler Reply

    Does this apply to sharded setups? If not it seems to be of limited use since the point of choosing MongoDB over MySQL is often for its easy sharding ability.

    April 6, 2013 at 7:51 pm
    • zardosht Reply

      We realize that sharded setups are an important use case for MongoDB users and are currently digging into how sharding will work with fractal trees. We can’t yet comment on how transactions will work with sharded setups. This currently applies to non-sharded setups

      April 8, 2013 at 2:46 pm
      • James Reply

        To be honest, this is a big improvement even with the sharing proviso. I bet many, if not most workloads have inserts hitting a single shard (much like the recommendation for queries to hit one shard, for latency reasons). Certainly for our workloads, the shard key is nicely orthogonal to the data – so this, in itself would be a great improvement.

        April 9, 2013 at 10:26 am
  • Nikhilesh Reddy Reply

    Good post…

    July 2, 2013 at 3:12 pm
  • Ban Ăn Chơi Reply

    Thanks, nice post

    May 26, 2016 at 11:25 am

Leave a Reply