September 1, 2014

MVCC: Transaction IDs, Log Sequence numbers and Snapshots

MySQL Storage Engines implementing Multi Version Concurrency Control have several internal identifiers related to MVCC. I see a lot of people being confused what they are and why they are needed so I decided to take a time to explain it a bit. This is general explanation, it does not corresponds to Innodb in particular and some implementation can be different but I hope this will let you to understand MVCC a bit better.

Transaction ID As the name says this is transaction identifier. It can be used by the engine for many things – for lock handling to see which transaction holds the lock and possibly kill it in case of deadlock, for proper isolation mode handling – transaction should see its own uncommitted changes but other transactions typically do not see them as well as MVCC implementation. It also can be used for recovery so you can see to which transaction given change corresponds so you can roll back or redo changes for transaction depending if it was committed or not.

Log Sequence Number (LSN) Log Sequence Numbers correspond to given position in the log files and typically incremented for each log record. Innodb uses number of bytes ever written to the log files however it could be something different. LSNs are often extensively used in recovery check pointing and buffer management operations. When checkpoint (both fuzzy and not) happens you get something like “all changes up to LSN=X are now flushed to the data space” this means you can discard or archive logs for LSN earlier than that. When doing log recovery checking LSN in the log record can tell if you this change needs to be applied or it already was applied (when doing recovery you do not know which dirty pages were flushed from the buffer pool).

The LSN do not relate much to transactions – changes from different transactions are intermixed in the log files and many LSNs can correspond to changes from the same transaction.

Snapshot Versions This is the term which is often used to to identify what will be visible for given transaction/query, however MVCC is usually implemented without having something like snapshot number.

Indeed if you look at it in details visibility for transaction is more complicated when some given “snapshot” – transaction tends to see its own changes, even if they are not committed.

What happens instead is – when transaction is started, the list of concurrently running transactions (not committed) is memorized.

For each row more than one version can be stored, each tagged with transaction IDs which added/modified and deleted the given version. This data can be compressed/optimized to keep less information if possible, ie you do not really need to store transaction ID for the row which is so old it is visible to old running transaction anyway.

So what you can get for given row is something like:

Value 5 created by transaction 100
Value 4 created by transaction 50
Value 3 created by transaction 10, deleted by transaction 20.

This tracks most recent object history which tells you it was created as value 3 by some transaction when was deleted and there were no such row (ie with given PK) for some time and when couple of new versions were created.

When looking at the row in MVCC environment system has to determinate which of the current row versions is visible. This can be done as simple as traversing the list of version down or something more complicated.

In the described case lets consider transaction 101 which is running in REPEATABLE-READ isolation mode and which was started before transaction 100 have committed.

In such case transaction 101 should not see changes done by transaction 100 and will read value 4 created by transaction 50.

We also need to track when given transaction has committed, in the list of transactions which were active at the time current transaction was started. This is needed for example to handle READ-COMMITTED isolation mode.

Lets look at transaction 102 which was started in READ-COMMITTED isolation mode before 100 was committed. Doing first read of this row it will read 50, now if it reads the row second time, after transaction 100 was committed it will see this row because it will know transaction 100 was committed when given read operation was started.

Note: These are far from all functions which MVCC storage engines may place on these identifiers. This is not extensive list but just to give you some examples to understand why do you need all of them.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. sakib says:

    How do you in MY Sql increment the number itself from like say 6000???

  2. peter says:

    What number and why would you like to increment it ?

  3. Andrej says:

    nice blog-entry!!! i’ve read it and i love it! i’m interesting in high scalable architectures and your blog entry helps to understand the mvcc-koncept in this context. thank you!

Speak Your Mind

*