October 21, 2014

MySQL should have dynamic durability settings

If you’re using Innodb tables MySQL has innodb_flush_log_at_trx_commit variable which defines how durable your transactions are. If you have high durability requirements you set it to 1 and log records are pushed directly to the disk on transaction commit. If you do not bother loosing come committed transactions you can set it to 0 and Innodb will only flush log approximately once per second. Finally you can set it to 2 which is flushes data to operation system cache (so if MySQL crashes transaction is not lost) but does not save from OS crashes or power failures.

So far so good. The only problem is many applications have different kind of transactions in the mix.

Indeed if you think about most Web applications different data (often updated in different transactions) have different durability requirements. For some of updates you really would hate to lose them, though they may be rare. For example if user registers and you lost his account he might not be happy. On other hand if you’re doing update of the fact that user viewed certain item to implement recommendations it does not matter if this update is lost in crash conditions. You also probably do not want to loose orders but less picky about product rating votes.

What Innodb could do is at least to make innodb_flush_log_at_trx_commit a session variable, this is really where it belongs as decision how to commit transaction can be decided for each transaction separately in session context.

What MySQL Could do to take it further is to implement couple of different types of commit. For example we may have normal COMMIT which behaves default way set for the session, COMMIT DELAYED to flush transaction log some time in the future and COMMIT FORCE to force log flush for some important update.

Besides having such feature used directly in the application it can be rather helpful for replication thread handling. As replication is asynchronous anyway you do not have to flush logs for each transaction. Replication option could be implemented to utilize delayed commit for the replication thread.

This is surely not standard at all but could be quite handy.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Paul McCullagh says:

    Hi Peter,

    You’ve got my vote! I think this is a great idea. Very useful and not difficult to implement.

  2. QQLinux says:

    I think 2 is good

  3. peter says:

    Thanks Paul,

    BTW you could at least implement session based durability settings in PBXT :)

Speak Your Mind

*