October 25, 2014

Innodb usability and ease of use.

It always surprised me how little Innodb team seems to think about product usability/ease of use, when it comes to settings, performance management etc.

I could understand many things 5 years ago, like a lot of information being available only in hard to parse SHOW INNODB STATUS output or even uglier hacks with creating tables such as innodb_lock_monitor to get more detailed information free space specified in table comments (which need to be parsed) etc. 5 years ago Heikki was along and he had a lot to do to make things work well so a lot of these things were just done quick and dirty way.

It is however hard for me to understand why so many years later with significantly increased team not only many of these things remain unfixed but things are still done similar way ?

Other the years variables like innodb_thread_concurrency were added with rather complicated history of changes for meaning of the values, which seems to now settled to more or less understandable with value 0 meaning disabling Innodb internal thread queuing.

Another one is innodb_flush_logs_at_trx_commit – initially it had only values 0 and 1 which was more or less obvious as 0 typically means “No” and 1 “Yes” in many MySQL options. When Value 2 was added which actually has a meaning between what 0 and 1 mean which is very hard to understand.

Proper way would be of course to use some string values which are more self explanatory so at least you can’t mix what do you currently have set in your my.cnf file – For example using values “none”, “disk”, “os” instead of 0,1,2 will be more explanatory.

Furthermore MySQL even has infrastructure to support both string and integer values, which would allow to preserve compatibility. Look for example on query_cache_type variable which can be set to ON/OFF/DEMAND as well as to 0 and 1.

Interesting enough in some cases Innodb team does get things right. innodb_flush_method variable does not use value 0,1,2,3,4,6 which you have to lookup in the manual each time, but it uses more understandable values such as O_DIRECT, O_DSYNC, fdatasync etc.

However Look at new variable innodb_autoinc_lock_mode freshly added in 5.1.22 (which is what motivated me to write this post) – looking at the manual
we again get values 0,1,2 which have to be translated to “traditional”, “consecutive” and “interleaved” ?

Using string values would make things much more friendly with my.cnf and “show variables” output being more obvious and self documenting.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. rockerBOO says:

    I agree with you on these. It has been really difficult to look over my innodb parameters, and get a good idea of what is happening in there.

  2. Denis says:

    Didn’t get u a bit. “not only many of these things are fixed” did u mean “not only many of these things are _not_ fixed” ?
    like they remain unfixed or what?

  3. peter says:

    Thanks Denis, Fixed.

  4. Jay Janssen says:

    I disagree that innodb_flush_method is simple. The deficiency in my mind is that it can change the open file options for either the innodb data files, *OR* the trx log files, but not both at once. I’d like it better if I could use O_DIRECT for my data files to prevent FS caching, and O_SYNC for my trx logs.

  5. peter says:

    Jay,

    I knew something would point that out :)

    It is simple in the sense of settings which are more or less self descriptive. The logic behind it can be not so simple, you’re right.

    Indeed setting logs and data flush method separately would be better especially as even allowed modes differ between OS. For example O_DIRECT on Linux requires buffer alignment while on Solaris it is not required and so it should be possible to use it for logs as well.

  6. Hm… what’s the downside of using O_DIRECT wrt the write ahead log?

    We’re considering the switch so that innodb doesn’t have to worry about the kernel pushing the buffer pool into the page file.

  7. Jay Janssen says:

    O_DIRECT doesn’t seem to affect the trx log, it applies to how inno opens the data files.

  8. peter says:

    Indeed. O_DIRECT can’t be used with Log (on Linux) because it requires aligned IO while log can get writes of various sizes/alignment.

  9. Peter. Good point.

    So memlock + O_DIRECT seem to be required on Linux if you want to use all your memory and not page.

    I still have to track down whether this is a bug in 2.6.18 or whether this is expected behavior.

    Kevin

  10. As far as i have understood it O_DIRECT makes linux not cache the data read from disk, this is a very small performance improvement and only holds true if you either don’t have any free memory to cache it in, or are sure you doesn’t read it again.

    Here is an article with Linus Torvalds thoughts on O_DIRECT:

    http://kerneltrap.org/node/7563

  11. Alexander,

    O_DIRECT can provide a huge performance boost since it can trick linux to NOT move the buffer pool into the page file.

    Here’s my recent battle with innodb+linux

    http://feedblog.org/2007/09/29/using-o_direct-on-linux-and-innodb-to-fix-swap-insanity/

  12. peter says:

    Kevin, Alexander

    Indeed – O_DIRECT removes IO pressure and so can reduce swapping. But even if your OS VM is smart enough to keep cache size in control there is still the reason to use it, and you’re actually stating it.

    With databases you’re indeed sure you will not ever read the same block again – because it is in database buffer pool cache, which is typically larger than OS cache and so it will be most likely first removed from OS cache.

    Not to mention you likely want to keep your OS cache for things which really need it – like MyISAM tables, sort files etc.

  13. Antony Curtis says:

    The plug-in system available in MySQL 5.1+ permits the declaration of system variables as an ENUM which would easily allow defining variables, such as innodb_flush_logs_at_trx_commit, in a more human friendly way instead of just a number.

  14. Kevin

    Thanks you for clearing that up, it was a good read.

Speak Your Mind

*