November 27, 2014

MySQL 5.7.3 milestone release fixes some of my pet peeves

It is wonderful to see some of my original pet peeves fixed in MySQL 5.7.3! It has not even taken 10 years :)

I remember when starting my work at MySQL Support that I would recommend using UNION ALL instead of plain UNION because it would not require duplicate elimination, and as such, would not require the creation of a temporary table in basic cases. Guess what? MySQL actually did it anyway. The learning experience from this incident was to never use common sense alone when thinking about MySQL – test and validate your assumptions because for one reason or another things may work differently from what “obviously makes sense.” I do not want to single out MySQL – this approach has helped me to be successful with problem analyses of all kinds of software.

The Multi Column IN optimization is another one that was long overdue for fixing. It was a wonderful feature when it was first introduced allowing Multi-key lookup queries from applications written much easier… so I was quite surprised to learn using this works differently from equivalent form written with OR:
SELECT * FROM T WHERE ((A,B) in (1,1), (2,2)) would work differently from SELECT * FROM T WHERE (A=1 and B=1) OR (A=2 and B=2)
This is essentially the different syntax to express same logical condition and it is very strange to see MySQL being unable to optimize the first form of it. It was clearly a half-baked feature in MySQL and I’m glad MySQL team is getting to fixing it for MySQL 5.7.

There are other impressive things in this Milestone as well:

Support for persistent connections and quick reconnects: I’m really glad work is going on supporting both application development patterns – of supporting a high number of connections and alternative styles when you just connect and disconnect all the time – and so the number of connections maintained is low. 50K connects/disconnects a second is very impressive number and when it comes to the number of connections – you can get to tens of thousands of connections with Thread Pool (part of MySQL Enterprise edition and also available in Percona Server as Open Source Alternative).

MetaData Lock Instrumentation is another very important feature. As of right now in MySQL 5.6 the Meta Data Locking patterns can be pretty complicated while there is very limited visibility into what is really happening – why threads are waiting for locks, they are waiting for and who is holding them. The small concern I have here is – it is done as part of Performance Schema which means additional overhead, even though the locks wait information is already always available internally and probably the current lock waits could be displayed without need for extra overhead in the similar way as INNODB_LOCKS table exposes internal Innodb locks.

MySQL 5.6 brought a lot of impressive changes and there is a lot of exciting stuff comes with the MySQL 5.7 milestones releases we have seen so far.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. KepPro says:

    “never use common sense alone when thinking about MySQL” – priceless :)

  2. Rick James says:

    UNION ALL is still worth recommending — it avoids the de-dup pass.

  3. Rick,

    For sure. My point is UNION ALL just got better avoiding creating temporary table at all. I have tested it a while back and UNION ALL was always better than UNION in MySQL even before this optimization was done
    http://www.mysqlperformanceblog.com/2007/10/05/union-vs-union-all-performance/

  4. Sinisa says:

    Peter, my dear friend,

    UNION ALL / DISTINCT can not be judged on whether one is better then the other. You only have to answer a simple question of whether you require distinct results or not. But, even if you require distinct result, ALL can be used in so many cases simply for the fact that, with some queries and entities, you will get distinct results even with ALL clause.

Speak Your Mind

*