My Innodb Feature wishes

PREVIOUS POST
NEXT POST

At Users Conference Heikki did good presentation about Innodb planned features. I did not see some of big and tiny wishes listed so I was making notes. Here is what I’d like to see

  1. Packed indexes. In many cases then difference in Performace with MyISAM and Innodb is huge for read only workload it is caused by the fact Innodb does not have prefix compression for its indexes, as MyISAM has which in certain cases makes them 10 times + larger. Heikki plans to have gzip like page compression but it would only give up to 2 times space compression.
  2. Insert buffer for delete. For Insert operation Innodb uses nice tecnique called Insert Buffer which speeds up inserts for non-unique indexes a lot. However deletes are not handled so they can be 10 times+ slower, which is especially bad for rollback of failed inserts which may take too long to complete
  3. Index build by sort. This is in ToDo but so important I’ll still list it. It would allow data load to be much faster as well as ALTER TABLE (expecially together with ability to create index without table rebuilding)
  4. Buffer pool improvements. I would like to see couple of things here. To have kind of middle point insertion so full table scan do not wipe it off as well as different priorities for different kind of data, for example flushing of undo pages might be avoided if possible as we know they are going to be purged.
  5. Long Rows. One ugly thing Innodb currently does is storing first 700 bytes or something like that from any blob/string in the page. Which gives you the nice surprise of row being unable to fit to the page if many columns are long strings. I would like to see it at least doing it for indexed columns only or even better have option similar to PostgreSQL to specify where this column should be stored – with the row in the page or outside of it. The other item with blobs – currently if I’m not mistaken blobs are stored each in its own “segment” which means having many blobs in the same row is going to be slow. For medium sized blobs it would be much better to merge them and store on the same segment and ie only store blobs over 1M in their own segments.
  6. Clustering. I love Innodb’s ability to cluster data by primary key it is so helpful in many cases, but in other cases you would like to cluster data by index which is not unique – for example cluster data by customer or by timestamp. Creating artificial second column is ugly and inconvenient. Having independent CLUSTER=key option would be great, it also would help to avoid penalty of very long primary key – you can simply select not to cluster by it instead of creating fake primary key and promoting real one to unique.
  7. Tablespaces. To call “innodb_file_per_table” option tablespaces is a joke. It is better than nothing but needs to be improved allowing many to many map of objects to the tablespaces. Especially being able to map different indexes to different tablespaces is important. This could be helpful in certain workloads. Yes you could simply use RAID and forget about tablespaces by spreading the load, but you often can do it better manually :)
  8. I/O On Linux you’re currently limited to innodb_file_io_threads=4. Some people do not know it but setting it to different values does not have any effect :) In certain cases multiple threads for some tasks would be helpful, especially for flushing dirty buffers. I would actually like to see it separated to different options – for example it does not make much sense to have more than one log flush thread. The other big lacking (especially as patch for it existed for years) is Async IO support in Linux. It should be very helpful with direct IO as well as with innodb_file_per_table=1 allowing to flush all of modified tablespaces in the parallel.
  9. Multiple page sizes. One size does not fit all and for certain workloads 16K is too much while for some others too little. Being able to specify page size per object or at least per tablespace would be great. This will complicate buffer management a bit but Innodb is already to deal with 2 page sizes by supporting compressed pages so it might be not that bad. Also other guys are able to handle it – even MyISAM has 1K-4KB key blocks so it is well possible
  10. Scalability It is actually bug but it is so bad I will list it. Basically Innodb has problems scaling both with large number of active threads and with large number of CPUs (in this bug you can see it is too bad even with 4 concurrent queries). The problem seems to be at least partially due to global mutexes in many places so it is not easy to fix. But I guess Heikki will have to do it otherwise with raise of MultiCores Innodb will become unusable for more and more types of workloads

What would your Innodb feature wishes be ?

PREVIOUS POST
NEXT POST

Comments

  1. pudo says

    (I have to use MySQL 4.0 at work and I’m not sure this is not already there in a later version, but): I’d love to see FULLTEXT indices in InnoDB. MyISAM has it. InnoDB needs it.

    Otherwise, that’s a neat list.

  2. peter says

    Yes… FullText Search would be nice. I however would see MySQL to implement full text search on the top level and make interface all storage engines could implement to support it. If everyone implements their own solution it would be inconvenient.

    I’m to aware of this and I did not mention it as it is not exactly performance related feature :)

    Also if you’re looking to do full text search with Innodb you may give a try to http://www.sphinxsearch.com/ It works with any table types and is normally much faster than build in FullText search. The integration is not as seemless as with build in FullText search but still pretty simple compared to other tools I’ve played with.

  3. n says

    Why bother fixing a inherently broken database-system when PostgreSQL and Firebird are freely available?

  4. Francesco Simi says

    We need Unicode (utf8) support in MyODBC driver for Windows.

    Broken for 4 years now… killing the windows platform for MySQL.

  5. peter says

    About PostgreSQL, Firebird etc – Belive me everyone has his own can of worms, no software is perfect. And you can say the same about close source databases as well. But having workarounds for problems you know is much better than facing unknown problems in new product

  6. peter says

    MyODBC – I will pass this to our MyODBC Developers. As far as I remember it is in works but it in works for too long.

  7. peter says

    Free hot backup is out there for all disk based storage engines…. It is called LVM (Linux Volume Manager). But yes, Innodb hot backup would be good to have free.

    And this actually may very well happen.

  8. peter says

    Just run into quite curious discussion about this article:

    http://www.phpbuilder.com/board/showthread.php?t=10322666

    I do not want to get into details comparing MySQL vs PostgreSQL just should say I can’t agree with poster at many counts.

    In general I will be often bringing out MySQL problems at this side as well as will advice on their avoidance and workarounds. MySQL Works great for millions of users but it is just wrong way to search for such stories. You can go to MySQL web site and download case studies and download some.

  9. says

    1. MySQL to open interface fulltext search and innoDB implement it.
    2. Descending indexes.
    3. Better index usage on range queries.

  10. peter says

    Couple of comments

    1) “Desceding indexes” is feature MySQL needs to implement on the top level. Some support might be needed from storage engine but it is less of the deal.

    2) What do you mean by better index usage for range queries. Can you give an example ?

  11. says

    select count(*)
    from Statement T
    join CurrDailyVal Q on Q.cur1 = T.cur1
    and Q.cur2 = ‘USD’
    and Q.xdate = ( select max(xdate) from CurrDailyVal
    where cur1 = Q.cur1
    and cur2 = ‘USD’
    and xdate

  12. says

    < = T.xdate )

    This is a select from an income statement table to report incomes in a different currency. CurrDailyVal table holds the exchange values. Its primary key is cur1, cur2, xdate.

    This select is an example to my previous post’s item 2 and 3.

    Item 2: I believe descending indexes would be helpful here since in the inner select i find a max date when approaching the nearest exchange value.

    Item 3: MySQL does not use xdate in the inner select. It just uses cur1 and cur2 and narrows the potential result set and then make a scan thru the remaining rows to find the max date.

  13. peter says

    Gokhan, I’d like to take a look at your table structure and explain.
    For corelated subquery you have cur1=const and cur2=const and xdate>=const which should use full index (cur1,cur2,xdate) furthermore max(xdate) should be retrieved from index directly

  14. says

    According to the documentation at MySQL site, Peter is right. MySQL should use all three columns in the inner select. However, I have faced performance problems in the real world which made me think MySQL does not use the range indexes efficiently.

    For the example query I previously attached here, I have prepared a test case, where you can see what I mean. To better illustrate what I mean, I have put rather big amounts of data to both tables. Statement table has 172806 rows, and CurrDailyVal has 638 rows. You can download the test case at http://igonline.biz/gdemir/data.tar.bz2 and try for yourself. The test case file itself is 1367455 bytes (~1.3 MB).

  15. peter says

    Gokhan,

    I finally had a time to take a look at your case. I’m not sure what it has to do with Range usage. In your case join can be only performed using two key parts. You can see it in explain as ref: “test.T.cur,const” with “ref” access type. At the stage when Join happens the value for third keypart is not yet avaiable as subselect is not executed.

    Now if you look at the number of rows query would return without subselect:

    mysql> select count(*) from Statement T join CurrDailyVal Q on Q.cur1 = T.cur and Q.cur2 = “EUR”;
    +———-+
    | count(*) |
    +———-+
    | 55125114 |
    +———-+
    1 row in set (1 min 17.06 sec)

    This subquery will need to be executed 55mi times so do not expect fast answer even if subquery is almost instant :)

  16. says

    This same test case takes 30 seconds to complete with Sybase ASE 15. Showplan output clearly states the usage of all three columns in the correlated subquery. Now, I have a production system with this kind of select and the data is growing day by day, I don’t know what to do.

  17. peter says

    Gokhan,

    In this case it could be Optimizer bug with subselect execution or with explain. Please report it at http://bugs.mysql.com This query should be using all 3 keyparts but explain shows it does not.

    Note there are VERY many missing optimizations with Subselects at this point. You should be very careful migrating Subselects from other Database servers as they could run much slower.

    In your case the best approach perhaps is to think how you can rewrite this query to perform well on MySQL.

  18. says

    Hi Peter,

    Thanks for the suggestion. Before submitting this case as a bug, I made a search through the bug system and found that bug, most probably very similar to this.

    http://bugs.mysql.com/bug.php?id=18465

    Therefore, I decided not to submit this case as a bug.

    By the way, especially nowadays, it is getting too much time for mysql developers to fix the bugs. There are even some severity 1 bugs open for six months or so, and no response from MySQL at all.

    Best Regards,
    Gokhan

  19. peter says

    Gokhan,

    This bug is actually a bit different even though it also correspond to poor
    subselect optimization and they have a lot in common.

    In general it might be better to report similar looking bug as new bug as if it is the same bug it will be quickly checked as soon as bug fixed, if it is not both of them can be fixed and so you will not find out the bug fixed does not apply to your case.

    Speaking about long time to fix the bug. Yes there is large backlog formed due to the rush with 5.0 development cycle. I hope back backlog will be reduced soon and we’ll get back to prompt bug resolution.

  20. Dmitry says

    simple “SELECT COUNT(*) FROM table” takes minutes/hours to run on tables with xxM records, can’t InnoDB calculate it based on Primary Key instantly?

  21. peter says

    Dmitry,

    This is pretty common request which is of high priority as soon as I know. It is however not as trivial as it sounds as Innodb is multi versioning engine so each transaction could see different amount of rows in the table which needs extra handling. Not what it is impossible just not as trivial as you’re saying.

    At this poing if you have no holes in your primary key and it goes from one select max(id) is frequently good and fast way to get the count. You also could use counter table especially now with Triggers in MySQL 5.0

    Yes I know it is ugly but we have to live with what we have :)

Leave a Reply

Your email address will not be published. Required fields are marked *