My Innodb Feature wishesPeter Zaitsev
At Users Conference Heikki did good presentation about Innodb planned features. I did not see some of big and tiny wishes listed so I was making notes. Here is what I’d like to see
- Packed indexes. In many cases then difference in Performace with MyISAM and Innodb is huge for read only workload it is caused by the fact Innodb does not have prefix compression for its indexes, as MyISAM has which in certain cases makes them 10 times + larger. Heikki plans to have gzip like page compression but it would only give up to 2 times space compression.
- Insert buffer for delete. For Insert operation Innodb uses nice tecnique called Insert Buffer which speeds up inserts for non-unique indexes a lot. However deletes are not handled so they can be 10 times+ slower, which is especially bad for rollback of failed inserts which may take too long to complete
- Index build by sort. This is in ToDo but so important I’ll still list it. It would allow data load to be much faster as well as ALTER TABLE (expecially together with ability to create index without table rebuilding)
- Buffer pool improvements. I would like to see couple of things here. To have kind of middle point insertion so full table scan do not wipe it off as well as different priorities for different kind of data, for example flushing of undo pages might be avoided if possible as we know they are going to be purged.
- Long Rows. One ugly thing Innodb currently does is storing first 700 bytes or something like that from any blob/string in the page. Which gives you the nice surprise of row being unable to fit to the page if many columns are long strings. I would like to see it at least doing it for indexed columns only or even better have option similar to PostgreSQL to specify where this column should be stored – with the row in the page or outside of it. The other item with blobs – currently if I’m not mistaken blobs are stored each in its own “segment” which means having many blobs in the same row is going to be slow. For medium sized blobs it would be much better to merge them and store on the same segment and ie only store blobs over 1M in their own segments.
- Clustering. I love Innodb’s ability to cluster data by primary key it is so helpful in many cases, but in other cases you would like to cluster data by index which is not unique – for example cluster data by customer or by timestamp. Creating artificial second column is ugly and inconvenient. Having independent CLUSTER=key option would be great, it also would help to avoid penalty of very long primary key – you can simply select not to cluster by it instead of creating fake primary key and promoting real one to unique.
- Tablespaces. To call “innodb_file_per_table” option tablespaces is a joke. It is better than nothing but needs to be improved allowing many to many map of objects to the tablespaces. Especially being able to map different indexes to different tablespaces is important. This could be helpful in certain workloads. Yes you could simply use RAID and forget about tablespaces by spreading the load, but you often can do it better manually 🙂
- I/O On Linux you’re currently limited to innodb_file_io_threads=4. Some people do not know it but setting it to different values does not have any effect 🙂 In certain cases multiple threads for some tasks would be helpful, especially for flushing dirty buffers. I would actually like to see it separated to different options – for example it does not make much sense to have more than one log flush thread. The other big lacking (especially as patch for it existed for years) is Async IO support in Linux. It should be very helpful with direct IO as well as with innodb_file_per_table=1 allowing to flush all of modified tablespaces in the parallel.
- Multiple page sizes. One size does not fit all and for certain workloads 16K is too much while for some others too little. Being able to specify page size per object or at least per tablespace would be great. This will complicate buffer management a bit but Innodb is already to deal with 2 page sizes by supporting compressed pages so it might be not that bad. Also other guys are able to handle it – even MyISAM has 1K-4KB key blocks so it is well possible
- Scalability It is actually bug but it is so bad I will list it. Basically Innodb has problems scaling both with large number of active threads and with large number of CPUs (in this bug you can see it is too bad even with 4 concurrent queries). The problem seems to be at least partially due to global mutexes in many places so it is not easy to fix. But I guess Heikki will have to do it otherwise with raise of MultiCores Innodb will become unusable for more and more types of workloads
What would your Innodb feature wishes be ?