November 22, 2014

Countless storage engines

Today everybody writes about MySQL Conference & Expo and I am not an exclusion. I am under impression of count of storage engines were presented. In good old time when Oracle bought InnoDB, MySQL did one step – announced MySQL supports Plugginable Storage Architecture. In that time nobody was able to predict what is the outcoming of such feature, but currently we all see it.
Let me list Transactional Storage Engines I heard of on Conference and main impression about.
InnoDB On the conference InnoDB presented two new interesting features – transparent ZIP of pages stored on disk. Potentially it can imporove performance of tasks with intensive I/O load a lot. Second feature – fast index build. Both should be available in 5.2 MySQL.
PBXT Interesting engine, mainly because of way to store new records “never updates” and just write to the end of file. Surely it has as benefits as and drawbacks. We are going to test PBXT in write workload, though currently comparison with InnoDB will not be fair, as PBXT do not perform fsync() after commit and fully rely to OS. The developers promise to fix it soon.
Falcon The Storage Engine which is developed in MySQL. The performance we see in our previous benchmarks was not good enough, but there was some fixes in 5.2 tree. Also this engine will not support index coverage optimization, that makes Falcon less attractive from performance reasons.
Solid The engine with both pessimistic and optimistic locking on the table level. The interesting note Solid proposes High Availability solution for Solid Engine, which can be replacement of MySQL replication (the solution was only announced and will be available later)

And three engines which were only presented and details only available from marketing materials:
NitroEDB The engine promises to handle in very fast way aggregative functions on massive data sets. As I understood they use special index to store pre-calculated values for limited number of functions, and then engine can retrieve values from index. The interesting question if engine is able to calculate combination of functions or user-defined function on TB data. I would like to test it, but most likely it is not possible in near future, as license price is 20.000$ per server ( the price I heard somewhere, not sure if it’s 100% correct)
Infobright Infobright is non-transactional and read-only engine designed for DataWarehouse applications with compression rate from 10:1 to 30:1
ScaleDB The engine based on special index, so named “Trie”, the extension of “Patricia Tree” data structure. The developers promises ultimate speed and scalability for index access operations. It’s interesting to test it in real situations, because theoretical ideas can be very far from life. Also nobody prohibits to add such index to other storage engine if it will show good performance results.

The wide choice is always good, but I’m afraid making right decision about Storage Engine will be much harder.

About Vadim Tkachenko

Vadim leads Percona's development group, which produces Percona Clould Tools, the Percona Server, Percona XraDB Cluster and Percona XtraBackup. He is an expert in solid-state storage, and has helped many hardware and software providers succeed in the MySQL market.

Comments

  1. James mcGlynn says:

    It is worth noticing different storage engines and features have various levels of Traffic Cones readiness, if as I understand ScaleDB is not publicly available yet. It is also worth noticing MySQL world is not free (as in beer) any more. Solids synchronous replication offering may be commercial only (It does not say on web site exactly yet) – NitroDB and Infobright are very special storage engines and not surprisingly commercial. Plus Infobright one is currently Windows only and ReadOnly (special loader-compressor must be used) as I understand Traffic Safety Cones.

    this leaves rather narrow range for working set to memory ratio when Falcon optimizations will allow it to excel. Ie Assume we have 16GB of memory and 10GB Innodb database which will be 7GB in Falcon (for example) and we assume working set matches database size for sake of argument Safety Cones.

    Then for all other tables where you don’t need your data clustered you could use Falcon with a large page and row cache and benefit from the better use of memory (especially in the case of the row cache).

  2. Ann Harrison says:

    “Index coverage optimization” means that queries can
    be resolved from the index without reference to data
    Falcon doesn’t carry transaction information in indexes
    for two reasons: the information is big relative to
    an index entry (up to two transaction ids, at about 6
    bytes each) and the transaction ids are volatile –
    sometimes you need one, sometimes you need two, and
    sometimes you don’t need any. Those factors equate
    to bloating the indexes and making them more expensive
    to maintain.

    So, queries like:

    select name from people where name = ‘Ann’

    require actually reading the records in people that are indexed
    under ‘Ann’ to see if I might have changed my name to ‘Her Most
    Exalted Majesty’. There are real cases where resolving from the
    index is a good thing – joins with junction tables for one – e.g.
    students / registrations / courses where the registration record
    consists only of the student_id and the course_id. Not reading
    those records is a good thing.

    However, the choice that Falcon made was to have smaller, faster,
    lower-maintenance indexes. That’s also the choice that InterBase,
    Firebird, and Postgres made. I’ve seen benchmarks where they do
    well, and others where they don’t. Personally, I prefer that
    indexes work better in the general case, even if it costs in this
    case, but YMMV.

  3. peter says:

    Ann,

    Yes thank your for your comment. I well understand implications and design choices you made.

    This however lives Falcon in very interesting position. For small data sets when data mostly fits in memory fact Falcon does not have clustering by primary key and covering indexes is not critical because it does not require a lot of random IO. But in the same case the fact indexes are a bit longer does not give much problems, also the fact Innodb has to cache full pages not individual rows does not bother you too much. Also optimized index retrieval does not really matter if data is in memory.

    So this leaves rather narrow range for working set to memory ratio when Falcon optimizations will allow it to excel. Ie Assume we have 16GB of memory and 10GB Innodb database which will be 7GB in Falcon (for example) and we assume working set matches database size for sake of argument.

    In case you get ie 20GB in Innodb and 14GB in falcon or 30GB in Innodb and 21GB in Falcon we can see the difference but when we get to 100GB Innodb vs 70GB of Falcon the lack of clustered key and covering index may show serious problems.

    Again these are my assumptions so far – the live will show us how it really is.

    I think you widely underestimate amount if cases when covering index is a life saver.

  4. peter says:

    Vadim,

    My comments on this post.

    1) PBXT: “Never update the row” as far as I understand really applies to “Variable length” row portion – the fixed length portion still needs to be updated to link to link to previous row version. I think the assumption in PBXT is to have fixed row portion small and so this file fits in memory. In general PBXT seems to be very optimized for Blobs handling and handling long rows in general. Also worth to note currently PBXT really treats different databases as different instances which can cause many gotchas if you have transactions spawning multiple databases both for Commit and for sake of Repeatable Reads isolation mode.

    2) Falcon. It is worth to note MySQL official marketing never position MySQL as such per say (for obvious reasons) but it is indeed set as replacement for Innodb in many users mind. Visiting customers I often hear the question of when they should migrate to Falcon and which benefits they will get. I surely tell them it is too early to migrate and too early to say because Falcon is in aggressive development. Due to very different concepts both in terms of performance properties and transaction implementation details Falcon is unlikely to be easy drop-in replacement for Innodb but for certain applications especially newly designed ones it offers some attractive features. I however agree the lack of clustering by primary key support and not being able to use covering indexes means there is no way to get predictable data locality which is extremely important for large databases.

    3) Solid – this one is actually better positioned as “Innodb Replacement” as it has much more close architecture and applications may be easy to port if performance will be adequate. Solid also has clustering by primary key and unlike Innodb has key compression (one of major Innodb problems in my opinion). The problems with Solid based on listening their talk may be handling of long transactions (spilling bonsai tree to the disk may not be that efficient) and writing new pages to new locations which is likely to cause significant problems.

    Comparing Transactional storage engines though one need to take into account a lot of variables – how stable is engine and integration (the last thing you want is loosing your data – it is worse than crashes or wrong query results). Crash recovery reliability and speed, including rare cases such as partial page writes etc. This is where knowledge just will need to be accumulated.

    It is worth noticing different storage engines and features have various levels of readiness, if as I understand ScaleDB is not publicly available yet. It is also worth noticing MySQL world is not free (as in beer) any more. Solids synchronous replication offering may be commercial only (It does not say on web site exactly yet) – NitroDB and Infobright are very special storage engines and not surprisingly commercial. Plus Infobright one is currently Windows only and ReadOnly (special loader-compressor must be used) as I understand.

    P.S For Planet MySQL users who may think why Peter is Commenting Peter – original post was made by Vadim – number of people write to MySQLPerformanceBlog these days but it is all showing up under Peter Zaitsev in planet MySQL.

  5. Ryan says:

    About clustered indexes. It seems to me that you could use innodb on tables where you rely on clustered index and give innodb a small buffer pool. If you’re mainly doing lookups on these tables by clustered index these don’t necessarily need to be in the buffer pool. If most of the accesses were going to be cache hits then you probably wouldn’t need the table clusted to begin with. Then for all other tables where you don’t need your data clustered you could use Falcon with a large page and row cache and benefit from the better use of memory (especially in the case of the row cache). If you could fit twice as many records in memory with falcon than innodb then you could see a large improvement in performance in most common cases where you’re disk bound on reads. At this point really falcon is a good year or two out before I would conser using it in a production environment though. Also I need to find out how well things would work in a mixed mode environment with some tables innodb and some falcon.

Speak Your Mind

*