Should you move from MyISAM to Innodb ?

There is significant portion of customers which are still using MyISAM when they come to us, so one of the big questions is when it is feasible to move to Innodb and when staying on MyISAM is preferred ?

I generally prefer to see Innodb as the main storage engine because it makes life much simpler in the end for most users – you do not get to deal with recovering tables on the crash or partially executed statements. Table locks is no more problem, hot backups are easy, though there are some important things which we have to consider on case by case basics before recommending the move.

Is MyISAM used as default or as a choice ? This is the most important question to ask upfront. Sometimes MyISAM is there just because it is default, in other cases this is deliberate choice with system being optimized to deal with MyISAM limits, for example there is a dedicated slave available for all long reporting queries. In case MyISAM was chosen not just happened to be it is important to build the good argument to suggest Innodb.

Application Readiness Application should be ready to work with Innodb, for example be ready to deal with deadlocks which can happen with Innodb even if you do not use transactions, but which are not existent with MyISAM. QA has to be performed as part of the move.

Performance Innodb has a lot to offer in terms of performance – Performance benefits and drawbacks. On the benefits side we usually see clustering by primary key, caching data, higher concurrency, background flushes while on the drawbacks side we see significantly large table size (especially if data size is close to memory size), generally slower writes, slower blob handling, concurrency issues, problems dealing with very large number of tables, slow data load and ALTER TABLE and others. Another big one is COUNT(*) without where clause which is often the show stopper for them move until it is worked around.

Operations What is good for MyISAM kills Innodb, such as copying binary tables between the servers. It is important the team understands Innodb and knows how to handle it, or be able to learn it. It is also important to adjust processes as required to work with Innodb. For example binary copy of one of the databases from the Slave to the dev envinronment works great for MyISAM but does not work with Innodb. Backup tools like “mysqlhotcopy” does not work etc. Note Performance also affects Operations aspects a lot – for example using mysqldump as a backup may well work for MyISAM but will start taking way too much time to do restore for Innodb. On large scale installations mysqldump does not work anyway but it may still work for you when you’re running MyISAM but instantly break upon upgrading to Innodb.

Features The MyISAM features which forbid moving to Innodb are typically Full Text Search and RTREE indexes/GIS with Full Text being much more common. There are workarounds for both of them, including dedicated MyISAM slave or shadow table but it is important to consider them.

How about Mixing Storage Engines ? Sure you can mix storage engines but I suggest you doing is wisely. It complicates operations tasks (backups, balancing, performance analyzes) as well as it exercises not so common paths in the MySQL server – in particular Optimizer may have harder time because costs between storage engines may not be well balanced or replication of mixed table types which is quite complicated.

I prefer to pick one storage engine (typically Innodb) and when use other tables when it really gives substantial gains. I would not switch table to MyISAM because it gives 5% performance improvement but I can perfectly use MyISAM (or Archive) for logging.

Innodb Needs Tuning As a final note about MyISAM to Innodb migration I should mention about Innodb tuning. Innodb needs tuning. Really. MyISAM for many applications can work well with defaults. I’ve seen hundreds of GB databases ran with MyISAM with default settings and it worked reasonably. Innodb needs resources and it will not work well with defaults a lot. Tuning MyISAM from defaults rarely gives more than 2-3 times gain while it can be as much as 10-50 times for Innodb tables in particular for write intensive workloads. Check here for details.

Note: As Few people questioned me, I indeed forgot to clarify the scope here – I’m mainly speaking about OLTP/ Traditional web applications. for Analytics things are a lot different.

Share this post

Comments (61)

  • Alex Yaroshevich

    Неплохая статья) Спасибо.

    January 12, 2009 at 12:00 am
  • Loc Dinh

    Very nice article Peter, thanks for the information. Just one thing wasn’t mentioned and would be great if you can clarify for me:

    My database has been running on MyISAM for a while and it does have multilingual data (Chinese, Japanese, French, English …), do you know any effect it might have to existing data if i changed to InnoDB?

    Thanks

    January 12, 2009 at 12:00 am
  • jayaram

    @ Ernesto Vargas…

    Thank you very much for your quick response :)
    Yes , have_innodb is yes
    At peak time minimum 50 to 60 read queries and 40 to 60 write queries on the same table.

    Please give me my.cnf setting for innodb engine for this large data tables.
    we are using 30 tables like this.

    Please give me suggestions about working with innodb for these big tables.
    And mostly me fear about backups…
    Suggest me about better backup plans with innodb

    Thank you

    January 12, 2009 at 12:00 am
  • Ernesto Vargas

    @jayaram…

    Do a SHOW VARIABLES and make sure have_innodb say YES. Then do an ALTER TABLE ENGINE INNODB and you will be it.

    How many queries are getting stuck on with table level locking at peak time? With Innodb you will no have that problems since its ACID complaint.

    January 12, 2009 at 12:00 am
  • jayaram

    I am working with one table it contains 30million entries about 3GB, which is MyISAM engine.
    And my application uses this table as so many of read and writes with in 1sec.

    Problem is most of queries in locking stage, i.e problem with table level locking.

    Can i move to InnoDB ? , What are the -ve and +ve things..

    Could you please help me………………

    January 12, 2009 at 12:00 am
  • Sam

    I am a native speaker and I understood it, but it was painful to read. Good work on publishing the info, and don’t mind the grammar police, they are just French descendants.

    January 12, 2009 at 12:00 am
  • Dale

    We evaluated switching to InnoDB.

    Performance vs. MySQL was horrible, even after tuning. SELECT performance is crucial and it just did not hold up.

    Management of the massive InnoDB binary file was daunting unlike MyISAM which has each table in its own set of files.

    The show stopper was restore of tables. If we needed to do a restore of any table in InnoDB, we would need to restore the full InnoDB database from tape, move it to another sql server and then find what we are looking for (at least that seems to be the flow of things to be done for such). In our environment, some of our tables are hundreds of gigs and our total database is much larger – the time to do this is just unacceptable.

    To be real honest, I don’t know why a production shop would use InnoDB based on what we have experienced thus far. It is slow, hard to manage and not at all viable in a DR situation when you have to quickly restore a single database or table.

    I would be interested in other comments in regard to the DR aspect, am I correct in the steps necessary to restore a single table from tape?

    January 12, 2009 at 3:56 pm
  • Ernesto Vargas

    I have seen many cases were people use MyISAM because of the Full-Text and they don’t know other alternatives…Sphinx Search Engine for example.

    Most people that use MyISAM may lack the knowledge and expertise to manage a full blow transaction database engine like InnoDB. At the end what matter is that the data is safe, trustable and reliable and you cannot guarantee that with MyISAM on site with medium concurrency.

    Peter. What is your best method for InnoDB backup on hunders of GB datasets?

    January 12, 2009 at 4:52 pm
  • Kevin Burton

    I think the biggest hurdle for switching is the massive 3-4x database bloat that comes with InnoDB.

    You guys might want to take a stab at solving that problem. I haven’t looked at the internals much but my hunch is that the on disk performance efficiency could be improved.

    January 12, 2009 at 5:05 pm
  • DaveB1

    Interesting topic, but difficult to read. Is your native language not English? I would suggest having someone running through your article to correct bad grammar and punctuation before posting.

    January 12, 2009 at 5:41 pm
  • Fineas K.

    The right tool for the right job. I don’t think this is a wise recommendation. Both MyISAM and InnoDB have their place. I tend to use both where they accel and would never remove either as a viable option.

    January 12, 2009 at 6:07 pm
  • Ms. Anthrope

    @Dale
    You likely needed better tuning – hire a good MySQL DBA and/or consultant.

    As for restores, selective restoring from binary files is painful in InnoDB. If you weren’t using LVM and doing raw backups w/ InnoDB, you probably weren’t backing your databases up properly either. For per-table restores, we use mysqldump + –single-transaction and splice out what we need with some internal tools when the need arises.

    January 12, 2009 at 6:10 pm
  • peter

    Dale,

    I specially list Performance as one of concerns. There are cases when Innodb will not perform as good as MyISAM no questions… in particular if you tuned your applications for MyISAM for years.

    Innodb does not require you to use single file if you use innodb_file_per_table. You can also backup and restore single table if you use this option (though it is a bit complicated) the catch is you can restore it only to the same server it was backed up.

    It is also interesting why you would frequently need to restore single table from tape ? Quite typically the tables are cross referenced so this would result in inconsistent data.

    In general I do not know anything in your application… may be Innodb is not a good match for you or may be you just know MyISAM much better.

    I’m confident my general advice holds. You do not have to believe me just check out how many Alexa Top 100 sites (as an example) use mostly Innodb vs Mostly MyISAM.

    January 12, 2009 at 6:13 pm
  • peter

    Ernesto,

    I mainly speak about our customers, who mainly have their application grew from the baby stage and so it makes sense for them to hire professional expertise to do the review. They generally plan for grow, scalability, want to have good HA etc… and yes we’re here for them to both help with their issues and teach them to so stuff themselves.

    Sphinx is indeed one of the leading ways to simply move away from MyISAM only, Full Text Search which also has various scalability issues.

    Regarding Innodb Backups – we mainly use LVM (or other snapshot) based backup which is the most typical solution. The second most common one is cold backup from the slave.

    January 12, 2009 at 6:17 pm
  • peter

    Kevin,

    Right though it is very data and query related. I know you guys try to keep large portion of your working set in memory which is basically worst case scenario. If data is small (will fit in memory in MyISAM and Innodb) or huge – 10x or more of memory size for MyISAM and Innodb the size based performance difference will not be that large any more.

    The Indexes (which are unpacked) is the biggest problem for Innodb tables which should be fixable though not as trivial. As we get more resources to work on XtraDB I think this would be good thing to look at.

    January 12, 2009 at 6:21 pm
  • peter

    Ms. Anthrope,

    For 100G+ tables MySQLdump restore may not be a very nice option – Innodb does not build indexes by sort in 5.0/5.1 Though this feature is available in Innodb Plugin and XtraDB. On large data sets this can get load 10x+ faster though still no match to binary backups.

    By the way making Innodb files copyable between servers is another thing we have on our mind for XtraDB.

    January 12, 2009 at 6:23 pm
  • peter

    Fineas K,

    Right. Tool for the job but you’ve got to start somewhere.

    MySQL by default starts with Everything in MyISAM which I do not think is a sensible default. I prefer to have Innodb by default because it is safe – you do not care about crashes or single wrong suboptimal query locking up the whole thing. This is even if you do not care about transactions and other Innodb features. Innodb paranoid checksums also made this storage engine the most reliable on stable hardware. Dealing with sporadic MyISAM corruption is extremely fun 🙂

    Starting with Innodb and when using MyISAM (or other storage engines) in case when there is a benefit of doing so is my main idea 🙂

    January 12, 2009 at 6:28 pm
  • peter

    @DaveB1:

    You’re right. English is not my native language. Depending on the mood and phase of the moon I may produce more or less readable text. Please feel free skipping unreadable parts – I do not mind, also you can “translate” the post to Perfect English and republish – I do not mind, just give me a credit 🙂

    January 12, 2009 at 6:30 pm
  • David

    Peter,
    I think you mostly talk about OLTP between MyISAM and InnoDB
    However, you have to realize that a lot of people has use MySQL in their OLAP application or hybird operation.
    In that case, I think MyISAM has the advantage.
    I also look forward to the new Maria storage engine, which is transaction safe engine.

    January 12, 2009 at 7:16 pm
  • surf

    typo?
    with deadlocks which can happen with Innodb even if you do not use transactions, but which are not existent with Innodb.

    January 12, 2009 at 8:29 pm
  • Dale

    @Anthrope you state:
    > You likely needed better tuning – hire a good MySQL DBA and/or consultant.
    We hired and discussed this with a certified MySQL consultant from Sun – he agreed with our cons and with how we had tuned our system.

    > As for restores, selective restoring from binary files is painful in InnoDB. If you weren’t using LVM and doing raw backups w/ InnoDB,
    > you probably weren’t backing your databases up properly either. For per-table restores, we use mysqldump + –single-transaction and
    > splice out what we need with some internal tools when the need arises.

    Right, try doing a mysqldump and splicing out what you need when maybe a critical table of data needs to be restored quickly and your ops is down at a rate of $5k per minute. The above would take us hours of time costing us hundreds of thousands of dollars potentially. The point being, I have to plan on a “really bad day” hoping it will never occur. It is easily in the realm of possibility that we would have to restore a table from tape to deal with some kind of unknown situation. InnoDB was not built for this scenario. The other person’s idea of using the InnoDB feature of one file per table was good, but appears to be very unnatural and hard to manage?

    I realize many “top sites” use InnoDB. I would be curious to know why they decided such and how they handle the query throughput issue and of course their DR plans for a critical failure and need to restore a table rapidly.

    Thanks for the article!

    D

    January 12, 2009 at 8:34 pm
  • peter

    David,

    Yes I speak mostly about OLTP. OLAP with full table scans and similar stuff indeed is the good usage for MyISAM. I should have been more clear with that.

    January 12, 2009 at 10:09 pm
  • peter

    surf – yes fixed.

    January 12, 2009 at 10:09 pm
  • peter

    Dale,

    With downtime of $5000 per minute I’m curious how you’re dealing with repairing of these MyISAM tables or are you switching to the slave because you do not care about loosing some updates… Well of course you do not as you would not be using MyISAM if you would.

    You still did not answer WHY you need to restore single table. What condition causes this ?

    Typically there are following conditions

    1) Server Down. Switch to the slave or DRBD/SAN based standby
    2) Data corruption with crash. Switch to the slave.
    3) Hacker break in or developer trashes the data – switch to time delayed slave or do roll forward recovery from backup.

    Note with very high downtime costs you can use high end systems doing snapshots every 5-10 minutes which are being instantly available if you need to roll back to old data copy.

    Also – with $5000 cost per minute you could pay us for few minutes of downtime and get .ibd files movable for XtraDB 🙂

    In any case there is general sense an there are exceptions. It is very possible we would have recommended you to stay with Innodb reviewing your application. Just recognize it is not typical. Working for a long time for a single company with single class of the applications you may feel like things are the same for everyone.

    January 12, 2009 at 10:19 pm
  • Shlomi Noach

    @Dale,

    Hi,

    1. Most DBAs and companies (I’ve met and seen a few) do not need to do single table restore on a daily (weekly?) basis. Actually, there shouldn’t be a reason to do so in the first place. I would suggest that there is some design flaw, or otherwise programmatic bug, which leads you to this need.

    2. You speak of volumes of 100GB and above, probably you have more than 1TB of data. You say you cannot afford the restoration time from a mysqldump’s single table parsing.
    In that case, I suspect you will certainly NOT have the time for a REPAIR TABLE on your MyISAM’s corrupted table. You say you need to prepare for a “really bad day”. I’ve met a DBA who has many TB of data, all in MyISAM. On average, 5 days each week are consumed by REPAIR TABLE. The larger tables take 4-5 *days* to restore.
    Please be prepared for a very bad day. It *will* come.

    3. I’m not suggesting InnoDB is right for you, since I don;t know anything about your application. Perhaps MyISAM is the right choice. You’ve has a review by a MySQL consultant. I’m assuming the review was professional.
    But you must remember that your application is probably tuned to work with MyISAM. Just as your application would not work well with Oracle right now, there’s always adjustments to the types of queries you run against the storage engine.

    4. InnoDB is far from being perfect.

    5. Are you using a Unix-like OS? If so, what file system are you using? Unjournalled, like ext2, or journaled, like ext3, xfs? If you’re running Windows, do you useFAT32 of NTFS?
    I suspect you’re using journaled file systems, although they are *slower*. The reason is – because they’re safer. MyISAM’s crashes are a show stopper for myself.

    6. “…To be real honest, I don’t know why a production shop would use InnoDB based on what we have experienced thus far. …”
    I think Peter summed it up very well in comment #18.

    If the above sounded like a “always use InnoDB” preach, that was not my intent. I was merely trying to point some issues with regard to your post.

    As a final suggestion, please read Peter’s comments again, they are very helpful.

    Best regards,
    Shlomi

    January 12, 2009 at 11:14 pm
  • Baron Schwartz

    About performance — there are many cases when InnoDB can outperform MyISAM significantly, especially when you know how to use its clustered index and how to take advantage of index-covered queries.

    There is no way to say one is faster than the other. It always depends on the specific case.

    January 12, 2009 at 11:37 pm
  • peter

    Baron… Right I mention couple of examples here in Performance Section.

    My reasoning for “go with Innodb by default” is peace of mind rather than claims it always gets best performance.

    January 13, 2009 at 12:03 am
  • Log Buffer

    On the MySQL Performance Blog, Peter Zaitsev’s question is, should you move from MyISAM to Innodb? Peter and his readers lay out the criteria he think are at play when answering this question for yourself.

    Log Buffer #131

    January 16, 2009 at 2:01 pm
  • mike

    We are having performance difficulties with the innodb plugin from innodb.com , using ver. 1.01 and mysql 5.1.24:

    Our 15 tables are partitioned (with 1024 partitions each), the innodb_file_per_table is on, and data is compressed (key-buffer-size=4). Total db content is only 15G.
    Our Inserts drop to 2500 samples / seconds (400 kB per second) when compression is on. We perform bulk inserts (i.e. values (),(),(),…).
    I set the innodb_buffer_pool_size to 1G (having 4G ram) which seems to be the highest buffer allowed. if exeeded, mysql does not recognise the search engine (!) anymore.
    Removing the compression helps for insert performance.

    Create table and alter table statements are very slow too if we have partitions (indep. from compression).
    Without partitions, the innodb plugin was o.k., slower than MYISAM (incl. inserts, selects) but still o.k.

    Unfortunately we need the compression because the db (goal: several TB) may be populated with a lot of redundant data.

    Any idea? would falcon be an alternative?

    January 21, 2009 at 4:04 am
  • peter

    Mike,

    Really your application deserves more accurate lock. The 1024 partitions is likely overkill – this really makes each table to be 1024 tables underneath which makes it surely slow. The buffer pool also can be much larger – there are people using over 100G for buffer pool just fine.

    January 21, 2009 at 10:52 pm
  • Shlomi Noach

    Mike,

    This one has a 0.999 probability of being the solution to your 1GB problem:
    Install a 64bit linux.

    I suspect you have a 32bit installation, which is why the storage engine would not accept more than 1GB: it’s out of address space. In total, you may squeeze some 1.5G from all MySQL components together…

    On a 4GB machine, setting innodb_buffer_pool_size to 2.5G (assuming only InnoDB is used, machine is solely MySQL) is probably a reasonable value, which leaves you much enough space for complicated queries and many connections.

    I suspect that moving from 1GB to 2.5GB will increase you InnoDB performance a lot, based on the fact you only have 15GB of data.

    Regards

    January 21, 2009 at 11:02 pm
  • mike

    Hi there,
    We are using this many partitions as the application most probably will run for around 10 years and gather data for very long timespans. In order to facilitate pruning, we chose the maximum amount of partitions per table, which currently corresponds to 3,5 days per partition (partition by range). Best case scenario would have been if would have been able to compress this data aditionally.
    Yes, our OS is SLES 9 32bit. Maybe we could change something there…
    Today we experienced a good performance increase if we changed from Barracuda back to Antelope file format. We do not have compression then, but InnoDB plugin performs with around 13000 samples/second on inserts compared to 2000-3000 with Barracuda and compression.

    cheers

    January 22, 2009 at 3:54 am
  • peter

    Mike,

    I understand benefits of partitioning. I’m just warning you about performance penalties of dealing with 1024 of underlying partitions. The compression does have considerable overhead on heavy insert no surprises here.

    January 22, 2009 at 10:03 am
  • Roman Lvov

    @Peter
    >Yes I speak mostly about OLTP. OLAP with full table scans and similar stuff indeed is the good usage for MyISAM. I should have been more clear with that.

    Uff, ok, good. I was surprised to hear that InnoDB is almost always better choice than MyISAM. Because I’m that guy who develops OLAP system. My DBs are very small compared to the mentioned above cases – mysqldump of the largest DB is only 550M. But queries upon data can run for seconds and even tens seconds (I even had one which ran 3 mins before optimization), as each query joins several tables, often 5-6-7 ones. And when the number of rows in the two biggest tables (which participate in almost every query) reached 300K and 50K, performance issues are visible. Data updates are performed once a day, and can be re-done if in the case of a crash entire DB has to be restored from a daily backup, so I almost perfectly live without InnoDB. So I think I should stick to MyISAM.
    I can guess that now, when you started developing XtraDB based on InnoDB, you will be almost entirely focused on InnoDB issues. It would be really great if you find some time to continue covering MyISAM as well.

    January 28, 2009 at 5:26 am
  • peter

    Roman,

    First I should note MySQL optimizer is often weak when handling complex queries which can be the problem unrelated to storage engines.

    Second – It is wrong to Assume MyISAM will be always faster for read only – there are many cases when it does not, in particular because it only can cache “row” data in OS cache, while Innodb can do this in the buffer pool. So in reality I would do the benchmark and decide…

    Regarding XtraDB – we’re not getting into the Product basics – we do XtraDB because we have a lot of customers with Innodb which are having issues with it and XtraDB is our effort to help them. As Maria, Falcon, PBXT will mature we will be helping customers with them and blogging about them actively too.

    MyISAM does not get too many posts because it is simple 🙂

    January 28, 2009 at 12:32 pm
  • Vladimir Rusinov

    Hi!

    I’ve translated it to russian: http://greenmice.info/ru/node/106

    HTH to any of my russian colleagues.

    March 14, 2009 at 11:59 am
  • Ed

    The deal breaker: InnoDB does not support AUTO_INCREMENT in positions other than the 1st.

    If you have an app that uses that MyISAM feature porting it to InnoDB can be a major headache.

    Example:

    #############################################
    # MyISAM

    CREATE TABLE MyISAM_animals (grp ENUM(‘fish’,’mammal’,’bird’) NOT NULL, id MEDIUMINT NOT NULL AUTO_INCREMENT, name CHAR(30) NOT NULL, PRIMARY KEY (grp,id))ENGINE=MyISAM;

    INSERT INTO MyISAM_animals (grp,name) VALUES (‘mammal’,’dog’),(‘mammal’,’cat’),(‘bird’,’penguin’),(‘fish’,’lax’),(‘mammal’,’whale’),(‘bird’,’ostrich’);

    SELECT * FROM MyISAM_animals ORDER BY grp,id;
    +——–+—-+———+
    | grp | id | name |
    +——–+—-+———+
    | fish | 1 | lax |
    | mammal | 1 | dog |
    | mammal | 2 | cat |
    | mammal | 3 | whale |
    | bird | 1 | penguin |
    | bird | 2 | ostrich |
    +——–+—-+———+

    #############################################
    # InnoDB

    CREATE TABLE InnoDB_animals (grp ENUM(‘fish’,’mammal’,’bird’) NOT NULL, id MEDIUMINT NOT NULL AUTO_INCREMENT, name CHAR(30) NOT NULL, PRIMARY KEY (grp,id))ENGINE=InnoDB;
    ### ERROR 1075 (42000): Incorrect table definition; there can be only one auto column and it must be defined as a key

    CREATE TABLE InnoDB_animals (grp ENUM(‘fish’,’mammal’,’bird’) NOT NULL, id MEDIUMINT NOT NULL AUTO_INCREMENT, name CHAR(30) NOT NULL, PRIMARY KEY (id,grp))ENGINE=InnoDB;

    INSERT INTO InnoDB_animals (grp,name) VALUES (‘mammal’,’dog’),(‘mammal’,’cat’),(‘bird’,’penguin’),(‘fish’,’lax’),(‘mammal’,’whale’),(‘bird’,’ostrich’);

    SELECT * FROM InnoDB_animals ORDER BY grp,id;

    +——–+—-+———+
    | grp | id | name |
    +——–+—-+———+
    | fish | 4 | lax |
    | mammal | 1 | dog |
    | mammal | 2 | cat |
    | mammal | 5 | whale |
    | bird | 3 | penguin |
    | bird | 6 | ostrich |
    +——–+—-+———+

    #############################################

    April 8, 2009 at 9:28 am
  • Shlomi Noach

    @Ed (31)

    Why would this be a deal breaker?

    April 8, 2009 at 8:50 pm
  • Tom

    To call MyISAM a database is a joke. To have something not crash safe and not supporting ACID transactions and call it “database”, you must have missed something in your education. Better compare it to cache products, and then the speed is not very impressive either. DRAM prices have changed the game in ways the DB industry have yet to fathom. DB devs are years behind in optimizing the sweet DRAM/DISK relation point.

    June 1, 2009 at 7:26 am
  • haram

    is there a way to do full text search using InnoDB

    October 18, 2009 at 10:43 am
  • MikeTrest

    Business Continuity / Disaster Recovery using InnoDB? We use an internal process based on mysqldump and innodb file