EmergencyEMERGENCY? Get 24/7 Help Now!

MySQL 8.0: The end of MyISAM

 | October 11, 2016 |  Posted In: MySQL

PREVIOUS POST
NEXT POST

MyISAMThis blog discusses the gradual end of MyISAM in MySQL.

The story that started 20 years ago is coming to its end. I’m talking about the old MyISAM storage engine that was the only storage provided by MySQL in 1995, and was available in MySQL for 20+ years. Actually, part of my job as a MySQL consultant for 10+ years was to discover MyISAM tables and advise customers how to convert those to InnoDB.

(Check your MySQL installation, you may still have MyISAM tables).

MySQL 5.7 still used MyISAM storage for the system tables in the MySQL schema.

In MySQL 8.0 (DMR version as of writing), the MyISAM storage engine is still available. But in a very limited scope:

  • After introducing the new data dictionary, the MyISAM tables are gone from the system schema (“mysql” db).
  • Working with MyISAM is harder now (and discouraged): you can’t just copy MyISAM tables into a running MySQL server, they will not be discovered (unlike InnoDB, where you can use “ALTER TABLE … IMPORT TABLESPACE”)
  • However, you can create a table engine=MyISAM, and it will work as before

InnoDB implemented all the older, missing features:

Feature MyISAM InnoDB
Full Text Indexes yes Since MySQL 5.6
Portable tables (tablespaces) yes Since MySQL 5.6
Spatial Indexes/RTREE (GIS) yes Since MySQL 5.7
Last update for table yes Since MySQL 5.7

(http://dev.mysql.com/worklog/task/?id=6658)

Suitable for temp tables yes Since MySQL 5.7

Also complex selects uses InnoDB ondisk temp tables

Faster count(*) yes *Faster in MySQL 5.7 but does not store counter

 

So the only MyISAM advantages left are:

  1. Tables will be smaller on disk compared to uncompressed InnoDB tables.
  2. The count(*) is still much faster in MyISAM:

I would not use MyISAM unless there is a specific case, and for well-known reasons (MyISAM are non-transactional, table level locks, with no crash recovery, etc.)

My colleague Laurynas Biveinis also suggested converting MyISAM to an optional storage engine plugin.

PREVIOUS POST
NEXT POST
Alexander Rubin

Alexander joined Percona in 2013. Alexander worked with MySQL since 2000 as DBA and Application Developer. Before joining Percona he was doing MySQL consulting as a principal consultant for over 7 years (started with MySQL AB in 2006, then Sun Microsystems and then Oracle). He helped many customers design large, scalable and highly available MySQL systems and optimize MySQL performance. Alexander also helped customers design Big Data stores with Apache Hadoop and related technologies.

46 Comments

  • MyISAM still does WAY better on the hot row problem, such as an increment of the same value over and over, as in a sequence. I did a significant number of tests on this, when I worked at Percona. They are probably still on the internal “experts” mailing list (they would be replies to Jay). I’d like to see some way to get InnoDB perform like MyISAM in such scenarios, as MyISAM exceeds the performance of InnoDB significantly in this particular use case.

    • Sam Lambert has previously written about a workaround for this problem:
      http://samlambert.com/posts/mysql-slotted-counter

      Depends on the use case of course how viable it is – but since we are all moving in the direction of semi-sync and GR, hot rows just becomes a hard problem due to Callaghan’s Law.

      • This is one reason why autonomous/anonymous transactions would be useful (retrying a one row trx is easy to do compared to a general group commit conflict). Regardless, there are data stores that are more suited to counters and such. MySQL is not always the best tool for the job.

  • Merge table still missing.
    Have a case where some collectors have a table that are updated. Every 5 min. the tables are moved to to a central machine and merge table gives the full view.
    The rows are (relative) stable(2M+) There are 20M+ updates/hour.
    I am considering to centralize DB(innodb or maybe NDB) But a perfomance test have to be done!

  • How terribly depressing that the fast, small, efficient MyISAM engine is going away, forcing us to replace it with its bloated successor, InnoDB, which eats memory and disk resources and belches out complex, hard-to-manage, hard to backup, hard to restore, hard to copy, hard to move, databases. We have maybe 4000 run-years (400 MySQL instances running for the past 10 years) on MyISAM and we’ve been really happy with it. The great FUD factor that sold InnoDB to the world–the fear of data corruption–never materialized. InnoDB is going to force us to buy five times as much hardware for basically the same level of performance for our particular application. How very exciting.

      • Lol. It’s medical data, and is being reviewed every day by doctors and nurses, often going all the way back to a patient’s first visit years ago. It’s not corrupted. Further, we do occasional mysqlchecks just to be 100% sure. They never show any corruption. Thanks for your input.

        • LOL.. you don’t have checksums. use a real storage engine. Listen to a person that has over 20 years of experience. Or not, your loss. your customers, your data.

          Just let me know where you work, so I never step through your doors.

          • if a bit changes in a value, your precious mysqlcheck won’t find any problems. A 1 can turn into a 2 and you will have no idea. Then what happens when a medical mistake works. SHAME ON YOUR FOR PUTTING IMPORTANT DATA IN MYISAM. SHAME!

          • My ‘lol’ apparently offended you, which was not my intention. MyISAM does have checksums. Just create the tables with CHECKSUM=1 and a live checksum is maintained as the table changes.

          • and what does that give you? Do you compare that checksum to a copy of InnoDB data to make sure it has the same checksum at the same binlog position? Otherwise it is useless, no more useful than the count of the table to make sure data is OK.

          • I’m not following your last comment. As I said, we use MyISAM. So I’m not sure how your question, “Do you compare that checksum to a copy of InnoDB data to make sure it has the same checksum at the same binlog position?” applies. Also, can you explain the purpose of MyISAM’s running-checksum functionality if it truly “does nothing for you” as you stated?

          • Also, I’d be curious to know what you think about MyRocks. It seems as if it could be a welcome alternative to the bloated, complex, hard-to-manage, grisly nightmare called InnoDB.

  • The storage engine API has always been hard, now it is harder and there will be less storage engine diversity. I hope that MyRocks is able to keep up with the API to keep upstream from being an InnoDB-only world.

  • Let me guess: only few inserts and few updates.
    And MyISAM is only performant for only one session.
    If you do not change the data afterwards, MyISAM is nice.

    BUT:
    – no ACID
    – MyISAM is for high performance not suitable.
    => not professional

      • Justin – you can make your point without insulting Alan. I have been a fan of MyISAM the past few years. Maybe I root for the underdog. I was happy to learn that it worked for Alan.

        • There wasn’t anything insulting here. If you use MyISAM, there are no checksums. You must validate your data to the source of truth if you want your warehouse to return true answers. If a 1 turns into a 2, how else will you know?

          • Lesson learned at Yahoo – ask Jeremy Cole about those SuperMicro risers that corrupted data because the PCI bus has no checksums. We never would have known if we were using MyISAM. Hardware CAN NOT BE TRUSTED.

          • MySQL was weak in the past about validating input (ask a Postgres fan). Postgres didn’t have page checksums for a long time. Yet we put important data in both and both eventually got better. Regardless, you don’t need to go ALL CAPS to make your point.

          • Oh, well, that is the truth. Would you put important medical records in MyISAM. For the love of God I hope not.

            Turn down durability at the session level if you want. XtraDB supports that.

          • You used a DBMS with weak or no support for data validation. Doesn’t matter much whether data was corrupted by disk or an application, the result is still corruption. Do I get to write “shame on you” now?

          • Corruption means you can’t read the data you wrote. Writing an application that writes bad data isn’t my problem, and using a product that doesn’t prevent it isn’t my problem either. This is 2016. Don’t keep the source of truth in MyISAM. It is gross negligence.

          • I was a god damned Oracle DBA that was forced to use MySQL. I never used MyISAM, never suggested anyone use MyISAM, and as a consultant I converted dozens of customers from MyISAM to InnoDB. In fact, ironically, an old boss from AdBrite just wrote me an email saying he wanted help converting data to MyISAM. Do you think I said YES? No.

  • There is no change for table-based logs (general and slow log. Docs at http://dev.mysql.com/doc/refman/8.0/en/log-destinations.html say:

    “By default, the log tables use the CSV storage engine that writes data in comma-separated values format. For users who have access to the .CSV files that contain log table data, the files are easy to import into other programs such as spreadsheets that can process CSV input.

    The log tables can be altered to use the MyISAM storage engine. You cannot use ALTER TABLE to alter a log table that is in use. The log must be disabled first. No engines other than CSV or MyISAM are legal for the log tables.”

  • When I was benchmarking myISAM vs. InnoDB vs. Mongo on 10M records with aggregations (GROUP BY, HAVING, WHERE, COUNT, SUM – but no joins) the results were surprisingly excellent for myISAM. myISAM responded in average by 3-6 secs for any complex queries, while InnoDB took in avg. 60 – 90 secs. Same for mongo (of course, NOSQL cannot do performative aggregations).

    This is just to point out, that there are scenarios where performance matters more than ACID. Also data integrity is no question for myISAM unless HW comes out of resources. We have been using myISAM for 10 years with 20 GB tables in memory – not a problem. Also many performant storage systems (Cassandra, Aerospike, ..) allow choosing strategy between data integrity and performance. So myISAM should not be thrown away, it still has it’s good purpose in real world scenarios.

    • Careful. There’s at least one guy in this thread who may claw out your eyes if you say “MyISAM” and “production” in the same paragraph. 😉 But I’m with you.

  • I am largely with Eric here (sorry Justin). MyISAM tables are a great tool in the hands of a DBA that understands their strengths and weaknesses. MyISAM tables sure do carry their risks, but so do InnoDB. When I worked at Percona, I saw a good number of InnoDB data recovery cases – many of which unsuccessful, and no MyISAM data recoveries. I do no think it was due only to the customer use ratio of about 80% InnoDB to 20% MyISAM. When MyISAM table gets corrupted, usually REPAIR TABLE recovers it losing maybe a row or two. Even in the case of really bad corruption, a lot of times REPAIR TABLE … USE_FRM does the trick. So a MyISAM user usually will not pay $450/hour to a third-party to do the above as they can handle it themselves. With InnoDB if you hit a bug or one really not so critical byte is wrong due to hardware or OS malfunction, the entire storage engine will not start and you are now studying and patching the source and viewing the tablespace in a hex editor to see if you can get around it. This requires some expertise that goes quite a bit beyond just searching Google.

    There is a dark side to the “all-or-nothing” part of the ACID paradigm – if it cannot give you all, it will give you nothing, so if some piece of data that you can really live without gets corrupted, you lose the rest of it as well.

    With a little bit of legwork, the performance limitations of MyISAM resulting out of the lock contention can be coded around. Keep your tables small, partition the data across multiple tables, write some logic to aggregate. Good software design principles on the client end will make it transparent to most of the client developers. After all, well-written client should have an abstracted data layer (which could be written by someone who understands MyISAM) and nothing except the internals of that layer should be sending queries directly to the server. Keep your queries fast – validate all queries with EXPLAIN before you make them live – make this part of your code review policy. These principles are good (though rarely followed in practice) for any storage engine, for that matter.

    Key cache lock contention can be alleviated with per-table key caches.

    MyISAM is an excellent storage engine in a data warehousing application, or anything mostly read when it is not critical to lose occasional writes.

    You could also make an argument for MyISAM even when losing individual writes has a high cost. For those high-cost writes you could have some kind of a redundant source – e.g your own “transaction” log – in essence emulating a transaction. More work than what most people would be willing to do, but I would argue that in the end it has potential for a more stable system. The system will still work significantly less than in the case of InnoDB where the extra work is done by default and has to happen on the main database server while the do-it-yourself transaction logging can happen anywhere. My argument that any time you make your system do more than what it absolutely needs you introduce a potential point of failure – regardless of how much it buys you in theoretical terms. The harder it works, the more likely it is to experience some kind of failure on the hardware/OS level, or hit excessively high resource usage that will make the system practically unusable. So even if there is nothing logically or physically wrong, you could be losing records simply because your requests time out. In theory, you have not lost anything because the query was never issued. In practice, you have lost a lot of money and customer good will on the system not being available. Therefore practical availability and performance can be significantly better even though theoretically the system appears more vulnerable to failure. I think that is what Eric has observed through the many years of his experience with MyISAM.

    If somebody wants to throw some rotten eggs and tomatoes at me for suggesting there is a decent use case for MyISAM, I am presenting on the subject at OpenWest in Sandy, Utah on July 12th at 11:30 am 🙂

Leave a Reply to sbester Cancel reply