September 20, 2014

MySQL Crash Recovery

MySQL is known for its stability but as any other application it has bugs so it may crash sometime. Also operation system may be flawed, hardware has problems or simply power can go down which all mean similar things – MySQL Shutdown is unexpected and there could be various inconsistences. And this is not only problem as we’ll see.

MySQL has angel process mysqld_safe which will restart MySQL Server in most cases. It is great, unless you have run into some bug which causes it to crash again – such crashes qucikly following one another are kind of worse because they explore many less tested code paths in MySQL and so problem potential is larger.

So lets look at the problem which happen during the crash which might need to take care of or which may seriously affect MySQL Performance.

MyISAM Corruption – If you’re writting to MyISAM tables there is very large chance of them becoming corrupted during the crash. Note corruption may be hidden and do not expose itself instantly – you may notice wrong query results days after crash. Sometimes corrupted tables may be reason for further crashes or hangs, and corruption may spread itself further in the table. You probably do not want any of these so it is very good idea to run MySQL with myisam_recover option which will make sure all improperly closed MyISAM tables are checked first time it is accessed. This option is however rather painful to use with web applications – users may issue different queries which may trigger check/repair running for many tables at onces, which typically make system extremely slow and also can use up all allowed connections or run out of memory ( myisam_sort_buffer_size is normally set pretty lage). If this becomes the problem I use tiny script which moves out all MyISAM tables out of MySQL database directory, checks them with MyISAMchk and moves them back to running server. This looks scary but it works great – until table is checked and ready application gets error rather than stalling forever which allows application to become partially functional as soon as possible. This hack is needed only in some cases – in most cases using Innodb for tables which you need to be recovered fast is better solution.

Innodb Recovery – Unless you have some hardware problems (99%) or found new Innodb bug (1%) Innodb recovery should be automatic and bring your database to consistent state. Depending on innodb_flush_lot_at_trx_commit setting you may lose few last committed transactions but it is it. It is Performance of this process which may cause the problems. As I already wrote innodb_log_file_size and innodb_buffer_pool_size affect recovery time significantly as well as your workload. I should also mention if you have innodb_file_per_table=1 your recovery speed will depend on number of Innodb tables you have, as well as many other operations, so beware.

Binary log corruption – Binary log may become corrupted and out of sync with database content. This will sometimes break replication but if you’re just planning on using binary log for point in time recovery it can go unnoticed. sync_binlog Is helping by syncing binary log, but at performance penalty. If using Innodb you also might with to use innodb-safe-binlog option in MySQL 4.1 so your Innodb log and binary log are synchronized. In MySQL 5.0 XA is taking care of this synchronization.

.frm Corruption – Few people know MySQL is not really ACID even with Innodb tables, at least not for DDL statements.
There is a chance of failing for example during CREATE statement with table created in Innodb dictionary but .frm not created or not completely written. Partially written .frm files or .frm being unsync with internal Innodb dictionary may cause MySQL to fail with wierd error messages. In MySQL 4.1 sync_frm option was added which reduces this problem as time window when it can happen is much less. Still if failure happens just during writting .frm file nasty things may happen, not to mention such potentially multiple operation DDL statements as RENAME TABLE – these are most vulnerable.

master.info corruption – If slave happens to crash you can also have relay logs corruption and master.info being corrupted. Not to mention MyISAM tables can contain partially completed statements as well as some of updates totally lost. The safe approach it to reclone the slaves if they crash or you can take the risks and try to continue. Sometimes you might be able to manually find appropriate position even if master.info file is out of sync but I would not be basing my failure handling scenarios.

Cold Start – If you restart MySQL server its caches (key_buffer, innodb_buffer_pool, query_cache,table_cache) are cleaned, so may be OS caches. This may reduce performance dramatically. So if you’re bringing server back after crash you might want to populate caches. For MyISAM key_cache this can be done by using LOAD INDEX INTO CACHE statement, for other storage engines it can be done by issuing large index scan queries. Full table scan queries allow to preload table data ether in storage engine caches or in OS cache. You can save these into .sql file and use –init-file to make sure it is run on startup. The other approach is to prime server with real servers (ie clone queries from other slave) before putting traffic to it.
In case application is not highly available so there is only one server you might with to start serving only some users initially (returning error to others) and gradually increase the load as server warms up. This may sound strange but makes a lot of sense as not only waiting for pages which never load is more frustrating for users than getting honest “try again later” message, but also – warmup takes longer time on extreme load.

Innodb statistics – Unlike MyISAM Innodb does not store index cardinality in tables, instead it computes them on first table access after startup. This may take significant time if you have very large number of tables (Some users have hundreds of thousands of tables per database host). This one is pretty much part of cold start problems but I wanted to point out it separately. To warmup this data you might run select 1 from _table_ limit 1 for each table or any other statement – it is table open which is important.

There are other problems which you may experience related to MySQL Crash Recovery – Restoring data from backup, corrupted Innodb tablespace recovery etc but I should write about them some other time.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. If you want to automate the select 1 from _table_ limit 1 instead of executing it once per table.
    In 5.0 you can just do “SELECT COUNT(*) FROM INFORMATION_SCHEMA.TABLES;
    This will open all tables once for you and update InnodB statistics in the process.

  2. peter says:

    Good idea Tobias,

    Having script ready you might not think about automating it further using new version features :)

  3. sawara says:

    In “Binary log corruption”, you say “In MySQL 5.0 XA is taking care of this synchronization”. If you know any document or URL about synchronization between binary log and InnoDB log w/ innodb_support_xa, would you like to tell me?

    Does MySQL AB Document refer to this synchronization?

  4. peter says:

    Sawara,

    I’ve looked into the docs and I can’t find much about it. I just know binary log file is used as one of transaction resources so it is synchronized with other transactional storage engines. You need sync_binlog of course to make it really work.

  5. mortenb says:

    Hi Is there any way to recreate a create statement from a table.frm file?, when the table.MYI has become corrupted.
    And it is so big, running analyze table or myisamchk takes days? table.MYD is 160GB and table.MYI is 80GB?

    Any input greatly appreciated (mysql 4.0.27)

    Thanks

  6. peter says:

    I’d place empty .MYI and .MYD file for this table and use REPAIR TABLE .. USE_FRM; it should recreate .MYI file based on .FRM information.

  7. Wart says:

    Hi,
    We have a dedicated mysql box with multiple mixed myisam innodb databases on them.
    Sometimes the server crashes for any particular reason and mysql starts to repair itself.

    so far so good

    however, when we look at our server load graphs we see that the average load has gone up quite allot since the crash.

    do you have any idea why this happens?

  8. Michal Aichinger says:

    Hi,
    you wrote: “Innodb recovery should be automatic and bring your database to consistent state.” But it is not true in some cases. It happened that we had no space on data disk last week and all innodb tables were corupted and MySQL did not repair it.

  9. peter says:

    Michal,

    What error message did you have ?

    Recovery could fail simply because there is no space on device in this case it should succeed when you clean things up.

    Of course it also could be some kind of bug… but this is important difference. MyISAM is expected to be broken in case of crash and Innodb is expected to fix itself and if it does not something is wrong :)

  10. MSK says:

    Hi,

    We are facing this error frequently we our db is growing day by day. We are unable to recover the innodb that is being corrupted because of no disc space. Can you help me with the steps to recreate the innodb? Following are the output in mysql.err file.

    InnoDB: Apply batch completed
    InnoDB: Starting rollback of uncommitted transactions
    InnoDB: Rolling back trx with id 0 7197457, 11 rows to undoInnoDB: Error: trying to access page number 2395455360 in space 0,
    InnoDB: space name .\ibdata1,
    InnoDB: which is outside the tablespace bounds.
    InnoDB: Byte offset 0, len 16384, i/o type 10
    070807 16:51:15InnoDB: Assertion failure in thread 256 in file .\fil\fil0fil.c line 3853
    InnoDB: We intentionally generate a memory trap.
    InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
    InnoDB: If you get repeated assertion failures or crashes, even
    InnoDB: immediately after the mysqld startup, there may be
    InnoDB: corruption in the InnoDB tablespace. Please refer to
    InnoDB: http://dev.mysql.com/doc/mysql/en/Forcing_recovery.html
    InnoDB: about forcing recovery.
    070807 16:51:15 [ERROR] ..\mysql\bin\mysqld-nt: Got signal 11. Aborting!

    thanks,
    MSK

  11. gigiduru says:

    It feels funny to read in the same page about MySQL’s stability and then about all those table/engine corruption cases.
    No later then today, I was sooo close to have to deal with huge MyISAM tables corruption. I tried to create table like old_table, without noticing that the original table was locked out by someone else’s query – so my ddl completely hang. I waited like 5 secs, then I pressed Ctrl+C. Now, in 4.0.26 it’s a bad idea to do this. The MySQL server went away faster then I was able to blink. My huge luck was that no one else was running any inserts/updates/deletes (all the tables being myisam, that would have been a total disaster).
    I didn’t quite hear so often about table corruption coming from the RDBMS itself, without external causes, at Oracle, Postgresql or SQL Server. It might be just me.
    Peter, you might to steer your company towards Oracle or Postgresql consulting services. I bet you’ll find it more rewarding and challenging/less frustrating. From my part, MySQL can go down or I myself being generous, let them be bought by Oracle. It’ll teach them how to have a clear vision and how to implement right.

  12. peter says:

    gigiduru,

    If MySQL crashed in your case that is a but and it does not normally happen.

    Regarding MyISAM locking and corruption it is a choice – if you want want table locks and repair after server crash you go with MyISAM if you want transactions and multi versioning you go with Innodb.

  13. dboyr says:

    peter

    how can you say “it does not normally happen” if a little thing like CTR+C puts mysql server down? if I have a power loss in the middle of some huge transactions, what then?
    Should I understand that I couldn’t trust that my data will not be safe in a mysql database?
    In oracle I have a lot of fail safe mechanisms in place. so it’s very difficult to lose data. With oracle I don’t have to worry that a user stops his session. the integrity of the database can not be questioned. It remains intact. I DON’T LOSE DATA!!!

  14. peter says:

    dboyr,

    I never have seen CTRL-C on the user console to crash MySQL Server this is what I mean. So this is not by design for sure.

    MyISAM tables – yes these are not designed to be crash save. They WILL most likely become corrupted if MySQL Server crashes.

    However you do not have to use MyISAM tables. Innodb tables do not have this problem.

  15. gigiduru says:

    So basically, you’re saying that if I want to choose MyISAM speeed AND Innodb reliability, MySQL is not a choice…
    That’s what I wanted to hear.
    It’s all about vision. “MyISAM tables – yes these are not designed to be crash safe.” – this is a total complete flop. If MySQL would have stayed in the realm of test/toy/web-serving realm, I wouldn’t mind to deal with it. But touting and beating the drum that MySQL is enterprise ready… that’s dangerous. You must be aware that there are companies out there that are doing even financial transactions, serving the clients from the myisam tables. I can hear the clock ticking.
    And at the philosophical level, I’m starting to believe that the majority of people choosing MySQL as their preferred database are prone to choose the path of minimal resistance – masking it under words like “efficiency”, “simplicity is genius” (see the so much touted ease of use) – without giving a second thinking what’s expecting them down on the road – THE minefield. Once they are in, they will defend their poor choice until death do them apart. But after all, it’s JUST a poor choice, which can be corrected.
    And if you’re giving me a response that it’s all about making things better, improving and building on something that others already built, please… choose something more worthy. Postgresql it’s MUCH more worthy in this regard. I know enough about MySQL but few things about PostgreSQL so it’s kind of logic for me to reach this conclusion.

  16. gigiduru says:

    Just a little follow up, just take a look at these two web pages, and see if you notice something unusual:
    http://www.mysql.com/why-mysql/quality/
    vs
    http://bugs.mysql.com/bug.php?id=30234

  17. peter says:

    gigiduru,

    This is again wrong comparison. Innodb can mean both speed AND safety in many cases. Though MyISAM, MEMORY, ARCHIVE storage engines all can be faster for some workloads.

    Indeed you’re right many people have heard MySQL has transactions but do not bother to check it is in case you’re using Innodb tables.

    I never said MySQL is for everyone. But this blog is for people trying to get most out of their MySQL install as well as know how to avoid the issues.

  18. pablo says:

    Hi, can someone post the script which moves out all MyISAM tables out of MySQL database directory, checks them with MyISAMchk and moves them back to running server? Thanks in advance..

  19. SILENT HACKER says:

    can any one tel me how to maintain your own log files for mysql?

  20. Kleyber says:

    Hi,

    I have a big problem here: We have noticed that our server was hanging up without any reason, apparently. So, we have made a backup (using MySQLDump) and we had to reinstall everything (Linux server, Apache, MySQL) and when we tried to restore the .SQL file, we noticed that in a certain point of the file, there is some strange data, but the file is very big (490 MB) to be opened for any editor. My question is: How to recover this .SQL file? Is there any editor which opens that file?

    Thanks in advance,

  21. Hi Peter,

    I am a technical writer, currently got a project on MySQL database. Your blog posts are very helpful for me.

  22. Harshal says:

    Hey Peter,

    I was getting an error

    Mysql service failed
    could not start mysql daemon …

    Mysql engine is : InnoDB .. I guess its corrupted ?

    could you please help me… how to recover or set mysql in recovery mode..

    Your help will be appreciated

    Thank you.

    Regards,
    Harshal

  23. Hello Harshal,

    You can try Innodb force recovery to recover MySQL data. Just add “innodb_force_recovery = 4″ line before restarting the server. You can choose number according to your need. For more details visit here: http://dev.mysql.com/doc/refman/5.0/en/forcing-innodb-recovery.html

  24. Harshal says:

    yeah addision,

    It worked..Thanks :)

  25. It is pleasure to hear that you have successfully recover your data.

Speak Your Mind

*