October 25, 2014

Recovering Innodb table Corruption

Assume you’re running MySQL with Innodb tables and you’ve got crappy hardware, driver bug, kernel bug, unlucky power failure or some rare MySQL bug and some pages in Innodb tablespace got corrupted. In such cases Innodb will typically print something like this:

InnoDB: Database page corruption on disk or a failed
InnoDB: file read of page 7.
InnoDB: You may have to recover from a backup.
080703 23:46:16 InnoDB: Page dump in ascii and hex (16384 bytes):
… A LOT OF HEX AND BINARY DATA…
080703 23:46:16 InnoDB: Page checksum 587461377, prior-to-4.0.14-form checksum 772331632
InnoDB: stored checksum 2287785129, prior-to-4.0.14-form stored checksum 772331632
InnoDB: Page lsn 24 1487506025, low 4 bytes of lsn at page end 1487506025
InnoDB: Page number (if stored to page already) 7,
InnoDB: space id (if created with >= MySQL-4.1.1 and stored already) 6353
InnoDB: Page may be an index page where index id is 0 25556
InnoDB: (index “PRIMARY” of table “test”.”test”)
InnoDB: Database page corruption on disk or a failed

and crash with assertion failure.
So what can you do to recover such a table ?

There are multiple things which can get corrupted and I will be looking in details on the simple one in this article – when page in clustered key index is corrupted. It is worse compared to having data corrupted in secondary indexes, in which case simple OPTIMIZE TABLE could be enough to rebuild it, but it is much better compared to table dictionary corruption when it may be much harder to recover the table.

In this example I actually went ahead and manually edited test.ibd file replacing few bytes so corruption is mild.

First I should note CHECK TABLE in INNODB is pretty useless. For my manually corrupted table I am getting:

First run is check table in normal operation mode – in which case Innodb simply crashes if there is checksum error (even if we’re running CHECK operation). In second case I’m running with innodb_force_recovery=1 and as you can see even though I get the message in the log file about checksum failing CHECK TABLE says table is OK. This means You Can’t Trust CHECK TABLE in Innodb to be sure your tables are good.

In this simple corruption was only in the data portion of pages so once you started Innodb with innodb_force_recovery=1 you can do the following:

Now you got all your data in MyISAM table so all you have to do is to drop old table and convert new table back to Innodb after restarting without innodb_force_recovery option. You can also rename the old table in case you will need to look into it more later. Another alternative is to dump table with MySQLDump and load it back. It is all pretty much the same stuff. I’m using MyISAM table for the reason you’ll see later.

You may think why do not you simply rebuild table by using OPTIMIZE TABLE ? This is because Running in innodb_force_recovery mode Innodb becomes read only for data operations and so you can’t insert or delete any data (though you can create or drop Innodb tables):

That was easy, right ?

I also thought so, so I went ahead and edited test.ibd a little more wiping one of the page headers completely. Now CHECK TABLE would crash even with innodb_force_recovery=1

080704 0:22:53 InnoDB: Assertion failure in thread 1158060352 in file btr/btr0btr.c line 3235
InnoDB: Failing assertion: page_get_n_recs(page) > 0 || (level == 0 && page_get_page_no(page) == dict_index_get_page(index))
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even

If you get such assertion failures most likely higher innodb_force_recovery values would not help you – they are helpful in case there is corruption in various system areas but they can’t really change anything in a way Innodb processes page data.

The next comes trial and error approach:

You may think will will scan the table until first corrupted row and get result in MyISAM table ? Unfortunately test2 ended up to be empty after the run. At the same time I saw some data could be selected. The problem is there is some buffering taking place and as MySQL crashes it does not store all data it could recover to MyISAM table.

Using series of queries with LIMIT can be handly if you recover manually:

As you can see I can get rows from the table in the new one until we finally touch the row which crashes MySQL. In this case we can expect this is the row between 200 and 300 and we can do bunch of similar statements to find exact number doing “binary search”

Note even if you do not use MyISAM table but fetch data to the script instead make sure to use LIMIT or PK Rangers when MySQL crashes you will not get all data in the network packet you potentially could get due to buffering.

So now we found there is corrupted data in the table and we need to somehow skip over it. To do it we would need to find max PK which could be recovered and try some higher values

So we tried to skip 30 rows and it was too little while skipping 80 rows was OK. Again using binary search you can find out how many rows do you need to skip exactly to recover as much data as possible. Row size can be good help to you. In this case we have about 280 bytes per row so we get about 50 rows per page so not a big surprise 30 rows was not enough – typically if page directory is corrupted you would need to skip at least whole page. If page is corrupted at higher level in BTREE you may need to skip a lot of pages (whole subtree) to use this recovery method.

It is also well possible you will need to skip over few bad pages rather than one as in this example.

Another hint – you may want to CHECK your MyISAM table you use for recovery after MySQL crashes to make sure indexes are not corrupted.

So we looked at how to get your data back from simple Innodb Table Corruption. In more complex cases you may need to use higher innodb_force_recovery modes to block purging activity, insert buffer merge or recovery from transactional logs all together. Though the lower recovery mode you can run your recovery process with better data you’re likely to get.

In some cases such as if data dictionary or “root page” for clustered index is corrupted this method will not work well – in this case you may wish to use Innodb Recovery Toolkit which is also helpful in cases you’ve want to recover deleted rows or dropped table.

I should also mention at Percona we offer assistance in MySQL Recovery, including recovery from Innodb corruptions and deleted data.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Farshad says:

    Hey , thanks but I have a serious problem in recovering some innodb mysql tables

    I discussed it here :

    http://stackoverflow.com/questions/6224158

    Problem is that I just have .frm file of a table and no backup

    Please help me , this is very important to us

  2. Rahul Rajpal says:

    I also find a good link that discusses all the corruption scenarios for bot MyISAM and InnoDB files. It also has explanations to repair the corrupted files. The link is mentioned below:
    http://www.datarecovery-info.com/database-recovery/mysql-recovery.html

  3. One more detail to mention: the bug that prevents innodb_force_recovery from working right. This bit me a couple weeks ago recovering data for a client: I set innodb_force_recovery=4 and suddenly all tables had 0 rows. More info: http://www.mysqlperformanceblog.com/2007/05/24/innodb-recovery-update-the-tricks-what-failed/ and http://bugs.mysql.com/bug.php?id=28604

  4. roy says:

    Will it be easiler to recover, if same error happen on MyISAM table?

  5. peter says:

    Roy,

    With MYISAM you can just do REPAIR TABLE most of the time. It is also isolated in few files so you never have to deal with dictionary corruption etc.

  6. James says:

    Any thoughts on what to do if the InnoDB log gets corrupted?

    Story behind the question: MySQL hit a bad sector during a write to one of its ib_logfiles. MySQL outright crashed (there was something in the log about it purposefully generating a segmentation fault so it would have a complete core dump; definitely not a confidence instilling feature of production release software).

    Long story short, we could start MySQL back up with innodb_force_recovery=6 so it didn’t try touching the logs, but we weren’t able to copy enough data to make a difference out of the table (the table has about half a billion records in it). We ended up recovering from a month-old backup and binary logs, but that was a hassle and took a few weeks (it’s not a mission critical application, so it wasn’t a big deal).

  7. peter says:

    If logs are corrupted you either can do dump and restore or if you have a backup point in time recovery is often better approach.

    You just should have backups setup in a way you can get point in time recovery within time desired.

  8. Funny, we’ve had a DB box acting up (InnoDB complaining about page corruption) for several months. It’s never the same page. The files on disk appear to be OK, it’s like they get corrupted in memory.

    I’ve dumped/reloaded the DBs, and swapped nearly everything in the box. The last remaining pieces are the motherboard and the processors.

    Yuck:(

  9. How about your RAID controller?

    Search Google for what Jeremy Cole has to say about LSI cards and risers — there might be something relevant there. Just a vague memory.

  10. peter says:

    Justin,

    The memory subsystem (which included mainboard) also can be the problem.
    In many cases with random Innodb corruptions doing memory diagnostic with memtest86 or similar tools pointed the problems.

  11. Baron,
    I did swap the card (a 3ware SATA controller), but there is a riser involved. We have ~10 other systems with an identical setup, and this is the only one with issues.

    Peter,
    I suspected ram at first, so I ran memtest86 for 24hrs with no failures. I think I have one ram permutation left to try before I swap the motherboard.

    Sadly, this problem takes over a week to appear in production, so it takes a while to narrow down:(

    thanks!

  12. peter says:

    Justin,

    If it is just one box you may just better off moving stuff to other box – in many case simply calling box “buggy” and not using it for critical stuff is more cost effective than spending hundred of hours to figure out what exactly is the issue.

    Also I should note if you have problem happening during the week in average you can’t be sure box is “good” by running just 24h worth of tests.

  13. Peter,
    Yeah, it’s just a slave for reporting, so it’s not a big deal when it acts up….

  14. peter says:

    Oh well,

    Then you can try more rounds replacing hardware :)

  15. The trick is to make sure you time the hardware work when it’s REALLY hot in the office. The data center is MUCH cooler:)

  16. Toby says:

    Did you file a bug on the CHECK TABLE deficiency?

  17. peter says:

    Toby,

    I ended writing this post well after midnight so I forgot to post the bug. Posted it now – thanks.
    http://bugs.mysql.com/bug.php?id=38078

  18. Chris says:

    Unfortunately, this didn’t help me at all recover from Innodb corruption. My db corrupted so badly, I can’t even connect to the server to run these commands.

  19. v2k says:

    Thanks for this guide; I was able to write a script to pull out the readable bits of a corrupted database.

  20. rey says:

    Thanks. You just saved my life!! I created a copy of a corrupt table using the MyISAM engine and populated it with all data from the old InnoDB table. It worked perfectly, with no problems. Big thanks again!!

  21. aashish says:

    i deleted 600000 records from db but ibdata1 size is still same ,how can i retrieve that unused memory space ? optimization is not working and it also does not support repair

  22. Erickson says:

    Thanks a lot. It resolves a serious problem i was getting in my server. Only here I found the solution. Hugs

  23. Nitin says:

    Hi Peter,
    I am facing an issue in mysql .
    I am getting this “ERROR 1467 (HY000): Failed to read auto-increment value from storage engine”.
    When i see auto-increment value using show table status it shows 0.also my table column type is Id int(3) NOT NULL AUTO_INCREMENT,.
    Please tell me how can i fix this issue so i couldn’t come in future and what caould be the reason of this error in my case.

  24. Jo Valerio says:

    Hi Nitin,

    Try this solution:

    ALTER TABLE table_name AUTO_INCREMENT =1

    Hope it help.

  25. Mitchell says:

    FT

  26. Hi!

    Why do you think, that crashed mysql can start? My mysql don’t start. How do anything? I don’t understand, all howtos assumes the running mysql. In this case i have no test.database (5.1) i have frm files, ibdata, logs. If i try to start mysql that drop lot of nulls on screen and a goodbye.

    131021 23:30:46 InnoDB: Initializing buffer pool, size = 16.0M
    131021 23:30:46 InnoDB: Completed initialization of buffer pool
    InnoDB: The log sequence number in ibdata files does not match
    InnoDB: the log sequence number in the ib_logfiles!
    131021 23:30:46 InnoDB: Database was not shut down normally!
    InnoDB: Starting crash recovery.
    InnoDB: Reading tablespace information from the .ibd files…
    InnoDB: Restoring possible half-written data pages from the doublewrite
    InnoDB: buffer…
    InnoDB: Database page corruption on disk or a failed
    InnoDB: file read of page 7.
    InnoDB: You may have to recover from a backup.
    131021 23:30:46 InnoDB: Page dump in ascii and hex (16384 bytes):
    len 16384; hex 0000000000
    [...]
    I* o /;InnoDB: End of page dump
    131021 23:30:47 InnoDB: Page checksum 1575996416, prior-to-4.0.14-form checksum 1371122432
    InnoDB: stored checksum 0, prior-to-4.0.14-form stored checksum 1227526798
    InnoDB: Page lsn 0 0, low 4 bytes of lsn at page end 1878327599
    InnoDB: Page number (if stored to page already) 0,
    InnoDB: space id (if created with >= MySQL-4.1.1 and stored already) 0
    InnoDB: Page may be a freshly allocated page
    InnoDB: Database page corruption on disk or a failed
    InnoDB: file read of page 7.
    InnoDB: You may have to recover from a backup.
    InnoDB: It is also possible that your operating
    InnoDB: system has corrupted its own file cache
    InnoDB: and rebooting your computer removes the
    InnoDB: error.
    InnoDB: If the corrupt page is an index page
    InnoDB: you can also try to fix the corruption
    InnoDB: by dumping, dropping, and reimporting
    InnoDB: the corrupt table. You can use CHECK
    InnoDB: TABLE to scan your table for corruption.
    InnoDB: See also http://dev.mysql.com/doc/refman/5.1/en/forcing-innodb-recovery.html
    InnoDB: about forcing recovery.
    InnoDB: Ending processing because of a corrupt database page.

    The force recovery crashed the sql server.

    innodb is sucks

  27. Santa Firr says:

    If you want recover your deleted MySQL database then your any third party MySQL recovery software to repair and recover all over MySQL database, The software design on latest technologies, which support all kind version of MS windows operating system. Visit this video and follow the steps : http://www.youtube.com/watch?v=a7npwYA-egI

Speak Your Mind

*