September 1, 2014

Innodb Recovery Update – The tricks what failed.

As I wrote we had been recovering corrupted Innodb Tablespace and it is finally done now.

As this was over than 1TB worth of data we really tried to avoid dumping the data and find some other way to recovery.
Examining Innodb page content and crash information we figured out it should be page of insert buffer itself, rather than page belonging to some of the pages touched by insert buffer.

If that would be just the table it should have been easy – dropping the table with corrupted page would remove all appropriate insert buffer records and be able just to load that table back. Unfortunately we were not that lucky.

If the page corrupted in insert buffer itself there seems to be no way to make Innodb avoid touching it. Even when we dropped all Innodb tables (for test purposes) Innodb still was crashing if we attempted to enable insert buffer.

I think it would be nice for Innodb to have some additional recovery options for example for skipping over insert buffer or undo buffer. True this would leave some tables corrupted but at least you could get by by dumping in reimporting couple of tables rather than full dump and restore.

The bug which we found during attempt to dump tables with innodb_force_recovery=4 was confirmed to be added in 5.0.33, meaning you could not really recover corrupted Innodb tables this way for about 6 months.

What does this mean ? I guess it could mean all sorts of different things:

Versions later than MySQL 5.0.33 are not frequently used True many distributions still have versions before this one which could have some impact.

Innodb tables rarely become corrupt Indeed even if you leave out crashes in my experience Innodb tables become corrupted more seldom than MyISAM. Thanks to checksums and crashing in case of corruptions and a lot of assertions most bugs were cleared out.

People finding the bug do not report it This is also probably true. Myself I probably report half of the bugs I find, the once I find most annoying or find easy to provide repeatable example for so they can be fixed.

Corruption usually happens in secondary index, so simple ALTER TABLE fixes it This also seems to be the case. In my experience probably 80-90% of Innodb corruptions are fixed this way, especially if they are not caused by faulty hardware.
Primary keys are typically simple integer keys in Innodb while secondary keys may have very complex structure.

Compex corruptions are resolved by dump and restore from backup. I think this is the main reason. In fact dump and reload process can be so slow restoring even week old backup and running roll forward recovery using binary log can be a lot faster.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Paul says:

    The “6 months” bit, does that mean it was fixed in a later release?

  2. peter says:

    Paul,

    6 months means it was broken at least 6 months ago, it was working before that. It probably should be fixed in next MySQL release – you should check bug report for that.

Speak Your Mind

*