How to recover table structure from InnoDB dictionary

To recover a dropped or corrupt table with Percona Data Recovery Tool for InnoDB you need two things: media with records(ibdata1, *.ibd, disk image, etc.) and a table structure. Indeed, there is no information about the table structure in an InnoDB page. Normally we either recover the structure from .frm files or take it from some old backup.

A new tool sys_parser can recover the table structure from InnoDB dictionary.

Why do we need a new tool anyway? It is absolutely critical to have an accurate table definition to ensure a successful recovery. Even an unnoticeable difference like NULL or NOT NULL can shift all values by a byte and thus will spoil the result. That’s why I prefer the structure from .frm files over taken from backups. But in some cases even .frm files is not an option:

  • Table was dropped and innodb_file_per_table is ON
  • Frm file corrupt, zeroed out, lost or SHOW CREATE TABLE crashes MySQL

There is yet another source of information about the table structure – InnoDB dictionary. Let’s review tables from the dictionary and see what it can give us. We will need four of them:

  • SYS_TABLES
  • SYS_INDEXES
  • SYS_COLUMNS
  • SYS_FIELDS

SYS_TABLES
Here InnoDB keeps correspondence between human readable table names and their internal identifiers.

NAME is a human readable table name in form database_name/table_name e.g. sakila/actor. ID is a table identifier. We will need the table id to find indexes of the table.

SYS_INDEXES
This table lists all indexes the table has, secondary as well as the primary.

So, TABLE_ID is our table_id. ID here is the index identifier. We need it to find InnoDB pages which belong to the table’s index. Which one? A table can have many secondary indexes, but only in the primary index we can find all fields. It must exist for any InnoDB table. If explicitely defined its NAME is PRIMARY. If the primary key is not defined InnoDB will use a unique secondary index as the primary. If there is no any unique index InnoDB will create one implicitely. Its name will be GEN_CLUST_INDEX.
It doesn’t matter how the primary index gets created it will have minimal ID among the indexes of the table.

SYS_COLUMNS
Table SYS_COLUMNS stores fields names and type information of the table.

Here TABLE_ID is a well known table identifier, POS – a position of a field in the table. NAME is the name of a field, MTYPE and PRTYPE store information about the field type, encoding, NULL/NOT NULL properties etc.
LEN is the maximum number of bytes a field uses to store a value. I’m not sure what PREC is used for. It sounds like a short from “precision”, but at least for DECIMAL type where it would make sense it is still zero. If you know how InnoDB uses PREC please let me know.

So, we know all fields of the table, we can get the type. Is it enough for the recovery? No.

SYS_FIELDS
We need to know what fields form the primary key. The matter is regardless at what position primary key fields are defined in CREATE TABLE statement internally they always go first in a record. The second issue we should take into account is internal fields DB_TRX_ID and DB_ROLL_PTR . These two fields always reside between the primary key fields and the rest of the fields.
SYS_FIELDS lists fields of all indexes, including the primary.