For a long time long types like BLOB, TEXT were not supported by Percona InnoDB Recovery Tool. The reason consists in a special way InnoDB stores BLOBs.
An InnoDB table is stored in a clustered index called PRIMARY. It must exist even if a user hasn’t defined the primary index. The PRIMARY index pages are identified by 8-bytes number index_id. The highest 4 bytes are always 0, so index_id is often notated as o:<4 bytes number>, e.g. 0:258. The pages are ordered in a B-tree. Primary index is used as a key. Inside a page records are stored in a linked list.
InnoDB page by default is 16k. Obviously if a record is too long, a single page can’t store it. If the total record size is less than UNIV_PAGE_SIZE/2 – 200 (this is roughly 7k) then the full record is stored in the page of PRIMARY index. Let’s call it internal. In InnoDB sources they have type FIL_PAGE_INDEX*. If the record is longer than 7k bytes, only first 768 bytes of every BLOB field are stored internally. The rest is stored in external pages. They have type FIL_PAGE_TYPE_BLOB. Page type is stored in a FIL_PAGE_TYPE field of the page header . In an earlier post Peter described in details how BLOBs are stored.
Let me illustrate a record format of the example of the table:
CREATE TABLE `t1` (
`ID` int(11) unsigned NOT NULL,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT
Here COMPACT format is used, which is default in MySQL >= 5.1.
The record consists of four parts:
- Offsets. Effectively these are field lengths. Only variable length types have offsets. The offset can be one or two bytes depending on maximum field size. The highest bit of the offset is 1 if the field is stored in external pages (i.e. long BLOB field)
- NULL fields. A bit per NULL-able field padded to minimum number of bytes to store all flags.
- So called extra bytes. These are 5 bytes where different flags are stored like “record is deleted” flag. The last two bytes are the relative pointer to the next record in the page.
- User data. TRX_ID is a transaction id. PTR_ID is a pointer in a rollback segment to the old version of the record.
So if a field is the long one, it has 1) The highest bit of the offset is set to “1″, 2) After 768 bytes there are 20 bytes in the end where the following:
- BTR_EXTERN_SPACE_ID – space id where the next piece of the field is stored
- BTR_EXTERN_PAGE_NO – page id
- BTR_EXTERN_OFFSET – offset inside a page. An external page has a header. The similar pointer to the next page is stored in it.
- BTR_EXTERN_LEN – length of the next piece.
The external pages are linked until BTR_EXTERN_PAGE_NO is FIL_NULL.
Percona InnoDB Recovery Tool supports now recovery of long fields. It is still in development branch, but should be released after QA tests.
The complexity of BLOB fields brings prerequisites to successfully recover a record with BLOB : all pieces of the BLOB field must be reachable by pointers. That means BTR_EXTERN_PAGE_NO, BTR_EXTERN_OFFSET and BTR_EXTERN_LEN must not be corrupted.
The tool outputs the recovered table in tab-separated values format. BLOBs are printed in a hex form – 0ACD86…
To upload the table back you should utilize UNHEX function:
LOAD DATA INFILE '/path/to/datafile'
REPLACE INTO TABLE <table_name>
FIELDS TERMINATED BY '\t'
OPTIONALLY ENCLOSED BY '"'
LINES STARTING BY '<table_name>\t'
blobfield = UNHEX(@var1),
datefield2 = FROM_UNIXTIME(@var2,'%Y %D %M %h:%i:%s %x');
* – there is a typo in Recovery of Lost or Corrupted InnoDB Tables Presentation