September 1, 2014

Innodb vs MySQL index counts

I had a customer recently who a few strange errors in their mysqld.err log:

This customer was running Percona Server 5.1 and they got this error on two tables during a maintenance window when they were adding indexes to the same tables.  We had a suspicion that it had something to do with Fast index creation in Innodb, and that it had been corrected when the ALTER TABLE completed because the errors had not recurred.

Reproducing the error on a test system is simple:

  1. create an Innodb table
  2. make a copy of the .frm file
  3. do an ALTER TABLE to add an index
  4. then copy the old .frm file back into place
  5. re-open the table  (Might need a FLUSH TABLES or mysqld restart here)

From my testing, I saw that the error only happened when the table was opened and not on every table access.  So, it was a possibility that the indexes were out of sync and we weren’t seeing new errors in the log simply because the table hadn’t been re-opened.

But, before getting too crazy, how can we verify the problem still exists?  We need a way to compare the output of SHOW CREATE TABLE to what Innodb thinks.  What Innodb thinks is in the Innodb Data dictionary.

  • The first recommendation I got was to simply use the INFORMATION_SCHEMA.INNODB_SYS_INDEXES table, which exists in Percona Server 5.1, but doesn’t appear in MySQL until 5.6 (if the manual is to be trusted).  I’d probably consider this on a newer version of Percona Server or MysqL 5.6.
  • Another person (I’m looking at you, Baron) was adverse to trusting INNODB_SYS_INDEXES from some bad experiences with it, and suggested the Innodb Table monitor instead, see my next post for how that turned out, but this basically will regurgitate the entire Innodb Data dictionary to the mysqld error log file.
  • If I had to do it over again, I think I’d simply try doing:  FLUSH TABLES table1, table2; to force the tables to close and be reopened and simply see if the error message comes back.  That might something of a performance impact, but it seems to be the most stable.
In this case, it turned out that the indexes were not out of sync, so I didn’t have to do anything to fix it.
However if I did have to fix it, I found on my test table that the extra index in Innodb could be removed by doing:

This, of course, rebuilds the whole table based on the .frm table definition and removes the existing index in Innodb, which might not be desirable, but at least you can re-add it later.  However, it’s not the greatest thing to do on a live production database master if you can help it.

Another solution might be to figure out what index was missing via the Innodb data dictionary (more on that in a minute), create a separate table identical to the existing .frm, add that index to it, and copy the new .frm back over the original.  Kind of scary.

My advice is to ensure the error still exists before trying to fix it.

About Jay Janssen

Jay joined Percona in 2011 after 7 years at Yahoo working in a variety of fields including High Availability architectures, MySQL training, tool building, global server load balancing, multi-datacenter environments, operationalization, and monitoring. He holds a B.S. of Computer Science from Rochester Institute of Technology.

Comments

  1. Jay,

    I wonder what happens if old .frm file will also contain smaller number of columns. Will the recovery work in this case or
    would alter table fail too ?

  2. Let’s see:

    I copied the .frm of this table:

    CREATE TABLE test (
    id int(10) unsigned NOT NULL,
    data varchar(100) DEFAULT NULL,
    PRIMARY KEY (id)
    ) ENGINE=InnoDB

    > ALTER TABLE test ADD COLUMN foo VARCHAR(20);
    Query OK, 1 row affected (0.07 sec)
    Records: 1 Duplicates: 0 Warnings: 0

    CREATE TABLE test (
    id int(10) unsigned NOT NULL,
    data varchar(100) DEFAULT NULL,
    foo varchar(20) DEFAULT NULL,
    PRIMARY KEY (id)
    ) ENGINE=InnoDB

    Inserted some data, so there is something in foo:

    > insert into test( id, data, foo ) VALUES (2, 1000, ‘some foo’ );
    Query OK, 1 row affected (0.01 sec)

    > select * from test;
    +—-+——-+———-+
    | id | data | foo |
    +—-+——-+———-+
    | 1 | 49999 | NULL |
    | 2 | 1000 | some foo |
    +—-+——-+———-+
    2 rows in set (0.00 sec)

    Copied the old .frm back, then:

    > flush tables;
    Query OK, 0 rows affected (0.01 sec)

    > select * from test;
    +—-+——-+
    | id | data |
    +—-+——-+
    | 1 | 49999 |
    | 2 | 1000 |
    +—-+——-+
    2 rows in set (0.00 sec)

    Interestingly, I do not see an error in my mysqld.err file complaining about the missing column. The Innodb Table Monitor shows me that ‘foo’ is still in the Data Dictionary:

    TABLE: name test/test, id 33, flags 1, columns 6, indexes 1, appr.rows 2
    COLUMNS: id: DATA_INT DATA_UNSIGNED DATA_BINARY_TYPE DATA_NOT_NULL len 4; data: DATA_VARCHAR prtype 524303 len 100; foo: DATA_VARCHAR prtype 524303 len 20; DB_ROW_ID: DATA_SYS prtype 256 len 6; DB_TRX_ID: DATA_SYS prtype 257 len 6; DB_ROLL_PTR: DATA_SYS prtype 258 len 7;
    INDEX: name PRIMARY, id 50, fields 1/5, uniq 1, type 3
    root page 3, appr.key vals 2, leaf pages 1, size pages 1
    FIELDS: id DB_TRX_ID DB_ROLL_PTR data foo

    > alter table test ENGINE=Innodb;
    Query OK, 2 rows affected (0.02 sec)
    Records: 2 Duplicates: 0 Warnings: 0

    > select * from test;
    +—-+——-+
    | id | data |
    +—-+——-+
    | 1 | 49999 |
    | 2 | 1000 |
    +—-+——-+
    2 rows in set (0.00 sec)

    Seems to be ok. The Innodb table monitor proves that ‘foo’ was successfully removed:

    ————————————–
    TABLE: name test/test, id 35, flags 1, columns 5, indexes 1, appr.rows 2
    COLUMNS: id: DATA_INT DATA_UNSIGNED DATA_BINARY_TYPE DATA_NOT_NULL len 4; data: DATA_VARCHAR prtype 524303 len 100; DB_ROW_ID: DATA_SYS prtype 256 len 6; DB_TRX_ID: DATA_SYS prtype 257 len 6; DB_ROLL_PTR: DATA_SYS prtype 258 len 7;
    INDEX: name PRIMARY, id 52, fields 1/4, uniq 1, type 3
    root page 3, appr.key vals 2, leaf pages 1, size pages 1
    FIELDS: id DB_TRX_ID DB_ROLL_PTR data

    I might think twice about this before trying it on a larger table. :)

  3. Jay,

    What if you try to insert in the table when frm is in mismatch with actual table structure ?

    Also what I read here is different – if for some reason you get old .frm you will have a silent data loss on ALTER TABLE – think about the case when “foo” was some important data added a month ago and when for some reason old .frm file was restored.

    Might be Innodb needs more checks here.

  4. Same situation as before (old table frm already copied back, and extra column missing):

    > select * from test;
    +—-+——-+———-+
    | id | data | foo |
    +—-+——-+———-+
    | 1 | 49999 | NULL |
    | 2 | 1000 | some foo |
    +—-+——-+———-+
    2 rows in set (0.00 sec)

    ## .frm copied back

    > flush tables;
    Query OK, 0 rows affected (0.00 sec)

    > select * from test;
    +—-+——-+
    | id | data |
    +—-+——-+
    | 1 | 49999 |
    | 2 | 1000 |
    +—-+——-+
    2 rows in set (0.00 sec)

    > insert into test( id, data, foo ) VALUES (3, 1001, ‘some more foo’ );
    ERROR 1054 (42S22): Unknown column ‘foo’ in ‘field list’
    > insert into test( id, data ) VALUES (3, 1001 );
    Query OK, 1 row affected (0.01 sec)

    > select * from test;
    +—-+——-+
    | id | data |
    +—-+——-+
    | 1 | 49999 |
    | 2 | 1000 |
    | 3 | 1001 |
    +—-+——-+
    3 rows in set (0.00 sec)

    It seems to handle it ok. For curiosity, what happens if the .frm has an *extra* column compared with Innodb:

    > show create table test\G
    *************************** 1. row ***************************
    Table: test
    Create Table: CREATE TABLE test (
    id int(10) unsigned NOT NULL,
    data varchar(100) DEFAULT NULL,
    foo varchar(20) DEFAULT NULL,
    PRIMARY KEY (id)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1
    1 row in set (0.00 sec)

    > select * from test;
    +—-+——-+——+
    | id | data | foo |
    +—-+——-+——+
    | 1 | 49999 | NULL |
    | 2 | 1000 | NULL |
    | 3 | 1001 | NULL |
    +—-+——-+——+
    3 rows in set (0.00 sec)

    > alter table test drop column foo;
    Query OK, 3 rows affected (0.02 sec)
    Records: 3 Duplicates: 0 Warnings: 0

    > select * from test;
    +—-+——-+
    | id | data |
    +—-+——-+
    | 1 | 49999 |
    | 2 | 1000 |
    | 3 | 1001 |
    +—-+——-+
    3 rows in set (0.00 sec)

    ## copy the .frm with 3 columns back on top

    > flush tables;
    Query OK, 0 rows affected (0.00 sec)

    > select * from test;
    +—-+——-+——+
    | id | data | foo |
    +—-+——-+——+
    | 1 | 49999 | NULL |
    | 2 | 1000 | NULL |
    | 3 | 1001 | NULL |
    +—-+——-+——+
    3 rows in set (0.00 sec)

    > insert into test( id, data ) VALUES (4, 1002 );
    Query OK, 1 row affected (0.00 sec)

    > select * from test;
    +—-+——-+——+
    | id | data | foo |
    +—-+——-+——+
    | 1 | 49999 | NULL |
    | 2 | 1000 | NULL |
    | 3 | 1001 | NULL |
    | 4 | 1002 | NULL |
    +—-+——-+——+
    4 rows in set (0.00 sec)

    > insert into test( id, data, foo ) VALUES ( 5, 1003, ‘more foo’ );
    Query OK, 1 row affected (0.00 sec)

    > select * from test;
    +—-+——-+——+
    | id | data | foo |
    +—-+——-+——+
    | 1 | 49999 | NULL |
    | 2 | 1000 | NULL |
    | 3 | 1001 | NULL |
    | 4 | 1002 | NULL |
    | 5 | 1003 | NULL |
    +—-+——-+——+
    5 rows in set (0.00 sec)

    Ouch. The column is there in the .frm, but clearly ignored in Innodb. No errors in the mysqld.err. (this was on vanilla MySQL 5.5)

    Let’s see if the ALTER TABLE trick works:

    > alter table test ENGINE=Innodb;
    Query OK, 5 rows affected (0.01 sec)
    Records: 5 Duplicates: 0 Warnings: 0

    > select * from test;
    +—-+——-+——+
    | id | data | foo |
    +—-+——-+——+
    | 1 | 49999 | NULL |
    | 2 | 1000 | NULL |
    | 3 | 1001 | NULL |
    | 4 | 1002 | NULL |
    | 5 | 1003 | NULL |
    +—-+——-+——+
    5 rows in set (0.00 sec)

    > insert into test( id, data, foo ) VALUES ( 6, 1004, ‘even more foo’ );
    Query OK, 1 row affected (0.00 sec)

    > select * from test;
    +—-+——-+—————+
    | id | data | foo |
    +—-+——-+—————+
    | 1 | 49999 | NULL |
    | 2 | 1000 | NULL |
    | 3 | 1001 | NULL |
    | 4 | 1002 | NULL |
    | 5 | 1003 | NULL |
    | 6 | 1004 | even more foo |
    +—-+——-+—————+
    6 rows in set (0.00 sec)

    Yup, seems to. Seems like this is another blog post by itself.

  5. I guess I’m surprised there isn’t at least an error in the .err file when you open the table like when the indexes are mismatched.

  6. Regarding INNODB_SYS_INDEXES from IS, when reading about them, I saw a note which I remember now, that they are in sync with the disk, in other words, the dictionary on disk is the source of truth which is followed, not the memory.

    PS: Got the article — http://blogs.innodb.com/wp/?p=636

    “There is another benefit of Information Schema SYSTEM TABLES in that, by design, the data is read from system table directly, rather than fetching data from their in memory representations (like innodb_table_monitor does). So it is a true representation of what we have on disk. If there is an mismatch (unlikely) between the in memory information and on disk data, these system tables will display the real image on disk, and help DBA and even developer to debug or better understand the system metadata behavior.”

Speak Your Mind

*