October 1, 2014

On Character Sets and Disappearing Tables

The MySQL manual tells us that regardless of whether or not we use “SET FOREIGN_KEY_CHECKS=0″ before making schema changes, InnoDB will not allow a column referenced by a foreign key constraint to be modified in such a way that the foreign key will reference a column with a mismatched data type. For instance, if we have these two tables:

trying to do something like “ALTER TABLE bar DROP j” or “ALTER TABLE bar MODIFY COLUMN j j SMALLINT NOT NULL” will produce an error unless we first remove the foreign key constraint present in table “foo”. Indeed, if we try it, that’s exactly what happens:

And that’s exactly what we’d hope would happen, too, lest table “foo” suddenly end up with an unsatisfiable FK reference due to a mismatched data type (non-existent certainly qualifies as mismatched in this case). No harm was done by this operation, which is also what we’d expect: table “foo” and table “bar” remain in their original, unmodified, and unmolested state, and if we really want to make this modification, we can drop the FK constraint and then proceed with the ALTER.  No surprises yet.

But are there situations in which this operation doesn’t unfold quite so neatly? Obviously, there are, or there’d be little point in this blog post. When is a matching datatype not really a matching datatype? If you guessed that it has something to do with character fields and character sets, then you’d be absolutely correct.  Consider the following 3 tables, each one with a single row inserted into it:

Suppose that we decide that we’ll never have any use for UTF8 in this database at all, and that we want to achieve the best performance possible, so we’re going to convert everything to ASCII. Or perhaps we have some other tables with ASCII identifiers and we’re doing JOINs against tables in this database and seeing that indexes aren’t being used in the way we were expecting. Whatever the reason, we decide to convert these columns to be ASCII.  We might assume that since the underlying data type is remaining as a CHAR(32) that we don’t need to worry about dropping constraints and that we can just modify the table in place.  The first table certainly proceeds well enough:

But suppose that this was actually a long-running ALTER statement. Maybe we got pre-occupied, or we had to pick up the task the next day, and when we came back, we forgot that we had to re-disable foreign key checking since we were in a new session. We move on to table 2:

Oops, we forgot to disable foreign key checking. No problem, right? Not exactly….

“Oops” indeed. Where we once had three tables, there are now only 2.  The “dos2″ table is nowhere to be found.  The good news here is that the data is not completely gone – it still exists in the temporary tables that were created as part of the ALTER operation and can be found in the data directory, but it’s no longer readily accessible by you or your applications, and the database itself is now in an odd state of limbo.  You can’t drop the database, you can’t recreate the “dos2″ table, and you can’t turn the orphaned temporary table back into its former self unless you’re handy with a hex editor and have a good knowledge of InnoDB file formats.  Moreover, trying to modify any of the remaining tables could cause the same thing to happen to them as well.

Yikes.  Not a good day.  The one thing that you can do, fortunately, is recover the data into a new table, but you’ll need to go take a look inside the InnoDB data directory for this database to get the actual filename of the temporary table rather than the temporary table name that was displayed in the error message.  Once you do that, you can run something like this:

But your database is still potentially inconsistent, and you can’t explicitly drop those temporary tables or rename them.

This looks somewhat like closed bug 13378, but I think there’s something of an inconsistency present when MySQL won’t allow us to change an FK-referenced column from INT to SMALLINT, but it will allow us to change from a multi-byte character column to a single-byte character column, so I’ve submitted this as bug report 65701.

So what did we learn here?

  • Character set mismatches are bad enough when dealing with JOINs, but they can be an absolute nightmare when dealing with FKs.
  • The safest way to modify a column referenced by FKs is to drop the constraints first.
  • Sometimes you won’t know that an operation has failed badly until it’s too late. Just one more reason (as if you needed any more) to keep good backups that are tested regularly.

Tested under MySQL 5.5.24.

About Ernie Souhrada

Ernie joined Percona in April 2012 as a Senior Consultant. In his previous lives, he has been everything from a Perl/Java developer to a Linux sysadmin, a MySQL DBA to a Cisco network engineer, and a security auditor to an IT engineering manager, many of these things all at the same time. When not working on MySQL, he might be found on the ski slope, at a psytrance festival, or at the nearest sushi bar.

Comments

  1. William says:

    Hi Ernie,

    Very good post!

    I remembered reading another post (“Hacking FRM files to switch data types” http://goo.gl/chzTJ) I think that can be used in this case to avoid the problems shown.

Speak Your Mind

*