October 1, 2014

Best kept MySQLDump Secret

Many people use mysqldump –single-transaction to get consistent backup for their Innodb tables without making database read only. In most cases it works, but did you know there are some cases when you can get table entirely missing from the backup if you use this technique ?

The problem comes from the fact how MySQL’s Transactions work with DDL, In particular ALTER TABLE. When ALTER TABLE is Performed in many cases it will Create temporary table with modified structure, copy data to that table and when drop original table and rename such temporary table to original name.

How does data visibility works in this case ? DDLs are not transactional and as such the running transaction will not see the contents of old table once it is dropped, transaction also will see the new table which was created after transaction was started, including table created by ALTER TABLE statement. Transactions however apply to DATA which is stored in this table and so data which was inserted after start of transaction (by ALTER TABLE statement) will not be visible. In the end we will get new structure in the dump but no data.

Here is example:

As you can see as we altered table C at the same time as mysqldump was running we got table empty table with new structure in mysqldump instead of table with valuable data.

This is a pretty edge case scenario neither the less it can be problem for some workloads which run ALTER TABLE regularly during normal operation. I also hope if you get some empty
tables in your mysqldump –single-transaction backups you will know the potential cause for it.

What are potential solutions for this problem ? you can use mysqldump –lock-all-tables instead which does not have this problem at the cost of having database read only for the
duration of operation. You can also use Percona Xtrabackup, LVM or other database backup approach which does not relay on transaction visibility.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Dmitry Lenev says:

    Hi Peter!

    It is not really a secret… Quoting MySQL manual from http://dev.mysql.com/doc/refman/5.5/en/mysqldump.html :

    “While a –single-transaction dump is in process, to ensure a valid dump file (correct table contents and binary log coordinates), no other connection should use the following statements: ALTER TABLE, CREATE TABLE, DROP TABLE, RENAME TABLE, TRUNCATE TABLE. A consistent read is not isolated from those statements, so use of them on a table to be dumped can cause the SELECT that is performed by mysqldump to retrieve the table contents to obtain incorrect contents or fail.”

  2. Dima,

    OK. My bad. Great to hear it is documented. Still I wonder how many people know about it… beyond you can use
    –single-transaction to get consistent backup

  3. I wrote about the alter table issue a few years ago and filed bug http://bugs.mysql.com/bug.php?id=28432 which was incorrectly closed as fixed. I’m surprised that more people weren’t upset about it. Maybe I should have added notes about mysqldump.

  4. This is so mind-bogglingly stupid behaviour. What drugs were people on when they decided this kind of dumb behaviour was a good idea? I seriously want to know so I can avoid ever taking them.

  5. Stewart,

    I do not think this is a “stupid decision” which was ever taken. This is just how things were designed with MySQL and how transactions were retrofitted to it. MySQL has funny history this way. It still has lots of things going to its non transactional past, yet non transactional table support are all but gone in recent versions. The question is whenever MySQL will ever be redesigned to focus on transactional storage engines or whenever this will remain Drizzle’s turf.

  6. As we learnt when doing a lot of the initial refactoring for Drizzle, turning MySQL into something that truly focused on transactions leads to some rather interesting corner cases you discover/remove. If they do, good luck to those with existing applications :)

  7. huarong says:

    I did the same test on windows2003 mysql 5.5.15

    while mysqldump is going , alter table was BLOCKED!

    Eventually mysqldump got the old table definition and old data in it.

    DROP TABLE IF EXISTS z;
    /*!40101 SET @saved_cs_client = @@character_set_client */;
    /*!40101 SET character_set_client = utf8 */;
    CREATE TABLE z (
    n char(1) DEFAULT ”
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
    /*!40101 SET character_set_client = @saved_cs_client */;


    — Dumping data for table z

    LOCK TABLES z WRITE;
    /*!40000 ALTER TABLE z DISABLE KEYS */;
    INSERT INTO z VALUES (‘a’);
    /*!40000 ALTER TABLE z ENABLE KEYS */;
    UNLOCK TABLES;
    /*!40103 SET TIME_ZONE=@OLD_TIME_ZONE */;

  8. 5.5 introduces metadata locking which fixes the problem.

  9. Eric,

    I tested it with MySQL 5.5 I think what huarong had is just different timing. ALTER TABLE Will be blocked if that table is being dumped right now. However if mysqldump is processing table “A” and you’re altering table “Z” at this time Alter Table
    will not be blocked.

  10. Ah that makes sense. MDL won’t kick in until the table has been accessed.

  11. paraman says:

    Only one way out. Make a backup copy of a replicated server. Prior to this, run a stop slave

  12. Paul says:

    I’m finding that invoking mysqldbexport.exe or mysqldbcopy.exe directly (without first launching MySQLWorkbench) seems to accomplish about any task blazingly fast while avoiding the issues associated with mysqldump.exe.

  13. Paul Otto says:

    Taking backups from a replicated slave with replication stopped is probably the best way to do it with mysqldump. You could also look at making a script to iterate through the tables in the database and mysqldump them one at a time, each with –single-transaction. Problem with that is your coordinates would differ for each table… using this to rebuild a slave wouldn’t be a problem, just logistically painful — but couldn’t be used for a new master or stand-alone db server, as relational integrity could be compromised.

  14. Abhijit Buchake says:

    Agree with Peter. Though it is not a secret, too few people know this (making this as a secret) :)

  15. Hi Peter,
    I would like to know if indices on any given columns are included in mysqldump created by the MySQL database server.
    Or is it that I should log into my remote host server and create them.

    Regards,

    Jimmy

Speak Your Mind

*