An argument for not using mysqldump

I have a 5G mysqldump which takes 30 minutes to restore from backup.  That means that when the database reaches 50G, it should take 30×10=5 hours to restore.  Right?  Wrong.

Mysqldump recovery time is not linear.  Bigger tables, or tables with more indexes will always take more time to restore.

If I restore from a raw backup (LVM snapshot, xtrabackup, innodb hot backup), it is very easy to model how much longer recovery time will take:

I can tell progress with mysqldump by monitoring the rate at which show global status like 'Handler_write'; increases and compare it to my knowledge of about how many rows are in each table.  But progress != a magic number like “17 minutes”.  Not unless I do a lot of complex modeling.

I am not saying a 5 hour recovery is good or bad.  What I am saying is knowing remaining time is very important during disaster recovery.  Being able to say “we’ll be back at 2PM” is much better than saying “we’ll be back between 1PM and 4PM.. maybe”.

Share this post

Comments (27)

  • Gwen Shapira Reply

    Responsible DBAs test restore from their backups on a regular basis.
    Say, every 2 month. During this check, it is useful to record how long restore took for exactly this purpose. When management asks – how much longer will this take? I can look in my charts and say “20 more minutes, captain”.

    It is still good to be able to calculate the remaining time, but maybe not as critical.

    November 8, 2010 at 11:22 am
  • Nikolay Reply

    I have huge mysql server with lots of myisam/maria databases, can someone elaborate how we can do non blocking backup, even if we back up some dirty (in the middle of transaction) data.

    November 8, 2010 at 11:49 am
  • Gerry Reply

    I wouldn’t favor depending solely on mysqldump for backups unless you have some other HA considerations in place (ie: master-master). This way you can work on the restore while the service is still on line.

    My $.02

    November 8, 2010 at 12:20 pm
  • Morgan Tocker Reply

    @Gwen – when people test their backups it is not always on the same hardware. I (of course) advocate testing backups, but it’s not always easy:

    – Comparisons will never truly be equal (hopefully bigger hardware is on the main DB, and any skew is of “the good kind”).

    – When you find restore time is starting to take longer, seldom will a customer have the opportunity to just change their backup plan over night. Raw backups take more space, and have some reduced flexibility when just restoring smaller amounts of data (although there is a feature in XtraDB to just restore one file_per_table table at a time).

    It is good to plan with this knowledge in advance.

    November 8, 2010 at 12:20 pm
  • Morgan Tocker Reply

    @Nikolay – The best place to ask this question is in our forums – Thanks!

    November 8, 2010 at 12:27 pm
  • Rolando Edwards Reply

    There are two alternatives, one using mysqldump and one not.

    Option 1 (using mysqldump)

    Write a customized backup script
    1) Extract the names of all databases (except mysql and information_schema)
    3) Record Log File and Position from SHOW MASTER STATUS;
    4) Run all mysqldumps of each database in parallel as background processes

    Apply this in reverse when restoring
    Load all the mysqldumps back in parallel

    Example : I have a client at my DB/Web hosting company with 650+ databases (450+ GB) in one mysql instance.
    mysqldump databases one by one : 7 hours
    mysqldump in parallel, 7 databases at a time : 3.5 hours

    I have performed 3 full restorations in this client’s lifetime with LogicWorks

    All 650+ DBs (450+ GB), 10.5 hours, loading 32 mysqldumps in parallel at a time

    I am sure 5GB should present no challenge if you parallel dump all databases. If this involves one database, then parallel dump all individual tables (which I have also scripted for other DB Hosting clients)

    Option 2 (not using mysqldump)

    MAATKIT’s very own toolset has Parallel Dump and Parallel Restore sugin SELECT INTO and LOAD DATA INFILE under the hood.

    November 8, 2010 at 8:54 pm
  • Morgan Tocker Reply

    @Rolando – The option one is really hard to get perfect. You need to make sure that you hold a lock after FLUSH TABLES WITH READ LOCK, so in command line scripting you need to add SLEEP (9999999..) after it. It sounds like you’ve got it to work – but I’m not sure it’s the best solution.

    There are other issues with mysqldump – but I didn’t want to cover them in this article. mysqldump needs to churn all the data through the buffer pool (something often undesired), and the backup as you described it is not “hot” (non blocking). You can make it hot – but this synchronizes the backup to the start of running mysqldump. Xtrabackup synchronizes with the end of the backup, which is probably the better choice if you do not want to bloat undo space.

    November 8, 2010 at 9:14 pm
  • Patrick Casey Reply

    If you’re just running innodb, you can run mysqldump with the –singletransaction flag which will get you a consistent logical backup. It’ll take a long time on a larger database, but it will give you a snapshot in a non-blocking manner.

    If you have a non transactional engine in your mix (myisam I’m looking at you), then there really is no way to get a “hot” consistent dump. You can issue flush tables with read lock, but to do this right you have to hold the lock until your dump completes. A read lock is just that, a lock, and that means no delta transactions as long as the backup is in flight.

    If you are on some form of snapshot aware storage you an execute:

    Flush tables with read lock
    Release lock

    This will get you a consistent snap even on myisam. Problem is the FLUSH can take an unpredictable amount of time b/c it must wait for all in flight queries to complete. That means its impossible to guarantee a “no downtime backup” via this method. You’re locked up for flush time + snap time.

    If you’re pure innodb, I generally trust its recovery engine so you can snap it w/o the read lock. This is a crash consistent copy and you’ll have to go through a recovery when you try to restore that backup. There are horror stories on the internet of this form of snap coming up corrupt, but I will say I’ve personally never had a problem with it.

    November 8, 2010 at 9:24 pm
  • Baron Schwartz Reply

    Very often there is no way at all to make any kind of dump & load work, period. It is not uncommon for me to see clients whose dumps would take days or weeks to load a single table. At such sizes, dumps are completely unusable. One of the main problems is building indexes.

    November 8, 2010 at 9:54 pm
  • Lucas Reply

    I relly like the mysql dump, but they have any compability problems whith another version`s of mysql, maybe its better now.

    November 9, 2010 at 4:04 am
  • Mark Wilden Reply

    Don’t update indexes for each record added. Drop them before the load, then recreate them after.

    November 9, 2010 at 4:13 pm
  • Morgan Tocker Reply

    @Mark – what you are suggesting is only a valid optimization in Percona Server/InnoDB plugin releases (which have the fast index create feature). For regular releases, it will actually be worse to add the indexes after full import.

    There are still some limitations with the fast index create feature. For example, BUG #33650, and that it will never work with primary keys.

    November 9, 2010 at 7:13 pm
  • Mark Wilden Reply

    Actually, I was just making a general comment – not necessarily referring specifically to Percona plugins.

    However, I don’t see how it could be slower to create the index all at once than to do it one record at a time.

    November 9, 2010 at 7:34 pm
  • Morgan Tocker Reply

    @Mark: It’s in the manual:

    “In MySQL versions up to 5.0, adding or dropping an index on a table with existing data can be very slow if the table has many rows. The CREATE INDEX and DROP INDEX commands work by creating a new, empty table defined with the requested set of indexes. It then copies the existing rows to the new table one-by-one, updating the indexes as it goes.”

    In the text it says up to 5.0, but it’s including 5.1 if you don’t use the plugin.

    November 9, 2010 at 8:46 pm
  • Mark Wilden Reply

    Wow, I had no idea – that’s pretty incredible. I guess it’s for the benefit of the clustered index, but still.

    Thanks for the information>

    November 9, 2010 at 8:53 pm
  • Blagomir Ivanov Reply

    By my opinion there are =rare cases that you have to restore the entire database. I have made an entire restore few times in my life. In most cases there is one one or few broken tables. The best solution for me was to write a script that takes all databases and all tables names in those databases and archive them (with mysqldump) in separate files.
    At the end i have:

    Dont know its the best, but it`s working for me 🙂

    November 10, 2010 at 4:33 pm
  • Morgan Tocker Reply

    @Blagomir – I agree with you. Real “disasters” happen far less frequent than accidental deletes.

    You can still restore a table at a time with xtrabackup/innodb hot backup. You may want to check out this discussion on Shlomi’s blog (and my reply):

    November 11, 2010 at 9:20 am
  • Patrick Casey Reply


    We use a similar technique for a subset of our logical backups and it works well.
    One challenge folks should be aware of though is that if you do no explicitely dump all tables without one transaction, each table will dump out at a different point in time so they won’t necessarily be consistent between tables.

    Example, if you have:
    Meals (contains a foreign key to food)

    And you dump food at 12:51 PM and Meals at 13:22 PM, then its possible there will be a reference in Meals to a piece of food which didn’t exist at 12:51 PM.

    Depending on data model this may or may not be an issue for any given app, but its an important constraint to be aware of if you’re considering going down this road.

    November 11, 2010 at 10:03 am
  • Andy Reply

    Use LVM for your filesystems and place MySQL on an LVM partition.

    You can then lock all tables and snapshot the partition – only a few seconds of lock time.
    Then copy the snapshot to another partition or device.

    You now have a full copy of the raw data – so restore is only the time taken to copy it into place – no restoring of dumps etc.

    Here is a really nice tool to do it all –

    November 16, 2010 at 12:27 pm
  • Patrick Casey Reply

    Andy, other challenge I’ve run into with LVM snapshots is that they really load up the disk subsystem.
    While I have a snap mounted I’m doubling my write iops.
    To use the snap as a backup, I have to copy it off main system and onto a backup device, which means I have to read hundreds of GB of data, which, again, loads up my io subsystem at the same time I’m taking a double-write penalty.

    We’ve found in our environment that SAN snaps work great (very low cost), but that LVM snaps are too intrusive on the production servers for regular use, so we typically snap the slaves nightly and the masters weekly since doing an LVM snap on a busy database will grind it down too much to tolerate.

    November 18, 2010 at 10:02 am
  • Andy Reply

    yes, LVM does have the issue of extra writes… so doing snapshots off the slave helps.

    November 20, 2010 at 8:53 am
  • Marc Reply

    What I’m really missing here is the “this is the way to do it” with a full example. I’m no sql management expert but find myself having to set one up regardless.
    I’m looking for the “works in 99% cases” solution but it almost seems like it’s more of a “there’s no good solution”. Which is kind of weird for such a popular database system.

    January 8, 2011 at 8:27 am
  • Ilan Hazan Reply

    Another argument for not using mysqldump for recovery is that when restoring a dump table into the MySQL master server, it can lead to serious replication delay.
    To overcome the replication delay, caused by restoring the dump table on the master, there is a need to widespread the massive inserts. This can be done by the MySQL SLEEP command.
    See As restoring a dump table into the MySQL master – you better get some sleep

    April 7, 2011 at 7:39 am
  • Mark Wilden Reply

    @Morgan I do see in that in 5.1, creating a *secondary* index is faster with ADD INDEX than inserting one-by-one.

    June 29, 2012 at 2:03 pm
  • Simon J Mudd Reply

    @Andy, I have been doing LVM snapshots for cloning and backups. Indeed it works well and is pretty easy to setup. However, I have seen considerable performance problems on CentOS 6 with xfs which were not apparent in CentOS 5, especially with filesystems of a few hundred GB or more. This can make using this method rather more cumbersome that you would expect.

    November 21, 2013 at 8:46 am

Leave a Reply