Logical MySQL backup tool Mydumper 0.9.1 now available

Databases backupThe new Mydumper 0.9.1 version, which includes many new features and bug fixes, is now available.  You can download the code from here.

A significant change included in this version now enables Mydumper to handle all schema objects!!  So there is no longer a dependency on using mysqldump to ensure complex schemas are backed up alongside the data.

Let’s review some of the new features:

Full schema support for Mydumper/Myloader

Mydumper now takes care of backing up the schema, including Views and Merged tables. As a result, we now have these new associated options:

-d, --no-data Do not dump table data
-G, --triggers Dump triggers
-E, --events Dump events
-R, --routines Dump stored procedures and functions

These options are not enabled by default to keep backward compatibility with actual mixed solutions using Mysqldump for DDLs.

Locking reduce options

--trx-consistency-only      Transactional consistency only

You can think on this as --single-transaction for mysqldump, but still with binlog position. Obviously this position only applies to transactional tables (TokuDB included).  One of the advantages of using this option is that the global read lock is only held for the threads coordination, so it’s released as soon as the transactions are started.

GTIDs and Multisource Slave 

GTIDs are now recorded on the metadata file.  Also Mydumper is now able to detect a multisource slave (MariaDB 10.1.x) and will record all the slaves coordinates.

Myloader single database restore

Until now the only option was to copy the database files to a different directory and restore from it. However, we now have a new option available:

-s, --source-db                   Database to restore

It can be used also in combination with -B, --database to restore to a different database name.

Full list of Bug Fixes:

#1431410 innodb stats tables
#1440403 *-post and *-triggers compressed files corrupt
#1470891 functions may be needed by SP and views
#1390437 segmentation fault against Percona MySQL 5.6.15-63.0
#1446280 Segmentation fault on Debian Wheezy
#1399715 Typo in –tables-list option in manpage
#1428608 missing -K option in mydumper manpage
#1440437 myloader: wrong database name in message when -B used
#1457091 tokudb detection doesn’t work
#1481747 Unable to compile r179 WITH_BINLOG=ON (undeclared ‘bj’)
#1507574 Assertion when broken mrg tables
#841651 dump view definitions
#1485688 make compile error myloader.c:209


Share this post

Comments (56)

  • Sheyda Amini

    What is the advantage of using this tool to mysqldump?

    November 12, 2015 at 6:29 pm
  • SuperQ

    mydumper can run many threads in parallel, where mysqldump does each table in series. Backups and restores happen much more quickly for databases with a large number of tables.

    November 13, 2015 at 3:31 am
  • Gareth Smith

    Similarly, what is the advantage of using this tool over *mysqlpump* – the new parallel backup tool in MySQL 5.7 ?

    November 13, 2015 at 4:37 am
  • Max Bubenick

    Hi Gareth, there are several differences between them

    – Parallel restores
    only with mydumper you can perform parallel restores using myloader

    – Files per table
    mydumper use files per table, so you can easily do partial restores

    – Chunking
    mydumper supports chunking, which is a huge performance difference, as parallelism is by chunk instead of table. In some cases only this feature improves times by more than 50% and not only backup but restore times as well

    But still mysqlpump is new tool and Im sure it will have more features, the most important thing here is that we are seeing an effort on get the logical backup tools to the next level.

    November 13, 2015 at 8:12 am
  • Phil Stracchino

    Can you talk a little more about the chunking? I was always under the impression mydumper parallelized table-by-table, and that’s certainly the way it appears to be operating in the MySQL process list and in the filesystem. Is chunking something that needs to be enabled?

    I just ran a test of 0.9.1 overnight, plugged my wrapper script for it into my nightly backups instead of my standard mysqldump-based script, and it worked perfectly. But when I run it in a timed benchmark against the mysqldump script, I’m only seeing about 7 seconds speed-up in my environment, out of just over 6 minutes total. This appears to be because more than 90% of my DB data is in two very large tables (bacula.File and dspam.dspam_signature_data), so the gating factor on backup length is how long it takes to dump those two tables, and they’re ending up being dumped serially because of the order in which mydumper gets to them.

    Would it be possible to add some smarts to have an optional large-table mode whereby mydumper takes some time out at the start to calculate what the largest tables are, and if the largest tables are larger in size than, say, total database size divided by number of threads, assign those giant tables to their own threads in order to make sure they are parallelized, then divide the smaller tables first-come-first-served among the remaining threads as usual?

    November 13, 2015 at 9:16 am
  • Phil Stracchino

    Gareth, in addition to what Max pointed out, there are some serious and poorly-documented shortcomings to mysqlpump.
    One, which is not explicitly called out anywhere but which can be found mentioned as an aside in the options documentation, is that in parallelizing mode, mysqlpump can interleave INSERTs to different tables in the single dump file. This means that doing a selective restore of just a few accidentally-erased tables from a mysqlpump dump is extremely difficult, possibly prohibitively difficult, whereas with mydumper/myloader it is a trivially simple operation.
    The other, even more serious, which does not seem to be exposed in the official documentation at all (although it is present in the mysqlpump announcement from the MySQL server team), is that mysqlpump’s multiple threads do not have a shared synchronization point for their dump operations — which means that mysqlpump dumps are *inconsistent*. This basically means that at its present state of development, mysqlpump is safe ONLY for offline dumps.

    November 13, 2015 at 10:06 am
  • Max Bubenick

    Hi Phil,

    Maybe this topic worths a post by it self, but to simplify mydumper has two chunking options, they are not enabled by default

    -r, –rows Try to split tables into chunks of this many rows. This option turns off –chunk-filesize

    -F, –chunk-filesize Split tables into chunks of this output file size. This value is in MB

    –rows only works over tables with an integer index for now and the logic needs to be improved yet. Anyway, with this mydumper use a thread per chunk instead of table.

    –chunk-filesize is useful when you don’t have an integer index, in this case mydumper use a thread per table but it splitting the output into multiple files by this parameter size. It won’t improve times as –rows does for backups but still helps to improve restore times as it can be done by multiple threads.

    November 13, 2015 at 11:51 am
  • George

    Thanks for the wonderful and continued work on mydumper !

    November 13, 2015 at 7:11 pm
  • George

    oh one bug i found is version number is reported incorrectly 0.9.1 vs 0.9.0

    mydumper -V
    mydumper 0.9.0, built against MySQL 10.0.22-MariaDB

    November 13, 2015 at 7:12 pm
  • Gareth Smith

    Thanks for the explanation Phil. As you note, inconsistent backups are pretty dangerous.

    November 16, 2015 at 6:03 am
  • Fadi El-Eter (itoctopus)

    Hi Max,

    Thanks for mentioning this tool as well as its advantages over the regular mysqldump that we all use.

    I have some quick questions:

    – Is it stable? Can it be used on a production environment?
    – Do you guys use it for your clients?
    – Are there any known bugs/limitations?
    – Is it really free?

    November 16, 2015 at 11:45 am
  • Max Bubenick


    – Is it stable? Can it be used on a production environment?
    Yes, you can use it on a production environment, but as with any backup solution you need to run your own tests, not only taking a backup but restoring and checksum it. Also be sure that you know how it works and how it can affect your production workload.

    – Do you guys use it for your clients?
    mydumper is the principal tool for our logical backups solution, so yes we use it.

    – Are there any known bugs/limitations?
    You can check https://bugs.launchpad.net/mydumper for bugs and limitations.

    – Is it really free?
    Yes, it is!!

    November 16, 2015 at 12:26 pm
  • Hung

    Hi Max,
    Is this tool a online dumper? (like InnoExtraBackup)

    November 16, 2015 at 9:21 pm
  • Narendra

    Hi Max,

    I got the below error while taking the backup by using mydumper .i tried to get resolve it but no luck.help me in this

    /home/mysql/mydumper-0.9.1/mydumper -S /mysql/mysql.sock -u xxxx -p’xxxx’ -B dataabses -o /home/backups/

    option parsing failed: Error parsing option -o, try –help

    November 18, 2015 at 5:13 am
  • Max Bubenick

    Hi Narendra,

    You need to add a blank space between parameter and value. Also I would recommend to add credentials to .my.cnf rather than command line.

    November 18, 2015 at 7:18 am
  • Narendra

    Cool. Thanks Max

    November 18, 2015 at 3:04 pm
  • Narendra

    Hi Max,

    Is there any option to take multiple DB backups(more than one not all) at a time by using mydumper..
    /home/mysql/mydumper-0.9.1/mydumper -S /mysql/mysql.sock -u xxxx -p ’xxxx’ -B db1 db2 db3 -o /home/backups/
    i used above one to take backup three databases . after executing this . i am seeing that its taking backup of db1 only..

    November 20, 2015 at 2:27 pm
  • Max Bubenick

    Hi Narendra,

    -B is for single database dumps, for multiple ones you can use the regex option like


    November 23, 2015 at 8:11 am
  • leafonsword

    what’s the difference of “–trx-consistency-only” and “–less-locking”?
    It seems with “–less-locking”, mydumper will do a short FTWRL, looks same as “–trx-consistency-only”~

    November 24, 2015 at 6:24 am
    • Max Bubenick

      Hi leafonsword,

      There are two differences: consistency and FTWRL time.
      –trx-consistency-only will not care about non transactional tables consistency and will release the lock right after the threads coordination.
      –less-locking instead will release the FTWRL after looping all databases and locking with different threads the non transactional tables.

      So –less-locking FTWRL is shorter than default and –trx-consistency-only is shorter than –less-locking but you need to have all innodb, or tokudb, if you want a full consistent backup.

      November 24, 2015 at 7:37 am
      • Phil Stracchino

        ….But –less-locking will still yield a consistent backup even if there are non-transactional tables present, correct?

        November 24, 2015 at 11:31 am
  • Max Bubenick

    Correct Phil

    November 24, 2015 at 1:26 pm
  • leafonsword

    Thanks Max ! If MySQL’s system table default engine is innodb or tokudb, I think “–trx-consistency-only” will be very helpful~
    I have another question: why could option “–use-savepoints” reduce meta lock holding time?

    November 24, 2015 at 8:35 pm
  • Max Bubenick

    Hi leafonsword,

    to reduce MDL times you can use savepoints and rollback to them after each table is dumped, this doesn’t affects consistency but removes the MDL on the table.

    November 25, 2015 at 9:22 am
  • leafonsword

    Though mydumper is very strong,if it add those features , I think it will be perfect:
    1.support mysqldump’s ‘–where’ option
    2.support just load schema or data while the backup dir contains both schema and data
    3.support backup view
    4.support backup multi databases one time

    November 25, 2015 at 10:43 pm
    • Phil Stracchino

      Leafonsword, your point 3 is already trivially easy to do with mydumper, because it dumps schema and data separately. Just create a temp directory, link the schema or data files for just the schemas or individual tables you want into there, and run myloader in that directory. Shazam! Done.

      Mydumper is the best tool I have ever encountered for doing partial restores of DB data. Nothing else comes close.

      November 26, 2015 at 6:31 pm
      • leafonsword

        If I just want to dump some databases,isn’t dump all databases then filer a waste?

        November 26, 2015 at 9:16 pm
        • Phil Stracchino

          Leafonsword, I meant point 2 not point 3. Sorry. Partial restores are easy.

          Max already addressed your other points.

          November 30, 2015 at 8:03 am
          • leafonsword

            Hi phil,I have used this method before,but I think it’s a little trivial,if myloader could use option control,it’s more convient~

            December 1, 2015 at 9:10 pm
  • Max Bubenick

    Hi leafonsword,

    Thanks for your feedback, Im actively working on mydumper/myloader and adding new features. About your points I agree with 1 and 2 and depending on priorities we can have those soon.

    About point 3 and 4 they are already in mydumper!!! Views are dumped by default and for multi databases you can use –regex as I explained above.

    November 26, 2015 at 7:31 am
    • leafonsword

      Ha,that’s right!
      One more thing, could you put mydumper in github?Since github looks better than launchpad, and percona also has its official repository,maybe mydumper could in repository too~

      November 26, 2015 at 8:41 am
  • leafonsword

    After restoring using myloader, does it need to “flush privileges”?

    November 29, 2015 at 10:51 pm
  • Narendra

    Hi Max ,
    Do we have option to backup entire DB as a single sql file? where we can handle the backup files easily ..instead of having per table schema and data as a single sql file.

    March 9, 2016 at 10:03 am
    • Phil Stracchino

      Narendra, that would almost certainly require major re-engineering, and would in many ways work in direct opposition to mydumper’s multi-threaded dumps and reloads. If you want a single monolithic dump file, why not just use mysqldump that already works that way? If your concern is easily moving the complete dump around, you could just tar the dump directory.

      March 9, 2016 at 10:46 am
  • hi, when I use mydumper to backup my DB there is some error happen. But in another of DB server it works OK.

    [root@localhost backup]# mydumper –user backup –password backup@123 -B mysql -o ./test -c -e

    ** (mydumper:36394): CRITICAL **: Couldn’t write data to a file: No such file or directory

    ** (mydumper:36394): CRITICAL **: Could not write create database for mysql
    Segmentation fault

    [root@localhost backup]# ls test
    metadata.partial mysql.columns_priv.sql.gz mysql.db.sql.gz mysql.event.sql.gz mysql.func.sql.gz mysql-schema-create.sql.gz

    [root@localhost backup]# mydumper -V
    mydumper 0.9.1, built against MySQL 5.6.29-76.2

    [root@localhost backup]# mysql -V
    mysql Ver 14.14 Distrib 5.6.28-76.1, for Linux (x86_64) using 6.2

    [root@localhost backup]# cat /etc/centos-release
    CentOS Linux release 7.2.1511 (Core)

    April 29, 2016 at 8:54 am
  • Rick Jans

    Can you have the output of mydumper be piped into the input of another program ? We want to encrypt the resultant text files but would like to do that on the fly such as piping mydumper output to input of openssl.

    May 10, 2016 at 11:52 am
  • rsterbal

    Do you offer a webinar on how to use the product?

    October 21, 2016 at 11:07 am
  • Phil Stracchino

    Just play with it a little and you should find it straightforward enough that you don’t need a webinar to understand it. It’s really pretty simple. Because it’s a lot smarter than mysqldump, it does not need mysqldump’s huge number of command-line options.

    October 21, 2016 at 11:21 am
  • Phil Stracchino

    –statement-size is how mydumper handles max_allowed_packet. Is there a technical reason why mydumper cannot simply auto-set –statement-size to accommodate global.max_allowed_packet upon initial connection to the target server?

    November 18, 2016 at 12:23 pm
    • Max Bubenick

      Hi Phil,

      Not really, but mostly I would like to keep it safe, MySQL’s default value is 1M and if I increase it I would expect people start getting ER_NET_PACKET_TOO_LARGE when restoring. Maybe I can keep the default but add an option to auto detect it.

      November 18, 2016 at 1:12 pm
      • Phil Stracchino

        That’s why I was suggesting auto-detect, though I can see how auto-detection might cause a problem if a database is dumped from a server with, say, max_allowed_packet = 16M and then loaded onto one with max_allowed_packet = 1M. However, what will happen *now* in that case? Will it currently fail anyway?

        November 18, 2016 at 1:36 pm
        • Max Bubenick

          Not sure what do you mean by *now* but I would guess if we do that change :). So, in that case you will be able to restore only to instances where max_allowed_packet is equal or greater than your backup source.

          So now having –statement-size defaults is 1M, even if the source server is 16M the generated statements are 1M which won’t fail even for default mysql installations. And I think these are the defaults for.

          In the other hand, this could cause issues if you have large fields, like blobs, bigger than –statement-size but I didn’t see that often.

          November 18, 2016 at 2:02 pm
          • Phil Stracchino

            Actually, the reason this came up was a customer who is getting numerous ‘Row bigger than statement-size’ warnings because their database has large numbers of huge XML strings. (Which they’re trying to strip out, but…)

            So does mydumper’s –statement-size option limit the WRITE statement size, then, not anything to do with actually retrieving the data? That is not clear from the documentation. Then perhaps auto-setting –statement-size isn’t actually the best way to handle this warning.

            If I understand your explanation correctly, the warning is actually harmless. Is that correct? If so, max_allowed_packet auto-detection probably isn’t needed at all, but perhaps an option to suppress the warning would be useful, just to cut down on log noise.

            November 18, 2016 at 2:13 pm
  • Max Bubenick

    You take it right, the warning is harmless as the backup is still consistent. To not get warnings you can set verbose to error (1) lvl only.

    November 18, 2016 at 2:20 pm
    • Phil Stracchino

      OK, then I’m going to just have my wrapper script manage that and forget the entire max_allowed_packet detection idea altogether.


      November 18, 2016 at 2:37 pm
      • Phil Stracchino

        (I note, by the way, that this is yet another detail in which mydumper is smarter and more advanced than mysqldump.)

        November 18, 2016 at 3:06 pm
  • Phil Stracchino

    This is a feature request.

    I’ve run into a problem using mydumper on XtraDB Cluster. The environment in question has four XtraDB nodes, the fourth a non-voting node (weight 0) which runs as a dedicated backup node, and is set to DESYNCED during DB dumps. This node is fairly often being evicted from the cluster during the backup, which causes it to declare itself non-primary and stop answering queries, which causes the backup to fail.

    Do you suppose it would be possible for mydumper to detect when it is connected to an XtraDB Cluster node (by checking to see whether the status variable wsrep_cluster_status exists), and if so, pause/suspend if wsrep_cluster_status becomes other than ‘Primary’?

    November 21, 2016 at 12:05 pm
  • Phil Stracchino

    I have another feature request, to address a current production problem.

    We have a client who has tens of thousands of schemas and a very high churn rate on schemas. MOST nights, their DB backup reports errors because they have dropped one or more schemas in between when the backup run started and when mydumper actually got to that schema. This produces false failure reports in monitoring.

    Something like the following combination of features would be extremely useful:
    — Add a command-line option to pass to mydumper to tell it that it is OK for schemas to be missing by the time it gets to them.
    — If this option is in use, each time mydumper begins to dump tables from a new schema, check that the schema still exists, and if not, drop that entire schema and its contents from the list of objects to be dumped.
    — Change reporting of the ‘table or schema does not exist’ case by default from an error to a warning (because it’s not an error, mydumper didn’t fail; it’s just that the object had been removed from the DB by the time mydumper got to it).
    — If the “missing is OK” option is in use, “missing table/schema” is further downgraded to an informational message instead of a warning

    Of course, if you can suggest a better way to handle this…? Right now, the only way I can come up with to prevent the false dump-failure alerts is to suspend replication to the dedicated backup slave while backups are running.

    December 15, 2016 at 11:52 am
  • Kumar

    Hi Max,

    I have installed 0.9.1, but it is showing built against mysql 5.1.73

    /usr/local/bin/mydumper -V;
    mydumper 0.9.1, built against MySQL 5.1.73

    And also when I compare the mentioned features with 0.9.0, I do not see any difference between both the versions, it looks same.

    February 3, 2017 at 11:53 am
  • Phil Stracchino

    There is a mydumper-0.9.1 RPM built for RHEL6/Centos6/OEL6 located at http://repo.enetres.net/x86_64/mydumper-0.9.1-1.el6.x86_64.rpm. However, it is ONLY the RHEL6 compatibles. Does anyone know of a RHEL7-compatible RPM package for mydumper 0.9.1?

    March 27, 2017 at 10:27 am
  • Phil Stracchino

    Max, I just discovered a bug in myloader.

    Capsule summary: Filename matching when using myloader -s is incorrect, causing false matches.

    Suppose you have schemas schema1 through schema9, and you run:

    myloader -eod dumps/ -s schema1

    The whole of schema1 will be restored, leaving the other schemas untouched.

    Now suppose you run the same command, but you have schemas schema1 through schema13. myloader will restore schema1, BUT will then also go on and try to load schema10, schema11, schema12, and schema13 into schema1.

    I presume the problem here is that myloader SHOULD be matching files against \.*, but is actually incorrectly matching against * (which will also match files for any schema whose name begins with ).

    June 28, 2017 at 2:06 pm
  • Phil Stracchino

    argh. Let me try that last sentence again using different delimiters:

    I presume the problem here is that myloader SHOULD be matching files against [schemaname]\.*, but is actually incorrectly matching against [schemaname]* (which will also match files for any schema whose name begins with [schemaname]).

    June 28, 2017 at 2:08 pm
  • Phil Stracchino

    I just completed a large (several hundred gigabytes) data load into an AWS Aurora RDS instance. The data was all apparently, as far as I can see, loaded without errors. However, at completion, myloader emitted the following error messages:

    bp-bastion.CUSTOMER.com:root:/mnt/decomp:18 # time myloader -oed db2.CUSTOMER.com-20180213/

    ** (myloader:1732): CRITICAL **: Error switching to database CUSTOMER whilst restoring table (null)

    ** (myloader:1732): CRITICAL **: Error switching to database CUSTOMER whilst restoring table userLogins

    ** (myloader:1732): CRITICAL **: Error switching to database CUSTOMER whilst restoring table DC_numbers

    ** (myloader:1732): CRITICAL **: Error switching to database CUSTOMER whilst restoring table assignments_periods

    ** (myloader:1732): CRITICAL **: Error switching to database CUSTOMER whilst restoring table teachers

    ** (myloader:1732): CRITICAL **: Error switching to database CUSTOMER whilst restoring table user

    ** (myloader:1732): CRITICAL **: Error switching to database CUSTOMER whilst restoring table students

    ** (myloader:1732): CRITICAL **: Error switching to database CUSTOMER whilst restoring table students_activities_periods

    ** (myloader:1732): CRITICAL **: Error switching to database CUSTOMER whilst restoring table periods_students

    real 189m32.755s
    user 18m41.804s
    sys 1m30.280s

    The database exists and as far as we can tell, all tables were fully loaded.

    Do you have any explanation of what these errors mean? The ‘table (null)’ message is particularly troubling.

    February 14, 2018 at 2:36 pm
  • Max Bubenick

    Hi Phil, thank you for your feedback!

    Please feel free to open bugs on github https://github.com/maxbube/mydumper/issues.

    About that specific errors, that is returned when myloader issue an USE db; in this case USE CUSTOMER. Will need to figure out why that could be failing.

    The (null) is odd, I guess could be related to a table name which is not being parsed correctly to split between db.table.sql. If you could report a bug with at least the tables names would be great and I can take a look.

    February 19, 2018 at 1:28 pm
    • Phil Stracchino

      Well, database name there was changed to protect customer confidentiality, but all table names are left unchanged.

      I’ll file a bug as requested.

      February 19, 2018 at 4:24 pm
    • Phil Stracchino

      Filed #105

      February 19, 2018 at 4:28 pm

Comments are closed.

Use Percona's Technical Forum to ask any follow-up questions on this blog topic.