EmergencyEMERGENCY? Get 24/7 Help Now!

The mysqlpump Utility

 | April 17, 2017 |  Posted In: Insight for DBAs, MySQL, Percona Toolkit

PREVIOUS POST
NEXT POST

mysqlpumpIn this blog, we’ll look at the mysqlpump utility.

mysqlpump is a utility that performs logical backups (which means backing up your data as SQL statements instead of a raw copy of data files). It was added in MySQL Server version 5.7.8, and can be used to dump a database or a set of databases to a file and then loaded on another SQL server (not necessarily a MySQL server).

Its usage is similar to mysqldump, but it includes a new set of features. Many of the options are the same, but it was written from scratch to avoid being limited to mysqldump compatibility.

The Main Features Include:

  • To make the dump process faster, it allows parallel processing of databases and objects within databases.
  • There are more options to customize your dumps and choose which databases and objects to dump (tables, stored programs, user accounts), using the --include-* and  --exclude-* parameters.
  • User accounts can be dumped now as CREATE USER and GRANT statements, instead of inserting directly to the MySQL system database.
  • Information between the client and the server can be compressed using the --compress option. This feature is very useful for remote backups, as it saves bandwidth and transfer time. You can also compress the output file using --compress-output, which supports ZLIB and LZ4 compression algorithms.
  • It has an estimated progress indicator. This is really useful to check the current status of the dump process. You can see the total amount of rows dumped and the number of databases completed. It also reports an estimate of the total time to complete the dump.
  • Creation of secondary indexes for InnoDB tables happens after data load for shorter load times.

Exclude/Include:

This feature provides more control over customizing your dumps, and filter the data that you need. Using this feature, you can be more selective with the data you want to dump (databases, tables, triggers, events, routines, users) and save file size, process time and transferring time while copying/moving the file to another host.

Keep in mind that there are some options that are mutually exclusive: e.g., if you use the --all-databases option, the --exclude-databases  parameter won’t take effect. By default, mysqlpump will not dump the following databases unless you specify them using the --include-databases option: INFORMATION_SCHEMA, performance_schema, ndbinfo  and sys.

Values for these options need to be declared by comma-separated listing. Using a “%” as a value for any of the exclude/include options acts as a wildcard. For example, you can dump all databases starting with “t” and “p” by adding the option --include-databases=t%,p%  to the command line.

For users, routines, triggers and events, mysqlpump has --include-* and --exclude-* options with similar usage. Some specific notes:

  • Triggers are dumped by default, but you can also filter them using the --include-triggers/ --exclude-triggers options
  • Routines and events are not dumped by default, and need to be specified in the command line with --routines and --events, or the corresponding --include and  --exclude options
  • Keep in mind that if a stored procedure and a function have the same name, then include/exclude applies to both

Parallel Processing:

This feature allows you to process several databases, and tables within the databases, in parallel. By default, mysqlpump uses one processing queue with two threads. You can increase the number of threads for this default queue with --default-parallelism. Unless you create additional queues, all the databases and/or tables you elect to dump go through the default queue.

To create additional queues you can use the  --parallel-schemas option, which takes two parameters: the number of threads for the queue and the sub-set of databases this queue processes.  As an example, you could run:

so that schemas c, d, e, f, g and h are processed by the default queue (which uses three threads), and then tables from schemas a and b are processed by a separate queue (that uses four threads). Database names should be included as a comma-separated list:

User Accounts:

User accounts can be dumped using this tool. Here’s a comparison of our Percona Tool pt-show-grants versus mysqlpump to check their differences.

By default, mysqlpump doesn’t dump user account definitions (even while dumping the MySQL database). To include user accounts on the dump, you must specify the --users option.

Here’s an example on how use mysqlpump to get only user accounts dumped to a file:

As you can see, above the tool makes sure the session uses known values for timezone and character sets. This won’t affect users, it’s part of the dump process to ensure correctness while restoring on the destination.

Comparing it with pt-show-grants from Percona Toolkit, we can see that  mysqlpump dumps the CREATE USER  information as well. The statements produced by mysqlpump are the right thing to run to recreate users (and should be the preferred method), especially because of the sql_mode NO_AUTO_CREATE_USERS. If enabled, it renders pt-show-grants useless.

Here’s an example of pt-show-grants usage:

Some Miscellaneous Notes:

  • One of the differences with mysqldump is that mysqlpump adds  CREATE DATABASE statements to the dump by default, unless specified with the --no-create-db option.
    • There’s an important difference on the dump process that is closely related: it includes the database name while adding the CREATE TABLE statement. This causes a problem when trying to use the tool to create a duplicate.
PREVIOUS POST
NEXT POST

5 Comments

    • Its not supported for version 5.6 and 5.5, it fails with the error as below

      mysqlpump: [ERROR] (2) Error during handling options
      Dump process encountered error and will not continue.
      Server version is not compatible

      Kabilesh

  • Small correction:
    mysqlpump dumps routines and events by default. In this post it is mentioned that “Routines and events are not dumped by default”

  • Interesting, but it doesn’t look like it does a good job with consitent snapshot loccking. Or that it saves the binglog/gtid positions in order to make the backups useful for creating new slaves. This is a feature of mydumper[0] that I depend on.

    This is a common pattern I have used for typical master/slave clusters:
    1. Use mydumper to backup version A slave server
    2. Use myloader to restore dataset to version B server.
    3. Attach version B server to version A master as a slave.
    4. Create binary (xtrabackup) copies of version B slave.
    5. Remove all version A slaves
    6. Failover version A master to version B slave.

    This allows for a very clean InnnoDB binary dataset upgrade across major versions.

    [0]: https://github.com/maxbube/mydumper/

  • Thank you, really interesting utility!
    The only thing seems me weird – why mysqlpump doesn’t use the same params as mysqldump?
    I.e. why it requests “–exclude-tables” instead of “–ignore-tables”, for example?

Leave a Reply