The importance of having periodic backups is a given in Database life. There are different flavors: binary ones (Percona XtraBackup), binlog backups, disk snapshots (lvm, ebs, etc) and the classic ones: logical backups, the ones that you can take with tools like mysqldump, mydumper, or mysqlpump. Each of them with a specific purpose, MTTRs, retention policies, etc.
Another given is the fact that taking backups can be a very slow task as soon as your datadir grows: more data stored, more data to read and backup. But also, another fact is that not only does data grow but also the amount of MySQL instances available in your environment increases (usually). So, why not take advantage of more MySQL instances to take logical backups in an attempt to make this operation faster?
The idea is simple: instead of taking the whole backup from a single server, use all the servers available. This Proof of Concept is focused only on using the replicas on a Master/Slave(s) topology. One can use the Master too, but in this case, I’ve decided to leave it alone to avoid adding the backup overhead.
On a Master/3-Slaves topology:

With a small datadir of around 64GB of data (without the index size) and 300 tables (schema “sb”):
|
1 |
+--------------+--------+--------+-----------+----------+-----------+----------+<br>| TABLE_SCHEMA | ENGINE | TABLES | ROWS | DATA (M) | INDEX (M) | TOTAL(M) |<br>+--------------+--------+--------+-----------+----------+-----------+----------+<br>| meta | InnoDB | 1 | 0 | 0.01 | 0.00 | 0.01 |<br>| percona | InnoDB | 1 | 2 | 0.01 | 0.01 | 0.03 |<br>| sb | InnoDB | 300 | 295924962 | 63906.82 | 4654.68 | 68561.51 |<br>| sys | InnoDB | 1 | 6 | 0.01 | 0.00 | 0.01 |<br>+--------------+--------+--------+-----------+----------+-----------+----------+<br> |
Using the 3 replicas, the distributed logical backup with mysqldump took 6 minutes, 13 seconds:
|
1 |
[root@mysql1 ~]# ls -lh /data/backups/20200101/<br>total 56G<br>-rw-r--r--. 1 root root 19G Jan 1 14:37 mysql2.sql<br>-rw-r--r--. 1 root root 19G Jan 1 14:37 mysql3.sql<br>-rw-r--r--. 1 root root 19G Jan 1 14:37 mysql4.sql<br>[root@mysql1 ~]# stat /data/backups/20200101/mysql2.sql<br> File: '/data/backups/20200101/mysql2.sql'<br> Size: 19989576285 Blocks: 39042144 IO Block: 4096 regular file<br>Device: 10300h/66304d Inode: 54096034 Links: 1<br>Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)<br>Context: unconfined_u:object_r:unlabeled_t:s0<br>Access: 2020-01-01 14:31:34.948124516 +0000<br>Modify: 2020-01-01 14:37:41.297640837 +0000<br>Change: 2020-01-01 14:37:41.297640837 +0000<br> Birth: -<br> |
Same backup type on a single replica took 11 minutes, 59 seconds:
|
1 |
[root@mysql1 ~]# time mysqldump -hmysql2 --single-transaction --lock-for-backup sb > /data/backup.sql<br><br>real 11m58.816s<br>user 9m48.871s<br>sys 2m6.492s<br>[root@mysql1 ~]# ls -lh /data/backup.sql<br>-rw-r--r--. 1 root root 56G Jan 1 14:52 /data/backup.sql<br> |
In other words:
The distributed one was 48% faster!
And this is a fairly small dataset. Worth the shot. So, how does it work?
The logic is simple and can be divided into stages.
The full script can be found here:
https://github.com/nethalo/parallel-mysql-backup/blob/master/dist_backup.sh
Worth to note that the script has its own log that will describe every step, it looks like this:
|
1 |
[200101-16:01:19] [OK] Found 'mysql' bin<br>[200101-16:01:19] [Info] SHOW SLAVE HOSTS executed<br>[200101-16:01:19] [Info] Count tables OK<br>[200101-16:01:19] [Info] table list gathered<br>[200101-16:01:19] [Info] CREATE DATABASE IF NOT EXISTS percona<br>[200101-16:01:19] [Info] CREATE TABLE IF NOT EXISTS percona.metabackups<br>[200101-16:01:19] [Info] TRUNCATE TABLE percona.metabackups<br>[200101-16:01:19] [Info] Executed INSERT INTO percona.metabackups (host,chunkstart) VALUES('mysql3',0)<br>[200101-16:01:19] [Info] Executed INSERT INTO percona.metabackups (host,chunkstart) VALUES('mysql4',100)<br>[200101-16:01:19] [Info] Executed INSERT INTO percona.metabackups (host,chunkstart) VALUES('mysql2',200)<br>[200101-16:01:19] [Info] lock binlog for backup set<br>[200101-16:01:19] [Info] slave status position on mysql3<br>[200101-16:01:19] [Info] slave status file on mysql3<br>[200101-16:01:19] [Info] slave status position on mysql4<br>[200101-16:01:19] [Info] slave status file on mysql4<br>[200101-16:01:19] [Info] slave status position on mysql2<br>[200101-16:01:19] [Info] slave status file on mysql2<br>[200101-16:01:19] [Info] set STOP SLAVE; START SLAVE UNTIL MASTER_LOG_FILE = 'mysql-bin.000358', MASTER_LOG_POS = 895419795 on mysql3<br>[200101-16:01:20] [Info] set STOP SLAVE; START SLAVE UNTIL MASTER_LOG_FILE = 'mysql-bin.000358', MASTER_LOG_POS = 895419795 on mysql4<br>[200101-16:01:20] [Info] set STOP SLAVE; START SLAVE UNTIL MASTER_LOG_FILE = 'mysql-bin.000358', MASTER_LOG_POS = 895419795 on mysql2<br>[200101-16:01:20] [Info] Created /data/backups/20200101/ directory<br>[200101-16:01:20] [Info] Limit chunk OK<br>[200101-16:01:20] [Info] Tables list for mysql3 OK<br>[200101-16:01:20] [OK] Dumping mysql3<br>[200101-16:01:20] [Info] Limit chunk OK<br>[200101-16:01:20] [Info] Tables list for mysql4 OK<br>[200101-16:01:20] [OK] Dumping mysql4<br>[200101-16:01:20] [Info] Limit chunk OK<br>[200101-16:01:20] [Info] Tables list for mysql2 OK<br>[200101-16:01:20] [OK] Dumping mysql2<br>[200101-16:01:20] [Info] UNLOCK BINLOG executed<br>[200101-16:01:20] [Info] set start slave on mysql3<br>[200101-16:01:20] [Info] set start slave on mysql4<br>[200101-16:01:20] [Info] set start slave on mysql2<br> |
Some basic requirements:
Interesting or not?
With this being a Proof of Concept, it lacks features that eventually (if this becomes a more mature tool) will arrive, like:
Let us know in the comments section!
Resources
RELATED POSTS