Today we are going to present you with another way to make a database copy between servers using our Percona XtraBackup utility. What is the difference with many others existing on the web? Not many, but they are interesting in terms of performance and availability.
We will combine the xbstream utility with the power of pigz and socat, to take advantage of the multi-processing in the case of having multiple processors, and at the same time, reduce the use of network bandwidth in cases where this component is a bottleneck. So let’s explain each component:
socat: This stands for SOcket CAT. It is a utility for data transfer between two addresses.
What makes socat so versatile is the fact that an address can represent a network socket, any file descriptor, a Unix domain datagram or stream socket, TCP and UDP (over both IPv4 and IPv6), SOCKS 4/4a over IPv4/IPv6, SCTP, PTY, datagram, and stream sockets, named and unnamed pipes, raw IP sockets, OpenSSL, or on Linux even any arbitrary network device
pigz: This stands for parallel implementation of gzip, and it is a fully functional replacement for gzip that exploits multiple processors and multiple cores to the hilt when compressing data.
xbstream: (with parallelism) to process several files in parallel.
Packages required: pigz, socat, and, of course, Percona XtraBackup
Servers Involved in the Task
Source: the source database where the backup will be taken (MySQL 5.7 installed on CentOS 7.8)
Target: the destination where the backup will be send (MySQL 5.7 installed on CentOS 7.8)
Steps
1. Install the required packages in the Source and Target nodes:
1 2 | Source # yum install -y pigz socat Target # yum install -y pigz socat |
If you haven’t installed Percona XtraBackup (in both servers), please follow the instructions explained in https://www.percona.com/doc/percona-xtrabackup/2.4/index.html#installation
Make sure you have a user with proper privileges to do the backup on Source db:
1 2 3 4 5 | +---------------------------------------------------------------------------+ | Grants for bkpuser@localhost | +---------------------------------------------------------------------------+ | GRANT RELOAD, PROCESS, REPLICATION CLIENT ON *.* TO 'bkpuser'@'localhost' | +---------------------------------------------------------------------------+ |
2. On Target, execute:
Stop current database service (if any):
1 | Target # systemctl stop mysqld |
Delete datadir contents (assuming it was installed with default settings) and make sure you’re logged in to the Target server!!
1 | Target # rm -rf /var/lib/mysql/* |
Finally, we’re going to execute the command to receive the backup from the source (Source):
1 | Target # socat -u TCP-LISTEN:4444,reuseaddr stdio | pigz -dc -p 4 - | xbstream —p 4 -x -C /var/lib/mysql |
3. On Source, execute the command to send the backup to target (Target).
1 | Source # xtrabackup --defaults-file=/etc/my.cnf --backup --user=bkpuser --password=Bkpuser123! --stream=xbstream --parallel 4 --no-timestamp --target-dir=/tmp | pigz -k -1 -p4 - | socat -u stdio TCP:Target:4444 |
You will see an output like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | xtrabackup: recognized server arguments: --datadir=/var/lib/mysql --server-id=1 --log_bin=/var/lib/mysql/mysql-bin --innodb_log_file_size=200M --innodb_log_files_in_group=2 --open_files_limit=65535 --parallel=4 xtrabackup: recognized client arguments: --backup=1 --user=bkpuser --password=* --stream=xbstream --target-dir=/tmp 200822 11:10:16 version_check Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_group=xtrabackup' as 'bkpuser' (using password: YES). 200822 11:10:16 version_check Connected to MySQL server 200822 11:10:16 version_check Executing a version check against the server... 200822 11:10:16 version_check Done. 200822 11:10:16 Connecting to MySQL server host: localhost, user: bkpuser, password: set, port: not set, socket: not set Using server version 5.7.30-log xtrabackup version 2.4.20 based on MySQL server 5.7.26 Linux (x86_64) (revision id: c8b4056) xtrabackup: uses posix_fadvise(). xtrabackup: cd to /var/lib/mysql xtrabackup: open files limit requested 65535, set to 65535 xtrabackup: using the following InnoDB configuration: xtrabackup: innodb_data_home_dir = . xtrabackup: innodb_data_file_path = ibdata1:12M:autoextend xtrabackup: innodb_log_group_home_dir = ./ xtrabackup: innodb_log_files_in_group = 2 xtrabackup: innodb_log_file_size = 209715200 InnoDB: Number of pools: 1 200822 11:10:16 >> log scanned up to (6724690490) xtrabackup: Generating a list of tablespaces InnoDB: Allocated tablespace ID 2 for mysql/plugin, old maximum was 0 xtrabackup: Starting 4 threads for parallel data files transfer 200822 11:10:16 [01] Streaming ./ibdata1 ...etc |
4. Once finished with step 3, you will see an output like this on the Target node:
1 2 3 4 5 6 7 8 | ... MySQL binlog position: filename 'mysql-bin.000091', position '102205647' 200822 11:10:21 [00] Streaming <STDOUT> 200822 11:10:21 [00] ...done 200822 11:10:21 [00] Streaming <STDOUT> 200822 11:10:21 [00] ...done xtrabackup: Transaction log of lsn (4308505553) to (4308505562) was copied. 200822 11:10:21 completed OK! |
And step 2 will be finished too, so you must execute the following commands in Target node:
1 | Target # xtrabackup --prepare --use-memory=1G --target-dir=/var/lib/mysql/ |
From the documentation:
After you made a backup with the xtrabackup –backup option, you’ll first need to prepare it in order to restore it. Data files are not point-in-time consistent until they’ve been prepared, because they were copied at different times as the program ran, and they might have been changed while this was happening. If you try to start InnoDB with these data files, it will detect corruption and crash itself to prevent you from running on damaged data. The xtrabackup –prepare step makes the files perfectly consistent at a single instant in time, so you can run InnoDB on them.
1 2 | Target # chown -R mysql:mysql /var/lib/mysql/* Target # systemctl start mysqld |
And you have cloned a new database!
You can, of course, set up this new database as a replica with these additional steps executed in the Target node:
Look at the content of the file xtrabackup_binlog_info, it will be something like:
1 2 3 | Target # cat /var/lib/mysql/xtrabackup_binlog_info mysql-bin.000091 102205647 |
(We’re assuming they have created the following user/grants in the Source db, if not, create it)
1 2 3 4 5 6 | mysql> show grants for replicator; +----------------------------------------------------+ | Grants for replicator@% | +----------------------------------------------------+ | GRANT REPLICATION SLAVE ON *.* TO 'replicator'@'%' | +----------------------------------------------------+ |
Connect to the database, and run:
1 | Target # mysql -u root -p |
1 2 | mysql> change master to master_host='Source',master_port=3306,master_user='replicator',master_password='R3pl1c4t10n!',master_log_file='mysql-bin.000091',master_log_pos=102205647; Query OK, 0 rows affected (0.00 sec) |
1 2 | mysql> start slave; Query OK, 0 rows affected (0.00 sec) |
1 | mysql> pager egrep -i "Master_Host|Master_User|Master_Port|file|behind" |
1 2 3 4 5 6 7 8 9 10 11 12 13 | mysql> show slave status\G Master_Host: master Master_User: replicator Master_Port: 3306 Master_Log_File: mysql-bin.000091 Relay_Log_File: relay.000001 Relay_Master_Log_File: mysql-bin.000091 Until_Log_File: Master_SSL_CA_File: Seconds_Behind_Master: 0 Master_Info_File: mysql.slave_master_info 1 row in set (0.00 sec) |
And that’s all. Happy streaming!
Comment (1)
Hi Fernando,
A couple of thoughts:
1. Shouldn’t you be demonstrating this with MySQL 8.0 as that’s the current GA version? MySQL 5.7 is fine, but quite old. There are plenty of good reasons to use 8.0 in favour of 5.7.
2. The use of pigz: Provide some performance statistics vs using xtrabackup in a “traditional fashion” so we can see the gain that’s possible in speed. It may be sometimes the network is the bottleneck and others it is not so having an idea of the improvement you can get, even in your test environment would be useful. Try this for a ~500 GB to 1 TB sized database as that’s not a trivial size and is large enough for improved performance to make a difference.
3. I’m also curious: Is xtrabackup speed on 5.7 any different to 8.0, either doing a standard cross server backup or using piz? We rarely look at comparisons in backup speeds so it would be interesting to see if there are any noticeable differences for what from the user perspective is likely to be the sane “dataset”. I wouldn’t expect any but there might be.
4. How do such backup / clone speeds compare to using something like “native cloning”? That’s new in 8.0 and requires copying between servers of the exact same version, but I remember looking a long time ago and performance (low clone time) was pretty good.
A couple of extra comments on these points might make a good follow up to this post.
Regards,
Simon