Comparing S3 Streaming Tools with Percona XtraBackup

Making backups over the network can be done in two ways: either save on disk and transfer or just transfer without saving. Both ways have their strong and weak points. The second way, particularly, is highly dependent on the upload speed, which would either reduce or increase the backup time. Other factors that influence it are chunk size and the number of upload threads.

Percona XtraBackup 2.4.14 has gained S3 streaming, which is the capability to upload backups directly to s3-compatible storage without saving locally first. This feature was developed because we wanted to improve the upload speeds of backups in Percona Operator for XtraDB Cluster.

There are many implementations of S3 Compatible Storage: AWS S3, Google Cloud Storage, Digital Ocean Spaces, Alibaba Cloud OSS, MinIO, and Wasabi.

We’ve measured the speed of AWS CLI, gsutil, MinIO client, rclone, gof3r and the xbcloud tool (part of Percona XtraBackup) on AWS (in single and multi-region setups) and on Google Cloud. XtraBackup was compared in two variants: a default configuration and one with tuned chunk size and amount of uploading threads.

Here are the results.

AWS (Same Region)

The backup data was streamed from the AWS EC2 instance to the AWS S3, both in the us-east-1 region.

 

 

toolsettingsCPUmax memspeedspeed comparison
AWS CLIdefault settings66%149Mb130MiB/sbaseline
AWS CLI10Mb block, 16 threads68%169Mb141MiB/s+8%
MinIO clientnot changeable10%679Mb59MiB/s-55%
rclone rcatnot changeable102%7138Mb139MiB/s+7%
gof3rdefault settings69%252Mb97MiB/s-25%
gof3r10Mb block, 16 threads77%520Mb108MiB/s-17%
xbclouddefault settings10%96Mb25MiB/s-81%
xbcloud10Mb block, 16 threads60%185Mb134MiB/s+3%

 

Tip: If you run MySQL on an EC2 instance to make backups inside one region, do snapshots instead.

AWS (From US to EU)

The backup data was streamed from AWS EC2 in us-east-1 to AWS S3 in eu-central-1.

 

 

toolsettingsCPUmax memspeedspeed comparison
AWS CLIdefault settings31%149Mb61MiB/sbaseline
AWS CLI10Mb block, 16 threads33%169Mb66MiB/s+8%
MinIO clientnot changeable3%679Mb20MiB/s-67%
rclone rcatnot changeable55%9307Mb77MiB/s+26%
gof3rdefault settings69%252Mb97MiB/s+59%
gof3r10Mb block, 16 threads77%520Mb108MiB/s+77%
xbclouddefault settings4%96Mb10MiB/s-84%
xbcloud10Mb block, 16 threads59%417Mb123MiB/s+101%

 

Tip: Think about disaster recovery, and what will you do when the whole region is not available. It makes no sense to back up to the same region; always transfer backups to another region.

Google Cloud (From US to EU)

The backup data were streamed from Compute Engine instance in us-east1 to Cloud Storage europe-west3. Interestingly, Google Cloud Storage supports both native protocol and S3(interoperability) API. So, Percona XtraBackup can transfer data to Google Cloud Storage directly via S3(interoperability) API.

 

toolsettingsCPUmax memspeedspeed comparison
gsutilnot changeable, native protocol8%246Mb23MiB/setalon
rclone rcatnot changeable, native protocol6%61Mb16MiB/s-30%
xbclouddefault settings, s3 protocol3%97Mb9MiB/s-61%
xbcloud10Mb block, 16 threads, s3 protocol50%417Mb133MiB/s+478%

 

Tip: A cloud provider can block your account due to many reasons, such as human or robot mistakes, inappropriate content abuse after hacking, credit card expire, sanctions, etc. Think about disaster recovery and what will you do when a cloud provider blocks your account, it may make sense to back up to another cloud provider or on-premise.

Conclusion

xbcloud tool (part of Percona XtraBackup) is 2-5 times faster with tuned settings on long-distance with native cloud vendor tools, and 14% faster and requires 20% less memory than analogs with the same settings. Also, xbcloud is the most reliable tool for transferring backups to S3-compatible storage because of two reasons:

  • It calculates md5 sums during the uploading and puts them into a .md5/filename.md5 file and verifies sums on the download (gof3r does the same).
  • xbcloud sends data in 10mb chunks and resends them if any network failure happens.

PS: Please find instructions on GitHub if you would like to reproduce this article’s results.

Share this post

Comments (2)

  • Vladislav Vaintroub Reply

    Are you sure AWS S3 CLI settings are not changeable? https://docs.aws.amazon.com/cli/latest/topic/s3-config.html lists some of the parameters, and I think this makes it changeable

    November 26, 2019 at 12:52 pm
    • Mykola Marzhan Reply

      Vladislav,

      I see +8% performance increase after aws cli tuning.
      we have updated blog post.
      thank you a lot!

      November 26, 2019 at 2:36 pm

Leave a Reply