Where the open source database community meets: Use code PERCONA75 and secure your spot for Percona Live.  Register

Just how useful are binary logs for incremental backups?

July 21, 2009
Author
Morgan Tocker
Share this Post:

We’ve written about replication slaves lagging behind masters before, but one of the other side effects of the binary log being serialized, is that it also limits the effectiveness of using it for incremental backup.  Let me make up some numbers for the purposes of this example:

  • We have 2 Servers in a Master-Slave topology.
  • The database size is 100 GB (same tables on each).
  • The slave machine barely keeps up with the master (at 90% capacity during peak, 75% during offpeak)
  • The peak window is 12 hours, the offpeak window is 12 hours.

Provided that the backup method was raw data files, it shouldn’t take much more than 30 minutes to restore 100GB (50MB/s), but to replay one day of binary logs  it would take an additional 20 hours ((12 * 0.9) + (12 * 0.75) = 19h48m).

If you wanted to do something like setup a new slave with a 24-hour old backup, and apply the binary logs continuously until it catches up, you will be waiting almost 5 days until that happens (each day has 24hr-19h48m = 4h12m “free” capacity, 19h48m/4h12m = 4.7 days).

So what are the solutions?
If you are using all InnoDB tables, an XtraBackup incremental backup should be much faster than using binary logs.  You can understand Vadim’s excitement when he announced this feature a few months ago.

If you are using multiple storage engines, then your options are to either try and time delay slaves (to keep them close to up to date), or hope that it’s not often that you need to restore!  Eventually this problem should be lessened by a MySQL Server feature – parallel execution on slaves.  Lets hope that it can get enough testers so that it makes it into a new release very quickly.

0 0 votes
Article Rating
Subscribe
Notify of
guest

6 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Peter Zaitsev
Admin
16 years ago

Morgan,

I would mention if you’re running your slave at 90% capacity during peak load you’re looking for trouble big deal.

I would highly recommend slave thread busy no more than 30-50% – this makes sure you have at least some time to take an action as load or data size growths. With 90% you can wake up one morning and see things never catch up 🙂

Daniellek
Daniellek
16 years ago

Hardcore (but working) solution:
Setup: Master+2 slaves (one slave is “production”, and the other is “backup”)

mysql-backup stop
tar czvf ….
mysql-backup start

Shlomi Noach
16 years ago

Just to mention that Xtrabackup incremental backups do not backup MyISAM/ARCHIVE tables (yet), and so provide incremental backup only fo InnoDB tables.

Justin Swanhart
16 years ago

I’m curious why the implementation notes in that workload preclude UPDATE. UPDATE can be logically decomposed into DELETE followed by INSERT. This doesn’t inhibit transaction serialization.

Robert Hodges
16 years ago

Hi Morgan,

Have you tried Tungsten Replicator? It has built-in time-delay replication. Also, it now has built-in backup/restore support (checked in; will be in an official build shortly). I’m going to look at putting in support for invoking XtraBackup as it seems pretty good, especially the incremental capacity.

Finally, as Peter mentioned, running replication anywhere near full capacity is asking for trouble. One thing we are planning for later in the year in Tungsten is to split events into different “channels” for replication in parallel. Parallel replication seems to be particularly helpful when you have events that are truly slow, such as DDL, or for replication of data that are already sharded.

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved