Just how useful are binary logs for incremental backups?

July 21, 2009

Author

Morgan Tocker

Insight for DBAs

Share this Post:

We’ve written about replication slaves lagging behind masters before, but one of the other side effects of the binary log being serialized, is that it also limits the effectiveness of using it for incremental backup.Â Let me make up some numbers for the purposes of this example:

We have 2 Servers in a Master-Slave topology.

The database size is 100 GB (same tables on each).

The slave machine barely keeps up with the master (at 90% capacity during peak, 75% during offpeak)

The peak window is 12 hours, the offpeak window is 12 hours.

Provided that the backup method was raw data files, it shouldn’t take much more than 30 minutes to restore 100GB (50MB/s), but to replay one day of binary logsÂ it would take an additional 20 hours ((12 * 0.9) + (12 * 0.75) = 19h48m).

If you wanted to do something like setup a new slave with a 24-hour old backup, and apply the binary logs continuously until it catches up, you will be waiting almost 5 days until that happens (each day has 24hr-19h48m = 4h12m “free” capacity, 19h48m/4h12m = 4.7 days).

So what are the solutions?
If you are using all InnoDB tables, an XtraBackup incremental backup should be much faster than using binary logs.Â You can understand Vadim’s excitement when he announced this feature a few months ago.

If you are using multiple storage engines, then your options are to either try and time delay slaves (to keep them close to up to date), or hope that it’s not often that you need to restore!Â Eventually this problem should be lessened by a MySQL Server feature – parallel execution on slaves.Â Lets hope that it can get enough testers so that it makes it into a new release very quickly.