Incremental backups

  • Filter
  • Time
  • Show
Clear All
new posts

  • Incremental backups

    I'm putting together a backup strategy of a 100% InnoDB database using innobackupex for a large part of the process. I have a very active database that I want to be able to recover up to as close as possible to the time of failure, even if the media is lost.

    One question that I can't get clear in my head is the difference between taking incremental backups using innobackupex and simply snapshotting the binary logs.

    To that extent, I'm taking daily full hot backups with innobackupex with the backups sent immediately to both onsite and offsite (online) storage. So, recovering one of those backups will result in losing up to one day's worth of transactions.

    So, in order to recover the remaining transactions, I'm looking at either:

    1. Watching the binary logs and copying each completed log to onsite and offsite storage each time a new log is created, forcing a log creation with FLUSH LOGS each hour if no log is created in the meantime. In the case of media failure, I'd recover the last daily full backup and then reapply all transactions from subsequent logs.

    2. Instead of watching and copying the binary logs each hour, I could do an incremental backup with innobackupex each hour. I would then use innobackupex's process to merge the incremental backups during restore.

    The way I see it, both #1 and #2 would result in the same situation - with a maximum of one hours' worth of transactions lost. However #2 results in a much larger backup size since and a longer process each hour.

    So - the question is what's the benefit of running innobackupex incremental backups over simply archiving the binary logs?

    Please keep in mind that this is just one of the components of the backup strategy. Other components include a weekly mysqldump of the database, disk mirroring and a replication slave, but those are outside the scope of this question.


  • #2
    I'd look at one of the features in MySQL 5.6's version of mysqlbinlog (it's backwards compatible). It lets you stream binary logs in near-realtime. You could do this to create a near-realtime backup of them.

    Alternatively, mydumper has the same capability, though I don't have experience using it.


    • #3
      Thanks Baron - the log streaming looks very promising. I'll give it a go on a test box. In the meantime, I'm still trying to nut out the benefits of a incremental innobackupex backup vs snapshotting the logs.

      As far as I can see, snapshotting the logs will result in a smaller backup, (since there's no delta overhead, no extra copies of .frm and other files) and quicker (since there's no need to run the incremental backup process with its own mysql engine and log following). However, the incremental backup will likely be faster to recover from since the logs can be applied to a full backup as soon as the backup's done (rather than after a crash).

      Any other positives/negatives I'm missing?



      • #4
        I think incremental backups can be rather complex to manage and keep track of. When there is additional complexity in other things, i.e. when you want anything more than dead-simple roll-forwards, it starts to feel to me like more cost, effort, and risk than payoff sometimes. This is a personal opinion.


        • #5
          I am kind of struggling with the same idea as tphipps.

          Basically I have my single weekly consistent snapshot, a point in time backup of the database, regardless of whether they are innoDB or myisam, which I can run every week as a hot backup using innobackupex.

          I can prepare this to some staging point, or I can mount it on some mysql and export a logical SQL dump of it. whatever its a lovely consistent copy of the database with a binary position number.

          But surely now is when I reap the benefit of the consistent backup achieved in the previous step?

          I should be able to use binary log exported SQL to roll the backup forwards in a relatively easily.
          The incremental backups should gzip right down to a small size.
          And I can run the SQL through a filter to restore particular databases to particular points in time.

          But the xtrabackup incremental strategy is to make the much bigger backups using the inno file copy methods, and presumably READ lock the myisam tables for that as well. I don't get that part.