Incremental Backups Failing

  • Filter
  • Time
  • Show
Clear All
new posts

  • Incremental Backups Failing

    Hello, everyone -

    I have a situation where xtrabackup fails to create an incremental backup from time to time, however it seems to be random and intermittent. I have been unable to figure out what happens to cause the backups to fail.

    A tail of the log file entry for the error looks like this:

    IMPORTANT: Please check that the backup run completes successfully.
    At the end of a successful backup run innobackupex
    prints "completed OK!".

    innobackupex: Using mysql Ver 14.14 Distrib 5.5.25, for Linux (x86_64) using re
    adline 5.1
    innobackupex: Using mysql server version Copyright (c) 2000, 2011, Oracle and/or
    its affiliates. All rights reserved.

    innobackupex: Created backup directory /data/backups/incremental/04
    130315 04:00:09 innobackupex: Starting mysql with options: --password=xxxxxxxx
    --user='bkpuser' --unbuffered --
    130315 04:00:09 innobackupex: Connected to database with mysql child process (p
    130315 04:00:11 innobackupex: Connection to database server closed

    130315 04:00:11 innobackupex: Starting ibbackup with command: xtrabackup_55 --defaults-group="mysqld" --backup --suspend-at-end --target-dir=/data/backups/incremental/04 --incremental-basedir='/data/backups/incremental/03' --parallel=4
    innobackupex: Waiting for ibbackup (pid=24908) to suspend
    innobackupex: Suspend file '/data/backups/incremental/04/xtrabackup_suspended'

    xtrabackup: Error: cannot open /data/backups/incremental/03/xtrabackup_checkpoints
    xtrabackup: error: failed to read metadata from /data/backups/incremental/03/xtrabackup_checkpoints
    innobackupex: Error: ibbackup child process has died at /usr/bin/innobackupex line 374.
    + '[' 2 -eq 0 ']'
    + echo '\nIncremental Backup failed at 04\n'
    \nIncremental Backup failed at 04\n
    + echo -e 'Incremental Backup failed at 04: '
    + mail -s 'Please check the status log.' -- -r bkpuser@localhost
    + '[' 0 -eq 1 ']'
    + exit 1

    So, to me, it looks like the most important part of all that is:

    xtrabackup: Error: cannot open /data/backups/incremental/03/xtrabackup_checkpoints
    xtrabackup: error: failed to read metadata from /data/backups/incremental/03/xtrabackup_checkpoints

    I have worked with the system administrator for this system and he was unable to find any indication of a problem in the storage subsystem indicating that it had gone off line or gone into a read only state or anything like that.

    Has anyone had any similar experiences?

    This seems to happen on all of the servers I manage where xtrabackup is installed, from time to time.

    Most of the time, xtrabackup runs just fine. Unfortunately, this is 'not good enough'.

    Any and all hints, guidance, suggestions, etc. are greatly appreciated!

    /David C.

    P.S. Incremental hourly backups continue to fail until the next full backup happens at 2:00AM - then the incrementals are able to find the metadata again and they pick back up.

    P.P.S. I have uploaded the script that is executed to manage the backups. I am not the author, nor am I very talented with scripting. But if anyone who is talented would like to have a look, there you go. I did remove an email address and replaced it with but that's the only change from what's running.

    System Information:

    xtrabackup --version
    xtrabackup version 2.0.1 for Percona Server 5.1.59 unknown-linux-gnu (x86_64) (revision id: 446)

    uname -a
    Linux 2.6.32-220.17.1.el6.x86_64 #1 SMP Wed May 16 00:01:37 BST 2012 x86_64 x86_64 x86_64 GNU/Linux

    CentOS release 6.2 (Final)

    df -h
    Filesystem Size Used Avail Use% Mounted on
    34G 2.7G 29G 9% /
    tmpfs 5.9G 0 5.9G 0% /dev/shm
    /dev/sda1 485M 68M 392M 15% /boot
    /dev/sdb1 394G 238G 137G 64% /data

    free -m
    total used free shared buffers cached
    Mem: 11912 9850 2062 0 2 26
    -/+ buffers/cache: 9820 2092
    Swap: 6015 784 5231

  • #2

    Does anyone have any ideas at all?
    Even if someone has a 'feeling' that it could be a problem somewhere would be great.
    The problem is still happening and we still don't have any more information.
    Thank you very much for your consideration.
    /David C.