Thoughs on Innodb Incremental BackupsPeter Zaitsev
For normal Innodb “hot” backups we use LVM or other snapshot based technologies with pretty good success. However having incremental backups remain the problem.
First why do you need incremental backups at all ? Why not just take the full backups daily. The answer is space – if you want to keep several generations to be able to restore to, having huge amount of full copies of large database is not efficient. Especially if it only changes couple of percents per day.
The solution MySQL offers – using binary log works in theory but it is not overly useful in practice because it may take way too long to catch up using binary log. Even if you have very light updates and can execute updates for a full day within an hour it will take over 24 hours to cover month worth of binary logs… and quite typically you would have much higher update traffic.
Another solution is rdiff which is a great general purpose tool. Though you can do much better with Innodb in Particular.
The Innodb pages have great deal of information helpful for their incremental backup in their internal. There is basically page version allowing to quickly check if the page is newer. There is page checksum and finally there is an offset of page (where it should be in the data file) stored in the page.
Using this data it should be easy to implement very efficient and yet simple for Incremental backup for Innodb.
In a way similar to rdiff the tool could both update the backup and store the rollback changes or if dealing with read-only compressed backup create the roll-forward recovery log, which also can be easily compressed.
What tool would need to do is to go through the pages for each Innodb file and simply write all the new pages to the separate file. Because pages already have position information in them there is no need to have complex “diff” meta data.
For recovery we can simply read this new pages file and put the pages back to their original places.
Of course this means .frm files and Innodb logs and MyISAM system tables need to be copied fully but they typically do not have any considerable portion of Innodb database