Where the open source community meets: Secure your spot for Percona Live Amsterdam! - Register

Downloads

Blog

Faster Point In Time Recovery with LVM2 Snaphots and Binary Logs

February 24, 2012

Author

Jervin Real

Insight for DBAs

Share this Post:

LVM snapshots is one powerful way of taking a consistent backup of your MySQL databases – but did you know that you can now restore directly from a snapshot (and binary logs for point in time recovery) in case of that ‘Oops’ moment? Let me show you quickly how.

This howto assumes that you already have a decent know how of LVM and snaphots and using LVM2 >= 2.02.58 (January 2010) which is when the --merge option was made available to the lvconvert binary. Base installs of Ubuntu 11.04 (Natty) and CentOS 5.7 includes packages with this feature, previous releases might also include them via supplemental repositories i.e. updates on CentOS. If you are using InnoDB, it is also important that your transaction logs (ib_logfile*) are on the same logical volume, if not, you could potentially trigger crash recovery when an LSN mismatch occurs and still end up with inconsistent data.

Now, assuming I have the following logical volumes – mysql-data for my datadir, and mysql-logs for my binary logs. I also have the latest snapshot of the mysql-data logical volume taken named ‘mysql-data-201202230157‘ using a script* I put together to make sure I have a consistent snapshot of the MySQL data files. Restoring snapshot alone may not be enough since there can be a lot more events from the time of the snapshots until you discover the problem, so it is really important that you have your binary logs on a different LV or copied it someplace else if they are on the same LV before we restore.

[root@sb logs]# /root/bin/lvmsnap.sh snapshot
Taking a new snapshot ..
done

Trimming excess snapshots ..
  Logical volume "mysql-data-201202230135" successfully removed
done

  LV                      VG       Attr   LSize  Origin     Snap%  Move Log Copy%  Convert
  lv_root                 VolGroup -wi-ao  5.54g                                          
  lv_swap                 VolGroup -wi-ao  1.97g                                          
  mysql-data              sb       owi-ao 20.00g                                          
  mysql-data-201202230150 sb       swi-a-  5.00g mysql-data   0.00                        
  mysql-data-201202230153 sb       swi-a-  5.00g mysql-data   0.00                        
  mysql-data-201202230155 sb       swi-a-  5.00g mysql-data   0.00                        
  mysql-data-201202230157 sb       swi-a-  5.00g mysql-data   0.00                        
  mysql-logs              sb       -wi-ao 10.00g

[root@sb logs]# /root/bin/lvmsnap.sh snapshot

Taking a new snapshot ..

done

Trimming excess snapshots ..

Logical volume "mysql-data-201202230135" successfully removed

done

LV VG Attr LSize Origin Snap% Move Log Copy% Convert

lv_root VolGroup -wi-ao 5.54g

lv_swap VolGroup -wi-ao 1.97g

mysql-data sb owi-ao 20.00g

mysql-data-201202230150 sb swi-a- 5.00g mysql-data 0.00

mysql-data-201202230153 sb swi-a- 5.00g mysql-data 0.00

mysql-data-201202230155 sb swi-a- 5.00g mysql-data 0.00

mysql-data-201202230157 sb swi-a- 5.00g mysql-data 0.00

mysql-logs sb -wi-ao 10.00g

Suppose sometime after the snapshot, I accidentally dropped the salaries tables from the employees database! (Oops, I can have an angry mob of employees who may not get their salaries on time!).

mysql> delete from salaries where emp_no = 10001;
Query OK, 17 rows affected (0.15 sec)

mysql> drop table salaries; -- Ooops!
Query OK, 0 rows affected (0.49 sec)

mysql> alter table employees add column age smallint unsigned not null default 0;
Query OK, 300024 rows affected (13.28 sec)
Records: 300024  Duplicates: 0  Warnings: 0

mysql> update employees set age = ((YEAR(CURDATE())-YEAR(birth_date)) - (RIGHT(CURDATE(),5)<RIGHT(birth_date,5)));
Query OK, 300024 rows affected (6.51 sec)
Rows matched: 300024  Changed: 300024  Warnings: 0

mysql> delete from salaries where emp_no = 10001;

Query OK, 17 rows affected (0.15 sec)

mysql> drop table salaries; -- Ooops!

Query OK, 0 rows affected (0.49 sec)

mysql> alter table employees add column age smallint unsigned not null default 0;

Query OK, 300024 rows affected (13.28 sec)

Records: 300024 Duplicates: 0 Warnings: 0

mysql> update employees set age = ((YEAR(CURDATE())-YEAR(birth_date)) - (RIGHT(CURDATE(),5)<RIGHT(birth_date,5)));

Query OK, 300024 rows affected (6.51 sec)

Rows matched: 300024 Changed: 300024 Warnings: 0

With the last snapshot I have above, I should be able to restore up to before the first DELETE statement above. If you use my LVM snapshot script*, it also saves the binary log coordinates when the snapshot was taken and saves it into a file specified as variable on the script. Below is the binary log coordinates for when the last snapshot on my list above.

[root@sb logs]# cat mysql-data-201202230157-binlog-info
File	Position	Binlog_Do_DB	Binlog_Ignore_DB
mysql-bin.000022	336796712

[root@sb logs]# cat mysql-data-201202230157-binlog-info

File Position Binlog_Do_DB Binlog_Ignore_DB

mysql-bin.000022 336796712

Using the coordinates above, I can start searching for the position of the DELETE statement so we can skip that after the snapshot restart. Using the below command and some inline searches, I was able to pinpoint the position of the delete statement, it is at 336797160.

[root@sb logs]# mysqlbinlog --verbose --base64-output=decode-rows --start-position 336796712 mysql-bin.000022
...
# at 336797160
#120223  1:59:55 server id 1  end_log_pos 336797275 	Query	thread_id=47	exec_time=1	error_code=0
SET TIMESTAMP=1329980395/*!*/;
DROP TABLE `salaries` /* generated by server */
...

[root@sb logs]# mysqlbinlog --verbose --base64-output=decode-rows --start-position 336796712 mysql-bin.000022

...

# at 336797160

#120223 1:59:55 server id 1 end_log_pos 336797275 Query thread_id=47 exec_time=1 error_code=0

SET TIMESTAMP=1329980395/*!*/;

DROP TABLE `salaries` /* generated by server */

...

Now, let’s restore our data from the snapshot, under the hood restore is really a “rollback” to the snapshot state when it was taken. We will do this using lvconvert’s –merge option, to merge the state of the snapshot to the original LV.

[root@sb logs]# /etc/init.d/mysql stop
Shutting down MySQL (Percona Server).....                  [  OK  ]
[root@sb logs]# umount /mysql/data
[root@sb logs]# lvconvert --merge /dev/sb/mysql-data-201202230157
  Merging of volume mysql-data-201202230157 started.
  mysql-data: Merged: 2.3%
  mysql-data: Merged: 0.2%
  mysql-data: Merged: 0.0%
  Merge of snapshot into logical volume mysql-data has finished.
  Logical volume "mysql-data-201202230157" successfully removed
[root@sb logs]# mount /mysql/data
[root@sb logs]# /etc/init.d/mysql start
Starting MySQL (Percona Server)                            [  OK  ]

[root@sb logs]# /etc/init.d/mysql stop

Shutting down MySQL (Percona Server)..... [ OK ]

[root@sb logs]# umount /mysql/data

[root@sb logs]# lvconvert --merge /dev/sb/mysql-data-201202230157

Merging of volume mysql-data-201202230157 started.

mysql-data: Merged: 2.3%

mysql-data: Merged: 0.2%

mysql-data: Merged: 0.0%

Merge of snapshot into logical volume mysql-data has finished.

Logical volume "mysql-data-201202230157" successfully removed

[root@sb logs]# mount /mysql/data

[root@sb logs]# /etc/init.d/mysql start

Starting MySQL (Percona Server) [ OK ]

You should shutdown MySQL first, then unmount the logical volume holding the MySQL data. This way you don’t have to deactivate/activate the original logical volume to start the merging. So let’s see if our salaries table is restored.

mysql> SHOW TABLES;
+---------------------+
| Tables_in_employees |
+---------------------+
...
| salaries            |
...
+---------------------+
6 rows in set (0.05 sec)

mysql> SHOW TABLES;

+---------------------+

| Tables_in_employees |

+---------------------+

...

| salaries |

...

+---------------------+

6 rows in set (0.05 sec)

Success! Your salaries data is back. But, we still have to apply the data after the snapshot, skipping the DROP statement. You should take another snapshot now – in case you missed to skip the DROP statement! I know for a fact that for every MySQL restart, the logs are flushed and a new binary log is created, looking at the current binary logs after restoring the snapshot, I know I have to apply mysql-bin.00022 only starting from position 336796712 and skipping the DROP statement at position 336797160:

mysql> show binary logs;
+------------------+-----------+
| Log_name         | File_size |
+------------------+-----------+
...
| mysql-bin.000022 | 336797725 |
| mysql-bin.000023 |       107 |
+------------------+-----------+
23 rows in set (0.00 sec)

mysql> select count(*) from salaries where emp_no = 10001;
+----------+
| count(*) |
+----------+
|       17 |
+----------+
1 row in set (0.00 sec)

[root@sb logs]# mysqlbinlog --verbose --base64-output=decode-rows --start-position 336796712 --stop-position 336797133 mysql-bin.000022 | mysql
[root@sb logs]# mysqlbinlog --verbose --base64-output=decode-rows --start-position 336797275 mysql-bin.000022 | mysql

mysql> select count(*) from salaries where emp_no = 10001;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)

mysql> show tables;
+---------------------+
| Tables_in_employees |
+---------------------+
...
| salaries            |
...
+---------------------+
6 rows in set (0.00 sec)

mysql> show binary logs;

+------------------+-----------+

| Log_name | File_size |

+------------------+-----------+

...

| mysql-bin.000022 | 336797725 |

| mysql-bin.000023 | 107 |

+------------------+-----------+

23 rows in set (0.00 sec)

mysql> select count(*) from salaries where emp_no = 10001;

+----------+

| count(*) |

+----------+

| 17 |

+----------+

1 row in set (0.00 sec)

[root@sb logs]# mysqlbinlog --verbose --base64-output=decode-rows --start-position 336796712 --stop-position 336797133 mysql-bin.000022 | mysql

[root@sb logs]# mysqlbinlog --verbose --base64-output=decode-rows --start-position 336797275 mysql-bin.000022 | mysql

mysql> select count(*) from salaries where emp_no = 10001;

+----------+

| count(*) |

+----------+

| 0 |

+----------+

1 row in set (0.00 sec)

mysql> show tables;

+---------------------+

| Tables_in_employees |

+---------------------+

...

| salaries |

...

+---------------------+

6 rows in set (0.00 sec)

So, as you can see, I now have a consistent data and still have my salaries table back.

Because leaving a production server with active snapshots can affect performance, this is not really an advisable backup solution. If your server somehow blew up in flames, hardware problems or encounter and LVM bug, your snapshots are useless. Also, you cannot test restore your snapshots – they are one time use!

However, if you can tolerate the extra IO overhead i.e. development or staging server , then this is still a valid backup method of course on top of your regularly tested (offsite) backups.

Another ideal use case for this method is when you are planning to execute a long running ALTER or server upgrade, this method can be good quick rollback procedure in case something fails during the operation.

* While writing this blog I hacked a quick shell script to create snapshots and uploaded it here. By no means it is perfect, you can use lvmsnap.sh snapshot to create snapshots. The restore functionality is not finished but you can use it for quick testing 🙂 i.e. lvmsnap.sh restore <snapshot-timestamp>

0 0 votes

Article Rating

3 Comments

Oldest

Newest Most Voted

sen

13 years ago

Assume we clone the data from a standby server(S1) in which data replicated from a active master(M) by doing stop slave in standby(S1) to create new standby server(S2) like M —> S1 —-> S2

The data will be consistent in this approach but while starting the S2 it goes to INNODB crash recovery & takes long time to start.

How to over come this ?