Estimating Undo Space needed for LVM SnapshotPeter Zaitsev
Quite frequently I find LVM installed on the system but no free space left to be used as snapshot undo space, which means LVM is pretty much unusable for backups or required space is very small – created without good understanding on how much space do you need for undo.
LVM Snapshot works kind of like Multi Versioning in Innodb – when you write the new data the new version for the block is stored in undo space. When snapshot is being read if the block is being remapped it is read from the undo location.
The LVM Snapshots versioning is however different from database Multi Versioning because only one old version per snapshot is needed while in databases you often would have multiple old versions of row stored.
Such structure allows us to understand the overhead you can expect from LVM. Writes suffer the penalty when it is the first time block is written to since snapshot was created. In this case single write is transformed to 2 writes, however similar to Innodb DoubleWrite the overhead is not double – the blocks written to the snapshot can be buffered few blocks at once using relatively sequential IO. Also if you modify same block again there is almost no overhead, and there is no space requirements.
Another issue with LVM snapshot it may convert pretty much sequential IO done while scanning the files to somewhat random IO – because blocks which need to be read from undo space will require disk seek, unless they are buffered already. This means your backup speed can be slower than you would expect especially in case you have a lot of scattered modifications.
Now back to our question of space requirements for MySQL Backups using LVM. As you can see from the description above in theory this would be the set of data which gets at least one modification during the period of backup. This is simple, however how do you find that ?
One simple though not very reliable way is to look at VMSTAT or IOSTAT. Find out number of KB being written to the given LVM volume per second. Now keep in account Innodb double write buffer (so writes to Innodb tablespace are double) and the fact you are probably not writing completely different data all the time you can reduce this number. The half of that number is often good ballpark figure.
So if I see 10MB/sec written to device I would assume I need 5MB of undo space per second of backup. Now say my backup takes 1 hour which is 3600 second – This means 18GB worth of undo space should be enough. Though you may wish to give you some breathing space.
Of course depending on database or workload this may need to be adjusted. If you have 10GB database size (including logs etc) you will not need 18GB undo snapshot no matter how heavy IO is. Also if you know the IO pattern, say most of traffic comes from batch updates updating pretty much every page in 10GB table you can account for that as well.
If you want to get really accurate number you better to use your experience – run the backup and see how snapshot growths during the process so you can see how correct was your original estimates.
Assuming you have already some space allocated for LVM undo space and it is not overly large – what can you do to keep space requirements to minimum ?
First do not place logs in LVM, I mean binary logs, slow query log, general query log etc, if they get any serious traffic. You can use other techniques to get consistent binary logs in the backup and you do not want query log and general logs anyway. However DO place Innodb Log Files on the same LVM volume as you need them to be snapshoted at the same time as tablespaces. The logs are especially nasty because they are always expanded so if you get 10GB worth of logs during the time of backup you will need 10GB of undo space.
Second, limit batch activities – Backups are often done at night when server activity is minimal, but this is also the time when you might want to do maintainence or batch job, which well may require a lot of undo space. Running ALTER TABLE or OPTIMIZE TABLE will cause table rebuild and use a lot of space. Batch jobs such as recounting rating summary tables recreation etc are also possible trouble makers.
Third,Use dedicated LVM volume for MySQL Instance. This is good idea for many reason and control over snapshot size is yet another one. If you just have one “/” snapshot which holds everything you may get a lot of junk IO activity causing overhead and eating LVM undo space. For example MySQL on disk temporary files and tables can waste quite a lot of such space.
The last advice I would give on snapshot space estimation would be not to try to save too much on it. Hard drive space is often cheap these days and so it may be better of allocating a bit more space for LVM snapshot and having piece of mind. Also keep in mind you can allocate undo space for any physical volume – it does not have to be same 15K RPM RAID10 volume your data lives. It can be single SATA hard drive (and so you may be able to allow full database size for undo space without thinking). If such hard drive dies you will only use snapshot but not your data, so it is pretty safe.
Though of course you should consider performance overheads mentioned in such operation.
Actually exactly this configuration could be lifesaver if you have system with LVM setup but zero free space and you would like to setup replication for large database with little no – connect USB hard drive place LVM snapshot on it and you can copy your database without shutting down server. Though I would only recommend doing it “once” to migrate to the server with proper LVM configuration with no downtime as it is too fragile for every day use. If you can’t use USB hard drive – NBD (Network Block Device) or iSCSI would also work.