One of the typical problems I see setting up ext2/3/4 file system is sticking to defaults when it comes to behavior on errors. By default these filesystems are configured to Continue when error (such as IO error or meta data inconsistency) is discovered which can continue spreading corruption. This manifests itself in a worst way when device have some “flapping” problems returning errors every so often as this would cause some random pieces of data and meta data to be lost. Not good for system running mySQL Server. As far as I understand this problem is limited to EXT2/3/4 while over systems like XFS will not continue if consistency problems are discovered.
So how can you check what error behavior mode your file system has ? Run dumpe2fs /dev/sda1 and you will get something like this:
dumpe2fs 1.41.14 (22-Dec-2010)
Filesystem volume name:
Last mounted on: /mnt/data
Filesystem UUID: f9f7a0c3-0350-46d5-9930-29c3ac1f4b32
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg spars
e_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 226918400
Block count: 3630694400
Reserved block count: 0
Free blocks: 3616208434
Free inodes: 226918374
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 316
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 2048
Inode blocks per group: 128
RAID stride: 8
RAID stripe width: 80
Flex block group size: 16
Filesystem created: Mon Aug 22 23:03:21 2011
Last mount time: Mon Aug 22 23:18:25 2011
Last write time: Wed Aug 24 00:01:56 2011
Mount count: 2
Maximum mount count: -1
Last checked: Wed Aug 24 00:01:56 2011
Check interval: 0 (
Lifetime writes: 54 GB
Reserved blocks uid: 0 (user unknown)
Reserved blocks gid: 0 (group unknown)
First inode: 11
This has a lot of interesting items and I’ll get into some of them a second later. What we’re concerned with right now is Errors behavior: Continue.
We can change behavior to remount-ro which will cause filesystem to become read-only and panic which will cause kernel panic. I believe remount-ro is the best option to use for the database server, though panic might be good option in high availability setup which would cause server to crash instead of continuing
in half working mode throwing errors etc (depending on which filesystem became read only)
To set error behavior to different value run tune2fs -e remount-ro /dev/sda1 which should have output something like:
tune2fs 1.41.14 (22-Dec-2010)
Setting error behavior to 2
It is worth to note when error is discovered during the operation EXT3, EXT4 filesystem will force file system check on the next startup which is handy.
Now I now some people are concerned about setting filesystem behavior to remount-ro or panic because this means even minor error in filesystem data structures which may be affects one file will take out whole file system. I do not think these concerns are valid. First with recent Linux versions and quality hardware EXT3 filesystem is extremely stable (EXT4 is good too though It is newer and I have shorter history with it). So if you have the error popping up you are very likely looking at hardware issues which can cause all kind of other nasty problems especially for database server. Second. The question comes to what you care the most – Do you care about consistency or availability ? Are you ready to risk for some data becoming inconsistent and increased data loss for system to be “up” (potentially serving wrong data) a little bit longer ? For most systems it is not worth tradeoff. Even more if you’re running Innodb chances are you will not buy you more “up time” either as Innodb is very
sensitive to corruptions and if any of file system errors are reported back to MySQL/Innodb it will assert and restart.
Now lets look at couple of other options you might want to tune with tune2fs:
Reserved block count: 0 Number of blocks reserved for root. It often defaults to 5% of total blocks, which is probably not needed for partition you store MySQL data on, as chances are MySQL server is only one doing writes on this partition anyway it just would be wasted if allocated. Some people like to keep it at some number so they have space reserve and if their database ran out of space they can buy a little bit of time before they find more permanent solution.
Maximum mount count: -1 and Check interval: 0 (
To change those options you can run tune2fs -m0 -i0 -c -1 /dev/sda1 changing reserved block percent, check interval and mount count appropriately.
Percona’s widely read Percona Data Performance blog highlights our expertise in enterprise-class software, support, consulting and managed services solutions for both MySQL® and MongoDB® across traditional and cloud-based platforms. The decades of experience represented by our consultants is found daily in numerous and relevant blog posts.
Besides specific database help, the blog also provides notices on upcoming events and webinars.
Want to get weekly updates listing the latest blog posts? Subscribe to our blog now! Submit your email address below.