Make your file system error resilient

August 24, 2011
Author
Peter Zaitsev
Share this Post:

One of the typical problems I see setting up ext2/3/4 file system is sticking to defaults when it comes to behavior on errors. By default these filesystems are configured to Continue when error (such as IO error or meta data inconsistency) is discovered which can continue spreading corruption. This manifests itself in a worst way when device have some “flapping” problems returning errors every so often as this would cause some random pieces of data and meta data to be lost. Not good for system running mySQL Server. As far as I understand this problem is limited to EXT2/3/4 while over systems like XFS will not continue if consistency problems are discovered.

So how can you check what error behavior mode your file system has ? Run dumpe2fs /dev/sda1 and you will get something like this:

dumpe2fs 1.41.14 (22-Dec-2010)
Filesystem volume name:
Last mounted on: /mnt/data
Filesystem UUID: f9f7a0c3-0350-46d5-9930-29c3ac1f4b32
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg spars
e_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 226918400
Block count: 3630694400
Reserved block count: 0
Free blocks: 3616208434
Free inodes: 226918374
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 316
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 2048
Inode blocks per group: 128
RAID stride: 8
RAID stripe width: 80
Flex block group size: 16
Filesystem created: Mon Aug 22 23:03:21 2011
Last mount time: Mon Aug 22 23:18:25 2011
Last write time: Wed Aug 24 00:01:56 2011
Mount count: 2
Maximum mount count: -1
Last checked: Wed Aug 24 00:01:56 2011
Check interval: 0 ()
Lifetime writes: 54 GB
Reserved blocks uid: 0 (user unknown)
Reserved blocks gid: 0 (group unknown)
First inode: 11

This has a lot of interesting items and I’ll get into some of them a second later. What we’re concerned with right now is Errors behavior: Continue.
We can change behavior to remount-ro which will cause filesystem to become read-only and panic which will cause kernel panic. I believe remount-ro is the best option to use for the database server, though panic might be good option in high availability setup which would cause server to crash instead of continuing
in half working mode throwing errors etc (depending on which filesystem became read only)

To set error behavior to different value run tune2fs -e remount-ro /dev/sda1 which should have output something like:

tune2fs 1.41.14 (22-Dec-2010)
Setting error behavior to 2

It is worth to note when error is discovered during the operation EXT3, EXT4 filesystem will force file system check on the next startup which is handy.

Now I now some people are concerned about setting filesystem behavior to remount-ro or panic because this means even minor error in filesystem data structures which may be affects one file will take out whole file system. I do not think these concerns are valid. First with recent Linux versions and quality hardware EXT3 filesystem is extremely stable (EXT4 is good too though It is newer and I have shorter history with it). So if you have the error popping up you are very likely looking at hardware issues which can cause all kind of other nasty problems especially for database server. Second. The question comes to what you care the most – Do you care about consistency or availability ? Are you ready to risk for some data becoming inconsistent and increased data loss for system to be “up” (potentially serving wrong data) a little bit longer ? For most systems it is not worth tradeoff. Even more if you’re running Innodb chances are you will not buy you more “up time” either as Innodb is very
sensitive to corruptions and if any of file system errors are reported back to MySQL/Innodb it will assert and restart.

Now lets look at couple of other options you might want to tune with tune2fs:

Reserved block count: 0 Number of blocks reserved for root. It often defaults to 5% of total blocks, which is probably not needed for partition you store MySQL data on, as chances are MySQL server is only one doing writes on this partition anyway it just would be wasted if allocated. Some people like to keep it at some number so they have space reserve and if their database ran out of space they can buy a little bit of time before they find more permanent solution.

Maximum mount count: -1 and Check interval: 0 () These corresponds to automatic file system check on startup which is normally done once per so many mounts or so many days. Large partitions with many files can take a lot of time to check and can cause unwanted surprise when you’re restarting server and expecting it to be back in 5 minutes yet it takes 30+ because it has to check file systems. I believe it is much better to disable both these auto check functions for your data partition and just check it manually as needed, same as you would every so often check MySQL tables for corruption.

To change those options you can run tune2fs -m0 -i0 -c -1 /dev/sda1 changing reserved block percent, check interval and mount count appropriately.

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved