Modern file systems are well equipped to deal with large writes. One area that remains challenging however is to efficiently write out “microdata”, such as metadata and small portions of large files, while showing good I/O utilization when the data is read back. This challenge is evident with mount options like “noatime” which disables updating file access time on reads. This kind of solution avoids the problem altogether. Another approach, delayed allocation, is meant to coalesce small writes in memory as long as possible before writing it out to disk. Filesystems like ext4 and Btrfs use delayed allocation to make a best-effort at reducing fragmentation and random I/O.
Isn’t there a way to fundamentally solve filesystem fragmentation and random I/O?
This week, I’ll be speaking at HotStorage 2012 in Boston. My talk will present TokuFS – a filesystem that uses Fractal Tree® indexes. The goal of TokuFS is to demonstrate that microwrites don’t have to be hopelessly slow as the working set exceeds memory. The talk, and associated conference paper, shows orders of magnitude improvement on workloads such as small file creation, large directory reads/writes, and random updates in a file.
My talk takes place on Thursday at 5:00pm (it’s the last of three talks that start at 4 pm). Details on the talk can be found here.