Why ZFS Affects MySQL Performance

Why ZFS Affects MySQL Performance


In this blog post, we’ll look at how ZFS affects MySQL performance when used in conjunction.

ZFS and MySQL have a lot in common since they are both transactional software. Both have properties that, by default, favors consistency over performance. By doubling the complexity layers for getting committed data from the application to a persistent disk, we are logically doubling the amount of work within the whole system and reducing the output. From the ZFS layer, where is really the bulk of the work coming from?

Consider a comparative test below from a bare metal server. It has a reasonably tuned config (discussed in separate post, results and scripts here). These numbers are from sysbench tests on hardware with six SAS drives behind a RAID controller with a write-backed cache. Ext4 was configured with RAID10 softraid, while ZFS is the same (striped three pairs of mirrored VDEvs).

There are a few obvious observations here, one being ZFS results have a high variance between median and the 95th percentile. This indicates a regular sharp drop in performance. However, the most glaring thing is that with write-only only workloads of update-index, overall performance could drop to 50%:


Looking further into the IO metrics for the update-index tests (95th percentile from /proc/diskstats), ZFS’s behavior tells us a few more things.



  1. ZFS batches writes better, with minimal increases in latency with larger IO size per operation.
  2. ZFS reads are heavily scattered and random – the high response times and low read IOPs and throughput means significantly higher disk seeks.

If we focus on observation #2, there are a number of possible sources of random reads:

  • InnoDB pages that are not in the buffer pool
  • When ZFS records are updated, metadata also has to be read and updated

This means that for updates on cold InnoDB records, multiple random reads are involved that are not present with filesystems like ext4. While ZFS has some tunables for improving synchronous reads, tuning them can be touch and go when trying to fit specific workloads. For this reason, ZFS introduced the use of L2ARC, where faster drives are used to cache frequently accessed data and read them in low latency.

We’ll look more into the details how ZFS affects MySQL, the tests above and the configuration behind them, and how we can further improve performance from here in upcoming posts.


Share this post

Comments (8)

  • ovaistariq Reply

    ZFS has lot to improve from the stability perspective before it can really be used for MySQL data storage at scale.

    February 17, 2018 at 3:35 am
    • Yves Trudeau Reply

      ZFS on Linux has improved a lot over the last few years and it now has a number of large corporate users. When is the last time you tested and evaluated ZFS?

      February 19, 2018 at 10:27 am
    • Ben Francis Reply

      Uh, ZFS is _the_ “data storage at scale” filesystem. Jim Salter has written detailed technical articles on the wonders of ZFS data safety: http://jrs-s.net/2015/02/03/will-zfs-and-non-ecc-ram-kill-your-data/

      February 19, 2018 at 10:36 am
  • Tobias Reply

    This benchmark really is not in favour of ZFS. It‘s best practise to have a separate SSD ZFS ZIL device to have good write performance.

    This benchmark is doing two disk writes for every mysql write because the data is first written to the zfs intent log snd later to the real blocks in the filesystem – sinply to have the crashsafe durable semantics of zfs. No wonder writes are 50% slower with zfs than ext4.

    February 17, 2018 at 4:18 am
    • Yves Trudeau Reply

      I do agree with you, a SLOG would have a big impact here. Let’s say Jervin is presenting a worse case scenario and even then, considering the benefits of using ZFS, the cost is not that high. I hope he’ll present results with a decent SLOG just to size the difference. Like you said, I often got close to twice as fast for write intensive workloads with a good SLOG device.

      February 19, 2018 at 10:35 am
    • Jervin Real Reply

      Indeed, there is not dedicated log device in this test as it is only to demonstrate how much MySQL can get slower if you simply switch your existing hardware to ZFS from another filesystem like ZFS. We will cover more cases in future blog posts.

      February 19, 2018 at 1:10 pm
  • Jason Reply

    Yeaaa, this benchmark process is horribly thought out and misinformed. You should probably reach out to some real ZFS experts.

    February 17, 2018 at 10:53 am
  • Sinisha Reply

    Indeed, i would like to see benchmark test tuned zfs vs. xfs vs. ext4. When using zfs, you can disable innodb double writes.

    April 12, 2018 at 4:07 am

Leave a Reply