In this blog post, I’ll take a look at the performance of the Samsung 960 Pro SSD NVME.
First, I know the Samsung 960 Pro is a consumer SSD NVME drive, not intended for sustained data center workloads. But the AnandTech review looked good enough that I decided to take it for a test spin to see if it would work well with MySQL benchmarks.
Before that, I decided to do a simple sysbench file IO test to see how the drives handled sustained workloads, and if it would start acting up.
My expectation for a consumer SSD drive is that its write consistency will suffer. Many of those drives can sustain high bursts for short periods of time but have to slow down to keep up with write leveling (and other internal activities SSDs must to do). This is not what I saw, however.
I did a benchmark on E5-2630L V3, 64GB RAM Ubuntu 16.04 LTS, XFS Filesystem, Samsung 960 Pro 512GB (FW:1B6QCXP7):
sysbench --num-threads=64 --max-time=86400 --max-requests=0 --test=fileio --file-num=1 --file-total-size=260G --file-io-mode=async --file-extra-flags=direct --file-test-mode=rndrd run
Note: I used asynchronous direct IO to keep it close to how MySQL (InnoDB) submits IO requests.
This is what the “Read Throughput” graph looks in Percona Monitoring and Management (PMM):
As you can see, in addition to some reasonable ebbs and flows we have some major dips from about 1.5GB/sec of random reads to around 800MB/sec. This almost halves the performance. We can clearly see two of those dips, with the third one starting when the test ended.
What is really interesting is that as I did a read-write test, it performed much more uniformly:
sysbench --num-threads=64 --max-time=86400 --max-requests=0 --test=fileio --file-num=1 --file-total-size=260G --file-io-mode=async --file-extra-flags=direct --file-test-mode=rndrw run
Any ideas on what the cause of such strange periodic IO performance regression for reads could be?
This does not look like overheating throttling. It is much too regular for that (and I checked the temperature – is wasn’t any different during this performance regression).
One theory I have is “read disturb management”: could the SSD need to rewrite the data after so many reads? By my calculations, every cell is read some 166 times during the eight hours between those gaps. This doesn’t sound like a lot.
What are your thoughts?