November 28, 2014

Testing Samsung SSD SATA 256GB 830 – not all SSD created equal

I personally like PCIe based Flash, but from a pricing point our customers are looking for cheaper alternatives. SATA SSD is an options. There is many products based on MLC technology, and Intel 320 I would say is the most popular. I do not particularly like its write performance – I wrote about it before, that’s why I am looking for comparable alternatives. Samsung 830 256GB looked like a good product, that’s why I decided to test it.

For tests I use sysbench fileio, 16KiB block size (to match workload from InnoDB, as this is primary usage for me), and recently I switched to use async IO mode. There are two reasons for that. First, MySQL/InnoDB uses async writes, so this will emulate database load, and second, async mode allows to see maximal possible throughput, it does not show reliable latency though, as it appears there is no a reliable way in the Linux asynchronous IO library to get time metrics for particular IO block.

so my testing command line looks like:

You may see I gather metrics every 10 sec to see how stable the performance is, and it really helps to observe some artifacts, as you will see in following graphs.

Hardware for tests: HP ProLiant DL380 G6, filesystem: ext4, mounted with nobarrier.

The results for random write case (8 async IO threads):

It seems that InnoDB is not alone with its flashing problems. You can see there periodical stalls in throughput (0 throughput for 20-30 sec period of time). When there is no drops, the drive keep write throughput on 323 MiB/sec level.

I really thought that these stalls are related, so I was totally surprised them in random reads also.
The results for random read case:

I do not have a good explanation for this. When there is no drop, the drive keeps 375 MiB/sec throughput. I may do a wild guess about drops – the drive periodically cleans an internal cache or something.

To understand better what kind of response time we should expect, I ran random read sync IO mode, now for 1-64 threads.

The throughput:

We are getting to the peak throughput at 16-32 threads.

And response time:

For 16 threads, we may expect 0.96ms response time, which increases to 1.62ms under 32 threads.

The periodic drops that I observe for both random reads and random writes do not allow me to recommend this drive for a database server usage, even in general this drive provides much better throughput than Intel 320 (some results for Intel 320).

If you are interested more in SSD and MySQL questions – I will be giving a webinary “MySQL and SSD” on May-9. It will be the same as my talk on Percona Live MySQL Conference 2012, if you did not attend my talk – you are welcome to join the webinar.


About Vadim Tkachenko

Vadim leads Percona's development group, which produces Percona Clould Tools, the Percona Server, Percona XraDB Cluster and Percona XtraBackup. He is an expert in solid-state storage, and has helped many hardware and software providers succeed in the MySQL market.

Comments

  1. Rodalpho says:

    My guess would be some kind of fault in the SSD’s garbage collection routines. It’s a consumer drive, not meant for high queuing async access from multiple DB threads, and the default routines couldn’t keep up.

  2. Rodalpho,

    It can be related to garbage collection routines, but it can see it also in read-only case, which makes me puzzled. Garbage collector should not act that much in read-only.

  3. Rodalpho says:

    That depends, did you run the read-only case closely following the writes? GC normally runs at low priority and can continue for hours. It usually tries to run when the interface is idle, but if you’re constantly running benchmarks, it could just go ahead and do it anyway. That’s my theory, at any rate.

  4. Yeah…. this would be quite silly to stall reads for so long because of garbage collection.

    Vadim – did you look at Intel 520 drives ? These look like a lot higher performance than 320 model yet they are reasonably comfortable. Here is nice table of comparison between models

    http://www.intel.com/content/www/us/en/solid-state-drives/solid-state-drives-330-series.html

  5. George says:

    Samsung 830 SSD’s GC differs from Intel 320 series AFAIK. Intel 320 does active GC while Samsung 830s do idle GC.

    Have you tested Samsung 830 SSD with over provisioning ? formated at 100GB,120GB, 200GB, and 240GB capacities ?

  6. Vadim,

    So Samsung 830 can handle 300MB+/sec writes and Stec Mach16 can handle 150MB/sec though Samsung has a lot more serve stalls. I wonder however what if you load Samsung lower than its full capacity, writing 150MB/sec as STEC can do or might be 200MB ? It could be in this case there will be enough idle room to avoid stalls while providing better performance than STEC ?

    For many workloads in reality you more care about what load system can handle being stable not whenever it becomes unstable at the full utilization.

  7. Peter,

    Take look on random read sync case with 1-2 threads.
    In this case the card is not fully loaded, but there are still periodic stalls.

  8. Michael Burman says:

    If it’s a GC issue (shouldn’t affect reads that much though, but it’s possible with high random I/O pattern) as these drives are designed to work with TRIM, would it be possible to see results under Windows as well? I know it’s not normally an environment to run MySQL on, but IF there’s reasonable advantage with SSDs, then maybe it shouldn’t be counted out. Especially if using these consumer drives. And at least it should show if these issues are related to the GC or the internal cache issue..

  9. Michael,

    I am not interested in Windows, I do not have hardware nor technical experience to do that.
    I think TRIM is overrated and probably not relevant to this benchmark.
    I do not create/delete many files. I just random write into files created just after secure erase procedure.

  10. Jacob Sohn says:

    Vadim,

    it would be interesting to see if pm830 wattage drops during stall and if it does, curious to see if it is caused by built-in power management. perhaps that would kinda be sneaky of samsung to set this limit in sm825 in firmware.

  11. Terje says:

    Could this be some other problem?

    I tried a Samsung 830 drive at home. Could not reproduce the issues you see here.
    This drive had been properly trashed in advance by my home cooked tool for stressing SSD GC and it happens to be on a 3Gb/s SATA controller.

    When I try sysbench I have a very steady performance in the 60-70MByte/sec area. occasionally drops down towards 50MB/sec, but with 1sec samples, I hardly ever get below 50MB/sec over a 6 hour test run.

    Just for the comparsion, a X25m 160GB G2 hovers around 20-30MB/sec after same “pre trashing” with regular drops towards 10MB/sec.

    Both drives over provisioned 20% in additional to standard over provisioning

  12. sam brown says:

    Well he does have a fusion i/o board and an extra nic according the build specs (only the dl380 g7 came with 4 nic’s on system board).

    We could assume he was using the P410/1GB FBWC, but that might be a big assumption. The P410/256 has half the cache bandwidth (32+8) instead of the 512/1024 (64+8) crippling the use of caching.

    Unlike the megaraid controllers you don’t have as many SSD options like the SBR modifying “fastpath (aka over clock the controller) – ever wonder why a m5015 cross flashed to a H700 has more iops? the PLL dividers are in the SBR which are trumped by the hardware add-on “fast path” keys.

    P410/256 with caching enabled (not possible to do cut through io?) would be severely crippled.

    From my observation you would need a p410/512 for 4 samsung 830’s.

    It seems to be consistent that using cachecade-like solutions or non-raid ssd solutions results in far superior performance always.

    I’d like to see a review of splicing your database files/indexes across ssd’s without raid, with trim.

    PNY prevail elite embedded SF-2282 480gb with intel HET emlc would be a great example.

Speak Your Mind

*