Final Words

In terms of performance, the NVMe version of the SM951 offers an upgrade over its AHCI sibling. The average data rate (i.e. large IO performance) isn't dramatically better compared to the AHCI version, but when it comes to small IO latency the SM951 and NVMe in general show their might. Typically the NVMe version offers about 10-20% improvement in average latency over the AHCI version, which is a healthy boost in performance given that the two utilize identical hardware.

It's obvious that the SM951-NVMe has been designed for mainstream client workloads. In our Heavy and Light traces it sets new records, but in the most IO intensive The Destroyer trace the SM951-NVMe is outperformed by the SSD 750. While Intel specifically built a client-oriented firmware for the SSD 750, the company made it clear that it focused on sustained random IO performance rather than high peak throughput, and the tradeoff pays off as long as the IO workload is intensive enough (think multiple VMs for instance). Another area where the SSD 750 beats the SM951-NVMe by a substantial margin is steady-state performance, which contributes heavily to The Destroyer benchmark since the trace effectively puts the drive into steady-state.

Speaking of steady-state performance, there are two things I was specifically happy to see in the SM951-NVMe. The first one is the unbelievable IO consistency, which isn't that significant for a client drive but if Samsung can pull off something equivalent (with higher performance, of course) in the enterprise space, then I'll be excited. It never hurts to have that level of consistency in a client drive either, but the it just isn't used to its full potential since client SSDs and workloads are more about peak than sustained performance, which is the opposite of enterprise workloads.

The second part is low queue depth random read performance. This is the area where we haven't seen much improvement in the past few years because ultimately the bottlenecks have been AHCI overhead and NAND latency. Fixing the latter requires a new type of non-volatile memory (e.g. ReRAM, MRAM or NRAM) with significantly lower read latency, but that isn't on the horizon until around 2020. In the mean time, the only way to improve random read latency is to cut the driver stack overhead, which is exactly the purpose of NVMe. The reason why I'm so excited about low queue depth random read performance is the fact that they account for a large of the total IOs in typical client workloads (especially the less intensive ones), so any improvement will translate to better user experience and performance, which is ultimately what a consumer is looking for.

Despite all this, I have to admit that I walk away a little disappointed. A 10-20% performance improvement isn't marginal, but after all the hype about NVMe I was expecting a little more. I have a strong feeling that NVMe is capable of much more, but the technology needs time to mature. From what I have talked to SSD OEMs, the generic NVMe driver that Microsoft includes in Windows 8.1 has some severe shortcomings, which is why nearly everyone has their own custom driver at least for now. I think Samsung and the SM951-NVMe desperately need that to unleash the full potential of the drive and I sure hope that the retail version of the drive will feature one.

All in all, the SSD 750 remains as the best option for very IO intensive workloads, but for a more typical enthusiast the SM951-NVMe provides better performance, although not substantially better than the AHCI version. If you need an SSD today, I wouldn't wait for the NVMe version because the availability is a mystery to all and you may end up waiting possibly months. Nevertheless, if the SM951-NVMe was easily available and reasonably priced, I would give it our "Recommended by AnandTech" award, but for now one can only drool after it.

ATTO & AS-SSD
Comments Locked

74 Comments

View All Comments

  • patrickjp93 - Thursday, June 25, 2015 - link

    They aren't all storage transfer commands go through the PCH. Your PCIe SSDs do not connect to the CPU directly in most cases. Some enterprise grade drives do, but most consumer do not.
  • Kristian Vättö - Friday, June 26, 2015 - link

    PCIe is PCIe regardless of whether the controller is inside the CPU or PCH. PCH merely acts as a hub for different interfaces, but ultimately it connects to the CPU as well since that is where all the processing is done.
  • CajunArson - Thursday, June 25, 2015 - link

    Yeah so are we missing some sound and FURY [hint hint] about this SSD on a stick?
  • Kristian Vättö - Thursday, June 25, 2015 - link

    Fury X is coming, Ryan just needed one more day because the flu has been undermining his ability to work.
  • DigitalFreak - Thursday, June 25, 2015 - link

    (hint hint) The 980ti is faster than the Fury X all around.
  • CajunArson - Thursday, June 25, 2015 - link

    I'm not disagreeing with that statement.
    I just want the review.
  • lilmoe - Thursday, June 25, 2015 - link

    +1

    A DX12 showdown between FuryX and 980ti would be highly welcome as well.
  • Gigaplex - Thursday, June 25, 2015 - link

    The Fury X wins in some of the 4k tests. The 980Ti seems faster overall, but it's not "all around".
  • mr_tawan - Friday, June 26, 2015 - link

    From what I've read, it looks like the Fury has advantages when it comes to memory-intensive use case.
  • SofS - Thursday, June 25, 2015 - link

    About the driver issue, how do different operating systems fare? Like 32/64 bits, XP/7/8/10 and Linux old/new (for instance CentOS/Fedora).

Log in

Don't have an account? Sign up now