NVMe vs AHCI: Another Win for PCIe

Improving performance is never just about hardware. Faster hardware can only help to reach the limits of software and ultimately more efficient software is needed to take full advantage of the faster hardware. This applies to SSDs as well. With PCIe the potential bandwidth increases dramatically and to take full advantage of the faster physical interface, we need a software interface that is optimized specifically for SSDs and PCIe.

AHCI (Advanced Host Controller Interface) dates back to 2004 and was designed with hard drives in mind. While that doesn't rule out SSDs, AHCI is more optimized for high latency rotating media than low latency non-volatile storage. As a result AHCI can't take full advantage of SSDs and since the future is in non-volatile storage (like NAND and MRAM), the industry had to develop a software interface that abolishes the limits of AHCI.

The result is NVMe, short for Non-Volatile Memory Express. It was developed by an industry consortium with over 80 members and the development was directed by giants like Intel, Samsung, and LSI. NVMe is built specifically for SSDs and PCIe and as software interfaces usually live for at least a decade before being replaced, NVMe was designed to be capable of meeting the industry needs as we move to future memory technologies (i.e. we'll likely see RRAM and MRAM enter the storage market before 2020).

  NVMe AHCI
Latency 2.8 µs 6.0 µs
Maximum Queue Depth Up to 64K queues with
64K commands each
Up to 1 queue with
32 commands each
Multicore Support Yes Limited
4KB Efficiency One 64B fetch Two serialized host
DRAM fetches required

Source: Intel

The biggest advantage of NVMe is its lower latency. This is mostly due to a streamlined storage stack and the fact that NVMe requires no register reads to issue a command. AHCI requires four uncachable register reads per command, which results in ~2.5µs of additional latency.

Another important improvement is support for multiple queues and higher queue depths. Multiple queues ensure that the CPU can be used to its full potential and that the IOPS is not bottlenecked by single core limitation.

Source: Microsoft

Obviously enterprise is the biggest beneficiary of NVMe because the workloads are so much heavier and SATA/AHCI can't provide the necessary performance. Nevertheless, the client market does benefit from NVMe but just not as much. As I explained in the previous page, even moderate improvements in performance result in increased battery life and that's what NVMe will offer. Thanks to lower latency the disk usage time will decrease, which results in more time spend at idle and thus increased battery life. There can also be corner cases when the better queue support helps with performance.

Source: Intel

With future non-volatile memory technologies and NVMe the overall latency can be cut to one fifth of the current ~100µs latency and that's an improvement that will be noticeable in everyday client usage too. Currently I don't think any of the client PCIe SSDs support NVMe (enterprise has been faster at adopting NVMe) but the SF-3700 will once it's released later this year. Driver support for both Windows and Linux exists already, so it's now up to SSD OEMs to release compatible SSDs.

Why We Need Faster SSDs Testing SATA Express
POST A COMMENT

130 Comments

View All Comments

  • R0H1T - Thursday, March 13, 2014 - link

    "This is actually the same motherboard as our 2014 SSD testbed but with added SATAe functionality."

    Does this mean you're going to test next gen SSD's with this(SATAe) & if so perhaps anytime during the current 2014 calendar year?
    Reply
  • ddriver - Thursday, March 13, 2014 - link

    So why not use 2 lane PCIE for the SSD instead - it does look like it uses less power and has higher bandwidth than SATAE? Reply
  • DanNeely - Thursday, March 13, 2014 - link

    Mini ITX with a discrete GPU (or any other card) or mATX with dual GPU setups either don't have anywhere to put a PCIe SSD or don't have anywhere good to put one. Reply
  • SirKnobsworth - Saturday, March 15, 2014 - link

    That's what M.2 is for. Reply
  • Bigman397 - Friday, April 04, 2014 - link

    Which is a much better solution than retrofitting controllers and protocols meant for rotational media. Reply
  • Kristian Vättö - Thursday, March 13, 2014 - link

    The motherboard in our 2014 testbed is the normal Z87 Deluxe without SATAe. There aren't any official SATAe products yet so we're not sure how we'll test those but the ASUS board is certainly an option. Reply
  • MrPoletski - Thursday, March 13, 2014 - link

    I wonder what ridiculous speed SSD's we are going to start seeing with this tech. Quite exciting really. Reply
  • nathanddrews - Friday, March 14, 2014 - link

    The Future!

    http://www.tomsitpro.com/articles/intel-silicon-ph...
    Reply
  • thevoiceofreason - Thursday, March 13, 2014 - link

    "because after all we are using cabling that should add latency"
    Why would you assume that?
    Reply
  • DiHydro - Thursday, March 13, 2014 - link

    When talking about one nanosecond signals, a charge will travel approximately 30 cm or 1 foot. If you add length onto a signal path, it will delay your transmission speed. Reply

Log in

Don't have an account? Sign up now