Checking Intel's Numbers

The product brief for the Optane SSD DC P4800X provides a limited set of performance specifications, entirely omitting any standards for sequential throughput. Some latency and throughput targets are provided for 4kB random reads, writes, and a 70/30 mix of reads and writes.

This section has our results for how the Optane SSD measures up to Intel's advertised specifications and how the flash SSDs fare on the same tests. The rest of this review provides deeper analysis of how these drives perform across a range of queue depths, transfer sizes, and read/write mixes.

4kB Random Read at a Queue Depth of 1 (QD1)
Drive Throughput Latency (µs)
MB/s IOPS Mean Median 99th 99.999th
Intel Optane SSD DC P4800X 375GB 413.0 108.3k 8.9 9 10 37
Intel SSD DC P3700 800GB 48.7 12.8k 77.9 76 96 2768
Micron 9100 MAX 2.4TB 35.3 9.2k 107.7 104 117 306

Intel's queue depth 1 specifications are expressed in terms of latency, and at a throughput specification at QD1 would be redundant. Intel specifies a "typical" latency of less than 10µs, and most QD1 random reads on the Optane SSD take 8 or 9µs; even the 99th percentile latency is still 10µs.

The 99.999th percentile target is less than 60µs, which the Optane SSD beats by a wide margin. Overall, the Optane SSD passes with ease. The flash SSDs are 8-12x slower on average, and the 99.999th percentile latency of the Intel P3700 is far worse, at around 75x slower.

4kB Random Read at a Queue Depth of 16 (QD16)
Drive Throughput Latency (µs)
MB/s IOPS Mean Median 99th 99.999th
Intel Optane SSD DC P4800X 375GB 2231.0 584.8k 25.5 25 41 81
Intel SSD DC P3700 800GB 637.9 167.2k 93.9 91 163 2320
Micron 9100 MAX 2.4TB 517.5 135.7k 116.2 114 205 1560

Intel's QD16 random read result is 584.8k IOPS for throughput, which is above the official specification of 550k IOPS by a few percent. The 99.999th percentile latency scores 81µs, significantly under the target of less than 150µs. The flash SSDs are 3-5x slower on most metrics, but 20-30 times slower at the 99.999th percentile for latency.

4kB Random Write at a Queue Depth of 1 (QD1)
Drive Throughput Latency (µs)
MB/s IOPS Mean Median 99th 99.999th
Intel Optane SSD DC P4800X 375GB 360.6 94.5k 8.9 9 10 64
Intel SSD DC P3700 800GB 350.6 91.9k 9.2 9 18 81
Micron 9100 MAX 2.4TB 160.9 42.2k 22.2 22 24 76

In the specifications, the QD1 random write specifications are 10µs on latency, while the 99.999th percentile for latency is relaxed from 60µs to 100µs. In our results, the QD1 random write throughput (360.6 MB/s) of the Optane SSD is a bit lower than the QD1 random read throughput (413.0 MB/s), but the latency is roughly the same (8.9µs mean, 10µs on 99th).

However it is worth noting that the Optane SSD only manages a passing score when the application uses asynchronous I/O APIs. Using simple synchronous write() system calls pushes the average latency up to 11-12µs.

Also, due to the capacitor-backed DRAM caches, the flash SSDs also handle QD1 random writes very well. The Intel P3700 also manages to keep latency mostly below 10µs, and all three drives have 99.999th percentile latency below Intel's 100µs standard for the Optane SSD.

4kB Random Write at a Queue Depth of 16 (QD16)
Drive Throughput Latency (µs)
MB/s IOPS Mean Median 99th 99.999th
Intel Optane SSD DC P4800X 375GB 2122.5 556.4 27.0 23 65 147
Intel SSD DC P3700 800GB 446.3 117.0 134.8 43 1336 9536
Micron 9100 MAX 2.4TB 1144.4 300.0 51.6 34 620 3504

The Optane SSD DC P4800X is specified for 500k random write IOPS using four threads to provide a total queue depth of 16. In our tests, the Optane SSD scored 556.4k IOPs, exceeding the specification by more than 11%. This equates to a random write throughput of more than 2GB/s.

The flash SSDs are more dependent on the parallelism benefits of higher capacities, and as a result can be slow at the same capacity. Hence in this case the 2.4TB Micron 9100 fares much better than the 800GB Intel P3700. The Micron 9100 hits its own specification right on the nose with 300k IOPS and the Intel P3700 comfortably exceeds its own 90k IOPS specification, although remaining the slowest of the three by far. The Optane SSD stays well below its 200µs limit for 99.999th percentile latency by scoring 147µs, while the flash SSDs have outliers of several milliseconds. Even at the 99th percentile the flash SSDs are 10-20x slower than Optane.

4kB Random Mixed 70/30 Read/Write Queue Depth 16
Drive Throughput Latency (µs)
MB/s IOPS Mean Median 99th 99.999th
Intel Optane SSD DC P4800X 375GB 1929.7 505.9 29.7 28 65 107
Intel SSD DC P3700 800GB 519.9 136.3 115.5 79 1672 5536
Micron 9100 MAX 2.4TB 518.0 135.8 116.0 105 1112 3152

On a 70/30 read/write mix, the Optane SSD DC P4800X scores 505.9k IOPS, which beats the specification of 500k IOPS by 1%. Both of the flash SSDs deliver roughly the same throughput, a little over a quarter of the speed of the Optane SSD. Intel doesn't provide a latency specification for this workload, but the measurements unsurprisingly fall in between the random read and random write results. While low-end consumer SSDs sometimes perform dramatically worse on mixed workloads than on pure read or write workloads, none of these drives have that problem due to their market positioning and capabilities therein.

Test Configurations Random Access Performance
Comments Locked

117 Comments

View All Comments

  • Billy Tallis - Friday, April 21, 2017 - link

    I said the NVMe driver wasn't manually switched into polling mode; I left it with the default behavior which on 4.8 seems to be not polling unless the application requests. I'm certainly not seeing the 100% CPU usage that would be likely if it was polling.

    If I'd had more time, I would have experimented with the latest kernel versions and the various tricks to get even lower latency.
  • tuxRoller - Friday, April 21, 2017 - link

    I wasn't claiming that you disabled polling only that polling was disabled since it should be on be default for this device.
    Assuming you were looking at the sysfs interface, was the key that was set to 0 called io_poll or io_poll_delay? The later set to 0 enables hybrid polling, so the cpu wouldn't be pegged.
    Either way, you wouldn't need a new kernel, just to enable a feature the kernel has had since 4.4 for these low latency devices.
    Also, did you disable the pagecache (direct=1) in your fio commands? If you didn't, that would explain why aio was faster since it uses dio.
    Btw, it's not my intent to unnecessarily criticize you because i realize the tests were performed under constrained circumstances. I just would've appreciated some comment in the article about a critical feature for this hardware was not enabled in the kernel.
  • yankeeDDL - Friday, April 21, 2017 - link

    Optane was supposed to be 1000x faster, have 1000X endurance and be 10x denser than NAND (http://hothardware.com/ContentImages/NewsItem/4020...
    I realize this is the first product, but saying that it fell short of expectation is an understatement.
    It has lower endurance, lower density and it is measurably faster, but certainly nowhere close 1000X.
    Oh, did I mention it is 5-10X more expensive?

    I am quite disappointed, to be honest. It will get better, but @not ready@ is something that comes to ind reading the article.
  • Billy Tallis - Friday, April 21, 2017 - link

    3D XPoint memory was supposed to be 1000x faster than NAND, 1000x more durable than NAND, and 10x denser than DRAM. Those claims were about the 3D XPoint memory itself, not the Optane SSD built around that memory.
  • ddriver - Friday, April 21, 2017 - link

    It is probably as good as they said... if you compare it to the shittiest SD card from 10 years ago. Still technically NAND ;)
  • yankeeDDL - Monday, April 24, 2017 - link

    I disagree. I can agree that the speed may be limited by the drive, but even so, it falls short by a large factor. The durability and the density, however, are pretty much platform independent and they are not there by a very, very long shot. Intel itself demonstrated that it is only 2.4-3X faster (https://en.wikipedia.org/wiki/3D_XPoint).

    It clearly has a future, especially as the NAND is approaching the end of its scalability. Engineering wise is interesting, but today, it makes really little sense, while it should have been a slam dunk. I mean, who would have thought twice before buying a 500GB drive that maxes out the SATA for $20-30? But this one ... not so much.
  • zodiacfml - Friday, April 21, 2017 - link

    It will perform better in DIMM.
  • factual - Friday, April 21, 2017 - link

    I don't see xpoint replacing dram due to both latency and endurance not being up to par , but It's going to disrupt the ssd market and as the technology matures and prices come down, I can see xpoint revolutionizing the storage market as ssd did years ago.

    Competition is clearly worried since seems like paid trolls are trying to spread falsehoods and bs here and elsewhere on the web.
  • ddriver - Saturday, April 22, 2017 - link

    I just bet it will be highly disturbing to the SSD market LOL. With its inflated price, limited capacity and pretty much unnecessary advantages I can just see people lining up to buy that and leaving SSDs on the shelves.
  • factual - Saturday, April 22, 2017 - link

    You are either extremely ignorant or a paid troll !!! anyone who understands technology knows that new tech is always expensive. When SSDs came to the market, they were much more expensive and had a lot less capacity than HDDs but they closed the gap and disrupted the market. The same is bound to happen for Xpoint which performs better than NAND by orders of magnitude.

Log in

Don't have an account? Sign up now