Single-Threaded Performance

This next batch of tests measures how much performance can be driven by a single thread performing asynchronous I/O to produce queue depths ranging from 1 to 64. With these fast SSDs, the CPU can become the bottleneck at high queue depths and clock speed can have a major impact on IOPS.

4kB Random Reads

With a single thread issuing asynchronous requests, all of the SSDs top out around 1.2GB/s for random reads. What separates them is how high a queue depth is necessary to reach this level of performance, and what their latency is when they first reach saturation.

Random Read Throughput
Throughput: IOPS MB/s
Latency: Mean Median 99th Percentile 99.999th Percentile

For the Optane SSDs, the queue depth only needs to reach 4-6 in order to be near the highest attainable random read performance, and further increases in queue depth only add to the latency without improving throughput. The flash-based SSDs require queue depths well in excess of 32. Even long after the Optane SSDs have reached saturation an latency has begun to climb, the Optane SSDs continue to offer better QoS than the flash SSDs.

4kB Random Writes

The Optane SSDs offer the best single-threaded random write performance, but the margins are much smaller than for random reads, thanks to the write caches on the flash-based SSDs. The flash SSDs have random write latencies that are only 2-3x higher than the Optane SSD's latency, and the throughput advantage of the Optane SSD at saturation is less than 20%.

Random Write Throughput
Throughput: IOPS MB/s
Latency: Mean Median 99th Percentile 99.999th Percentile

The Optane SSDs saturate around QD4 where the CPU becomes the bottleneck, and the flash based SSDs follow suit between QD8 and QD16. Once all the drives are saturated at about the same throughput, the Optane SSDs offer far more consistent performance.

128kB Sequential Reads

With the large 128kB block size, the sequential read test doesn't hit a CPU/IOPS bottleneck like the random read test above. The Optane SSDs saturate at the rated throughput of about 2.4-2.5GB/s while the Micron 9100 MAX and the Intel P3608 scale to higher throughput.

Sequential Read Throughput
Throughput: IOPS MB/s
Latency: Mean Median 99th Percentile 99.999th Percentile

The Optane SSDs reach their full sequential read speed at QD2, and the flash-based SSDs don't catch up until well after QD8. The 99th and 99.999th percentile latencies of the Optane SSDs are more than an order of magnitude lower when the drives are operating at their respective saturation points.

128kB Sequential Writes

Write caches again allow the flash-based SSDs to approach the write latency of the Optane SSDs, albeit at lower throughput. The Optane SSDs quickly exceed their specified 2GB/s sequential write throughput while the flash-based SSDs have to sacrifice low latency in order to reach high throughput.

Sequential Write Throughput
Throughput: IOPS MB/s
Latency: Mean Median 99th Percentile 99.999th Percentile

As with sequential reads, the Optane SSDs reach saturation at a mere QD2, while the flash-based SSDs need until around QD8 to scale up to full throughput. By the time the flash-based SSDs reach their maximum speed, their latency has at least doubled.

Performance VS Transfer Size Mixed Read/Write Performance
POST A COMMENT

58 Comments

View All Comments

  • "Bullwinkle J Moose" - Thursday, November 9, 2017 - link

    Humor me.....

    How fast can you copy and paste a 100GB file from and to the same Optane SSD

    I don't believe your mixed mode results adequately demonstrate the internal throughput

    At least not until you demonstrate a direct comparison
    Reply
  • Billy Tallis - Thursday, November 9, 2017 - link

    Your concept of "internal throughput" has no basis in reality. File copies (on a filesystem that does not do copy-on-write) require the file data to be read from the SSD into system DRAM, then written back to the SSD. There are no "copy" commands in the NVMe command set. Reply
  • "Bullwinkle J Moose" - Thursday, November 9, 2017 - link

    "There are no "copy" commands in the NVMe command set."
    ---------------------------------------------------------------------------------
    That might be fixed with a few more onboard processors in the future but does not answer my question

    How fast can you copy/paste 100GB on THAT specific drive?
    Reply
  • "Bullwinkle J Moose" - Thursday, November 9, 2017 - link

    Better yet, I'd like you to GUESS how fast it can copy and paste based on your mixed mode analysis and then go measure it Reply
  • Lord of the Bored - Friday, November 10, 2017 - link

    How will a new processor change that there is no way to tell the drive to do what you want? We don't trust storage devices to "do what I mean", because the cost of a mistake is too high. No device anyone should be using will say "it looks like they're writing back the data they just read in, I'mma ignore the input and duplicate it from the cache to save time." Especially since they can't know if the data is changed in advance.

    Barring a new interface standard, it will take exactly as long to copy a file to another location on the same drive as it will to read the file and then write the file, because that is the only provision within the NVMe command set.
    Reply
  • "Bullwinkle J Moose" - Thursday, November 9, 2017 - link

    What would happen if Intel Colludes with AMD to implement this technology into onboard graphics instead of AMD's plan to use Flash in their graphics cards ?

    Seems to me like Internal throughput would be very important to the design
    Reply
  • Samus - Thursday, November 9, 2017 - link

    That is also file system dependent. For example, in Mac OS High Sierra, you can copy and paste (duplicate) any size file instantly on any drive formatted with APFS.

    But your question of a block by block transfer of a file internally for a 100GB file would likely take 50 seconds if not factoring in file system efficiency.
    Reply
  • cygnus1 - Thursday, November 9, 2017 - link

    That's not a copy of the file though. It's just a duplicate file entry referencing the same blocks. That and things like snapshots are possible thanks to the copy on write nature of that file system. But, if any of those blocks were to become corrupted, both 'copies' of the file are corrupt. Reply
  • "Bullwinkle J Moose" - Thursday, November 9, 2017 - link

    Good call Samus

    I noticed that the Windows Fall Crappier Edition takes MUCH longer to copy/move/paste in several tests than the earlier versions of "Spyware Platform 10"

    as well as gives me a "Format Disk Now?" Prompt after formatting a new disk with Disk Manager
    as well as making a zero byte backup with Acronis 2015 (Possible anomaly, Will test again)
    as well as breaking compatibility with several programs that worked in earlier versions
    as well as asking permission to do anything with my files and then often failing to do it
    as well as, well.....you get the idea, they fix one thing and break 5 more

    Disclaimer:
    I do NOT believe that xpoint could be used in its current form for onboard graphics!
    But I'd like to know that the numbers you are getting at AnandTech match your/my expectations and if not, why not?

    Sorry if I'm sounding like an AHole but I'd like to know what this drive can really do, and then what Microsoft graciously allows it to do ?

    make sense?
    Reply
  • PeachNCream - Friday, November 10, 2017 - link

    It doesn't make sense to even care what a client OS would do with this drive. It's enterprise hardware and is priced/positioned accordingly. Instead of the P4800X, check out the consumer 900p version of Octane instead. It's a lot more likely that model will end up sitting in a consumer PC or an office workstation. Reply

Log in

Don't have an account? Sign up now