Random Read Performance

Our first test of random read performance uses very short bursts of operations issued one at a time with no queuing. The drives are given enough idle time between bursts to yield an overall duty cycle of 20%, so thermal throttling is impossible. Each burst consists of a total of 32MB of 4kB random reads, from a 16GB span of the disk. The total data read is 1GB.

Burst 4kB Random Read (Queue Depth 1)

The M.2 Optane modules offer the fastest burst random read speeds when tested as standalone drives, but Intel's caching system imposes substantial overhead. Even with that overhead, the random read performance is far above any solution that doesn't involve 3D XPoint memory. As in past reviews, we find that the Optane Memory/Optane SSD 800P has a slight advantage here over the top of the line Optane SSD 900P.

Our sustained random read performance is similar to the random read test from our 2015 test suite: queue depths from 1 to 32 are tested, and the average performance and power efficiency across QD1, QD2 and QD4 are reported as the primary scores. Each queue depth is tested for one minute or 32GB of data transferred, whichever is shorter. After each queue depth is tested, the drive is given up to one minute to cool off so that the higher queue depths are unlikely to be affected by accumulated heat build-up. The individual read operations are again 4kB, and cover a 64GB span of the drive.

Sustained 4kB Random Read

The sustained random read test covers a larger span of the drive, and the 32GB and 64GB modules are not large enough to cache the entire dataset plus the necessary cache management metadata, leaving them with performance close to that of the the hard drive. The 118GB cache is sufficient to contain the full data set for this test, and its performance is below that of the Optane drives tested as standalone drives, but still out of reach of flash-based storage.

The random read performance scaling of the Optane Memory and 800P drives is rather uneven at higher queue depths, but they do still reach very high throughput. The 118GB cache configuration doesn't scale to higher queue depths as well as the standalone SSD configuration, and the 900P hits a wall at a far lower performance level than it should based on our Linux benchmarking.

Random Write Performance

Our test of random write burst performance is structured similarly to the random read burst test, but each burst is only 4MB and the total test length is 128MB. The 4kB random write operations are distributed over a 16GB span of the drive, and the operations are issued one at a time with no queuing.

Burst 4kB Random Write (Queue Depth 1)

On the burst random write test, the larger two caching configurations perform far above what any standalone drive delivers under Windows. The 32GB Optane Memory module also scores better when used as a cache than as a standalone SSD. It is possible that Intel's caching software is also using a RAM cache and is lying to the benchmark software about whether the writes have actually made it onto non-volatile storage. However, the performance here is not actually beyond what NVMe SSDs deliver when we test them under Linux, so it's somewhat possible that there are simply some much-needed fast paths in Intel's drivers.

As with the sustained random read test, our sustained 4kB random write test runs for up to one minute or 32GB per queue depth, covering a 64GB span of the drive and giving the drive up to 1 minute of idle time between queue depths to allow for write caches to be flushed and for the drive to cool down.

Sustained 4kB Random Write

The sustained random write test covers more data than can be cached on the 64GB Optane Memory M10, so it and the 32GB cache module fall far behind mainstream SATA SSDs. The standalone Optane SSDs continue to offer great performance, and the 118GB Optane SSD 800P as a cache device tops the chart.

For the one configuration with a cache large enough to handle this test, performance scales up much sooner than in the standalone SSD configuration: QD2 gives almost the full random write speed. When the cache is too small, increasing queue depth just makes performance worse.

AnandTech Storage Bench - Light Sequential Performance
Comments Locked

96 Comments

View All Comments

  • jordanclock - Wednesday, May 16, 2018 - link

    Yeah, 64GB is ~59GiB.
  • IntelUser2000 - Tuesday, May 15, 2018 - link

    Billy,

    Could you tell us why the performance is much lower? I was thinking Meltdown but 800P article says it has the patch enabled. The random performance here is 160MB/s for 800P, but on the other article it gets 600MB/s.
  • Billy Tallis - Tuesday, May 15, 2018 - link

    The synthetic benchmarks in this review were all run under Windows so that they could be directly compared to results from the Windows-only caching drivers. My other reviews use Linux for the synthetic benchmarks. At the moment I'm not sure if the big performance disparity is due entirely to Windows limitations, or if there's some system tuning I could do to Windows to bring performance back up. My Linux testbed is set up to minimize OS overhead, but the Windows images used for this reivew were all stock out of the box settings.
  • IntelUser2000 - Tuesday, May 15, 2018 - link

    What is used for the random tests? IOmeter?
  • Billy Tallis - Tuesday, May 15, 2018 - link

    FIO version 3.6, Windows binaries from https://bluestop.org/fio/ (and Linux binaries compiled locally, for the other reviews). The only fio settings that had to change when moving the scripts from Linux to Windows was the ioengine option for selecting which APIs to use for IO. On Linux, QD1 tests are done with synchronous IO and higher queue depths with libaio, and on Windows all the queue depths used asynchronous IO.

    In this review I also didn't bother secure erasing the drives between running the burst and sustained tests, but that shouldn't matter much for these drives.
  • IntelUser2000 - Tuesday, May 15, 2018 - link

    So looking at the original Optane Memory review, the loss must be due to Meltdown as it also gets 400MB/s.
  • Billy Tallis - Tuesday, May 15, 2018 - link

    The Meltdown+Spectre workarounds don't have anywhere near this kind of impact on Linux, so I don't think that's a sufficient explanation for what's going on with this review's Windows results.

    Last year's Optane Memory review only did synthetic benchmarks of the drive as a standalone device, not in a caching configuration because the drivers only supported boot drive acceleration at that time.
  • IntelUser2000 - Tuesday, May 15, 2018 - link

    The strange performance may also explain why its sometimes faster in caching than when its standalone.

    Certainly the drive is capable of faster than that looking at raw media performance.

    My point with the last review was that, whether its standalone or not, the drive on the Optane Memory review is getting ~400MB/s, while in this review its getting 160MB/s.
  • tuxRoller - Wednesday, May 16, 2018 - link

    As Billy said you're comparing the results from two different OSs'
  • Intel999 - Tuesday, May 15, 2018 - link

    Will there be a comparison between the uber expensive Intel approach to sped up boot times with AMD's free approach using StorageMI?

Log in

Don't have an account? Sign up now