Sequential Read

Sequential access is usually tested with 128kB transfers, which is large enough that requests can typically be striped across multiple controller channels and still involve writing a full page or more to the flash on each channel. Real-world sequential transfer sizes vary widely depending on factors like which application is moving the data or how fragmented the filesystem is.

The drives were preconditioned with two full writes using 4kB random writes, so the data on each drive is entirely fragmented. This may limit how much prefetching of user data the drives can perform on the sequential read tests, but they can likely benefit from better locality of access to their internal mapping tables. These tests were conducted on the Optane Memory as a standalone SSD, not in any caching configuration.

Queue Depth 1

The test of sequential read performance at different transfer sizes was conducted at queue depth 1. Each transfer size was used for four minutes, and the throughput was averaged over the final three minutes of each test segment.

Sequential Read
Vertical Axis scale: Linear Logarithmic

The three PCIe drives show similar growth through the small to mid transfer sizes, but the Optane Memory once again has the highest performance for small transfers and higher performance across the board than the Samsung 960 EVO.

Queue Depth > 1

For testing sequential read speeds at different queue depths, we use the same overall test structure as for random reads: total queue depths of up to 64 are tested using a maximum of four threads. Each thread is reading sequentially but from a different region of the drive, so the read commands the drive receives are not entirely sorted by logical block address.

The Samsung 960 EVO and Optane Memory start out with QD1 sequential read performance and latency that is relatively close, but then at higher queue depths the Optane Memory jumps up to a significantly higher throughput.

Sequential Read Throughput
Vertical Axis scale: Linear Logarithmic

The two Optane devices saturate for sequential reads at QD2, but the Optane Memory experiences a much smaller jump from its QD1 throughput. The flash SSDs are mostly saturated from the start. The Crucial MX300 delivers far lower performance than SATA allows for, due to this test being multithreaded with up to four workers reading from different parts of the drive.

Sequential Read Latency
Mean Median 99th Percentile 99.999th Percentile

Since all four drives are saturated through almost all of this test, the latency graphs are fairly boring: increasing queue depth increases latency. For mean and median latency the Optane Memory and the Samsung 960 EVO are relatively close, but for the 99th and 99.999th percentile metrics the 960 EVO is mostly slower than the Optane Memory by about the same factor of two that the P4800X beats the Optane Memory by.

Sequential Write

The sequential write tests are structured identically to the sequential read tests save for the direction the data is flowing. The sequential write performance of different transfer sizes is conducted with a single thread operating at queue depth 1. For testing a range of queue depths, a 128kB transfer size is used and up to four worker threads are used, each writing sequentially but to different portions of the drive. Each sub-test (transfer size or queue depth) is run for four minutes and the performance statistics ignore the first minute. These tests were conducted on the Optane Memory as a standalone SSD, not in any caching configuration.

Sequential Write
Vertical Axis scale: Linear Logarithmic

The enterprise-focused Optane SSD P4800X is slower than the consumer Optane Memory for sequential writes of less than 4kB, and even the Samsung 960 EVO beats the P4800X at 512B transfers. The 960 EVO's performance is inconsistent through the second half of the test but on average it is far closer to the MX300 than either Optane device. For larger transfers the MX300 is about a tenth the speed of the Optane Memory.

Queue Depth > 1

The sequential write throughput of the Optane SSD DC P4800X dwarfs that of the other three drives, even the Optane Memory. The Optane Memory does provide substantially higher throughput than the flash SSDs, but it does not have a latency advantage for sequential writes.

Sequential Write Throughput
Vertical Axis scale: Linear Logarithmic

The Crucial MX300 is the only drive that does not get a throughput boost going from QD1 to QD2; as with the random write test it is not able to improve performance when the higher queue depth is due to multiple threads writing to the drive. The Samsung 960 EVO improves from the addition of a second thread but beyond that it simply gets more inconsistent. The Optane Memory and P4800X are both very consistent and saturated at QD2 after a moderate improvement from QD1.

Sequential Write Latency
Mean Median 99th Percentile 99.999th Percentile

The flash SSDs get more inconsistent with increased thread count and queue depth, but other than that the latency charts show the predictable growth in latency that comes from the drives all being saturated in terms of throughput.

Random Access Performance Mixed Read/Write And Idle Power Consumption
Comments Locked

110 Comments

View All Comments

  • Billy Tallis - Wednesday, April 26, 2017 - link

    As long as you have Intel RST RAID disabled for NVMe drives, it'll be accessible as a standard NVMe device and available for use with non-Intel caching software.
  • fanofanand - Tuesday, April 25, 2017 - link

    I came here to read ddriver's "hypetane" rants, and I was not disappointed!
  • TallestJon96 - Tuesday, April 25, 2017 - link

    Too bad about the drive breaking.

    As an enthusiast who is gaming 90% of the time with my pc, I don't think this is for me right now. I actually just bought a 960 evo 500gb to compliment my 1 tb 840 evo. Overkill for sure, but I'm happy with it, even if the difference is sometimes subtle.

    This technology really excites me. If they can get a system running eith no Dram or Nand, and just use a large block of Xpoint, that could make for a really interesting system. Put 128 gb of this stuff paired with a 2c/4t mobile chip in a laptop, and you could get a really lean system that is fast for every day usage cases (web browsing, video watching, etc).

    For my use case, I'd love to have a reason to buy it (no more loading times ever would be very futuristic) but it'll take time to really take off.
  • MrSpadge - Tuesday, April 25, 2017 - link

    > no more loading times

    Not going to happen, because there's quite some CPU work involved with loading things.
  • SanX - Tuesday, April 25, 2017 - link

    Blahblahblah indurance, price, consumption, superspeed. Where they are? ROTFLOL At least don't show these shameful speeds if you opened your mouth this loud, Intel. No one will ever look at anything less then 3.5GB/s set by Samsung 960 Pro if you trolled about superspeeds.
  • cheshirster - Wednesday, April 26, 2017 - link

    Is there any technical reasoning why this won't work with older CPU's?
    I don't see this being any different than Intel RST.
  • KAlmquist - Thursday, April 27, 2017 - link

    I think that Intel SRT caches reads, whereas the Optane Memory caches both reads and writes. My guess is that when Intel SRT places data in the cache, it doesn't immediately update the non-volatile lookup tables indicating where that data is stored. Instead, it probably waits until a bunch of data has been added, and then records the locations of all of the cached data. The reason for this would be that NAND can only be written in page units. If Intel were to update the non-volatile mapping table every time it added a page of data to the cache, that would double the amount of data written to the caching SSD.

    If I'm correct, then with Intel SRT, a power loss can cause some of the data in the SSD cache to be lost. The data itself would still be there, but it won't appear in the lookup table, making it inaccessible. That doesn't matter because SRT only caches reads, so the data lost from the cache will still be on the hard drive.

    In contrast, Optane Memory memory presumably updates the mapping table for cached data immediately, taking advantage of the fact that it uses a memory technology that allows small writes. So if you perform a bunch of 4K random writes, the data is written to the Optane storage only, resulting in much higher write performance than you would get with Intel SRT.

    In short, I would guess that Optane Memory uses a different caching algorithm than Intel SRT; an algorithm that is only implemented in Intel's latest chipsets.

    That's unfortunate, because if Optane Memory were supported using software drivers only (without any chipset support), it would be a very attractive upgrade to older computer systems. At $44 or $77, an Optane Memory device is a lot less expensive than upgrading to an SSD. Instead, Optane Memory is targeted at new systems, where the economics are less compelling.
  • mkozakewich - Thursday, April 27, 2017 - link

    I would really like to see the 16GB Optane filled with system paging file (on a device with 2 or 4 GB of RAM) and then do some general system experience tests. This seems like the perfect solution: The system is pretty good about offloading stuff that's not needed, and pulling needed files into working memory for full speed; and the memory can be offloaded to or loaded from the Optane cache quickly enough that it shouldn't cause many slowdowns when switching between tasks. This seems like the best strategy, in a world where we're still seeing 'pro' devices with 4 GB of RAM.
  • Ugur - Monday, May 1, 2017 - link

    I wish Intel would release Optane sticks/drives of 1-4TB sizes asap and sell them for 100-300 more than SSDS of same size immediately.
    I'm kinda disappointed they do this type of tiered rollout where it looks like it'll take ages until i can get an Optane drive at larger sizes for halfway reasonable prices.
    Please Intel, make it available asap, i want to buy it.
    Thanks =)
  • abufrejoval - Monday, May 8, 2017 - link

    Well the most important thing is that Optane is now real a product on the market, for consumers and enterprise customers. So some Intel senior managers don’t need to get fired or cross off items on their bonus score cards.

    Marketing will convince the world that Optane is better, most importantly that only Intel can have it inside: No ARM, no Power no Zen based server shall ever have it.

    For the DRAM-replacement variant, that exclusivity had a reason: Without proper firmware support, that won’t work and without special cache flushing instructions it would be too slow or still volatile.
    Of course, all of that could be shared with the competition, but who want to give up a practical monopoly, which no competition can contest in court before their money runs out.

    For the PCIe variant Intel, chipset and OS dependencies are all artificial, but doesn’t that make things better for everyone? Now people can give up ECC support in cheap Pentiums and instead gain Optane support for a premium on CPUs and chipsets, which use the very same hardware underneath for production cost efficiency. Whoever can sell that, truly deserves their bonus!

    Actually, I’d propose they be paid in snake oil.

    For the consumer with a linear link between Optane and its downstream storage tier, it means the storage path has twice as many opportunities to fail. For the service technician it means he has four times as many test scenarios to perform. Just think on how that will double again, once Optane does in fact also come to the DIMM socket! Moore’s law is not finished after all! Yeah!

    Perhaps Microsoft could be talked into creating a special Optane Edition which offers much better granularity for forensic data storage, and surely there would be plenty of work for security researchers, who just love to find bugs really, really deep down in critical Intel Firmware, which is designed for the lowest Total Cost of TakeOwnership in the industry!

    Where others see crisis, Intel creates opportunities!

Log in

Don't have an account? Sign up now