Final Words: Is 3D XPoint Ready?

The Intel Optane SSD DC P4800X is a very high-performing enterprise SSD, but more importantly it is the first shipping product using Intel's 3D XPoint memory technology. After a year and a half of talking up 3D XPoint, Intel has finally shipped something. The P4800X proves that 3D XPoint memory is real and that it really works. The P4800X is just a first-generation product, but it's more than sufficient to establish 3D XPoint memory as a serious contender in the storage market.

If your workload matches its strengths, the P4800X offers performance that cannot currently be provided by any other storage product. This means high throughput random access, as well as very strict latency requirements - the results Optane achieves for it's quality of service for latency on both reads and writes, especially in heavy environments with a mixed read/write workload, is a significant margin ahead of anything available on the market.


At 50/50 reads/writes, latency QoS for the DC P4800X is 30x better than the competition

The Intel Optane SSD DC P4800X is not the fastest SSD ever on every single test. It's based on a revolutionary technology, but no matter how high expectations were, very rarely does a first-generation product take over the world unless it becomes ubiquitous and cheap on day one. The Optane SSD is ultimately an expensive niche product. If you don't need high throughput random access with the strictest latency requirements, the Optane SSD DC P4800X may not be the best choice. It is very expensive compared to most flash-based SSDs.

With the Optane SSD and 3D XPoint memory now clearly established as useful and usable, the big question is how broad its appeal will be. The originally announcements around Optane promised a lot, and this initial product delivers a few of those metrics, so to some extent, the P4800X may have to grow its own market and reteach partners what Optane is capable of today. Working with developers and partners is going to be key here - they have to perform outreach and entice software developers to write applications that rely on extremely fast storage. That being said, there are plenty of market segments already that can never get enough storage performance, so anything above what is available in the market today will be more than welcome. 

There's still much more we would like to know about the Optane SSD and the 3D XPoint memory it contains. Since our testing was remote, we have not yet even had the chance to look under the drives's heatsink, or measure the power efficiency of the Optane SSD and compare it against other SSDs. We are awaiting an opportunity to get a drive in hand, and expect some of the secrets under the hood to be exposed in due course as drives filter through the ecosystem.

Mixed Read/Write Performance
Comments Locked

117 Comments

View All Comments

  • ddriver - Friday, April 21, 2017 - link

    *450 ns, by which I mean lower by 450 ns. And the current xpoint controller is nowhere near hitting the bottleneck of PCIE. It would take a controller that is at least 20 times faster than the current one to even get to the point where PCIE is a bottleneck. And even faster to see any tangible benefit from connecting xpoint directly to the memory controller.

    I'd rather have some nice 3D SLC (better than xpoint in literally every aspect) on PCIE for persistent storage RAM in the dimm slots. Hyped as superior, xpoint is actually nothing but a big compromise. Peak bandwidth is too low even compared to NVME NAND, latency is way too high and endurance is way too low for working memory. Low queue depths performance is good, but credit there goes to the controller, such a controller will hit even better performance with SLC nand. Smarter block management could also double the endurance advantage SLC already has over xpoint.
  • mdriftmeyer - Saturday, April 22, 2017 - link

    ddriver is spot on. just to clarify an early comment. He's correct and the IntelUser2000 is out of his league.
  • mdriftmeyer - Saturday, April 22, 2017 - link

    Spot on.
  • tuxRoller - Friday, April 21, 2017 - link

    We don't know how much slower the media is than dram right now.
    We know than using dram over nvme has similar (though much better worst case) perf to this.
    See my other post regarding polling and latency.
  • bcronce - Saturday, April 22, 2017 - link

    Re-reading, I see it says "typical" latency is under 10us, placing it in spitting distance of DDR3/4. It's the 99.9999th percentile that is 60us for Q1. At Q16, 99.999th percentile is 140us. That means it takes only 140us to service 16 requests. That's pretty much the same as 10us.

    Read Q1 4KiB bandwidth is only about 500MiB/s, but at Q8, it's about 2GiB which puts it on par with DDR4-2400.
  • ddriver - Saturday, April 22, 2017 - link

    "placing it in spitting distance of DDR3/4"

    I hope you do realize that dram latency is like 50 NANOseconds, and 1 MICROsecond is 1000 NANOseconds.

    So 10 us is actually 200 times as much as 50 ns. Thus making hypetane about 200 times slower in access latency. Not 200%, 200X.
  • tuxRoller - Saturday, April 22, 2017 - link

    Yes, the dram media is that fast but when it's exposed through nvme it has the latency characteristics that bcronce described.
  • wumpus - Sunday, April 23, 2017 - link

    That's only on a page hit. For the type of operations that 3dxpoint is looking at (4k or so) you won't find it on an open page and thus take 2-3 times as long till it is ready.

    That still leaves you with ~100x latency. And we are still wondering if losing the PCIe controller will make any significant difference to this number (one problem is that if Intel/Micron magically fixed this, the endurance is only slightly better than SLC and would quickly die if used as main memory).
  • ddriver - Sunday, April 23, 2017 - link

    Endurance for the initial batch postulated from intel's warranty would be around 30k PE cycles, and 50k for the upcoming generation. That's not "only slightly better than SLC" as SCL has 100k PE cycles endurance. But the 100k figure is somewhat old, and endurance goes down with process node. So at a comparable process, SLC might be going down, approaching 50k.

    It remains to be seen, the lousy industry is penny pinching and producing artificial NAND shortages to milk people as much as possible, and pretty much all the wafers are going into TLC, some MLC and why oh why, QLC trash.

    I guess they are saving the best for last. 3D SLC will address the lower density, samsung currently has 2 TB MLC M2, so 1 TB is perfectly doable via 3D SLC. I am guessing samsung's z-nand will be exactly that - SLC making a long overdue comeback.
  • tuxRoller - Sunday, April 23, 2017 - link

    The endurance issue is, imho, the biggest concern right now.

Log in

Don't have an account? Sign up now