Final Words: Is 3D XPoint Ready?

The Intel Optane SSD DC P4800X is a very high-performing enterprise SSD, but more importantly it is the first shipping product using Intel's 3D XPoint memory technology. After a year and a half of talking up 3D XPoint, Intel has finally shipped something. The P4800X proves that 3D XPoint memory is real and that it really works. The P4800X is just a first-generation product, but it's more than sufficient to establish 3D XPoint memory as a serious contender in the storage market.

If your workload matches its strengths, the P4800X offers performance that cannot currently be provided by any other storage product. This means high throughput random access, as well as very strict latency requirements - the results Optane achieves for it's quality of service for latency on both reads and writes, especially in heavy environments with a mixed read/write workload, is a significant margin ahead of anything available on the market.


At 50/50 reads/writes, latency QoS for the DC P4800X is 30x better than the competition

The Intel Optane SSD DC P4800X is not the fastest SSD ever on every single test. It's based on a revolutionary technology, but no matter how high expectations were, very rarely does a first-generation product take over the world unless it becomes ubiquitous and cheap on day one. The Optane SSD is ultimately an expensive niche product. If you don't need high throughput random access with the strictest latency requirements, the Optane SSD DC P4800X may not be the best choice. It is very expensive compared to most flash-based SSDs.

With the Optane SSD and 3D XPoint memory now clearly established as useful and usable, the big question is how broad its appeal will be. The originally announcements around Optane promised a lot, and this initial product delivers a few of those metrics, so to some extent, the P4800X may have to grow its own market and reteach partners what Optane is capable of today. Working with developers and partners is going to be key here - they have to perform outreach and entice software developers to write applications that rely on extremely fast storage. That being said, there are plenty of market segments already that can never get enough storage performance, so anything above what is available in the market today will be more than welcome. 

There's still much more we would like to know about the Optane SSD and the 3D XPoint memory it contains. Since our testing was remote, we have not yet even had the chance to look under the drives's heatsink, or measure the power efficiency of the Optane SSD and compare it against other SSDs. We are awaiting an opportunity to get a drive in hand, and expect some of the secrets under the hood to be exposed in due course as drives filter through the ecosystem.

Mixed Read/Write Performance
Comments Locked

117 Comments

View All Comments

  • extide - Thursday, April 20, 2017 - link

    Queue depth is how many commands the computer has queued up for the drive. The computer can issue commands to the drive faster than it can service them -- so, for example, SATA can support a queue of up to 32 commands. Typical desktop use just doesn't generate enough traffic on the drives to queue up much data so you usually are in the low 1-2, maybe 4 QD. Some server workloads can be higher, but even on a DB server, if you are seeing QD's of 16 I would say your storage is not fast enough for what you are trying to do, so being able to get good performance at low queue depths is truly a breakthrough.
  • bcronce - Thursday, April 20, 2017 - link

    For file servers, it's not just the queue depth that's important, it's the number of queues. FreeBSD and OpenZFS have had a lot of blogs and videos about the issues of scaling up servers, especially in regards to multi-core.

    SATA only supports 1 queue. NVMe supports up to ~65,000 with a depth of ~65,000 each. They're actually having issues saturating high end SSDs because their IO stack can't handle the throughput.

    If you have a lot of SATA drives, then you effectively have many queues, but if you want a single/few super fast device(s), like say L2ARC, you need to take advantage of the new protocol.
  • tuxRoller - Friday, April 21, 2017 - link

    The answer is something like the Linux kernel's block multiqueue (ongoing, still not the default for all devices but it shouldn't be more than a few more cycles). Its been a massive undertaking and involved rewriting many drivers.

    https://lwn.net/Articles/552904/
  • Shadowmaster625 - Thursday, April 20, 2017 - link

    It is a pity intel doesnt make video cards, because 16GB of this would go very well with 4GB of RAM and a decent memory controller. It would lower the overall cost and not impact performance at all.
  • ddriver - Friday, April 21, 2017 - link

    "It would lower the overall cost and not impact performance at all."

    Yeah, I bet. /s
  • Mugur - Friday, April 21, 2017 - link

    I think I read something like this when i740 was launched... :-)

    Sorry, couldn't resist. But the analogy stands.
  • ridic987 - Friday, April 21, 2017 - link

    "It would lower the overall cost and not impact performance at all."

    What? This stuff is around 50x slower than DRAM, which itself is reaching its limits in GPUs, hence features like delta color compression... Right now when your gpu runs out of ram it uses your system ram as extra space, this is a far better system.
  • anynigma - Thursday, April 20, 2017 - link

    "Intel's new 3D XPoint non-volatile memory technology, which has been on the cards publically for the last couple of years"

    I think you mean "IN the cards". In this context, "ON the cards" makes it sound like we've all been missing out on 3D xPoint PCI cards for a "couple of years" :)
  • SaolDan - Thursday, April 20, 2017 - link

    bI think he means it like Its been in the works publicly for a couple of years.
  • DrunkenDonkey - Thursday, April 20, 2017 - link

    A bit of a suggestion - can you divide (or provide in final thoughts) SSD reviews per consumer base? Desktop user absolutely does not care about sequential performance or QD16 or even write for what matters (except for the odd time installing something). Database can't care less about sequential or low QD, etc. Giving the tables is good for the odd few % of the readers that actually know what to look for, the rest just take a look at the end of the graph and take a stunningly wrong idea. Just a few comparisons tailored per use will make it so easy for the masses. It was Anand that fought for that during the early sandforce days, he forced ocz to reconsider their ways to tweak SSDs for real world performance, not graph based and got me as a follower. Let that not die in vain and let those, that lack the specific knowledge be informed. Just look at the comments and see how people interpret the results.
    I know this is enterprise grade SSD, but it is also a showcase for a new technology that will come in our hands soonish.

Log in

Don't have an account? Sign up now