Who is the Optane SSD 900P for?

With a price per GB a little over twice that of the the fastest flash-based consumer SSDs, the Optane SSD 900P is an exclusive high-end product. For most desktop usage, drives like the 960 PRO are already fast enough to make storage no longer a severe bottleneck. The most noticeable delays due to storage performance on a 960 PRO are when moving around large files, and the Optane SSD doesn't offer any significant improvement to sequential transfer speeds. Random writes can be a challenge for flash-based SSDs, but volatile write caches and SLC caches allow them to handle short bursts with very high performance.

The unprecedented random read performance of the Optane SSD 900P is its biggest strength on paper, but not one that will often lead to a proportional speedup in overall application performance. Too many programs and filesystems are still designed with mechanical hard drive performance in mind as the baseline, and further increases to SSD performance serve mainly to shift the bottlenecks further onto the CPU, RAM, network, and even the user's own reaction time.

The scenarios where a drive like the Optane SSD 900P can offer meaningful and worthwhile performance improvements can be broadly categorized as as situations where the Optane SSD can help with one of two problems:

1. Storage is too slow

About the only time a desktop could challenge the sequential access performance of a high-end PCIe SSD (based on flash or 3D XPoint) is when dealing with high resolution uncompressed video. The Optane SSD doesn't help much here because of its limited capacity, and the PCIe 3 x4 link itself is a bottleneck at the highest refresh rates and bit depths. For video work, flash-based SSDs are definitely a better choice, and RAID arrays of cheaper SATA SSDs may be a better option than PCIe SSDs. Desktop workloads that require extremely high sustained random write performance are very rare, and SLC caching on a flash-based SSD nicely takes care of most realistic quantities of random writes.

That said, there are some situations where higher random read performance can be quite noticeable. Searching through a large volume of data is a common case, such as searching through a video, but it usually presents enough opportunities for parallelization that the drive's queue depth will climb up to the range where flash-based SSDs come close to the Optane SSD. Game level load times can in theory benefit greatly from faster read speeds, but in practice decompressing the assets after loading them into RAM quickly becomes the bottleneck. Most of the other situations where the performance advantage of the Optane SSD will really help are better described as a different kind of problem:

2. RAM is too small

In the workstation market, there are abundant examples of compute tasks with a memory working set that doesn't fit in RAM. Almost any simulation or rendering task will have a parameter for mesh density or particle count that can very quickly scale the memory requirements from a few GB to tens or hundreds of GB. An Optane SSD is far slower than four to eight channels of DDR4, but 16GB DIMMs are least 6-7 times more expensive per GB than the Optane SSD 900P, and putting more than 128GB of DRAM in an ATX motherboard is even more expensive.

Intel PR provided an example of using SideFX Houdini to render a high-resolution animation that included a 1.1 billion particle water simulation. Their test used a machine with a 10-core CPU and 64GB of RAM, and compared the 512GB Samsung 960 PRO against the 480GB Optane SSD 900P. The total memory requirements (DRAM+swap) of the rendering job were not disclosed, but the resulting 2.7x speedup is very plausible for a task that absolutely hammers the swap device. With a sufficiently high thread count to keep the queue depth high, that margin could be narrower (especially with the fastest 2TB 960 PRO), but then context switch overhead would become problematic. With the Optane SSD 900P, the random read latency is low enough that it would be hard to host more than two swap-limited threads per core without context switch overhead wasting more time than waiting on the SSD.

Star Citizen Bundle

Even though gaming isn't the ideal workload for the Optane SSD 900P to show off its performance, Intel is marketing the 900P to gaming enthusiasts. They're bundling a code for the game Star Citizen with the 900P, and including a new in-game spaceship variant as an exclusive item for Optane SSD customers. Intel has partnered with Star Citizen developer Roberts Space Industries (RSI) to hold a launch event for the 900P at CitizenCon 2017 today, which they are streaming live on Twitch and YouTube. Attendees will have the chance to playtest the Intel-exclusive Sabre Raven ship, but it is still undergoing final QA and will not be immediately available to Optane SSD 900P customers. The web page for redeeming the Star Citizen game code had not gone live as of the time of writing, so I was unable to attempt any testing with the game. (ed: I remember when AMD was offering a Star Citizen bundle in 2014 as well. The game still hasn't shipped.)

At the media briefing for the 900P, an RSI representative said they are exploring ways to optimize the Star Citizen experience on Optane SSDs, but not many specifics were provided. One approach under consideration is using less compression for some game assets, freeing up CPU time but relying on high storage performance. It didn't sound like this work was close to release. In the game's current state, RSI claims they've seen load times improve by 20-25%, but they didn't specify what other storage device they were comparing against.

Introduction Drive Features
POST A COMMENT

200 Comments

View All Comments

  • ddriver - Friday, October 27, 2017 - link

    I got myself a bucked of salt. The necessary requirement to swallow that Houdini "2.7x better" claim from the launch PR.

    I've been rendering stuff since the days of 3d max for frigging DOS. And I am yet to experience a scenario where CPU load is not in the 99% range.

    Having a rendering job that cannot feed the CPU to above 10% load with the insanely fast 960 pro has got to be an unprecedented case of cooked-up benchmark in human history.
    Reply
  • extide - Friday, October 27, 2017 - link

    Did you read the article? It pretty clearly explains how they got that result, and it makes sense. Reply
  • ddriver - Friday, October 27, 2017 - link

    Oh yeah, I get it. Hypetane is a synthetic beast. Which allows to showcase said advantage as long as you focus on it in a carefully devised and completely detached from real-world usage workload.

    Don't get me wrong. It is good that hypetane is now available in capacities that actually allow to use it. And if endurance turns out to be tangibly better than nand, I might actually buy it. Low queue depth performance is good, especially random read, which may not be of that much practical use to most of the people out there, but I could make good use of that.

    But it will remain "hypetane" even after I go and buy it. Because intel said "1000 times better", and it is not even 10 times better. A zero on its own might be nothing, but two zeroes after a positive number make quite a lot of difference.
    Reply
  • ddriver - Friday, October 27, 2017 - link

    "no other alternative nonvolatile memory technology is close to being ready to challenge 3D XPoint"

    Except for SLC, which was so good it was immediately abandoned once inferior and more profit friendly NAND implementations were available.

    A SLC based product coupled with MRAM cache will easily humiliate hypetane in its few strong aspects.

    Too bad NAND drives are now moving to TLC and QLC, even MLC is heading in the "luxury item" category. Too bad because 3D SLC has tremendous potential. Let's see if it gets realized.
    Reply
  • extide - Friday, October 27, 2017 - link

    How would that work. SLC is slower than Optane, can't be written at a block level, needs trash collection, etc. Then you cache it with a technology similar to Optane? Why not just build a drive with all MRAM, oh yeah, too expensive. Looks like Optane wins. Reply
  • ddriver - Friday, October 27, 2017 - link

    Nope, SLC is actually faster. Look it up.

    And what it cannot do is write at the bit level. Which is not really a big deal. Even CPUs cannot address RAM at bellow a byte, if you want single bit operations, you have to use bitwise operators. Writing at a higher level is actually very efficient, because it reduces overhead. If single bit addressing was important, that's who computers would work.

    Furthermore, single bit writes produce a significant challenge when tracking wear levels. Hypetane still wears out, you know... It will be tremendously harder to accurately track wear at bit level, and I am abot 99.999999% sure it is not how intel does it, meaning that a lot of that supposed extra endurance will be forfeited by managing wear at a coarsely grained level. They won't be managing that at bit level, the overhead will be tremendous and will completely diminish potential advantages.

    The MRAM cache will reduce a lot of write amplification and garbage collection.

    It also looks like 3d SLC has about 3 times the density of the chips intel is currently using for hypetane.

    "Why not just build a drive with all MRAM" - density is too low. Which is also why we use RAM for working memory, I mean volatility can easily be solved by say adding a RTG battery to a DRAM drive, giving it effectively about a century of continuous, uninterrupted power. It is doable, but then again, redundant, and while it is true that the industry does a lot of pointless things nowadays, the only ones that qualify are those with a desirable usability to profitability ratio, and a RTG DRAM drive is simply too good to offer...

    "Looks like Optane wins" - anyone can win when running unopposed. The moment someone makes a SLC/MRAM hybrid and it loses to hypetane, I will retract my statement and admit I was wrong. I have zero problem with that ;)
    Reply
  • vanilla_gorilla - Friday, October 27, 2017 - link

    So you're saying Optane sucks because it would be slower than a drive that doesn't exist? Reply
  • ddriver - Friday, October 27, 2017 - link

    No, I am saying it "sucks" because for all intents and purposes, it is not any faster than a 2 year old drive that it was supposed to beat by a 1000 times.

    And the reason I put it "sucks" is because I never said it does suck. I give it a very realistic valuation. What sucks is how far that realistic valuation is from what intel promised. Which is entirely on them.
    Reply
  • name99 - Friday, October 27, 2017 - link

    He's saying two distinct things.
    (a) This costs too much for what it delivers. IF Samsung wanted to compete with it, they could do so with a suite of existing technologies. But they probably won't do so because there is little demand for a product like this; honestly it only exists so that Intel can say "see, 3D-XPoint is too, real".

    (b) The place where 3D-XPoint ACTUALLY makes sense is, more or less, what AnandTech says --- as a slower (but much larger) RAM replacement. That's what plays to the technology's strengths (simple controller, byte-level access). But Intel STILL are not shipping that --- which makes one wonder WTF not?

    It IS reasonable to point out that Intel has been lying about this product since the day it was announced, and that the only reason they're shipping these SSD drives is to throw up more smoke to hide the fact that the actually sensible use case remains (for some reason) impossible.

    Being a fanboy isn't about always praising your company, it's about refusing to criticize your company even when they're clearly in the wrong. Intel is clearly in the wrong here, in the sense that nothing that they promised about Optane is actually reality even today, two years after the announcement.
    If you think that's reasonable behavior, ask yourself how you would react if your favorite villainous company did the same.
    Would you be impressed if AMD announced that they're going to ship a GPU 1000x faster than the competition, and two years later all they have is something 2.7x as fast (under very specialized circumstances)?
    Would you let Apple off the hook if they said that the Apple car was going to have 1000x the range of a Tesla, then they shipped two years later, a car with 2.7x the range of a Tesla?
    Reply
  • Drumsticks - Saturday, October 28, 2017 - link

    Re: AMD example: if AMD claimed a product would be 100x or 1000x faster than Nvidia, but only delivered something 6-10x faster in the majority of cases, and on par in the rest, for only 2-3x more money, I'd still be pretty satisfied. Reply

Log in

Don't have an account? Sign up now