Analyzing Intel-Micron 3D XPoint: The Next Generation Non-Volatile Memory

Name: Analyzing Intel-Micron 3D XPoint: The Next Generation Non-Volatile Memory
Item: Analyzing Intel-Micron 3D XPoint: The Next Generation Non-Volatile Memory

by Kristian Vättö, Ian Cutress & Ryan Smith on July 31, 2015 11:00 AM EST

80 Comments | Add A Comment

80 Comments

Products

During the event, Intel and Micron made it clear that this week's announcement is solely about the underlying 3D XPoint technology. Products based on this new technology will follow sometime next year and the companies were quite tight-lipped when it came to details, but they did give away a few hints. First of all, the co-operation between Intel and Micron only exists at the memory technology level and both companies are developing their own 3D XPoint based products, similar to how the two have operated in the SSD/NAND business. Technically this means that the two will be competing against each other, although it's possible that each company will take a unique approach to utilizing 3D XPoint in an end product.

One take away from the presentation and Q&A was Intel's emphasis on NVMe. Intel has been a strong advocate of the technology ever since its inception, and as a matter of fact Intel was the first SSD vendor to ship NVMe SSDs in high volume with the introduction of the DC P3700 and its derivatives last year. While NVMe has mostly been associated with NAND so far since it is mainstream non-volatile memory, the core architecture was built to scale with future memory technologies with even lower latencies (after all, NVMe stands for Non-Volatile Memory Express). Given that software interfaces tend to stick around for at least a decade, it's obvious that NVMe had to be designed with more than just NAND in mind.

With NVMe it's certain that we will see 3D XPoint based PCIe SSDs. Whether these will be add-in cards or 2.5" drives remains to be seen, but I'm more inclined to say add-in cards (at least initially) because of the connector limitations. U.2 (former SFF-8639) supports only four PCIe 3.0 lanes, resulting in effective real world bandwidth of about 3.2GB/s. NAND is already capable of saturating that for read operations, so even though 3D XPoint would improve write and random IO performance, the full potential would ultimately go unused without a higher bandwidth interface. An add-in card doesn't share the limitations of U.2 and could support up to 16 lanes with over 10GB/s of bandwidth available, but the downside would more limited serviceability since add-in cards can't be front-loaded like 2.5" drives can. As the enterprises have used add-in cards in the past (Fusion-io never made anything but add-in cards), I don't see serviceability being a major hurdle for the companies that really need 3D XPoint for their workloads. On the other hand, I wouldn't be surprised to see Intel pushing for an 8-lane U.2-like standard, but it really needs industry-wide support to get air under the wings.

With Intel being the other party in the joint-venture, it's guaranteed that 3D XPoint will get all support and love it needs on the platform side. Intel can integrate more PCIe lanes and/or accelerate the development of PCIe 4.0 for its upcoming platforms to create the necessary bandwidth and push for 3D XPoint if needed, which is something that no other memory vendor could do.

AgigA's DDR4 NVDIMM: A Future 3D XPoint Form Factor?

While Intel will clearly be pursuing the storage aspect of 3D XPoint through NVMe, I suspect Micron might take a more memory-like approach since it's a memory company as much as it's a storage company. It was made clear that 3D XPoint can be used in memory and storage applications because the technology is bit-addressible and can work in a similar fashion as DRAM. Bringing 3D XPoint closer to the CPU and connecting it through a DDR4 interface would obviously yield the best performance and eliminate any bottlenecks that PCIe has. There are already NAND-based products that do this, such as SanDisk's ULLtraDIMM, and a couple of months ago JEDEC paved the way by releasing a standard for DDR4 NVDIMMs, a new standard set to fill the gap between DRAM and SSDs. While NVDIMMs will require driver work due to the lack of standardized software interface like NVMe, I do believe 3D XPoint is the right technology for bringing NVDIMMs to the market and it would make sense for Micron to do so.

Applications

Section by Ryan Smith

The use cases for 3D XPoint are potentially significant in number and Intel/Micron believe that it will open the doors for all sorts of new applications. Overall the computing industry has had access to high speed non-volatile memory technologies before – magnetic core memory is the traditional poster child here – so there is some precedence here and some fundamental research into the field from the early days of computing. However with magnetic core memory having become outmoded before the majority of our readers were even born, the modern computing industry has developed around the current paradigm of fast DRAM and slow permanent storage. As a result while the potential applications are numerous, it’s still in many ways an uncharted area in computer science.

The most immediate application of 3D XPoint based products will be as a layer of storage between DRAM and SSDs. Over the history of computing the number of layers between storage and processors has continued to build – multiple layers of on-die caches, off-die caches, caching SSDs, etc – and 3D XPoint memory would further fit into that heiarchy as a storage medium that bridges the gap between DRAM and the current fastest non-volatile storage. By treating 3D XPoint memory as another layer of cache, 3D XPoint can be used to further speed up applications that are currently bound by either memory capacity or storage latency.

<
The Traditional Memory Heiarchy (Image Source: Tommy MacWilliam, Harvard)

Given the costs of 3D XPoint, the first such applications are expected to be on the enterprise side. Enterprise users make heavy use of storage at all layers in order to balance performance needs against the relatively small capacity of DRAM. Database servers in particular adapt well to caching, and it’s easy enough to imagine a next-generation database system using 3D XPoint to backstop DRAM. Since 3D XPoint is non-volatile, it can even be an exclusive cache – that is, its contents don’t need to be in lower layers as well – which eliminates a good deal of overhead. A database system in this context would only need to write contents to SSDs and other, lower layers of storage when data gets expelled from the 3D XPoint cache, an occurrence that may be particularly rare with the properly tuned database.

Many of these benefits of a cache layer are applicable to other types of storage-heavy servers as well, though I expect databases will be the king. Perhaps the more interesting aspect – and certainly more relatable to the public at large – will be what 3D XPoint-backed servers are used for. Intel and Micron are eager to point out the “big science” uses for the technology; projects and systems such as the Large Hadron Collider and Oak Ridge’s Titan supercomputer can generate a massive amount of data, and while processing all of that data is first and foremost a processor issue, feeding that data for processing is a big problem as well. Any kind of analysis that could benefit from individual processors having RAM-like access to an SSD-sized pool of data could benefit.

The catch is that there’s still a lot of research that’s needed into figuring out what the best uses may be. This kind of shift in access times and capacity doesn’t just make computers faster, but it can change the fundamentals of what algorithms are best. Just as how GPUs required scientists to figure out how to spread out their work in a massively parallel (and high latency) nature, putting 3D XPoint to its full use will require newer algorithms that are capable of effectively utilizing direct access to so much data at once.

Meanwhile I would be surprised if the financial industry didn’t jump on this early, as they are prone to jumping on major technologies in order to try to get an edge in a highly competitive and lucrative field. In this aspect it’s not so much that 3D XPoint would improve processing speed – such work is already offloaded to large RAM pools when possible – but rather it would enable traders and analysts to run simulations against much larger datasets much more effectively.

As for the consumer space, the same principles about an additional cache layer would apply, but I’m not so sure we’d see consumers pick it up in the same manner. Much of this has to do with what the eventual costs and capacities of 3D XPoint products would be, as consumers are much more price sensitive than professional users. In the consumer space we’ve seen sporadic use of NAND-backed hard drives, for example, but by and large consumers have stuck with discrete SSDs and HDDs. Consumers either don’t want to pay the premium for SSDs, or have enough money to just buy large SSDs outright, leaving little of a middle ground.

That said I’ve seen some interesting pitches for 3D XPoint in the gaming space that have some merit, as games are something of a special case for consumer workloads. By and large we want fast access to game resources since those resources are accessed on-demand and are needed to progress in a game’s execution, but the assets themselves aren’t volatile. Only a small part of the working set for a game is volatile data – player positions, AI decision trees, game state, etc – while the rest of it is static data such as models, world geometry, and textures. 3D XPoint in turn would be fast enough that it could be used as a replacement for RAM in holding these assets, but as the data is non-volatile it wouldn’t thrash 3D XPoint P/E cycles very much, and any write speed disadvantage compared to DRAM would be immaterial.

But again, this is going to depend on the cost of the technology; if it were to become cheap enough that 50-100GB could be thrown in a game console or gaming PC, then you could store the entirety of most games in 3D XPoint memory, which would reduce load times to the time required to process the data and setup the game state. This is more important in consoles which currently store their games on a mechanical drive, who then could recall data rather quickly on first boot or adjust for large amounts of memory swapping for more detailed titles. High end PCs with large amounds of DRAM can already use RAMDisks perhaps nullifying a point there.

Last but not least of course are the implications for 3D XPoint as a wholesale replacement for DRAM. The more limited lifetime of 3D XPoint relative to DRAM certainly poses some challenges in this respect, but I suspect the bigger issue will be overall bandwidth. By the time 3D XPoint becomes available in bulk, DRAM technology should be to the point where faster-generations of DDR4 are available and HBM is widely deployed. Given that future generations of HBM are targeting 1TB/sec or more of memory bandwidth, it’s unlikely that 3D XPoint is going to be able to match the bandwidth of contemporary high-bandwidth DRAM solutions. So any rumors of the impending death of DRAM are likely premature.

IoT & Embedded, A Good Fit For 3D XPoint?

But with that said, while 3D XPoint isn’t likely to replace DRAM in a wholesale manner for all applications, there is clearly room for it to replace DRAM in some situations where DRAM is used primarily for its bandwidth and latency versus solid state storage. Replacing DRAM with 3D XPoint in embedded applications for example would be very practical – many embedded uses don’t need high bandwidth or low latency as much as they just something better than traditional NAND – and I wouldn’t rule out smartphones here either, at least to an extent. If individual 3D XPoint chips can be produced small and cheap enough, then the most lucrative use for the tech as a DRAM replacement may be in the vast legions of low-performance devices, rather than in high-performance hardware that actually needs the full speed and latency of DRAM.

Estimating 3D XPoint Die Size & What Happens to 3D NAND Final Thoughts

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

80 Comments

View All Comments

FunBunny2 - Friday, July 31, 2015 - link
If you want to know what's being sold, go back and look up Unity Semiconductor's CMOx tech. Rambus bought them, then Rambus and Micron settled, including a patent sharing arrangement. The last Unity CEO said, just before Rambus bought them, that 2015 was production year. Could be.
nwarawa - Friday, July 31, 2015 - link
I can't wait for this to be a normal conversation:
A:"How much storage do you have?"
B:"256GB"
A:"RAM or on your drive?"
B:"Yes."
ajp_anton - Friday, July 31, 2015 - link
10^15 P/E cycles for DRAM? How does this work?, as typical DRAM does on the order of 10^16 cycles in a year. I'm assuming a P/E cycle is the same as a clock cycle because of the constant refreshing, is this wrong?
Crazy1 - Saturday, August 1, 2015 - link
I had to look this up, but the DDR3 standard calls for at least 8 refresh commands every 7.8 usec. Rounding down to the nearest 50ns, means to one refresh every 950 ns. When calculated out, that equals roughly 3.32x10^13 cycles/year. That means DDR3 should survive up to 30 years with a 10^15 P/E cycles rating, while never turning off your computer or putting it in hibernate.

In a refresh cycle, the information in a cell is read, then rewritten. There is no erase. I'm not sure the speed a typical P/E cycle occurs when erasing and writing new data is required. If it is significantly quicker than 950ns, there may be a decrease in lifespan from 30 years. However, unless you run intensive programs that delete and write new information to all memory cells every 32ns, you are not going to exceed the 10^15 P/E cycles in a year.
TallestJon96 - Friday, July 31, 2015 - link
Excellent work. Anandtech always has the best information and reviews, even if they are the last.

This is pretty exciting stuff. If storage can become fast enough, then perhaps we will not need memory. Theoretically this would be a massive improvement to efficiency and performance. I would argue that the perfect computer would only have a processor and extremely fast storage. This is not enough to fill the gap, but storage is certainly catching up.

As a gamer, the idea of having my game loaded onto storage that is fast enough to not need to load into the memory is pretty appealing. Zero load time, no texture streaming issues, and potentially larger scale.

I have to wonder about bandwidth with this tech. Latency is clearly between ram and SSDs, but is closer to ram. But I haven't seen any solid bandwidth stats.
Freakie - Friday, July 31, 2015 - link
In the article they mention that gamers already can by-pass slow NAND and HDD speeds by just creating a RAMDisk. If you have 32GB of RAM, you could take 8GB of it for your system memory, turn the other 24GB into a RAM disk, and put all of your game files onto it and then your games will load their resources at the speed of your RAM.

And DDR4 is coming down in price very quickly so it isn't such a crazy idea. The cheapest 32GB DDR4 kit I can find is $176 which means 64GB will cost you $350 for games that have 40GB of resources. While not incredibly cheap, it's also not totally unreasonable especially if you're already complaining about SSD's not loading game resources fast enough.
Friendly0Fire - Saturday, August 1, 2015 - link
Sadly, 24GB is a bit short for modern games and 8GB for the OS and the game is also a bit on the low side. Games are finally taking advantage of 64-bit executables (and thus far larger memory cap) and it's showing up as a dramatic increase in asset size, both on disk and in memory.

64GB of RAM might get you there, but I think 32's on the short-ish side. 3D XPoint would side-step the issue by providing far more storage than contemporary games would likely need.
lordken - Sunday, August 2, 2015 - link
As said by Friendly0Fir 24GB is unfortunately nothing today, many games today have 20-50GB disk requirments (not sure if devs are plain lazy to optimize or they really need that much space for stuff)
Plus dont forget that you need to first fetch data into ramdisk after boot, and wait it to flush it out before shutdown. So personally I would not bother with ramdisks, and probably load times doesnt solely depend on read time from storage only. On some games I didnt seen much difference between HDD and SSD load performance (which shows either bad game engine/coding or some other bottleneck, maybe my CPU).
And not to say leaving only 8GB for OS is really not that great.
JKflipflop98 - Monday, August 3, 2015 - link
Not to mention it's a giant pain in the butt to have to create the ram drive, copy all the files over, and then create all the links needed to actually run the game. By the time you're done futzing around with all that crap, you've cost yourself 10x the time you've saved in loading screens.
lordken - Sunday, August 2, 2015 - link
"This is pretty exciting stuff. If storage can become fast enough, then perhaps we will not need memory. "
imho this will "never" be true, RAM will always be faster, no matter how much you make storage faster you can still also improve RAM which in turn will always keep ahead of storage. Plus as shown in article it is much closer to CPU and thus better perf/latencies etc.

Maybe in case when Xpoint v3 reach performance level of DDR3/4 then diminishing returns could start to kick in , but still by that time we will probably have DDR5/6 or HBM3. So I think RAM will stick around, even if it could perhaps shift into CPU L4 like cache with HBM for example.

Analyzing Intel-Micron 3D XPoint: The Next Generation Non-Volatile Memory

Products

Applications

Post Your Comment

80 Comments

View All Comments

FunBunny2 - Friday, July 31, 2015 - link

nwarawa - Friday, July 31, 2015 - link

ajp_anton - Friday, July 31, 2015 - link

Crazy1 - Saturday, August 1, 2015 - link

TallestJon96 - Friday, July 31, 2015 - link

Freakie - Friday, July 31, 2015 - link

Friendly0Fire - Saturday, August 1, 2015 - link

lordken - Sunday, August 2, 2015 - link

JKflipflop98 - Monday, August 3, 2015 - link

lordken - Sunday, August 2, 2015 - link

Log in

Don't have an account? Sign up now