Analyzing Intel-Micron 3D XPoint: The Next Generation Non-Volatile Memory

Name: Analyzing Intel-Micron 3D XPoint: The Next Generation Non-Volatile Memory
Item: Analyzing Intel-Micron 3D XPoint: The Next Generation Non-Volatile Memory

by Kristian Vättö, Ian Cutress & Ryan Smith on July 31, 2015 11:00 AM EST

80 Comments | Add A Comment

80 Comments

Products

During the event, Intel and Micron made it clear that this week's announcement is solely about the underlying 3D XPoint technology. Products based on this new technology will follow sometime next year and the companies were quite tight-lipped when it came to details, but they did give away a few hints. First of all, the co-operation between Intel and Micron only exists at the memory technology level and both companies are developing their own 3D XPoint based products, similar to how the two have operated in the SSD/NAND business. Technically this means that the two will be competing against each other, although it's possible that each company will take a unique approach to utilizing 3D XPoint in an end product.

One take away from the presentation and Q&A was Intel's emphasis on NVMe. Intel has been a strong advocate of the technology ever since its inception, and as a matter of fact Intel was the first SSD vendor to ship NVMe SSDs in high volume with the introduction of the DC P3700 and its derivatives last year. While NVMe has mostly been associated with NAND so far since it is mainstream non-volatile memory, the core architecture was built to scale with future memory technologies with even lower latencies (after all, NVMe stands for Non-Volatile Memory Express). Given that software interfaces tend to stick around for at least a decade, it's obvious that NVMe had to be designed with more than just NAND in mind.

With NVMe it's certain that we will see 3D XPoint based PCIe SSDs. Whether these will be add-in cards or 2.5" drives remains to be seen, but I'm more inclined to say add-in cards (at least initially) because of the connector limitations. U.2 (former SFF-8639) supports only four PCIe 3.0 lanes, resulting in effective real world bandwidth of about 3.2GB/s. NAND is already capable of saturating that for read operations, so even though 3D XPoint would improve write and random IO performance, the full potential would ultimately go unused without a higher bandwidth interface. An add-in card doesn't share the limitations of U.2 and could support up to 16 lanes with over 10GB/s of bandwidth available, but the downside would more limited serviceability since add-in cards can't be front-loaded like 2.5" drives can. As the enterprises have used add-in cards in the past (Fusion-io never made anything but add-in cards), I don't see serviceability being a major hurdle for the companies that really need 3D XPoint for their workloads. On the other hand, I wouldn't be surprised to see Intel pushing for an 8-lane U.2-like standard, but it really needs industry-wide support to get air under the wings.

With Intel being the other party in the joint-venture, it's guaranteed that 3D XPoint will get all support and love it needs on the platform side. Intel can integrate more PCIe lanes and/or accelerate the development of PCIe 4.0 for its upcoming platforms to create the necessary bandwidth and push for 3D XPoint if needed, which is something that no other memory vendor could do.

AgigA's DDR4 NVDIMM: A Future 3D XPoint Form Factor?

While Intel will clearly be pursuing the storage aspect of 3D XPoint through NVMe, I suspect Micron might take a more memory-like approach since it's a memory company as much as it's a storage company. It was made clear that 3D XPoint can be used in memory and storage applications because the technology is bit-addressible and can work in a similar fashion as DRAM. Bringing 3D XPoint closer to the CPU and connecting it through a DDR4 interface would obviously yield the best performance and eliminate any bottlenecks that PCIe has. There are already NAND-based products that do this, such as SanDisk's ULLtraDIMM, and a couple of months ago JEDEC paved the way by releasing a standard for DDR4 NVDIMMs, a new standard set to fill the gap between DRAM and SSDs. While NVDIMMs will require driver work due to the lack of standardized software interface like NVMe, I do believe 3D XPoint is the right technology for bringing NVDIMMs to the market and it would make sense for Micron to do so.

Applications

Section by Ryan Smith

The use cases for 3D XPoint are potentially significant in number and Intel/Micron believe that it will open the doors for all sorts of new applications. Overall the computing industry has had access to high speed non-volatile memory technologies before – magnetic core memory is the traditional poster child here – so there is some precedence here and some fundamental research into the field from the early days of computing. However with magnetic core memory having become outmoded before the majority of our readers were even born, the modern computing industry has developed around the current paradigm of fast DRAM and slow permanent storage. As a result while the potential applications are numerous, it’s still in many ways an uncharted area in computer science.

The most immediate application of 3D XPoint based products will be as a layer of storage between DRAM and SSDs. Over the history of computing the number of layers between storage and processors has continued to build – multiple layers of on-die caches, off-die caches, caching SSDs, etc – and 3D XPoint memory would further fit into that heiarchy as a storage medium that bridges the gap between DRAM and the current fastest non-volatile storage. By treating 3D XPoint memory as another layer of cache, 3D XPoint can be used to further speed up applications that are currently bound by either memory capacity or storage latency.

<
The Traditional Memory Heiarchy (Image Source: Tommy MacWilliam, Harvard)

Given the costs of 3D XPoint, the first such applications are expected to be on the enterprise side. Enterprise users make heavy use of storage at all layers in order to balance performance needs against the relatively small capacity of DRAM. Database servers in particular adapt well to caching, and it’s easy enough to imagine a next-generation database system using 3D XPoint to backstop DRAM. Since 3D XPoint is non-volatile, it can even be an exclusive cache – that is, its contents don’t need to be in lower layers as well – which eliminates a good deal of overhead. A database system in this context would only need to write contents to SSDs and other, lower layers of storage when data gets expelled from the 3D XPoint cache, an occurrence that may be particularly rare with the properly tuned database.

Many of these benefits of a cache layer are applicable to other types of storage-heavy servers as well, though I expect databases will be the king. Perhaps the more interesting aspect – and certainly more relatable to the public at large – will be what 3D XPoint-backed servers are used for. Intel and Micron are eager to point out the “big science” uses for the technology; projects and systems such as the Large Hadron Collider and Oak Ridge’s Titan supercomputer can generate a massive amount of data, and while processing all of that data is first and foremost a processor issue, feeding that data for processing is a big problem as well. Any kind of analysis that could benefit from individual processors having RAM-like access to an SSD-sized pool of data could benefit.

The catch is that there’s still a lot of research that’s needed into figuring out what the best uses may be. This kind of shift in access times and capacity doesn’t just make computers faster, but it can change the fundamentals of what algorithms are best. Just as how GPUs required scientists to figure out how to spread out their work in a massively parallel (and high latency) nature, putting 3D XPoint to its full use will require newer algorithms that are capable of effectively utilizing direct access to so much data at once.

Meanwhile I would be surprised if the financial industry didn’t jump on this early, as they are prone to jumping on major technologies in order to try to get an edge in a highly competitive and lucrative field. In this aspect it’s not so much that 3D XPoint would improve processing speed – such work is already offloaded to large RAM pools when possible – but rather it would enable traders and analysts to run simulations against much larger datasets much more effectively.

As for the consumer space, the same principles about an additional cache layer would apply, but I’m not so sure we’d see consumers pick it up in the same manner. Much of this has to do with what the eventual costs and capacities of 3D XPoint products would be, as consumers are much more price sensitive than professional users. In the consumer space we’ve seen sporadic use of NAND-backed hard drives, for example, but by and large consumers have stuck with discrete SSDs and HDDs. Consumers either don’t want to pay the premium for SSDs, or have enough money to just buy large SSDs outright, leaving little of a middle ground.

That said I’ve seen some interesting pitches for 3D XPoint in the gaming space that have some merit, as games are something of a special case for consumer workloads. By and large we want fast access to game resources since those resources are accessed on-demand and are needed to progress in a game’s execution, but the assets themselves aren’t volatile. Only a small part of the working set for a game is volatile data – player positions, AI decision trees, game state, etc – while the rest of it is static data such as models, world geometry, and textures. 3D XPoint in turn would be fast enough that it could be used as a replacement for RAM in holding these assets, but as the data is non-volatile it wouldn’t thrash 3D XPoint P/E cycles very much, and any write speed disadvantage compared to DRAM would be immaterial.

But again, this is going to depend on the cost of the technology; if it were to become cheap enough that 50-100GB could be thrown in a game console or gaming PC, then you could store the entirety of most games in 3D XPoint memory, which would reduce load times to the time required to process the data and setup the game state. This is more important in consoles which currently store their games on a mechanical drive, who then could recall data rather quickly on first boot or adjust for large amounts of memory swapping for more detailed titles. High end PCs with large amounds of DRAM can already use RAMDisks perhaps nullifying a point there.

Last but not least of course are the implications for 3D XPoint as a wholesale replacement for DRAM. The more limited lifetime of 3D XPoint relative to DRAM certainly poses some challenges in this respect, but I suspect the bigger issue will be overall bandwidth. By the time 3D XPoint becomes available in bulk, DRAM technology should be to the point where faster-generations of DDR4 are available and HBM is widely deployed. Given that future generations of HBM are targeting 1TB/sec or more of memory bandwidth, it’s unlikely that 3D XPoint is going to be able to match the bandwidth of contemporary high-bandwidth DRAM solutions. So any rumors of the impending death of DRAM are likely premature.

IoT & Embedded, A Good Fit For 3D XPoint?

But with that said, while 3D XPoint isn’t likely to replace DRAM in a wholesale manner for all applications, there is clearly room for it to replace DRAM in some situations where DRAM is used primarily for its bandwidth and latency versus solid state storage. Replacing DRAM with 3D XPoint in embedded applications for example would be very practical – many embedded uses don’t need high bandwidth or low latency as much as they just something better than traditional NAND – and I wouldn’t rule out smartphones here either, at least to an extent. If individual 3D XPoint chips can be produced small and cheap enough, then the most lucrative use for the tech as a DRAM replacement may be in the vast legions of low-performance devices, rather than in high-performance hardware that actually needs the full speed and latency of DRAM.

Estimating 3D XPoint Die Size & What Happens to 3D NAND Final Thoughts

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

80 Comments

View All Comments

Alexvrb - Friday, July 31, 2015 - link
I don't think so... this is slower than current RAM. They aren't very likely to use HBM only on an APU for various reasons, so you're still going to be using something like DDR4 for your main memory. Which again, is faster than this XPoint tech.

XPoint is however a lot denser than RAM, and it's non-volatile so it will make excellent high-speed storage if we can get a better interface. I think in a few years we could at least be using it as a cache for NAND devices or as "boot drives" similar to how we were using then-costly NAND-based SSDs not so long ago.
lilmoe - Monday, August 3, 2015 - link
If we're talking more in a "conventional" non-enterprise, consumer/professional product sense, then I believe this type of memory would be more of a complement to eDRAM (or other forms of higher density, lower speed cache memory), with DRAM completely omitted from the hierarchy. But this may fundamentally change the way operating systems and applications work, and depending on design/application, may lead to breakthrough performance gains.
Scoobmx - Friday, July 31, 2015 - link
Ian, I have some serious doubts that this is STT-MRAM. The endurance and density numbers don't really line up. STT has virtually limitless endurance but fairly poor density due to the high current required, hence the need for a large transistor. I don't have the hubris to claim that it's impossible, but I believe it highly unlikely. Source: completed my dissertation in nanomagnetic logic and memory devices last year.
J03_S - Friday, July 31, 2015 - link
It might very well be Perpendicular Magnetic Anisotropic Magnetic Tunneling Junction STT-MRAM. It's a variant of STT-MRAM that does not suffer from the density issues and is more than one order of magnitude efficient than Spin torque transfer. It was covered in the AIP journal and published back in April of 2014 by Luc Thomas and associates. At the time they had IBM producing chips for them as the entire process is fully compatible with the existing CMOS backend and requires no special changes be made to the process. This expedited the research quite a bit as they were able to test fully functioning chips.
jjj - Friday, July 31, 2015 - link
About the positioning in the market you are being a bit misleading initially.
The technology itsalf is likely able to compete with NAND in pricing,there would be a process and layers race but it could be doable.
So it's not really in between NAND and DRAM, cost wise, at least that's not a must, it will cost us a lot more than NAND because Intel and Micron will milk the hell out of it.
About output, that's a startegy matter, the goal being to maximize profits ,nothing else matters. The 2 companies are trying to justify their initial prices and markets by placing it inthe middle- sure it is in the middle perf wise and cost is likely higher for now than the most efficient NAND.
When you comment about power vs NAND you forget to say that it would be per bit and that's kinda relevant.
When you talk about how the laywer are made and costs, it would be important to point out that 3D NAND has very poor planar density compared to 2D NAND. the density here seems to be very close to 2D NAND density. You make it sound like it would cost a lot more than 3D NAND and don't think that's a case at all. Sure maybe it's 2-4 times more than more for now but that's not too far and it's a lot cheaper than RAM. Yes scaling the layers seems costlier here than with 3D NAND.
When talking die size it stops being as misleading as some previous bits. On die size it looks more like 18+ dies and close to 23 so some 13x16mm for 208-ish mm2.
High cell efficiency would be good too when scaling soif they go 16nm 4 layers in gen 2,it would be interesting.
Micron can double it's profits once they max that facility (and Intel takes half) , i was assuming they'll push SSDs at 4-5$ per GB too but i'm sure they'll try to go even higher if they can.
As far as i know PCI 4.0 was due in 2017 so not too far away.
You keep pushing their agenda at the end about where it can go. Look, DDR3 is some 4.5$ per GB, DDR4 getting close to 6.5$ per GB , 128Gb NAND is some 5$ but the range is pretty wide for NAND (3.5-6$). Could they sell it in phones at 1-2$ per GB? Easily, but they won't at first ,it's more profitable not to. Will they do it in gen 2-3, yeah they will. They need to expand it slowly before others have their own 3D ReRAM slutions and have a solid base by that time,whilemaking a lot of money with it in the few years of monopoly.
Ofc in phones they can go for 4-8GB at 3$ per GB and lesser RAM to save power. Don't forget power in phones, just on that and it's worth using a hybrid RAM/ReRAM in high end.

So overall i think you fail to make a clear distinction between the technology and the financial strategy. The big limitation in adoption is the very high margins, the technology itself seems plenty capable and cheap. In IoT could be interesting too when it gets cheap enough but it's not ideal since it's not quite as cheap and dense as the industry would like, a lot more is needed there long term.
Anyway, great that we have this 5 or more years before it was expected, not so great (for us) that it might take a while before prices become accessible for consumers. At least this forces others to accelerate their ReRAM roadmaps.
zodiacfml - Friday, July 31, 2015 - link
Hmm, I think I know why Intel is so invested in this. This will eventually replace NAND drives as performance storage while current NAND drives of today becomes the cold, backup storage replacing the spinning disk drives. I feel that 3D-NAND has more potential for higher density and lower power versus disks. It might become more cost effective or cheaper than hard disks when OEMs starts using NAND in cheap and mid range PCs because of the scale and less buyers of the hard disks.
DrKlahn - Friday, July 31, 2015 - link
I think short term you may see Intel and Micron put a small amount of Xpoint as a read/write cache onto their Enterprise and performance oriented SSD's. It would give them a decent performance advantage with a price bump modest enough to still attract consumers.
Drumsticks - Friday, July 31, 2015 - link
I've been looking forward to this writeup! I work in NSG at Intel (the Non-Volatile Memory Solutions Group i.e. the people developing 3D XPoint) and we've been super excited for this reveal.

It's fun to see the industry analysis, and as always Anandtech has one of the most in-depth!
Vlad_Da_Great - Friday, July 31, 2015 - link
@Drumsticks. Keep the good work, the world is moving thanks to people like you and INTC as a company. Thank you!!!
jjj - Friday, July 31, 2015 - link
Forgot to mention that in a promo video they claim SSDs with this would be up to 10x faster over PCIe/NVMe. https://www.youtube.com/watch?t=184&v=Wgk4U4qV...
No idea how they do the math ofc so i wouldn't expect 10x random.

Analyzing Intel-Micron 3D XPoint: The Next Generation Non-Volatile Memory

Products

Applications

Post Your Comment

80 Comments

View All Comments

Alexvrb - Friday, July 31, 2015 - link

lilmoe - Monday, August 3, 2015 - link

Scoobmx - Friday, July 31, 2015 - link

J03_S - Friday, July 31, 2015 - link

jjj - Friday, July 31, 2015 - link

zodiacfml - Friday, July 31, 2015 - link

DrKlahn - Friday, July 31, 2015 - link

Drumsticks - Friday, July 31, 2015 - link

Vlad_Da_Great - Friday, July 31, 2015 - link

jjj - Friday, July 31, 2015 - link

Log in

Don't have an account? Sign up now