Intel's Return to DRAM: Haswell GT3e to Integrate 128MB eDRAM?by Anand Lal Shimpi on April 23, 2013 11:58 AM EST
We've known for a while now that Intel will integrate some form of DRAM on-package for the absolute highest end GPU configurations of its upcoming Haswell SoC. Memory bandwidth is a very important enabler of GPU (and multi-core CPU) performance, but delivering enough of it typically required very high speed interfaces (read: high power) and/or very wide interfaces (read: large die areas). Neither of the traditional approaches to scaling memory bandwidth are low power or cost effective, which have kept them out of ultra mobile and integrated processor graphics.
The days of simple performance scaling by throwing more transistors at a design are quickly coming to an end. Moore's Law will continue but much like the reality check building low power silicon gave us a while ago, building high performance silicon will need some out of the box thinking going forward.
Dating back to Ivy Bridge (3rd gen Core/2012), Intel had plans to integrate some amount of DRAM onto the package in order to drive the performance of its processor graphics. Embedding DRAM onto the package adds cost and heat, and allegedly Paul Otellini wasn't willing to greenlight the production of a part that only Apple would use so it was canned. With Haswell, DRAM is back on the menu and this time it's actually going to come out. We've referred to the Haswell part with embedded DRAM as Haswell GT3e. The GT3 refers to the GPU configuration (40 EUs), while the lowercase e denotes embedded DRAM. Haswell GT3e will only be available in a BGA package (soldered-on, not socketed), and is only expected to appear alongside higher TDP (read: not Ultrabook) parts. The embedded DRAM will increase the thermal load of the SoC, although it shouldn't be as painful as including a discrete GPU + high speed DRAM. Intel's performance target for Haswell GT3e is NVIDIA's GeForce GT 650M.
What we don't know about GT3e is the type, size and speed of memory that Intel will integrate. Our old friend David Kanter at RealWorldTech presented a good thesis on the answers to those questions. Based on some sound logic and digging through the list of papers to be presented at the 2013 VLSI Technology Symposium in Kyoto, Kanter believes that the title of this soon to be presented Intel paper tells us everything we need to know:
"A 22nm High Performance Embedded DRAM SoC Technology Featuring Tri-Gate Transistors and MIMCAP COB"
According to Kanter's deductions (and somewhat validated by our own sources), Haswell GT3e should come equipped with 128MB of eDRAM connected to the main SoC via a 512-bit bus. Using eDRAM vs. commodity DDR3 makes sense as the former is easier to integrate into Intel's current fabs. There are also power, manufacturability and cost concerns as well that resulted in the creation of Intel's own DRAM design. The interface width is a bit suspect as that would require a fair amount of area at the edges of the Haswell die, but the main takeaway is that we're dealing with a parallel interface. Kanter estimates the bandwidth at roughly 64GB/s, not anywhere near high-end dGPU class but in the realm of what you can expect from a performance mainstream mobile GPU. At 22nm, Intel's eDRAM achieves a density of around 17.5Mbit/mm^2, which works out to be ~60mm^2 for the eDRAM itself. Add in any additional interface logic and Kanter estimates the total die area for the eDRAM component to be around 70 - 80mm^2. Intel is rumored to be charging $50 for the eDRAM adder on top of GT3, which would deliver very good margins for Intel. It's a sneaky play that allows Intel to capture more of the total system BoM (Bill of Materials) that would normally go to a discrete GPU company like NVIDIA, all while increasing utilization of their fabs. NVIDIA will still likely offer better perfoming solutions, not to mention the benefits of much stronger developer relations and a longer history of driver optimization. This is just the beginning however.
Based on leaked documents, the embedded DRAM will act as a 4th level cache and should work to improve both CPU and GPU performance. In server environments, I can see embedded DRAM acting as a real boon to multi-core performance. The obvious fit in the client space is to improve GPU performance in games. At only 128MB I wouldn't expect high-end dGPU levels of performance, but we should see a substantial improvement compared to traditional processor graphics. Long term you can expect Intel to bring eDRAM into other designs. There's an obvious fit with its mobile SoCs, although there we're likely talking about something another 12 - 24 months out.
AMD is expected to integrate a GDDR5 memory controller in its future APUs, similar to what it has done with the PlayStation 4 SoC, as its attempt to solve the memory bandwidth problem for processor based graphics.
Post Your CommentPlease log in or sign up to comment.
View All Comments
Spunjji - Tuesday, April 23, 2013 - link$50 on top of the already-inflated cost of an i7... no, I don't think I'm too interested in that just yet. Could be nice a generation or two from now.
aruisdante - Tuesday, April 23, 2013 - linkIt's only available in solder-on packages. Meaning OEM only. It's meant to replace the lower end DGPU's in mid-range ultrabooks. Not for consumers. So the end cost is about the same when you subtract the cost of the DGPU
glugglug - Tuesday, April 23, 2013 - linkThat aspect of it pisses me off. I've been looking forward to building a new desktop to upgrade from my current Bloomfield when Haswell comes out, and so wanted this to be the integrated DRAM version.
The L4 cache would be a huge win for servers as well if they didn't make the stupid thing soldered only. While you could argue that noone ever upgrades a server CPU, they just replace the server, we do have hundreds of dual socket HP servers here ordered with only one socket populated because of core count limits on software licensing.
HisDivineOrder - Wednesday, April 24, 2013 - linkWell, if the rumors that Intel intends not to offer user-installed versions of Broadwell are correct, perhaps this is the reason. So they can offer more eDRAM versions of the CPU.
Spunjji - Thursday, April 25, 2013 - linkI'm aware that it's not for consumers. What I was trying to express is that where graphics performance is important, given the choice between a machine using an i7 with its accompanying price premium plus this on top vs. something like an i3/i5 with discrete, I am still going to be looking at the latter.
seamonkey79 - Tuesday, April 30, 2013 - linkYea, not quite sure what the purpose is. This type of addition would be fantastic on a higher end-ish i3 or i5 that would give me a decent system and decent graphics, but more portable. Sticking it in with an i7, most people will still want a dGPU to power their games because even with eDRAM, Haswell won't be powerful enough.
jeffkibuule - Tuesday, April 23, 2013 - linkDo you think Apple was relying on Ivy Bridge having eDRAM in its 2012 Retina MacBook Pros?
Death666Angel - Tuesday, April 23, 2013 - linkWell, the 15" model has a GT 650M, so even better performance than IVB would have with eDRAM. And I've read that it is more a software issue (and single thread CPU performance issue) than a GPU performance issue. Many people are fine running 1440p/1600p displays off their Ultrabooks in Windows without performance drawbacks in the general UI. :)
fteoath64 - Thursday, April 25, 2013 - linkThe GT3e in the new Haswell chip has to perform 30% better overall compared to the GT650M in order for Apple to ditch the dGpu. There is still a competitor in the GT 750M newchip from Nvidia which might just double the GT650 and if so would still be useful for high-end Retinas if Apple so decides. The improvements to the 13inch Retinas will certainly be iGpu while the highest end would probably get the dGpu (maybe of AMD persuasion ?). Kepler would be king for a long while as Intel still struggles to be "good enough" in the gpu arena.
Flunk - Sunday, May 5, 2013 - linkThe GT 750M is already available, it's a higher-clocked version of the same chip the 650M uses.