The Memory Bandwidth Challenge

By 2015 Rattner predicted that Intel CPUs would have 10s or 100s of cores on each die, which in turn would require a lot of memory bandwidth. The problem with memory bandwidth at that level is that you effectively become pin limited, you can't physically have enough pins leaving your microprocessor to allow for a wide enough memory bus delivering the sort of bandwidth necessary to feed those 10s or 100s of cores.

One solution that Rattner presented was 3D die and wafer stacking. Normally microprocessor circuits are laid out in a flat 2D surface, as the name implies 3D die and wafer stacking builds on top of that, literally.

First let's talk about wafer stacking; wafer stacking involves stacking two identically sized/shaped wafers on top of each other, and using through-silicon vias (interconnects) to connect the top wafer layer to the bottom layer. The best example of an application of this would be a DRAM wafer sitting on top of a CPU wafer, meaning that you would have memory (not cache, that would still be inside your CPU) sitting directly on top of your CPU.

With wafer stacking, instead of having hundreds or thousands of pins between your CPU and main memory, you have 1 - 10 million connections between your CPU and memory, directly increasing memory bandwidth. What's interesting is that this method of stacking could also mean the end of external memory.

Die stacking is another possibility, where you could stack multiple different sized die on top of the CPU core logic, that die could also be DRAM as well as Flash memory or anything else really. Intel showed off an 8 layer configuration using die stacking, which according to Intel is a very realistic option.

Rattner was fairly confident in the potential with die and wafer stacking, so it's a technology that we'll definitely have to keep an eye on as time goes on. There are definitely limitations to consider, such as power and thermal dissipation, but there are solutions in the works for that as well (e.g. nanoscale thermal pumps).

The Super Resolution demo A Fully Virtualized Platform
Comments Locked

15 Comments

View All Comments

  • HardwareD00d - Thursday, March 3, 2005 - link

    Wishful thinking for Intel. Who the heck knows what will happen in 10 years. The earth will probably be taken over by Soviet Russia and Yakov Smirnov will be president.

    When I hear about this stuff I keep thinking back to the Prescott and how it was going to scale to 5 GHz + . Yeah right.
  • Dualboy24 - Thursday, March 3, 2005 - link

    #2 The Super Resolution technique would work only with a video signal as stated a 3 second low resolution one. It would not work with a single still low resolution image. The software just compares the pixels and motion between frames to generate a single clean frame. 3 seconds can mean 30 to 90 frames for analysis in total depending on the rate of capture. Still very impressive how clear it made it.

    Impressive stuff.. I cant wait till 2015... I want 100 cores now! I bet developer tools and compilers will start to take advantage of huge multitrheading.
  • ZobarStyl - Thursday, March 3, 2005 - link

    Anyone think the idea of sticking two cores directly on top of one another for Intel's current processor line is a little bit ambitious? We're already curious about how 2 Prescotts side by side will fare, but one on top of another, how can that dissipate such a tremendous amount of heat?
  • quanta - Thursday, March 3, 2005 - link

    Maybe it was just Anand's screenshot, but the 'Super Resolution' technique looks like someone making up the whole thing when the source text is not supposed to be readable. Perfect for tampering with evidence. :)
  • coldpower27 - Thursday, March 3, 2005 - link

    Interesting stuff.

Log in

Don't have an account? Sign up now