History: Where GDDR5 Reaches Its Limits

To really understand HBM we’d have to go all the way back to the first computer memory interfaces, but in the interest of expediency and sanity, we’ll condense that lesson down to the following. The history of computer and memory interfaces is a consistent cycle of moving between wide parallel interfaces and fast serial interfaces. Serial ports and parallel ports, USB 2.0 and USB 3.1 (Type-C), SDRAM and RDRAM, there is a continual process of developing faster interfaces, then developing wider interfaces, and switching back and forth between them as conditions call for.

So far in the race for PC memory, the pendulum has swung far in the direction of serial interfaces. Though 4 generations of GDDR, memory designers have continued to ramp up clockspeeds in order to increase available memory bandwidth, culminating in GDDR5 and its blistering 7Gbps+ per pin data rate. GDDR5 in turn has been with us on the high-end for almost 7 years now, longer than any previous memory technology, and in the process has gone farther and faster than initially planned.

But in the cycle of interfaces, the pendulum has finally reached its apex for serial interfaces when it comes to GDDR5. Back in 2011 at an AMD video card launch I asked then-graphics CTO Eric Demers about what happens after GDDR5, and while he expected GDDR5 to continue on for some time, it was also clear that GDDR5 was approaching its limits. High speed buses bring with them a number of engineering challenges, and while there is still headroom left on the table to do even better, the question arises of whether it’s worth it.


AMD 2011 Technical Forum and Exhibition

The short answer in the minds of the GPU community is no. GDDR5-like memories could be pushed farther, both with existing GDDR5 and theoretical differential I/O based memories (think USB/PCIe buses, but for memory), however doing so would come at the cost of great power consumption. In fact even existing GDDR5 implementations already draw quite a bit of power; thanks to the complicated clocking mechanisms of GDDR5, a lot of memory power is spent merely on distributing and maintaining GDDR5’s high clockspeeds. Any future GDDR5-like technology would only ratchet up the problem, along with introducing new complexities such as a need to add more logic to memory chips, a somewhat painful combination as logic and dense memory are difficult to fab together.

The current GDDR5 power consumption situation is such that by AMD’s estimate 15-20% of Radeon R9 290X’s (250W TDP) power consumption is for memory. This being even after the company went with a wider, slower 512-bit GDDR5 memory bus clocked at 5GHz as to better contain power consumption. So using a further, faster, higher power drain memory standard would only serve to exacerbate that problem.

All the while power consumption for consumer devices has been on a downward slope as consumers (and engineers) have made power consumption an increasingly important issue. The mobile space, with its fixed battery capacity, is of course the prime example, but even in the PC space power consumption for CPUs and GPUs has peaked and since come down some. The trend is towards more energy efficient devices – the idle power consumption of a 2005 high-end GPU would be intolerable in 2015 – and that throws yet another wrench into faster serial memory technologies, as power consumption would be going up exactly at the same time as overall power consumption is expected to come down, and individual devices get lower power limits to work with as a result.

Finally, coupled with all of the above has been issues with scalability. We’ll get into this more when discussing the benefits of HBM, but in a nutshell GDDR5 also ends up taking a lot of space, especially when we’re talking about 384-bit and 512-bit configurations for current high-end video cards. At a time when everything is getting smaller, there is also a need to further miniaturize memory, something that GDDR5 and potential derivatives wouldn’t be well suited to resolve.

The end result is that in the GPU memory space, the pendulum has started to swing back towards parallel memory interfaces. GDDR5 has been taken to the point where going any further would be increasingly inefficient, leading to researchers and engineers looking for a wider next-generation memory interface. This is what has led them to HBM.

AMD Dives Deep On High Bandwidth Memory HBM: Wide & Slow Makes It Fast
Comments Locked

163 Comments

View All Comments

  • jjj - Tuesday, May 19, 2015 - link

    "First part of the solution to that in turn was to develop something capable of greater density routing, and that something was the silicon interposer. "
    "Moving on, the other major technological breakthrough here is the creation of through-silicon vias (TSVs). "

    You guys are acting like interposers and TSV were created by AMD and Hynix for this, it's hugely misleading the way you chose to phrase things.
    And ofc, as always when you do this kind of article (Aptina, Synaptics, Logitech and a few more in the last few years), it's more advertising than anything else.You don't talk about other similar technologies ,existing or potential, you just glorify the one you are presenting.
  • Crunchy005 - Tuesday, May 19, 2015 - link

    This isn't an article on HBM itself but AMDs next gen cards. They are focusing on AMD becasue of that fact. If this were about HBM itself i'm sure they would talk about other technologies out their as well. Don't criticize because they are staying on topic in the article.
  • testbug00 - Tuesday, May 19, 2015 - link

    a side note for the article, ATI also was the main developer of GDDR3, with JEDEC helping a little. Nvidia launched with it first, but, ATI __DID__ most of the design work.
  • testbug00 - Tuesday, May 19, 2015 - link

    Having finished the article, I was also under the impression that high clock GDDR5 used 2-2.5 watts per chip on the board. I don't see what 7GBps GDDR5 with 50% more chips would use only 5% more power. (currently on graph 290 == 16 chips @5GBps, ~30W. Titan X = 24 chips @7GBps, ~31.5W).

    Given AMD's ~15-20% for the 290x, I would put that at around 35-50W, while NVidia's solution is at least 50W. Of course, I could be wrong!
  • testbug00 - Tuesday, May 19, 2015 - link

    As a note, I get that you used the GDDR5 bandwidth/W you can get... However, that's likely at the best point in the pref/watt curve. I suspect that's under 5GBps, based on AMD's claimed GDDR5 consumption on the 290(X) and their memory clock.

    Which, would put AMD's under that number, and, NVidia's further under that number.
  • testbug00 - Tuesday, May 19, 2015 - link

    Oh, and, here the slide you have that "proves" it: http://images.anandtech.com/doci/9266/HBM_9_Compar...

    That means at 7GBps, at max bandwidth/watt, the Titan X should be using ~63 watts of power (28/10.66) * 24 = 63.04
  • Ryan Smith - Tuesday, May 19, 2015 - link

    They're rough estimates based on power consumption per bit of bandwidth and should be taken as such. Titan X has more chips, but it doesn't have to drive a wider memory bus.
  • HighTech4US - Tuesday, May 19, 2015 - link

    Facts have never gotten in the way of testbug's anti-Nvidia drivel.
  • testbug00 - Tuesday, May 19, 2015 - link

    So, should I assume that GDDR5 chips don't use power if you don't make a wider bus? And that 7GBps is the best Bandwidth/watt of GDDR5? Or that GDDR5 power consumption doesn't change when you raise or lower the clockspeed?

    Nvidia's generalized power is just easier to calculate because they use 7GBps. Anyhow, my guesstimations for the 290x is that it uses is 32W given perfect power scaling from 5GBps to 7GBps and that it has less chips to run voltage.

    The reality is probably AMD's is 40-50W and NVidia is 50-60W. Running more GDDR5 chips at higher clockspeeds, even on a smaller bus, should result in higher power usage.

    I have rose tinted glasses, I also do have a brain.
  • silverblue - Tuesday, May 19, 2015 - link

    It's quite the role-reversal, really. Back in the GT 200 days, NVIDIA were throwing out cards with wider memory buses, and AMD showed them that it was (mostly) unnecessary.

    Whichever solution uses the most power for GDDR5 stands to gain the most with a move to HBM. I'd be interested in seeing how much juice the memory on a 12GB card uses...

Log in

Don't have an account? Sign up now