History: Where GDDR5 Reaches Its Limits

To really understand HBM we’d have to go all the way back to the first computer memory interfaces, but in the interest of expediency and sanity, we’ll condense that lesson down to the following. The history of computer and memory interfaces is a consistent cycle of moving between wide parallel interfaces and fast serial interfaces. Serial ports and parallel ports, USB 2.0 and USB 3.1 (Type-C), SDRAM and RDRAM, there is a continual process of developing faster interfaces, then developing wider interfaces, and switching back and forth between them as conditions call for.

So far in the race for PC memory, the pendulum has swung far in the direction of serial interfaces. Though 4 generations of GDDR, memory designers have continued to ramp up clockspeeds in order to increase available memory bandwidth, culminating in GDDR5 and its blistering 7Gbps+ per pin data rate. GDDR5 in turn has been with us on the high-end for almost 7 years now, longer than any previous memory technology, and in the process has gone farther and faster than initially planned.

But in the cycle of interfaces, the pendulum has finally reached its apex for serial interfaces when it comes to GDDR5. Back in 2011 at an AMD video card launch I asked then-graphics CTO Eric Demers about what happens after GDDR5, and while he expected GDDR5 to continue on for some time, it was also clear that GDDR5 was approaching its limits. High speed buses bring with them a number of engineering challenges, and while there is still headroom left on the table to do even better, the question arises of whether it’s worth it.


AMD 2011 Technical Forum and Exhibition

The short answer in the minds of the GPU community is no. GDDR5-like memories could be pushed farther, both with existing GDDR5 and theoretical differential I/O based memories (think USB/PCIe buses, but for memory), however doing so would come at the cost of great power consumption. In fact even existing GDDR5 implementations already draw quite a bit of power; thanks to the complicated clocking mechanisms of GDDR5, a lot of memory power is spent merely on distributing and maintaining GDDR5’s high clockspeeds. Any future GDDR5-like technology would only ratchet up the problem, along with introducing new complexities such as a need to add more logic to memory chips, a somewhat painful combination as logic and dense memory are difficult to fab together.

The current GDDR5 power consumption situation is such that by AMD’s estimate 15-20% of Radeon R9 290X’s (250W TDP) power consumption is for memory. This being even after the company went with a wider, slower 512-bit GDDR5 memory bus clocked at 5GHz as to better contain power consumption. So using a further, faster, higher power drain memory standard would only serve to exacerbate that problem.

All the while power consumption for consumer devices has been on a downward slope as consumers (and engineers) have made power consumption an increasingly important issue. The mobile space, with its fixed battery capacity, is of course the prime example, but even in the PC space power consumption for CPUs and GPUs has peaked and since come down some. The trend is towards more energy efficient devices – the idle power consumption of a 2005 high-end GPU would be intolerable in 2015 – and that throws yet another wrench into faster serial memory technologies, as power consumption would be going up exactly at the same time as overall power consumption is expected to come down, and individual devices get lower power limits to work with as a result.

Finally, coupled with all of the above has been issues with scalability. We’ll get into this more when discussing the benefits of HBM, but in a nutshell GDDR5 also ends up taking a lot of space, especially when we’re talking about 384-bit and 512-bit configurations for current high-end video cards. At a time when everything is getting smaller, there is also a need to further miniaturize memory, something that GDDR5 and potential derivatives wouldn’t be well suited to resolve.

The end result is that in the GPU memory space, the pendulum has started to swing back towards parallel memory interfaces. GDDR5 has been taken to the point where going any further would be increasingly inefficient, leading to researchers and engineers looking for a wider next-generation memory interface. This is what has led them to HBM.

AMD Dives Deep On High Bandwidth Memory HBM: Wide & Slow Makes It Fast
Comments Locked

163 Comments

View All Comments

  • chizow - Tuesday, May 19, 2015 - link

    Nvidia has already confirmed HBM2 support with Pascal (see the ref PCB on last page). I guess they weighed the pros/cons of low supply/high costs and limited VRAM on HBM1 and decided to wait until the tech matured. HBM1 also has significantly less bandwidth than what HBM2 claims (1+GB/s).
  • DanNeely - Tuesday, May 19, 2015 - link

    Probably part of it; but I suspect passing on HBM1 is part of the same more conservative engineering approach that's lead to nVidia launching on new processes a bit later than ATI has over the last few generations. Going for the next big thing early on potentially gives a performance advantage; but comes at a cost. Manufacturing is generally more expensive because early adopters end up having to fund more of the upfront expenses in building capacity, and being closer to the bleeding edge generally results in the engineering to make it work being harder. A dollar spend on fighting with bleeding edge problems is a either going to contribute to higher device costs; or to less engineering being able to optimize other parts of the design.

    There's no right answer here. In some generations ATI got a decent boost from either a newer GDDR standard or GPU process. At other times, nVidia's gotten big wins from refining existing products; the 7xx/9xx series major performance/watt wins being the most recent example.
  • chizow - Wednesday, May 20, 2015 - link

    Idk, I think AMD's early moves have been pretty negligible. GDDR4 for example was a complete flop, made no impact on the market, Nvidia skipped it entirely and AMD moved off of it even in the same generation with the 4770. GDDR5 was obviously more important, and AMD did have an advantage with their experience with the 4770. Nvidia obviously took longer to get their memory controller fixed, but since then they've been able to extract higher performance from it.

    And that's not even getting into AMD's proclivity to going to a leading edge process node sooner than Nvidia. Negligible performance benefit, certainly more efficiency (except when we are stuck on 28nm), but not much in the way of increased sales, profits, margins etc.
  • testbug00 - Tuesday, May 19, 2015 - link

    They probably also didn't have the engineering set up for it. *rollseyes* for NVidia's software superiority in the majority of cases, it is commonly accepted that AMD has far better physical design.

    And, they also co-developed HBM. That probably doesn't hurt!

    Nvidia probably wouldn't have gone with it anyways, but, I don't think they even had the option.
  • chizow - Tuesday, May 19, 2015 - link

    No the article covers it quite well, AMD tends to move to next-gen commodity processes as soon as possible in an attempt to generate competitive advantage, but unfortunately for them, this risk seldom pays off and typically increases their risk and exposure without any significant payoff. This is just another example, as HBM1 clearly has limitations and trade-offs related to capacity, cost and supply.

    As for not having the option lol, yeah I am sure SK Hynix developed the process to pander it to only AMD and their measly $300M/quarter in GPU revenue.
  • testbug00 - Tuesday, May 19, 2015 - link

    Next gen process? What does that have to do with HBM again? There you lose me, even with that slight explanation.

    Now, HBM has issues, but, supply isn't one of them. Capacity-- if AMD really can make an 8GB card (or 6GB card would be enough, really) are the real issues. Cost is a lesser one, it can be partially offset, so, the extra cost of HBM won't be extra cost eaten by AMD/added to the card. However, the cost will be higher than if the card had 4GB of GDDR5.

    AMD *worked with* SK Hynix to develop this technology. This technology is going to be widely adopted. At least, SK Hynix believed that enough to be willing to push forward with it while only having AMD as a partner (it appears to me). There's obviously some merit with it.
  • chizow - Tuesday, May 19, 2015 - link

    HBM is that next-gen, commodity process....

    How can you say HBM doesn't have supply/yield issues? You really can't say that, in fact, if it follows the rest of the DRAM industry's historical pricing, prices are going to be exponentially higher until they ramp for the mainstream.

    This article already lists out a number of additional costs that HBM carries, including the interposer itself which adds complexity, cost and another point of failure to a fledgling process.
  • testbug00 - Tuesday, May 19, 2015 - link

    Because HBM doesn't bring any areas where you get to reduce cost.
    Currently, it does and will add a net cost. It also can reduce some costs. *yawn*
  • chizow - Thursday, May 21, 2015 - link

    What? Again, do you think it will cost more, or not? lol.
  • Ranger101 - Wednesday, May 20, 2015 - link

    Lol @ Chizowshill doing what he does best, Nvidia troll carrot still visibly protruding,
    stenching out the Anandtech forums...thanks for the smiles dude.

Log in

Don't have an account? Sign up now