Cayman: The Last 32nm Castaway

With the launch of the Barts GPU and the 6800 series, we touched on the fact that AMD was counting on the 32nm process to give them a half-node shrink to take them in to 2011. When TSMC fell behind schedule on the 40nm process, and then the 32nm process before canceling it outright, AMD had to start moving on plans for a new generation of 40nm products instead.

The 32nm predecessor of Barts was among the earlier projects to be sent to 40nm. This was due to the fact that before 32nm was even canceled, TSMC’s pricing was going to make 32nm more expensive per transistor than 40nm, a problem for a mid-range part where AMD has specific margins they’d like to hit. Had Barts been made on the 32nm process as projected, it would have been more expensive to make than on the 40nm process, even though the 32nm version would be smaller. Thus 32nm was uneconomical for gaming GPUs, and Barts was moved to the 40nm process.

Cayman on the other hand was going to be a high-end part. Certainly being uneconomical is undesirable, but high-end parts carry high margins, especially if they can be sold in the professional market as compute products (just ask NVIDIA). As such, while Barts went to 40nm, Cayman’s predecessor stayed on the 32nm process until the very end. The Cayman team did begin planning to move back to 40nm before TSMC officially canceled the 32nm process, but if AMD had a choice at the time they would have rather had Cayman on the 32nm process.

As a result the Cayman we’re seeing today is not what AMD originally envisioned as a 32nm part. AMD won’t tell us everything that they had to give up to create the 40nm Cayman (there has to be a few surprises for 28nm) but we do know a few things. First and foremost was size; AMD’s small die strategy is not dead, but getting the boot from the 32nm process does take the wind out of it. At 389mm2 Cayman is the largest AMD GPU since the disastrous R600, and well off the sub-300mm2 size that the small die strategy dictates. In terms of efficient usage of space though AMD is doing quite well; Cayman has 2.64 billion transistors, 500mil more than Cypress. AMD was able to pack 29% more transistors in only 16% more space.

Even then, just reaching that die size is a compromise between features and production costs. AMD didn’t simply settle for a larger GPU, but they had to give up some things to keep it from being even larger. SIMDs were on the chopping block; 32nm Cayman would have had more SIMDs for more performance. Features were also lost, and this is where AMD is keeping mum. We know PCI Express 3.0 functionality was scheduled for the 32nm part, where AMD had to give up their PCIe 3.0 controller for a smaller 2.1 controller to make up for their die size difference. This in all honesty may have worked out better for them: PCIe 3.0 ended up being delayed until November, so suitable motherboards are still at least months away.

The end result is that Cayman as we know it is a compromise to make it happen on 40nm. AMD got their new VLIW4 architecture, but they had to give up performance and an unknown number of features to get there. On the flip side this will make 28nm all the more interesting, as we’ll get to see many of the features that were supposed to make it for 2010 but never arrived.

Refresher: The 6800 Series’ New Features VLIW4: Finding the Balance Between TLP, ILP, and Everything Else
POST A COMMENT

167 Comments

View All Comments

  • fausto412 - Wednesday, December 15, 2010 - link

    6970 just 4 to 6 fps faster in Bad Company 2 than my 5870? WTF!

    not worth the upgrade. what a lame ass successor.
    Reply
  • Kibbles - Wednesday, December 15, 2010 - link

    It's 7% faster at 1920 and 9% faster at 2560. BC2 obviously doesn't need the extra GPU power at 1680.

    I wouldn't call it weak, but this card certainly isn't the clear winner that the 5870 was.
    Reply
  • fausto412 - Wednesday, December 15, 2010 - link

    its weak if i was expecting a response to the gtx580 to upgrade to.

    may as well stay with my 5870.
    Reply
  • ClownPuncher - Wednesday, December 15, 2010 - link

    For now... But who really bases their purchase on one game anymore? It looks like 10.12 or 11.1 drivers will help performance a good amount. Reply
  • fausto412 - Wednesday, December 15, 2010 - link

    I base my performance on 1 game...because it is a very taxing game and my #1 game right now. Reply
  • MeanBruce - Wednesday, December 15, 2010 - link

    Yup, dude I heard the AMD 7000 series might make an early appearance next July, with the die shrink @28nm you might want to wait and pick up a 7970! Reply
  • fausto412 - Wednesday, December 15, 2010 - link

    that's what i'm considering now. need to upgrade for 30% more performance than 5870 for it to make sense. Reply
  • Stuka87 - Wednesday, December 15, 2010 - link

    The game is CPU limited at lower resolutions. BC2 is known for being more CPU bound than GPU bound.

    But I was hoping for a larger jump over the previous cards :/
    Reply
  • fausto412 - Wednesday, December 15, 2010 - link

    I understand BFBC2 is more cpu bound. But in this testing Anandtech did they use a TOP TOP TOP of the line cpu so that rules that out as a bottleneck. Reply
  • Belard - Wednesday, December 15, 2010 - link

    Yeah... at least the model numbers didn't make things confusing!

    In some benchmarks, the 6950 is faster than your 5870... but it would have made far more sense to call these 6850/6870 or even 6830/6850..

    AMD screwed up with the new names...
    Reply

Log in

Don't have an account? Sign up now