Cayman: The Last 32nm Castaway

With the launch of the Barts GPU and the 6800 series, we touched on the fact that AMD was counting on the 32nm process to give them a half-node shrink to take them in to 2011. When TSMC fell behind schedule on the 40nm process, and then the 32nm process before canceling it outright, AMD had to start moving on plans for a new generation of 40nm products instead.

The 32nm predecessor of Barts was among the earlier projects to be sent to 40nm. This was due to the fact that before 32nm was even canceled, TSMC’s pricing was going to make 32nm more expensive per transistor than 40nm, a problem for a mid-range part where AMD has specific margins they’d like to hit. Had Barts been made on the 32nm process as projected, it would have been more expensive to make than on the 40nm process, even though the 32nm version would be smaller. Thus 32nm was uneconomical for gaming GPUs, and Barts was moved to the 40nm process.

Cayman on the other hand was going to be a high-end part. Certainly being uneconomical is undesirable, but high-end parts carry high margins, especially if they can be sold in the professional market as compute products (just ask NVIDIA). As such, while Barts went to 40nm, Cayman’s predecessor stayed on the 32nm process until the very end. The Cayman team did begin planning to move back to 40nm before TSMC officially canceled the 32nm process, but if AMD had a choice at the time they would have rather had Cayman on the 32nm process.

As a result the Cayman we’re seeing today is not what AMD originally envisioned as a 32nm part. AMD won’t tell us everything that they had to give up to create the 40nm Cayman (there has to be a few surprises for 28nm) but we do know a few things. First and foremost was size; AMD’s small die strategy is not dead, but getting the boot from the 32nm process does take the wind out of it. At 389mm2 Cayman is the largest AMD GPU since the disastrous R600, and well off the sub-300mm2 size that the small die strategy dictates. In terms of efficient usage of space though AMD is doing quite well; Cayman has 2.64 billion transistors, 500mil more than Cypress. AMD was able to pack 29% more transistors in only 16% more space.

Even then, just reaching that die size is a compromise between features and production costs. AMD didn’t simply settle for a larger GPU, but they had to give up some things to keep it from being even larger. SIMDs were on the chopping block; 32nm Cayman would have had more SIMDs for more performance. Features were also lost, and this is where AMD is keeping mum. We know PCI Express 3.0 functionality was scheduled for the 32nm part, where AMD had to give up their PCIe 3.0 controller for a smaller 2.1 controller to make up for their die size difference. This in all honesty may have worked out better for them: PCIe 3.0 ended up being delayed until November, so suitable motherboards are still at least months away.

The end result is that Cayman as we know it is a compromise to make it happen on 40nm. AMD got their new VLIW4 architecture, but they had to give up performance and an unknown number of features to get there. On the flip side this will make 28nm all the more interesting, as we’ll get to see many of the features that were supposed to make it for 2010 but never arrived.

Refresher: The 6800 Series’ New Features VLIW4: Finding the Balance Between TLP, ILP, and Everything Else
Comments Locked

168 Comments

View All Comments

  • MeanBruce - Wednesday, December 15, 2010 - link

    TechPowerUp.com shows the 6850 as 95percent or almost double the performance of the 4850 and 100percent more efficient than the 4850@1920x1200. I also am upgrading an old 4850, as far as the 6950 check their charts when they come up later today.
  • mapesdhs - Monday, December 20, 2010 - link


    Today I will have completed by benchmark pages comparing 4890, 8800GT and
    GTX 460 1GB (800 and 850 core speeds), in both single and CF/SLI, for a range
    of tests. You should be able to extrapolate between known 4850/4890 differences,
    the data I've accumulated, and known GTX 460 vs. 68xx/69xx differences (baring
    in mind I'm testing with 460s with much higher core clocks than the 675 reference
    speed used in this article). Email me at mapesdhs@yahoo.com and I'll send you
    the URL once the data is up. I'm testing with 3DMark06, Unigine (Heaven, Tropics
    and Sanctuary), X3TC, Stalker COP, Cinebench, Viewperf and PT Boats. Later
    I'll also test with Vantage, 3DMark11 and AvP.

    Ian.
  • ZoSo - Wednesday, December 15, 2010 - link

    Helluva 'Bang for the Buck' that's for sure! Currently I'm running a 5850, but I have been toying with the idea of SLI or CF. For a $300 difference, CF is the way to go at this point.
    I'm in no rush, I'm going to wait at least a month or two before I pull any triggers ;)
  • RaistlinZ - Wednesday, December 15, 2010 - link

    I'm a bit underwhelmed from a performance standpoint. I see nothing that will make me want to upgrade from my trusty 5870.

    I would like to see a 2x6950 vs 2x570 comparison though.
  • fausto412 - Wednesday, December 15, 2010 - link

    exactly my feelings.

    it's like thinking Miss Universe is about to screw you and then you find out it's her mom....who's probably still hot...but def not miss universe
  • Paladin1211 - Wednesday, December 15, 2010 - link

    CF scaling is truly amazing now, I'm glad that nVidia has something to catch up in terms of driver. Meanwhile, the ATI wrong refresh rate is not fixed, it stucks at 60hz where the monitor can do 75hz. "Refresh force", "refresh lock", "ATI refresh fix", disable /enable EDID, manually set monitor attributes in CCC, EDID hack... nothing works. Even the "HUGE" 10.12 driver can't get my friend's old Samsung SyncMaster 920NW to work at its native 1440x900@75hz, both in XP 32bit and win 7 64bit. My next monitor will be an 120hz for sure, and I don't want to risk and ruin my investment, AMD.
  • mapesdhs - Monday, December 20, 2010 - link


    I'm not sure if this will help fix the refresh issue (I do the following to fix max res
    limits), but try downloading the drivers for the monitor but modify the data file
    before installing them. Check to ensure it has the correct genuine max res and/or
    max refresh.

    I've been using various models of CRT which have the same Sony tube that can
    do 2048 x 1536, but every single vendor that sells models based on this tube has
    drivers that limited the max res to 1800x1440 by default, so I edit the file to enable
    2048 x 1536 and then it works fine, eg. HP P1130.

    Bit daft that drivers for a monitor do not by default allow one to exploit the monitor
    to its maximum potential.

    Anyway, good luck!!

    Ian.
  • techworm - Wednesday, December 15, 2010 - link

    future DX11 games will stress GPU and video RAM incrementally and it is then that 6970 will shine so it's obvious that 6970 is a better and more future proof purchase than GTX570 that will be frame buffer limited in near future games
  • Nickel020 - Wednesday, December 15, 2010 - link

    In the table about whether PowerTune affects an application or not there's a yes for 3DMark, and in the text you mention two applications saw throttling (with 3DMark it would be three). Is this an error?

    Also, you should maybe include that you're measuring the whole system power in the PowerTune tables, it might be confusing for people who don't read your reviews very often to see that the power draw you measured is way higher than the PowerTune level.

    Reading the rest now :)
  • stangflyer - Wednesday, December 15, 2010 - link

    Sold my 5970 waiting for 6990. With my 5970 playing games at 5040x1050 I would always have a 4th extended monitor hooked up to a tritton uve-150 usb to vga adapter. This would let me game while having the fourth monitor display my teamspeak, afterburner, and various other things.
    Question is this!! Can i use the new 6950/6970 and use triple monitor and also use a 4th screen extended at the same time? I have 3 matching dell native display port monitors and a fourth with vga/dvi. Can I use the 2 dp's and the 2 dvi's on the 6970 at the same time? I have been looking for this answer for hours and can't find it! Thanks for the help.

Log in

Don't have an account? Sign up now