Civilization V

Our final game, Civilization 5, gives us an interesting look at things that other RTSes cannot match, with a much weaker focus on shading in the game world, and a much greater focus on creating the geometry needed to bring such a world to life. In doing so it uses a slew of DirectX 11 technologies, including tessellation for said geometry, driver command lists for reducing CPU overhead, and compute shaders for on-the-fly texture decompression.

Civilization V

Civilization V

Because of the fact that Civilization V uses driver command lists, we were originally not going to include it in this benchmark suite as a gaming benchmark. If it were solely a DCL test it would do a good job highlighting the fact that AMD doesn’t currently support the feature, but a poor job of actually explaining any hardware/architectural differences.  It was only after we saw AMD’s reviewer’s guide that we decided to go ahead and include it, because quite frankly we didn’t believe the numbers AMD had published.

With the GTX 580 and the 6970, the 6970 routinely lost to the GTX 580 by large margins. We had long assumed this was solely due to NVIDIA’s inclusion of DCLs, as we’ve seen a moderate single-GPU performance deficit and equally moderate multi-GPU lead for AMD melt away when NVIDIA added DCL support. The 7970 required that we rethink this.

If Civilization V was solely a DCL test, then our 2560 results would be impossible – the 7970 is winning by 12% in a game NVIDIA previous won by a massive margin. NVIDIA only regains their lead at 1680, which at this resolution we’re not nearly as likely to be GPU-bound.

So what changed? AMD has yet to spill the beans, but short of a secret DCL implementation for just CivV we have to look elsewhere. Next to DCL CivV’s other killer feature is its use of compute shaders, and GCN is a compute architecture. To that extent we believe at this point that while AMD is still facing some kind of DCL bottleneck, they have completely opened the floodgates on whatever compute shader bottleneck was standing in their way before. This is particularly evident when comparing the 7970 to the 6970, where the 7970 enjoys a consistent 62% performance advantage. It’s simply an incredible turnabout to see the 7970 do so well when the 6970 did so poorly.

Of course if this performance boost really was all about compute shaders, it raises a particularly exciting question: just how much higher could AMD go if they had DCLs? Hopefully one day that’s an answer we get to find out.

Starcraft II Compute: The Real Reason for GCN
Comments Locked

292 Comments

View All Comments

  • CrystalBay - Thursday, December 22, 2011 - link

    Hi Ryan , All these older GPUs ie (5870 ,gtx570 ,580 ,6950 were rerun on the new hardware testbed ? If so GJ lotsa work there.
  • FragKrag - Thursday, December 22, 2011 - link

    The numbers would be worthless if he didn't
  • Anand Lal Shimpi - Thursday, December 22, 2011 - link

    Yep they're all on the new testbed, Ryan had an insane week.

    Take care,
    Anand
  • Lifted - Thursday, December 22, 2011 - link

    How many monitors on the market today are available at this resolution? Instead of saying the 7970 doesn't quite make 60 fps at a resolution maybe 1% of gamers are using, why not test at 1920x1080 which is available to everyone, on the cheap, and is the same resolution we all use on our TV's?

    I understand the desire (need?) to push these cards, but I think it would be better to give us results the vast majority of us can relate to.
  • Anand Lal Shimpi - Thursday, December 22, 2011 - link

    The difference between 1920 x 1200 vs 1920 x 1080 isn't all that big (2304000 pixels vs. 2073600 pixels, about an 11% increase). You should be able to conclude 19x10 performance from looking at the 19x12 numbers for the most part.

    I don't believe 19x12 is pushing these cards significantly more than 19x10 would, the resolution is simply a remnant of many PC displays originally preferring it over 19x10.

    Take care,
    Anand
  • piroroadkill - Thursday, December 22, 2011 - link

    Dell U2410, which I have :3

    and Dell U2412M
  • piroroadkill - Thursday, December 22, 2011 - link

    Oh, and my laptop is 1920x1200 too, Dell Precision M4400.
    My old laptop is 1920x1200 too, Dell Latitude D800..
  • johnpombrio - Wednesday, December 28, 2011 - link

    Heh, I too have 3 Dell U2410 and one Dell 2710. I REALLY want a Dell 30" now. My GTX 580 seems to be able to handle any of these monitors tho Crysis High-Def does make my 580 whine on my 27 inch screen!
  • mczak - Thursday, December 22, 2011 - link

    The text for that test is not really meaningful. Efficiency of ROPs has almost nothing to do at all with this test, this is (and has always been) a pure memory bandwidth test (with very few exceptions such as the ill-designed HD5830 which somehow couldn't use all its theoretical bandwidth).
    If you look at the numbers, you can see that very well actually, you can pretty much calculate the result if you know the memory bandwidth :-). 50% more memory bandwidth than HD6970? Yep, almost exactly 50% more performance in this test just as expected.
  • Ryan Smith - Thursday, December 22, 2011 - link

    That's actually not a bad thing in this case. AMD didn't go beyond 32 ROPs because they didn't need to - what they needed was more bandwidth to feed the ROPs they already had.

Log in

Don't have an account? Sign up now