Civ V, Battlefield, STALKER, and DIRT 2

Civilization V continues to be the oddball among our benchmarks. Having started out as a title with low framerates and poor multi-GPU scaling, in recent months AMD and NVIDIA have rectified this some.  As a result it’s now possible to crack 60fps at 2560 with a pair of high-end GPUs, albeit with some difficulty. In our experience Civ V is a hybrid bottlenecked game – we have every reason to believe it’s bottlenecked by the CPU at certain points, but the disparity between NVIDIA and AMD’s performance indicates there’s a big difference in how the two are settings things up under the hood.

When we started using Bad Company 2 a year ago, it was actually a rather demanding benchmark; anything above 60fps at 2560 required SLI/CF. Today that’s still true, but at 52fps the GTX 580 comes close to closing that gap. On the flip side two GPUs can send scores quite a distance up, and three GPUs will push that over 120fps. Now if we could just get a 120Hz 2560 monitor…

The Bad Company 2 Waterfall benchmark is our other minimum framerate benchmark, as it provides very consistent results. NVIDIA normally does well here with one GPU, but with two GPUs the gap closes to the point where NVIDIA may be CPU limited as indicated by our 580SLI/590 scores. At three GPUs AMD falls just short of a 60fps minimum, while the triple GTX 580 setup drops in performance. This would indicate uneven performance scaling for NVIDIA with three GPUs.

STALKER is another title that is both shader heavy and potentially VRAM-intensive. When moving from 1GB cards to 2GB cards we’ve seen the average framerate climb a respectable amount, which may be why AMD does so well here with multiple GPUs given the 512MB advantage in VRAM. With three GPUs the GTX 580 can crack 60fps, but the 6970 can clear 90fps.

We’ve seen DiRT 2 become CPU limited with two GPUs at 1920, so it shouldn’t come as a surprise that with three GPUs a similar thing happens at 2560. Although we can never be 100% sure that we’re CPU limited versus just seeing poor scaling, the fact that our framerates top out at only a few FPS above our top 1920 scores is a solid sign of this.

  Radeon HD 6970 GeForce GTX 580
GPUs 1->2 2->3 1->3 1->2 2->3 1->3
Civilization V 168% 99% 167% 170% 95% 160%
Battlefield: BC2 Chase 200% 139% 278% 189% 129% 246%
Battlefield: BC2 Water 206% 131% 272% 148% 85% 125%
STALKER: CoP 189% 121% 231% 149% 104% 157%
DiRT 2 181% 120% 219% 177% 105% 186%

So what does multi-GPU scaling look like in this batch of games? The numbers favor AMD at this point, particularly thanks to STALKER. Throwing out a CPU limited DIRT 2, and the average FPS for an AMD card moving from one GPU to two GPUs is 185%; NVIDIA’s gains under the same circumstances are only 169%.

For the case of two GPUs, AMD’s worst showing is Civilization V at 168%, while for NVIDIA it’s STALKER at %149. In the case of Civilization V the close gains to NVIDIA (168% vs. 170%) hides the fact that the GTX 580 already starts out at a much better framerate, so while the gains are similar the final performance is not. STALKER meanwhile presents us with an interesting case where the GTX 580 and Radeon HD 6970 start out close and end up far apart; AMD has the scaling and performance advantage thanks to NVIDIA’s limited performance gains here.

As for scaling with three GPUs, as was the case with two GPUs the results are in AMD’s favor. We still see some weak scaling at times – or none as in the case of Civilization V – but AMD’s average gain of 120% over a dual-GPU configuration isn’t too bad. NVIDIA’s average gains are basically only half AMD’s though at 110%, owing to an even larger performance loss in Civilization V, and almost no gain in STALKER. Battlefield: Bad Company 2 is the only title that NVIDIA sees significant gains in, and while the specter of CPU limits always looms overhead, I’m not sure what’s going on in STALKER for NVIDIA; perhaps we’re looking at the limits of 1.5GB of VRAM?

Looking at minimum framerates though the Battlefield: Bad Company 2, the situation is strongly in AMD’s favor for both two and three GPUs, as AMD scales practically perfectly with two GPUs and relatively well with three GPUs. I strongly believe this has more to do with the game than the technology, but at the end of the day NVIDIA’s poor triple-GPU scaling under this benchmark really puts a damper on things.

Crysis, BattleForge, Metro 2033, and HAWX Mass Effect 2, Wolfenstein, and Civ V Compute
Comments Locked

97 Comments

View All Comments

  • DanNeely - Sunday, April 3, 2011 - link

    Does DNetc not have 4xx/5xx nVidia applications yet?
  • Pirks - Sunday, April 3, 2011 - link

    They have CUDA 3.1 clients that work pretty nice with Fermi cards. Except that AMD cards pwn them violently, we're talking about an order of magnitude difference between 'em. Somehow RC5-72 code executes 10x faster on AMD than on nVidia GPUs, I could never find precise explanation why, must be related to poor match between RC5 algorithm and nVidia GPU architecture or something.

    I crack RC5-72 keys on my AMD 5850 and it's amost 2 BILLION keys per second. Out of 86,000+ participants my machine is ranked #43 from the top (in daily stats graph but still, man...#43! I gonna buy two 5870 sometime and my rig may just make it to top 10!!! out of 86,000!!! this is UNREAL man...)

    On my nVidia 9800 GT I was cracking like 146 million keys per second, this very low rate is soo shameful compared to AMD :)))
  • DanNeely - Monday, April 4, 2011 - link

    It's not just dnetc, ugly differences in performance also show up in the milkeyway@home and collatz conjecture projects on the boinc platform. They're much larger that the 1/8 vs 1/5(1/4 in 69xx?) differences in FP64/FP32 between would justify; IIRC both are about 5:1 in AMD's favor.
  • Ryan Smith - Sunday, April 3, 2011 - link

    I love what the Dnet guys do with their client, and in the past it's been a big help to us in our articles, especially on the AMD side.

    With that said, it's a highly hand optimized client that almost perfectly traces theoretical performance. It doesn't care about cache, it doesn't care about memory bandwidth; it only cares about how many arithmetic operations can be done in a second. That's not very useful to us; it doesn't tell us anything about the hardware.

    We want to stick to distributed computing clients that have a single binary for both platforms, so that we're looking at the performance of a common OpenCL/DirectCompute codepath and how it performs on two different GPUs. The Dnet client just doesn't meet that qualification.
  • tviceman - Sunday, April 3, 2011 - link

    Ryan are you going to be using the nvidia 270 drivers in future tests? I know they're beta, but it looks like you aren't using WHQL AMD drivers either (11.4 preview).
  • Ryan Smith - Sunday, April 3, 2011 - link

    Yes, we will. The benchmarking for this article was actually completed shortly after the GTX 590, so by the time NVIDIA released the 270 drivers it was already being written up.
  • ajp_anton - Sunday, April 3, 2011 - link

    When looking at the picture of those packed sardines, I had an idea.
    Why don't the manufacturers make the radial fan hole go all the way through the card? With three or four cards tightly packed, the middle card(s) will still have some air coming through the other cards, assuming the holes are aligned.
    Even with only one or two cards, the (top) card will have access to more fresh air than before.
  • semo - Sunday, April 3, 2011 - link

    Correct me if I'm wrong but the idea is to keep the air flowing through the shroud body and not through and through the fans. I think this is a moot point though as I can't see anyone using a 3x GPU config without water cooling or something even more exotic.
  • casteve - Sunday, April 3, 2011 - link

    "It turns out adding a 3rd card doesn’t make all that much more noise."

    Yeah, I guess if you enjoy 60-65dBA noise levels, the 3rd card won't bother you. Wouldn't it be cheaper to just toss a hairdryer inside your PC? You'd get the same level of noise and room heater effect. ;)
  • slickr - Sunday, April 3, 2011 - link

    I mean mass effect 2 is a console port, Civilisation 5 is the worst game to choose to benchmark as its a turn based game and not real time and Hawx is 4 years outdated and basically game that Nvidia made, its not even funny anymore seeing how it gives the advantage to Nvidia cards every single time.

    Replace with:
    Crysis warhead to Aliens vs predator
    battleforge to shogun 2 total war
    hawx to Shift 2
    Civilization 5 to Starcraft 2
    Mass Effect 2 with Dead Space 2
    Wolfeinstein to Arma 2
    +add Mafia 2

Log in

Don't have an account? Sign up now