Crysis, BattleForge, Metro 2033, and HAWX

For the sake of completeness we have included both 2560x1600 and 1920x1200 results in our charts. However with current GPU performance a triple-GPU setup only makes sense at 2560, so that’s the resolution we’re going to be focusing on for commentary and scaling purposes.

As we normally turn to Crysis as our first benchmark it ends up being quite amusing when we have a rather exact tie on our hands. The triple GTX 580 setup ends up exactly tying the triple 6970 setup at 2560x1600 with full enthusiast settings at 65.6fps. This is quite an appropriate allegory for AMD and NVIDIA’s relative performance as of late, as the two are normally very close when it comes to cards at the same price. It’s also probably not the best start for the triple GTX 580 though, as it means NVIDIA’s lead at one and two cards has melted away by the 3rd.

We have however finally established what it takes to play Crysis at full resolution on a single monitor with every setting turned up – it takes no fewer than three GPUs to do the job. Given traditional GPU performance growth curves, it should be possible to do this on a single GPU by early 2014 or so, only some 7 years after the release of Crysis: Warhead. If you want SSAA though, you may as well throw in another few years.

Moving on, it’s interesting to note that while we had a tie at 2560 with Enthusiast settings for the average framerate, the same cannot be said of the minimums.  At 2560, no matter the quality, AMD has a distinct edge in the minimum framerate. This is particularly pronounced at 2560E, where moving from two to three GPUs causes a drop in the framerate on the GTX 580. This is probably a result of the differences in the cards’ memory capacity – additional GPUs require additional memory, and it seems the GTX 580 and its 1.5GB has reached its limit. We never seriously imagined we’d find a notable difference between 1.5GB and 2GB at this point in time, but here we are.

BattleForge is a shader-bound game that normally favors NVIDIA, and this doesn’t change with three GPUs. However even though it’s one of our more intensive games, three GPUs is simply overkill for one monitor.

Metro 2033 is the only other title in our current lineup that can challenge Crysis for the title of the most demanding game, and here that’s a bout it would win. Even with three GPUs we can’t crack 60fps, and we still haven’t enabled a few extra features such as Depth of Field. The 6970 and GTX 580 are normally close with one and two GPUs, and we see that relationship extend to three GPUs. The triple GTX 580 setup has the lead by under 2fps, but it’s not the lead one normally expects from the GTX 580.

Our next game is HAWX, a title that shifts us towards games that are CPU bound. Even with that it’s actually one of the most electrically demanding games in our test suite, which is why we use it as a backup for our power/temperature/noise testing. Here we see both the triple GTX 580 and triple 6970 crack 200fps at 2560, with the GTX 580 taking top honors.

  Radeon HD 6970 GeForce GTX 580
GPUs 1->2 2->3 1->3 1->2 2->3 1->3
Crysis G+E Avg
185%
134%
249%
181%
127%
230%
Crysis E
188%
142%
268%
184%
136%
252%
Crysis G+E Min
191%
141%
270%
181%
116%
212%
Crysis E Min
186%
148%
277%
185%
83%
155%
BattleForge
194%
135%
263%
199%
135%
269%
Metro 2033
180%
117%
212%
163%
124%
202%
HAWX
190%
115%
219%
157%
117%
185%

Having taken a look at raw performance, what does the scaling situation look like? All together it’s very good. For a dual-GPU configuration the weakest game for both AMD and NVIDIA is Metro 2033, where AMD gets 180% while NVIDIA manages 163% a single video card’s performance respectively. At the other end, NVIDIA manages almost perfect scaling for BattleForge at 199%, while AMD’s best showing is in the same game at 194%.

Adding in a 3rd GPU significantly shakes things up however. The best case scenario for going from two GPUs to three GPUs is 150%, which appears to be a harder target to reach. At 142% under Crysis with Enthusiast settings AMD does quite well, which is why they close the overall performance gap there. NVIDIA doesn’t do as quite well however, managing 136%. The weakest for both meanwhile is HAWX, which is what we’d expect for a game passing 200fps and almost assuredly running straight into a CPU bottleneck.

The Crysis minimum framerate gives us a moment’s pause though. AMD gets almost perfect scaling moving from two to three GPUs when it comes to minimum framerates in Crysis, meanwhile NVIDIA ends up losing performance here with Enthusiast settings. This is likely not a story of GPU scaling and more a story about GPU memory, but regardless the outcome is a definite hit in performance. Thus while minimum framerate scaling from one to two GPUs is rather close between NVIDIA and AMD with full enthusiast settings and slightly in AMD’s favor with gamer + enthusiast, AMD has a definite advantage going from two to three GPUs all of the time out of this batch of games.

Sticking with average framerates and throwing out a clearly CPU limited HAWX, neither side seems to have a strong advantage moving from two GPUs to three GPUs; the average gain is 131%, or some 62% the theoretical maximum. AMD does have a slight edge here, but keep in mind we’re looking at percentages, so AMD’s edge is often a couple of frames per second at best.

Going from one GPU to two GPUs also gives AMD a minor advantage, with the average performance being 186% for for AMD versus 182% for NVIDIA. Much like we’ve seen in our individual GPU reviews though, this almost constantly flip-flops based on the game being tested, which is why in the end the average gains are so close.

The Test, Power, Temps, and Noise Civ V, Battlefield, STALKER, and DIRT 2
Comments Locked

97 Comments

View All Comments

  • DanNeely - Sunday, April 3, 2011 - link

    Does DNetc not have 4xx/5xx nVidia applications yet?
  • Pirks - Sunday, April 3, 2011 - link

    They have CUDA 3.1 clients that work pretty nice with Fermi cards. Except that AMD cards pwn them violently, we're talking about an order of magnitude difference between 'em. Somehow RC5-72 code executes 10x faster on AMD than on nVidia GPUs, I could never find precise explanation why, must be related to poor match between RC5 algorithm and nVidia GPU architecture or something.

    I crack RC5-72 keys on my AMD 5850 and it's amost 2 BILLION keys per second. Out of 86,000+ participants my machine is ranked #43 from the top (in daily stats graph but still, man...#43! I gonna buy two 5870 sometime and my rig may just make it to top 10!!! out of 86,000!!! this is UNREAL man...)

    On my nVidia 9800 GT I was cracking like 146 million keys per second, this very low rate is soo shameful compared to AMD :)))
  • DanNeely - Monday, April 4, 2011 - link

    It's not just dnetc, ugly differences in performance also show up in the milkeyway@home and collatz conjecture projects on the boinc platform. They're much larger that the 1/8 vs 1/5(1/4 in 69xx?) differences in FP64/FP32 between would justify; IIRC both are about 5:1 in AMD's favor.
  • Ryan Smith - Sunday, April 3, 2011 - link

    I love what the Dnet guys do with their client, and in the past it's been a big help to us in our articles, especially on the AMD side.

    With that said, it's a highly hand optimized client that almost perfectly traces theoretical performance. It doesn't care about cache, it doesn't care about memory bandwidth; it only cares about how many arithmetic operations can be done in a second. That's not very useful to us; it doesn't tell us anything about the hardware.

    We want to stick to distributed computing clients that have a single binary for both platforms, so that we're looking at the performance of a common OpenCL/DirectCompute codepath and how it performs on two different GPUs. The Dnet client just doesn't meet that qualification.
  • tviceman - Sunday, April 3, 2011 - link

    Ryan are you going to be using the nvidia 270 drivers in future tests? I know they're beta, but it looks like you aren't using WHQL AMD drivers either (11.4 preview).
  • Ryan Smith - Sunday, April 3, 2011 - link

    Yes, we will. The benchmarking for this article was actually completed shortly after the GTX 590, so by the time NVIDIA released the 270 drivers it was already being written up.
  • ajp_anton - Sunday, April 3, 2011 - link

    When looking at the picture of those packed sardines, I had an idea.
    Why don't the manufacturers make the radial fan hole go all the way through the card? With three or four cards tightly packed, the middle card(s) will still have some air coming through the other cards, assuming the holes are aligned.
    Even with only one or two cards, the (top) card will have access to more fresh air than before.
  • semo - Sunday, April 3, 2011 - link

    Correct me if I'm wrong but the idea is to keep the air flowing through the shroud body and not through and through the fans. I think this is a moot point though as I can't see anyone using a 3x GPU config without water cooling or something even more exotic.
  • casteve - Sunday, April 3, 2011 - link

    "It turns out adding a 3rd card doesn’t make all that much more noise."

    Yeah, I guess if you enjoy 60-65dBA noise levels, the 3rd card won't bother you. Wouldn't it be cheaper to just toss a hairdryer inside your PC? You'd get the same level of noise and room heater effect. ;)
  • slickr - Sunday, April 3, 2011 - link

    I mean mass effect 2 is a console port, Civilisation 5 is the worst game to choose to benchmark as its a turn based game and not real time and Hawx is 4 years outdated and basically game that Nvidia made, its not even funny anymore seeing how it gives the advantage to Nvidia cards every single time.

    Replace with:
    Crysis warhead to Aliens vs predator
    battleforge to shogun 2 total war
    hawx to Shift 2
    Civilization 5 to Starcraft 2
    Mass Effect 2 with Dead Space 2
    Wolfeinstein to Arma 2
    +add Mafia 2

Log in

Don't have an account? Sign up now