Compute

Shifting gears, as always our final set of benchmarks is a look at compute performance. As we have seen with GTX 680, GK104 appears to be significantly less balanced between rendering and compute performance than GF110 or GF114 were, and as a result compute performance suffers.  Cache and register file pressure in particular seem to give GK104 grief, which means that GK104 can still do well in certain scenarios, but falls well short in others.

Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes. Note that this is a DX11 DirectCompute benchmark.

It’s quite shocking to see the GTX 670 do so well here. For sure it’s struggling relative to the Radeon HD 7900 series and the GTX 500 series, but compared to the GTX 680 it’s only trailing by 4%. This is a test that should cause the gap between the two cards to open up due to the lack of shader performance, but clearly that this not the case. Perhaps we’ve been underestimating the memory bandwidth needs of this test? If that’s the case, given AMD’s significant memory bandwidth advantage it certainly helps to cement the 7970’s lead.

Our next benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. We’re now using a development build from the version 2.0 branch, and we’ve moved on to a more complex scene that hopefully will provide a greater challenge to our GPUs.

SmallLuxGPU on the other hand finally shows us that larger gap we’ve been expecting between the GTX 670 and GTX 680. The GTX 680’s larger number of SMXes and higher clockspeed cause the GTX 670 to fall behind by 10%, performing worse than the GTX 570 or even the GTX 470. More so than any other test, this is the test that drives home the point that GK104 isn’t a strong compute GPU while AMD offers nothing short of incredible compute performance.

For our next benchmark we’re looking at AESEncryptDecrypt, an OpenCL AES encryption routine that AES encrypts/decrypts an 8K x 8K pixel square image file. The results of this benchmark are the average time to encrypt the image over a number of iterations of the AES cypher.

Once again the GTX 670 has a weak showing here, although not as bad as with SmallLuxGPU. Still, it’s enough to fall behind the GTX 570; but at least it’s enough to beat the 7950. Clockspeeds help as showcased by the EVGA GTX 670SC but nothing really makes up for the missing SMX.

Our foruth benchmark is once again looking at compute shader performance, this time through the Fluid simulation sample in the DirectX SDK. This program simulates the motion and interactions of a 16k particle fluid using a compute shader, with a choice of several different algorithms. In this case we’re using an (O)n^2 nearest neighbor method that is optimized by using shared memory to cache data.

For reasons we’ve yet to determine, this benchmark strongly dislikes GTX 670 in particular. There doesn’t seem to be a performance regression in NVIDIA’s drivers, and there’s not an incredible gap due to TDP, it just struggles on the GTX 670. As a result performance of the GTC 670 only hits 42% of the GTX 680, which is well below what the GTX 670 should theoretically be getting. Barring some kind of esoteric reaction between this program and the unbalanced GPC a driver issue is still the most likely culprit, but it looks to only affect the GTX 670.

Finally, we’re adding one last benchmark to our compute run. NVIDIA  and the Folding@Home group have sent over a benchmarkable version of the client with preliminary optimizations for GK104. Folding@Home and similar initiatives are still one of the most popular consumer compute workloads, so it’s something NVIDIA wants their GPUs to do well at.

Whenever NVIDIA sends over a benchmark you can expect they have good reason to, and this is certainly the case for Folding@Home. GK104 is still a slouch given its resources compared to GF110, but at least it can surpass the GTX 580. At 970 nanoseconds per day the GTX 670 can tie the GTX 580, while the GTX 680 can pull ahead by 6%. Interestingly this benchmark appears to be far more constrained by clockspeed than the number of shaders, as the EVGA GTX 670SC outperforms the GTX 680 thanks to its 1188MHz boost clock, which it manages to stick to the entire time.

Civilization V Synthetics
Comments Locked

414 Comments

View All Comments

  • CeriseCogburn - Sunday, May 13, 2012 - link

    28 isn't playable and yes, the nVidia card really wins that game, as we see in the 680 test, which I had to point out as you, the amd fanboy despite your claim to own a 680, never noticed like all the rest, including the author to a large degree, in the 680 release review here.

    So take your temporary Gaming Evolved amd game driver hack that disabled nVidia's winning sweep across all resolutions and celebrate, a fool of course needs to do so, you're welcome for pointing it out.

    (roll eyes at the immense ignorance, again)

    Now enjoy the video http://www.youtube.com/watch?v=J0eZEdpsgjk

    I know amd told us many, many times, as did so many little named posters here for so many years, that nVidia was evil for TWIMTBP work and what they did to the amd cards performance in those efforts.

    Maybe they should note this little problem that developed ?

    ROFL
  • Galidou - Sunday, May 13, 2012 - link

    ''(roll eyes at the immense ignorance, again)''

    Such a troll again, such a lack of respect, indirect attacks, most useless comment on earth... immense ignorance, comon we're speaking about video cards, someone not knowing that you can change a buck for 4 quarters might be an ignorant... unless he's a tribal that lived in africa all his life... and then the ''again'' omg the inflammatory stuff you're able to say in 7 words sentence... I'm unsure you realize what you do... you're being really mean...

    All that and I could only say you're mean... I guess respect ain't give to everyone, sad it can't be bought, because it must be the most important value, ALL AROUND, a man can have. Everything starts with respect, real wisdom is acquired through respect.
  • CeriseCogburn - Sunday, May 13, 2012 - link

    All you do is attack, this is the last response you get from me unless you're on topic with a point, and as respectful as you demand others be, which you are not, you're the worst so far, a pure troll with no points at all.

    The other posters are trying to make points, not you. Attention for you is over.
  • Galidou - Sunday, May 13, 2012 - link

    He mentions the puny 1,25gb because the card CAN'T run it and is usually a good performer against the competition at that resolution. You say it beats the 7870 in the next page, by 1-2 fps, I don't even call that a beating. Plus in a game that favors Nvidia.

    ''This is the kind of crap we have to put up with here, at least we who have a brain and can see what's going on.''

    I think you meant ''we who are Nvidia's fanboys''

    It may not be the most neutral of comments but it's not the worst, you're just looking to find things against Nvidia and enumerate them because that's what Nvidia's fanboys do. What do they do, get mad as soon as there's a little reason to.
  • CeriseCogburn - Sunday, May 13, 2012 - link

    No other card can run it with gaming frame rates in his test.
    Since he didn't point that out, I DID.

    I guess he'll have to work harder to find a valid reason to dis the card since he has claimed nVidia is keeping it on, and the egg sure looks like that is correct - a lot of stock present.

    Now, you validated my point, but want to call it petty, but a similar thing happens on nearly every gaming page.

    At least what I point out is some pathetic grammar nazi problem, huh, which all of the rest of you seem to love to do so much, in every review posting it appears to be a contest for that, and I agree with the reviewer that PM'ing him to offer a correction is actually adult like and responsible.

    That of course is different than what bothers me, and we shall see, a valid complaint is usually responded to in a good way, so there may be some thought ahead, I certainly expect positive results for my efforts.
    As is so often claimed here by those in charge they respond to readers and what they want, so this fits that case fine.

    On that note along those lines I already advocated a single gaming chart with the collated data of the various cards in their overclocked performance states, as it seems to me that would be a nice added feature to reviews and would settle some of the rancor on the reviewed cards sometimes having OC'ed versions added in their release.
  • SamsungAppleFan - Thursday, May 10, 2012 - link

    first of all, thanks for the article, but you guys (anandtech) take wayyyyyy too long between new articles. get on it guys, seriously. and i'm still waiting on my gs3 full review lol.
  • GlItCh017 - Thursday, May 10, 2012 - link

    This card can really shine if it likes what you like. I'm a huge FPS fan, so in scenario's such as BF3 the GTX 670 vs. Radeon HD 6970 is a no brainer.
  • Morg. - Thursday, May 10, 2012 - link

    Sure, like most FPS's won't be on Unreal4 instead of frostbite ;)

    That engine, for some reason, favors nVidia and I don't think it's a good GPU performance metric, although if you're going to play frostbite content, it's clearly important.
  • Morg. - Thursday, May 10, 2012 - link

    Nevermind, I knew why but I hadn't seen it mentioned yet.
    http://www.geforce.com/whats-new/articles/johan-an...
    So .. buying a graphics board because it is favored by a botched graphical engine which is temporary - meh. If you plan on keeping your pc 2 or 3 years, fck the marketing, get raw power instead ;)
  • antef - Thursday, May 10, 2012 - link

    Are you saying AMD has the better GPU for most FPS titles outside ones running Frostbite?

Log in

Don't have an account? Sign up now