Compute

Shifting gears, as always our final set of benchmarks is a look at compute performance. As we have seen with GTX 680, GK104 appears to be significantly less balanced between rendering and compute performance than GF110 or GF114 were, and as a result compute performance suffers.  Cache and register file pressure in particular seem to give GK104 grief, which means that GK104 can still do well in certain scenarios, but falls well short in others.

Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes. Note that this is a DX11 DirectCompute benchmark.

It’s quite shocking to see the GTX 670 do so well here. For sure it’s struggling relative to the Radeon HD 7900 series and the GTX 500 series, but compared to the GTX 680 it’s only trailing by 4%. This is a test that should cause the gap between the two cards to open up due to the lack of shader performance, but clearly that this not the case. Perhaps we’ve been underestimating the memory bandwidth needs of this test? If that’s the case, given AMD’s significant memory bandwidth advantage it certainly helps to cement the 7970’s lead.

Our next benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. We’re now using a development build from the version 2.0 branch, and we’ve moved on to a more complex scene that hopefully will provide a greater challenge to our GPUs.

SmallLuxGPU on the other hand finally shows us that larger gap we’ve been expecting between the GTX 670 and GTX 680. The GTX 680’s larger number of SMXes and higher clockspeed cause the GTX 670 to fall behind by 10%, performing worse than the GTX 570 or even the GTX 470. More so than any other test, this is the test that drives home the point that GK104 isn’t a strong compute GPU while AMD offers nothing short of incredible compute performance.

For our next benchmark we’re looking at AESEncryptDecrypt, an OpenCL AES encryption routine that AES encrypts/decrypts an 8K x 8K pixel square image file. The results of this benchmark are the average time to encrypt the image over a number of iterations of the AES cypher.

Once again the GTX 670 has a weak showing here, although not as bad as with SmallLuxGPU. Still, it’s enough to fall behind the GTX 570; but at least it’s enough to beat the 7950. Clockspeeds help as showcased by the EVGA GTX 670SC but nothing really makes up for the missing SMX.

Our foruth benchmark is once again looking at compute shader performance, this time through the Fluid simulation sample in the DirectX SDK. This program simulates the motion and interactions of a 16k particle fluid using a compute shader, with a choice of several different algorithms. In this case we’re using an (O)n^2 nearest neighbor method that is optimized by using shared memory to cache data.

For reasons we’ve yet to determine, this benchmark strongly dislikes GTX 670 in particular. There doesn’t seem to be a performance regression in NVIDIA’s drivers, and there’s not an incredible gap due to TDP, it just struggles on the GTX 670. As a result performance of the GTC 670 only hits 42% of the GTX 680, which is well below what the GTX 670 should theoretically be getting. Barring some kind of esoteric reaction between this program and the unbalanced GPC a driver issue is still the most likely culprit, but it looks to only affect the GTX 670.

Finally, we’re adding one last benchmark to our compute run. NVIDIA  and the Folding@Home group have sent over a benchmarkable version of the client with preliminary optimizations for GK104. Folding@Home and similar initiatives are still one of the most popular consumer compute workloads, so it’s something NVIDIA wants their GPUs to do well at.

Whenever NVIDIA sends over a benchmark you can expect they have good reason to, and this is certainly the case for Folding@Home. GK104 is still a slouch given its resources compared to GF110, but at least it can surpass the GTX 580. At 970 nanoseconds per day the GTX 670 can tie the GTX 580, while the GTX 680 can pull ahead by 6%. Interestingly this benchmark appears to be far more constrained by clockspeed than the number of shaders, as the EVGA GTX 670SC outperforms the GTX 680 thanks to its 1188MHz boost clock, which it manages to stick to the entire time.

Civilization V Synthetics
Comments Locked

414 Comments

View All Comments

  • kingkazuma - Friday, May 11, 2012 - link

    i was honestly surprised that Nvidia didn't make the 600 series more powerful in gpcomputing

    well guess its 7970 for me... but i would want one of these for gaming :D
  • CeriseCogburn - Friday, May 11, 2012 - link

    Gee what programs are you going to use the amd card for in compute ?
    I'd sure like to hear for once what amd can do, instead of experiencing it's massively frustrating failures and apologists and recent unbelievable hypocrits.
    What the heck "compute" are you using a 7970 for - I'd sure like to know what programs... and what exactly stream is capable of, but all we ever get is a blank and some fat slow drooling talking point, except when it's pointed out that nVidia has ten thousand more drivers and support and software base in compute so it's the only way to go.
    Would you enlighten us ?
  • CeriseCogburn - Sunday, May 13, 2012 - link

    PS - the 7970 lost the computer benchmarks here, and the 680 won.

    I guess you never checked the results and just went with the "on paper !" fantasies.

    Good luck squandering up valid programs and valid amd software.

    At least you can use the now "we'll never do it !" "we are love, we are amd! " "we demand open source! " "we demand corporate responsibility !" - PROPRIETARY amd hacked openCL winzip.

    Enjoy that gigantic hypocrisy unzipping.

    Amd is for suckers.
  • james.jwb - Friday, May 11, 2012 - link

    I normally like to read the comment section here, but honestly can't be bothered since certain trolls have come here and for the 10th time (in the last 2/3 years), get to run a riot for a few weeks until someone at Anandtech finally butts in and bans him.

    Yawn to this scenario once again.

    Move to Disqus, Anandtech, and start moderating. You'll get 500 comments per important article and won't have to let controversy and trolls stay to bolster discussion.
  • Gastec - Tuesday, November 13, 2012 - link

    I subscribe to that. If they don't take action maybe we should.
  • CeriseCogburn - Friday, May 11, 2012 - link

    Civ5 " On our final test the 7970 sees a slight resurgence compared to the past few games, preventing NVIDIA from sweeping the whole back half of our tests. "
    Well, actually, that's not nVidia sweeping "the whole back HALF" that's nVidia sweeping the entire last 75% or 3/4ths, and if it weren't for the TWS2 bug, amd could claim only 2 out of 10 games, losing a FULL 80%, more than 3/4ths of all gaming tests.

    Instead of hearing the awful truth since the 7970 dives as resolution goes up losing miserably, which is ALWAYS pointed out when nVidia cards react that, puttting down some imaginary problem the reviewer guesses at concerning nVidia, instead we hear how amd shows a slight "resurgence" like a good terrorist card, and it's not noted it's only at the lowest resolution shown, of course.
    Next the reviewer tells us how the 600 series doesn't do very well "against the 500 series" here - yet the 580 BEATS the 7950 at both 1920 and at 2560 - in other words, the truth is, the GTX500 series does EXCEPTIONALLY WELL here, smacking down even the "resurgent" amd cards brother.
    The 570 is about 40% and then 30% ahead of the 6970, as another example of how well the 500 series plays this game. That's extended performance, not 600 series "interesting it's not far ahead".
    Of course, IMO only an amd fanboy could come up with that kind of wording and analysis.
    Was it so "interesting" that the reviewer couldn't see the 580 and 570 cleaning the clocks of their competition and even above their competition ?

    So when the nVidia just prior tier does well, it's the current card not doing so well against it.

    When the current nVidia cards do win 75% of the games against amd, it's belittled to less than half, with the special less than half "sweep" phrasing, with the 7970 amd flagship losing past the lowest tested resolution as the "catch".
    What a bad joke for Nvidia the pro amd words are in these reviews games pages.
  • medi01 - Friday, May 11, 2012 - link

    "7970 dives as resolution goes up losing miserably"

    Are you on a crack, or something?
  • CeriseCogburn - Friday, May 11, 2012 - link

    Look at the civ5 page - Civilization 5 gaming page review, as soon as you put down your stupid dumb you down drug.
  • medi01 - Saturday, May 12, 2012 - link

    So what are you smoking? Is it crack that dumbs down and turns into zealont, or were you born an idiot?
  • CeriseCogburn - Saturday, May 12, 2012 - link

    Excellent rebuttal, you've made your point for amd so well.

Log in

Don't have an account? Sign up now