Compute

Shifting gears, as always our final set of benchmarks is a look at compute performance. As we have seen with GTX 680, GK104 appears to be significantly less balanced between rendering and compute performance than GF110 or GF114 were, and as a result compute performance suffers.  Cache and register file pressure in particular seem to give GK104 grief, which means that GK104 can still do well in certain scenarios, but falls well short in others.

Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes. Note that this is a DX11 DirectCompute benchmark.

It’s quite shocking to see the GTX 670 do so well here. For sure it’s struggling relative to the Radeon HD 7900 series and the GTX 500 series, but compared to the GTX 680 it’s only trailing by 4%. This is a test that should cause the gap between the two cards to open up due to the lack of shader performance, but clearly that this not the case. Perhaps we’ve been underestimating the memory bandwidth needs of this test? If that’s the case, given AMD’s significant memory bandwidth advantage it certainly helps to cement the 7970’s lead.

Our next benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. We’re now using a development build from the version 2.0 branch, and we’ve moved on to a more complex scene that hopefully will provide a greater challenge to our GPUs.

SmallLuxGPU on the other hand finally shows us that larger gap we’ve been expecting between the GTX 670 and GTX 680. The GTX 680’s larger number of SMXes and higher clockspeed cause the GTX 670 to fall behind by 10%, performing worse than the GTX 570 or even the GTX 470. More so than any other test, this is the test that drives home the point that GK104 isn’t a strong compute GPU while AMD offers nothing short of incredible compute performance.

For our next benchmark we’re looking at AESEncryptDecrypt, an OpenCL AES encryption routine that AES encrypts/decrypts an 8K x 8K pixel square image file. The results of this benchmark are the average time to encrypt the image over a number of iterations of the AES cypher.

Once again the GTX 670 has a weak showing here, although not as bad as with SmallLuxGPU. Still, it’s enough to fall behind the GTX 570; but at least it’s enough to beat the 7950. Clockspeeds help as showcased by the EVGA GTX 670SC but nothing really makes up for the missing SMX.

Our foruth benchmark is once again looking at compute shader performance, this time through the Fluid simulation sample in the DirectX SDK. This program simulates the motion and interactions of a 16k particle fluid using a compute shader, with a choice of several different algorithms. In this case we’re using an (O)n^2 nearest neighbor method that is optimized by using shared memory to cache data.

For reasons we’ve yet to determine, this benchmark strongly dislikes GTX 670 in particular. There doesn’t seem to be a performance regression in NVIDIA’s drivers, and there’s not an incredible gap due to TDP, it just struggles on the GTX 670. As a result performance of the GTC 670 only hits 42% of the GTX 680, which is well below what the GTX 670 should theoretically be getting. Barring some kind of esoteric reaction between this program and the unbalanced GPC a driver issue is still the most likely culprit, but it looks to only affect the GTX 670.

Finally, we’re adding one last benchmark to our compute run. NVIDIA  and the Folding@Home group have sent over a benchmarkable version of the client with preliminary optimizations for GK104. Folding@Home and similar initiatives are still one of the most popular consumer compute workloads, so it’s something NVIDIA wants their GPUs to do well at.

Whenever NVIDIA sends over a benchmark you can expect they have good reason to, and this is certainly the case for Folding@Home. GK104 is still a slouch given its resources compared to GF110, but at least it can surpass the GTX 580. At 970 nanoseconds per day the GTX 670 can tie the GTX 580, while the GTX 680 can pull ahead by 6%. Interestingly this benchmark appears to be far more constrained by clockspeed than the number of shaders, as the EVGA GTX 670SC outperforms the GTX 680 thanks to its 1188MHz boost clock, which it manages to stick to the entire time.

Civilization V Synthetics
Comments Locked

414 Comments

View All Comments

  • chizow - Thursday, May 10, 2012 - link

    Except in this case, the "underdog" AMD initiated this pricing debacle with the terribly overpriced 7970 and the "leader" Nvidia was content to follow, selling their mid-range ASIC GK104 as a high-end SKU.

    While Nvidia did improve the situation with their GK104 pricing, its still by far, the worst increase we've seen from a price:performance perspective in the last decade of GPUs.
  • CeriseCogburn - Sunday, May 13, 2012 - link

    You're in the GTX670 review, it's $399, it has come out fast, and it's awesome and beats the more epxensive flagship 7970, and destroys and historical price/perf you've got handy.
    Utter decimates it.
    Best in years, best in a decade is now the line you should be using for the GTX670.
  • Crazyeyeskillah - Thursday, May 10, 2012 - link

    don't buy it if you can't afford it, other people will gladly take your place in line. I'm just glad we have some next gen products from both companies to choose from. If anything we are very fortunate to have so many products available that can max out all our games at present.
  • chizow - Thursday, May 10, 2012 - link

    Its not a matter of being able to afford it, its about standards and expectations, which I'm not willing to lower for substandard offerings for products that are neither essential for survival nor expire on their own due to wear.

    They're high-priced toys and nothing more and there's *PLENTY* of other distractions in that endless category of entertainment to compete with, especially when these new offerings don't offer compelling reasons to upgrade over my last-gen $500 GPUs.

    The other consideration is buying these parts at high premiums sets a bad precedence, where the consumer gets *LESS* for their money and similarly gives Nvidia free reign to set a new bar for premium price and performance in the future.

    We've already gotten a taste of this with the GTX 690 for $1000!!! What do you think is next with GK110? Why don't you look historically at the reaction to the 8800 Ultra at $830? Nvidia is *STILL* trying to downplay that part and justify their pricing decisions, but with a mid-range ASIC like GK104 selling for $500 premium flagship prices, Nvidia is once again positioned to sell an "Ultra" part at ultra-premium pricing. For what? A part that performs as you would've expected from a $500 flagship to begin with, roughly +50% more than the last-gen flagship.....
  • Crazyeyeskillah - Thursday, May 10, 2012 - link

    i don't buy any of that wahhh
  • CeriseCogburn - Thursday, May 10, 2012 - link

    Charlie D from semi-accurate buys it 100%, why no U ?
  • chizow - Thursday, May 10, 2012 - link

    Yeah I know, you're too busy blithely buying overpriced GPUs to understand what I'm talking about.
  • CeriseCogburn - Friday, May 11, 2012 - link

    Maybe if you provided a percentage with a simple texted chart, heck you don't need to do ten years, the doubter could gauge the level of your sourness properly - after all .01% less of a jump in performance below the worst jump in the last ten years fits all of your descriptions 100%.
    So why are you moaning about .01% ?
  • SlyNine - Thursday, May 10, 2012 - link

    Well when the 7970 came out that was by far the worst. Its alot better now, but I agree this jump hasn't been one for value at all. People don't remember the great videocards I guess. The 5870 was the last one in my eyes.
  • CeriseCogburn - Friday, May 11, 2012 - link

    5870 jumped from the 4890. Now please, let's see this enormous perf increase somewhere... as compared to the current.
    No less than that, the 5870 was replaced by the 6870, also not so great a leap.
    We keep hearing about these ephemeral perf increases, but so far NO ONE, and I mean NO ONE has provided even a simple percent increase chart - and you know why ?
    Because you people love to quaff out moaning fantasies like "double performance" and says things like "the great GTX880 !" (after of course bitching for a four years it was extremely overpriced and not ever worth it).
    So let's see it my friends, where pray tell is this great alluded to but never actually defined gigantic performance increase now not seen ?
    4890-5870-6970 ????
    Come on now, let's have one of you true believers gum up the work and give us a good percentage comparison we don't have to rip apart for immense biased game picking.
    Should take one of you all but 10 minutes. Charts are everywhere.
    Use the anand bench for cripes sakes, I'm sick of hearing the moanings and fantasies with no simple effort of a comonly available percentage - and you know why - because I'm calling BS !
    Now - let's see it !

Log in

Don't have an account? Sign up now