Compute

Shifting gears, as always our final set of benchmarks is a look at compute performance. As we have seen with GTX 680, GK104 appears to be significantly less balanced between rendering and compute performance than GF110 or GF114 were, and as a result compute performance suffers.  Cache and register file pressure in particular seem to give GK104 grief, which means that GK104 can still do well in certain scenarios, but falls well short in others.

Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes. Note that this is a DX11 DirectCompute benchmark.

It’s quite shocking to see the GTX 670 do so well here. For sure it’s struggling relative to the Radeon HD 7900 series and the GTX 500 series, but compared to the GTX 680 it’s only trailing by 4%. This is a test that should cause the gap between the two cards to open up due to the lack of shader performance, but clearly that this not the case. Perhaps we’ve been underestimating the memory bandwidth needs of this test? If that’s the case, given AMD’s significant memory bandwidth advantage it certainly helps to cement the 7970’s lead.

Our next benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. We’re now using a development build from the version 2.0 branch, and we’ve moved on to a more complex scene that hopefully will provide a greater challenge to our GPUs.

SmallLuxGPU on the other hand finally shows us that larger gap we’ve been expecting between the GTX 670 and GTX 680. The GTX 680’s larger number of SMXes and higher clockspeed cause the GTX 670 to fall behind by 10%, performing worse than the GTX 570 or even the GTX 470. More so than any other test, this is the test that drives home the point that GK104 isn’t a strong compute GPU while AMD offers nothing short of incredible compute performance.

For our next benchmark we’re looking at AESEncryptDecrypt, an OpenCL AES encryption routine that AES encrypts/decrypts an 8K x 8K pixel square image file. The results of this benchmark are the average time to encrypt the image over a number of iterations of the AES cypher.

Once again the GTX 670 has a weak showing here, although not as bad as with SmallLuxGPU. Still, it’s enough to fall behind the GTX 570; but at least it’s enough to beat the 7950. Clockspeeds help as showcased by the EVGA GTX 670SC but nothing really makes up for the missing SMX.

Our foruth benchmark is once again looking at compute shader performance, this time through the Fluid simulation sample in the DirectX SDK. This program simulates the motion and interactions of a 16k particle fluid using a compute shader, with a choice of several different algorithms. In this case we’re using an (O)n^2 nearest neighbor method that is optimized by using shared memory to cache data.

For reasons we’ve yet to determine, this benchmark strongly dislikes GTX 670 in particular. There doesn’t seem to be a performance regression in NVIDIA’s drivers, and there’s not an incredible gap due to TDP, it just struggles on the GTX 670. As a result performance of the GTC 670 only hits 42% of the GTX 680, which is well below what the GTX 670 should theoretically be getting. Barring some kind of esoteric reaction between this program and the unbalanced GPC a driver issue is still the most likely culprit, but it looks to only affect the GTX 670.

Finally, we’re adding one last benchmark to our compute run. NVIDIA  and the Folding@Home group have sent over a benchmarkable version of the client with preliminary optimizations for GK104. Folding@Home and similar initiatives are still one of the most popular consumer compute workloads, so it’s something NVIDIA wants their GPUs to do well at.

Whenever NVIDIA sends over a benchmark you can expect they have good reason to, and this is certainly the case for Folding@Home. GK104 is still a slouch given its resources compared to GF110, but at least it can surpass the GTX 580. At 970 nanoseconds per day the GTX 670 can tie the GTX 580, while the GTX 680 can pull ahead by 6%. Interestingly this benchmark appears to be far more constrained by clockspeed than the number of shaders, as the EVGA GTX 670SC outperforms the GTX 680 thanks to its 1188MHz boost clock, which it manages to stick to the entire time.

Civilization V Synthetics
Comments Locked

414 Comments

View All Comments

  • snakefist - Saturday, May 12, 2012 - link

    "...that the 365mm die is over 43% larger than the 300mm die."

    die size is in mm2 and NOT in diameter (mm). do your math again... and gtx680 die is 294mm2... to your pleasure, it increases the size difference... the real one, not the 43% you came out with - somehow

    now, these were two of the things you learned from me :)

    reading more, instead writing would help you, as well anger-management i suggested earlier :)
  • CeriseCogburn - Saturday, May 12, 2012 - link

    Why then I await your math calculation.
    I'll let you know now I'll be back after 3 seasons pass to give you time to prepare your answer.

    It doesn't surprise me one iota the stupid amd fanboy even increased the nVidia core size for that always needed amd liar cheat, nor that praise for his sanity followed on unopposed except by yours truly. Sometimes letting a liar even have part of his lie and still proving him wrong is good enough.

    No need however to correct my shorthand text concerning circular vs rectangular area, but as I imagine the stupidity you are surrounded with inside your own head you thought it a possibility, and it clearly indicates you didn't read the part of the thread discussed as well.
  • snakefist - Sunday, May 13, 2012 - link

    you do realize that 300mm is about a length of an A4 page, don't you?

    i don't need to calculate anything, it's clearly that nvidia die is ~20% smaller than amd one...

    on the other hand, unlike you, i know how to calculate, maybe that explains why your mistake was so obvious to me...

    "stupidity you are surrounded with" - sadly, true - but i'm only surrounded by you... but than again, it's the only reason i even talk to you - it's kinda fun because i don't get angry at all (quite the opposite), and you're spilling poison - seriously, how long you spent on writing that 1,000,000 comments about nvidia being better?

    i've spent about half an hour talking to you in total, and for own amusement purposes
  • CeriseCogburn - Sunday, May 13, 2012 - link

    That's over 22%
  • snakefist - Friday, May 18, 2012 - link

    i did right to come back here, more laughter!

    YOU are correcting ME?

    what now, i should calculate 22% decimals, and correct you?

    man, i KNOW math, you DON'T. otherwise, it would strike you immediately for 2x mistake you made in post (lets assume square was a type, thought i'm not quite sure).

    now ~20% means ABOUT and it is that way because it applies in similar fashion to both bigger-than and less-than scenarios, things you wouldn't of course know anything about

    but then again, i proclaim FULL VICTORY for you on math issue, you were right all along, even when you wrote 43, 46 or whatever you did in the first place (without "~" which means "approximate" for you, and you didn't used it, meaning it was exactly 43 (or 46, whatever))

    you're mathematical genius and i envy you a great deal on vast amount of hardware knowledge you have. happy?
  • snakefist - Friday, May 18, 2012 - link

    oh, "typo", not "type"
  • CeriseCogburn - Friday, May 11, 2012 - link

    Here we get it again, since nVidia's 670, the card under review did better than expected, something is wrong...

    " For reasons that aren’t entirely clear Batman isn’t as shader performance bottlenecked as we would have expected, leading to it doing so well compared to the GTX 680 here. "

    Something is wrong, Batman is not so shader bottlenecked, and since it's so easy, the "harvested"(defective according to the reviewer) 670 core can do well.

    What happened to the 570 attack this time ? Nothing, since.the 570 beats the 7870 at 2560 here, but since we can't cut the 570 down, we won't mention it.

    Instead of mentioning the 670OC beats the amd flagship at the highest resolution 5760x1200, the reviewer has to play that down, so only mentions the 670OC "coming to parity" with the 6970 at middle resolution, 2560, after the STOCK 670 beats the amd 7970 at the low 1920 resolution !
    ROFL - once again the analysis favors and coddles amd.

    " EVGA’s overclock, even if it’s once again only around 3%, is just enough to close that gap and to bring the GTX 670 to parity with the GTX 680 and the 7970. "

    No, the OC shows the 670 beating the 7970 at the highest triple screen 5760 resolution, and the STOCK 670 BEATS THE 7970 at the lowest resolution, 1920... so somehow "that's parity".
    How the heck does that work ?

    Do we hear once in all this game page commentary what the 7950 at the very same $399 as the 670 price is doing ?
    I don't think we do.
    Where is that ?
    Instead of attacking the 7950 that is currently the same price as the 670, we get the reviewer over and over again attacking the GTX570 that he notes nVidia mentioned to him, making him think the GTX570 will be part of nVidia's line up for some time he states. Not once did he point out how well the GTX570 did against the amd competition.
    Not once do swe hear how the 7950 costs the same but loses, loses loses. Nothing specific.
    Instead we hear 670 vs 680 or attack the 570, or make excuses for the 7970 or call it inexplicable..
  • medi01 - Friday, May 11, 2012 - link

    Can someone ban this zealot please?
  • sausagefingers - Friday, May 11, 2012 - link

    +1
  • CeriseCogburn - Friday, May 11, 2012 - link

    You can blame silverblue who begged for proof about the bias in these articles. Go call your buddy so you can both smack talk about me, or heck post it here openly like you do, why not you're innocent no matter what you do as an amd fanboy, right ?

Log in

Don't have an account? Sign up now