Compute

Shifting gears, as always our final set of benchmarks is a look at compute performance. As we have seen with GTX 680, GK104 appears to be significantly less balanced between rendering and compute performance than GF110 or GF114 were, and as a result compute performance suffers.  Cache and register file pressure in particular seem to give GK104 grief, which means that GK104 can still do well in certain scenarios, but falls well short in others.

Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes. Note that this is a DX11 DirectCompute benchmark.

It’s quite shocking to see the GTX 670 do so well here. For sure it’s struggling relative to the Radeon HD 7900 series and the GTX 500 series, but compared to the GTX 680 it’s only trailing by 4%. This is a test that should cause the gap between the two cards to open up due to the lack of shader performance, but clearly that this not the case. Perhaps we’ve been underestimating the memory bandwidth needs of this test? If that’s the case, given AMD’s significant memory bandwidth advantage it certainly helps to cement the 7970’s lead.

Our next benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. We’re now using a development build from the version 2.0 branch, and we’ve moved on to a more complex scene that hopefully will provide a greater challenge to our GPUs.

SmallLuxGPU on the other hand finally shows us that larger gap we’ve been expecting between the GTX 670 and GTX 680. The GTX 680’s larger number of SMXes and higher clockspeed cause the GTX 670 to fall behind by 10%, performing worse than the GTX 570 or even the GTX 470. More so than any other test, this is the test that drives home the point that GK104 isn’t a strong compute GPU while AMD offers nothing short of incredible compute performance.

For our next benchmark we’re looking at AESEncryptDecrypt, an OpenCL AES encryption routine that AES encrypts/decrypts an 8K x 8K pixel square image file. The results of this benchmark are the average time to encrypt the image over a number of iterations of the AES cypher.

Once again the GTX 670 has a weak showing here, although not as bad as with SmallLuxGPU. Still, it’s enough to fall behind the GTX 570; but at least it’s enough to beat the 7950. Clockspeeds help as showcased by the EVGA GTX 670SC but nothing really makes up for the missing SMX.

Our foruth benchmark is once again looking at compute shader performance, this time through the Fluid simulation sample in the DirectX SDK. This program simulates the motion and interactions of a 16k particle fluid using a compute shader, with a choice of several different algorithms. In this case we’re using an (O)n^2 nearest neighbor method that is optimized by using shared memory to cache data.

For reasons we’ve yet to determine, this benchmark strongly dislikes GTX 670 in particular. There doesn’t seem to be a performance regression in NVIDIA’s drivers, and there’s not an incredible gap due to TDP, it just struggles on the GTX 670. As a result performance of the GTC 670 only hits 42% of the GTX 680, which is well below what the GTX 670 should theoretically be getting. Barring some kind of esoteric reaction between this program and the unbalanced GPC a driver issue is still the most likely culprit, but it looks to only affect the GTX 670.

Finally, we’re adding one last benchmark to our compute run. NVIDIA  and the Folding@Home group have sent over a benchmarkable version of the client with preliminary optimizations for GK104. Folding@Home and similar initiatives are still one of the most popular consumer compute workloads, so it’s something NVIDIA wants their GPUs to do well at.

Whenever NVIDIA sends over a benchmark you can expect they have good reason to, and this is certainly the case for Folding@Home. GK104 is still a slouch given its resources compared to GF110, but at least it can surpass the GTX 580. At 970 nanoseconds per day the GTX 670 can tie the GTX 580, while the GTX 680 can pull ahead by 6%. Interestingly this benchmark appears to be far more constrained by clockspeed than the number of shaders, as the EVGA GTX 670SC outperforms the GTX 680 thanks to its 1188MHz boost clock, which it manages to stick to the entire time.

Civilization V Synthetics
Comments Locked

414 Comments

View All Comments

  • Gastec - Tuesday, November 13, 2012 - link

    Your every comment is an attack at ATi/AMD video cards or people who seem to be using them( maybe). Why?
    You get payed to do negative publicity for AMD on the review sites? Because having a Ati card die on you in the middle of some important event in you gaming life( like raiding in WoW , am I close or am I close ;-) could not be the only reason.
  • shin0bi272 - Friday, May 11, 2012 - link

    I think the reason for the missing memory chips is because they will be releasing the 685 in aug or sep which is supposed to be 4gb and run on a 512bit bus. It could be possible to increase the size of the gpu core and double the amount of ram and stil have it on a card this length.

    30% faster than the 670 (685 is supposed to be 25% faster than the 680 and the 670 is 5% slower than the 680) on the same size card but using 2x8 pin connectors instead of 2x6pin. Now imagine an after market or water cooler on it... yeah.

    You'll get great FPS on all those brand new console ports.
  • KivBlue - Friday, May 11, 2012 - link

    $400 for a graphics card is just too much.
  • medi01 - Saturday, May 12, 2012 - link

    For me too. In 200$-ish range it looks like AMD 7850 / 7870 are the only reasonable options.

    PS
    Honestly I don't get all the hype about 680/670. Cards are only marginally better than AMDs offering (losing in some games, winning in some games).

    Power consumption difference according to techpowerup is only 2 watt in idle, about 9 watt at full load. Not a big deal either.

    Basically a slight price drop by AMD on 7950/7970 (for whoever really wants those) once these cards actually become available and that's it.

    I also wonder, how many "enthusiasts" with multi-monitor setups in the need of a faster card are out there.

    PPS
    Worst part of it would be nVidia releasing confusing mix of completely different cards lower end cards released under the same name, to confuse consumer.
  • CeriseCogburn - Saturday, May 12, 2012 - link

    I guess considering you think $200 equals $335 and that also equals $250, we can say your comment equals a big fat lie, and when a big fat lie is what one immediately starts off with, everyone knows something is WRONG.
  • Gastec - Tuesday, November 13, 2012 - link

    Again you attack someone who posted a comment about AMD cards, just because. You are obviously a troll and someone from this, STILL RESPECTED computer magazine should ban you.
  • Gastec - Tuesday, November 13, 2012 - link

    Yes but people who buy these have enough money to buy even the $3000-4000. Tesla K20 ones . Many of them have money from their parents, if you catch my drift.
  • RegEDDIT - Sunday, May 13, 2012 - link

    I managed to buy one from Amazon before they went out of stock, and I must say, I am pleased. BF3 plays like a champ, Skyrim is smooth as butter, and Adobe Premiere edits like a champ now with Nvidia hardware acceleration. This is on a 1920x1080 monitor with an old q6700 quad core @ 2.666 GHz and 800Mhz RAM. I do not expect to buy another card for a long while.
  • CeriseCogburn - Sunday, May 13, 2012 - link

    Here's COMPUTE SOFTWARE BASE in action.

    " Adobe Premiere edits like a champ now with Nvidia hardware acceleration "

    Nvidia wins. amd loses in compute.
  • Zebo - Sunday, May 13, 2012 - link

    7950 has 40-50% OC potential being servilely down tuned @ 800Mhz.

    If AMD is smart they will release a 1100Mhz version and wreck 670s party.

    If you're an overclocked you'd be dumb to buy 670 with its limited control and potential of 7950. Let alone of you're on water.

Log in

Don't have an account? Sign up now