Compute

Shifting gears, as always our final set of benchmarks is a look at compute performance. As we have seen with GTX 680, GK104 appears to be significantly less balanced between rendering and compute performance than GF110 or GF114 were, and as a result compute performance suffers.  Cache and register file pressure in particular seem to give GK104 grief, which means that GK104 can still do well in certain scenarios, but falls well short in others.

Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes. Note that this is a DX11 DirectCompute benchmark.

It’s quite shocking to see the GTX 670 do so well here. For sure it’s struggling relative to the Radeon HD 7900 series and the GTX 500 series, but compared to the GTX 680 it’s only trailing by 4%. This is a test that should cause the gap between the two cards to open up due to the lack of shader performance, but clearly that this not the case. Perhaps we’ve been underestimating the memory bandwidth needs of this test? If that’s the case, given AMD’s significant memory bandwidth advantage it certainly helps to cement the 7970’s lead.

Our next benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. We’re now using a development build from the version 2.0 branch, and we’ve moved on to a more complex scene that hopefully will provide a greater challenge to our GPUs.

SmallLuxGPU on the other hand finally shows us that larger gap we’ve been expecting between the GTX 670 and GTX 680. The GTX 680’s larger number of SMXes and higher clockspeed cause the GTX 670 to fall behind by 10%, performing worse than the GTX 570 or even the GTX 470. More so than any other test, this is the test that drives home the point that GK104 isn’t a strong compute GPU while AMD offers nothing short of incredible compute performance.

For our next benchmark we’re looking at AESEncryptDecrypt, an OpenCL AES encryption routine that AES encrypts/decrypts an 8K x 8K pixel square image file. The results of this benchmark are the average time to encrypt the image over a number of iterations of the AES cypher.

Once again the GTX 670 has a weak showing here, although not as bad as with SmallLuxGPU. Still, it’s enough to fall behind the GTX 570; but at least it’s enough to beat the 7950. Clockspeeds help as showcased by the EVGA GTX 670SC but nothing really makes up for the missing SMX.

Our foruth benchmark is once again looking at compute shader performance, this time through the Fluid simulation sample in the DirectX SDK. This program simulates the motion and interactions of a 16k particle fluid using a compute shader, with a choice of several different algorithms. In this case we’re using an (O)n^2 nearest neighbor method that is optimized by using shared memory to cache data.

For reasons we’ve yet to determine, this benchmark strongly dislikes GTX 670 in particular. There doesn’t seem to be a performance regression in NVIDIA’s drivers, and there’s not an incredible gap due to TDP, it just struggles on the GTX 670. As a result performance of the GTC 670 only hits 42% of the GTX 680, which is well below what the GTX 670 should theoretically be getting. Barring some kind of esoteric reaction between this program and the unbalanced GPC a driver issue is still the most likely culprit, but it looks to only affect the GTX 670.

Finally, we’re adding one last benchmark to our compute run. NVIDIA  and the Folding@Home group have sent over a benchmarkable version of the client with preliminary optimizations for GK104. Folding@Home and similar initiatives are still one of the most popular consumer compute workloads, so it’s something NVIDIA wants their GPUs to do well at.

Whenever NVIDIA sends over a benchmark you can expect they have good reason to, and this is certainly the case for Folding@Home. GK104 is still a slouch given its resources compared to GF110, but at least it can surpass the GTX 580. At 970 nanoseconds per day the GTX 670 can tie the GTX 580, while the GTX 680 can pull ahead by 6%. Interestingly this benchmark appears to be far more constrained by clockspeed than the number of shaders, as the EVGA GTX 670SC outperforms the GTX 680 thanks to its 1188MHz boost clock, which it manages to stick to the entire time.

Civilization V Synthetics
Comments Locked

414 Comments

View All Comments

  • CeriseCogburn - Sunday, May 13, 2012 - link

    Are you going to put up with crashing amd drivers and a burning electric bill OC with added instability and a water tower cost and then all of a sudden save a miniscule bit on card cost ? Are you going to add to your suffering with no adaptive v-sync, no also added smooth frame rate target, no instant per game optimum settings from a massive nVidia server farm embedded automagically in the superior nVidia drivers ?

    Are you going to stand for no bezel peek feature ?
    Are you going to put up with the more expensive and hassled 3 monitor connection issues of the amd cards ?
    Are you going to sit there undisturbed by the epic failure of amd 3D gaming vs Nvidia's available and awesome implementation ?
    Are you going to put up with no amd 120hz monitor support there too ?

    Isn't your original stance there the very opposite of "no one buys these cards to run on just one monitor and certainly not 1900x1200" argument ?

    Since the amd overlcocks "so well" as you claim vs nVidia, what is amd releasing a pre overclocked version going to do other than allow amd partners to charge more ?
    ROFL - it will do nothing.
  • saturn85 - Monday, May 14, 2012 - link

    the folding@home benchmark is great!!
    i think the performance unit "point per day (ppd)" is preferable compare to "nanosecond per day (ns/day)".
  • TheMan876 - Tuesday, May 15, 2012 - link

    Glad to see 3 monitor resolutions getting benchmarked since I just moved to that setup. Can't wait to see SLI on this card!
  • Death666Angel - Thursday, May 17, 2012 - link

    Prices for the GTX 670 and the HD 7970 are similar in Germany, at max a difference of about 30€. :-)
    If I had to buy a card today, I'd probably get a GTX 680, but I don't regret the 500€ I spent on a 7970 with a watercooling block and OC capabilities of 1300/1700. :-)
  • Brainling - Thursday, May 24, 2012 - link

    I had been waiting patiently for the release of the 670 or the 660ti, depending on availability, cost and performance. After reading this review of the 670, I bought one on the spot (release day morning, while Newegg still had some)....it was a good decision.

    This card replaced an HD6870, and while that was a decent card, it's like night and day. In informal tests I did, I found this card to be twice as powerful in most scenarios. Nvidia has really outdone themselves with their new Kepler architecture. They've created one of the most powerful hyper-parallel architectures available to do, and have done so at greatly decreased power draw and heat (aka: less noise). It's rare to ever see my 670 spike above 60C, with the stock blower cooler.

    All in all a great purchase, and one I'm very glad I made.
  • smartypnt4 - Sunday, May 27, 2012 - link

    I know they're on the site in other reviews, but it would be nice if you could include a few dual-GPU cards in the benchmark comparisons. It probably only matters to a few people like me, but it'd be nice to have.

    For me, I want them because I'm trying to make a decision: do I get a second 6950 to crossfire with the one I already have for $200, or do I go out and buy a new card?

    From what I've seen, outside the edge case games such as Batman and some of the games running on Frostbite, a 6990 pretty much trades blows with the 680 and the 7970. So, I'm thinking that for me, since I have the headroom in my PSU, getting a second 6950 makes a whole lot of sense, even though the setup will consume almost twice as much power as one new card.

    Just my two cents.
  • codeus - Monday, June 4, 2012 - link

    Good review but so much focus on EVGA's warranty changes smacks of this being a sponsored (and therefore biased?) review.
  • pilotofdoom - Monday, June 11, 2012 - link

    Anyone else notice that the GTX 670 outperformed the GTX 680 in the Microsoft’s Detail Tessellation test on Normal settings?

    I'm guessing it's a simple mistake, since there's no mention of the reversal in the text. Not like it really matters anyways, being a synthetic benchmark compared to actual gaming performance.
  • chrisrobhay2 - Friday, June 29, 2012 - link

    Which leader does Anandtech use for the Civilization V Compute test? I'm just curious because my overclocked GTX 670 wipes the floor with all of these cards in almost all of the leader tests, so I want to make sure that I'm looking at the right information.
  • warmbit - Tuesday, July 17, 2012 - link

    If you want to see what we really have GTX670 performance in games is worth taking a look at this overview:

    http://warmbit.blogspot.com/2012/05/analiza-wyniko...

    On the right side, select your language for translation (Google Translate).

Log in

Don't have an account? Sign up now