Compute Performance

Shifting gears, as always our final set of real-world benchmarks is a look at compute performance. As we have seen with GTX 680 and GTX 670, GK104 appears to be significantly less balanced between rendering and compute performance than GF110 or GF114 were, and as a result compute performance suffers.  Cache and register file pressure in particular seem to give GK104 grief, which means that GK104 can still do well in certain scenarios, but falls well short in others. For GTX 660 Ti in particular, this is going to be a battle between the importance of shader performance – something it has just as much of as the GTX 670 – and cache/memory pressure from losing that ROP cluster and cache.

Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes. Note that this is a DX11 DirectCompute benchmark.

For Civilization V memory bandwidth and cache are clearly more important than raw compute performance in this test. Although this isn’t a worst case scenario outcome for the GTX 660 Ti, it drops substantially from the GTX 670. As a result its compute performance is barely better than the GTX 560 Ti, which wasn’t a strong performer at compute in the first place.

Our next benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. We’re now using a development build from the version 2.0 branch, and we’ve moved on to a more complex scene that hopefully will provide a greater challenge to our GPUs.

Ray tracing likes memory bandwidth and cache, which means another tough run for the GTX 660 Ti. In fact it’s now slower than the GTX 560 Ti. Compared to the 7950 this isn’t even a contest. GK104 is generally bad at compute, and GTX 660 Ti is turning out to be especially bad.

For our next benchmark we’re looking at AESEncryptDecrypt, an OpenCL AES encryption routine that AES encrypts/decrypts an 8K x 8K pixel square image file. The results of this benchmark are the average time to encrypt the image over a number of iterations of the AES cypher.

The GTX 660 Ti does finally turn things around on our AES benchmark, thanks to the fact that it generally favors NVIDIA. At the same time the gap between the GTX 670 and GTX 660 Ti is virtually non-existent.

Our fourth benchmark is once again looking at compute shader performance, this time through the Fluid simulation sample in the DirectX SDK. This program simulates the motion and interactions of a 16k particle fluid using a compute shader, with a choice of several different algorithms. In this case we’re using an (O)n^2 nearest neighbor method that is optimized by using shared memory to cache data.

The compute shader fluid simulation provides the GTX 660 Ti another bit of reprieve, although like other GK104 cards it’s still relatively weak. Here it’s virtually tied with the GTX 670 so it’s clear that it isn’t being impacted by cache or memory bandwidth losses, but it needs about 10% more to catch the 7950.

Finally, we’re adding one last benchmark to our compute run. NVIDIA and the Folding@Home group have sent over a benchmarkable version of the client with preliminary optimizations for GK104. Folding@Home and similar initiatives are still one of the most popular consumer compute workloads, so it’s something NVIDIA wants their GPUs to do well at.

Interestingly Folding @ Home proves to be rather insensitive to the differences between the GTX 670 and GTX 660 Ti, which is not what we would have expected. The GTX 660 Ti isn’t doing all that much better than the GTX 570, once more reflecting that GK104 is generally struggling with compute performance, but it’s not a bad result.

Civilization V Synthetics
Comments Locked

313 Comments

View All Comments

  • CeriseCogburn - Saturday, August 25, 2012 - link

    The 660Ti has a bios SUPER roxxor feature...in the MSI version.. ROFL !! hahaha
    http://www.techpowerup.com/reviews/MSI/GTX_660_Ti_...

    It seems that MSI has added some secret sauce, no other board partner has, to their card's BIOS. One indicator of this is that they raised the card's default power limit from 130 W to 175 W, which will certainly help in many situations.
    The card essentially uses the same power as other cards, but is faster - leading to improved performance per Watt.
    Overclocking works great as well and reaches the highest real-life performance, despite not reaching the lowest GPU clock. This is certainly an interesting development. We will, hopefully, see more board partners pick up this change.
    ROFL HAHAHAAHAAAAAAAAAAA
    So this is the one you want now Galidou.
    " Pros: This thing is pretty amazing. Tried running Skyrim on Ultra, 2k textures, and 14 other visual mods. With this card, I ran it all with no lagg at all, with a temp under 67. Love it. "
    http://www.newegg.com/Product/Product.aspx?Item=N8...
  • Galidou - Tuesday, September 4, 2012 - link

    Gibgabyte did the same, the board power is up to 180 watts if you tweak it and still both overclocked(my wife's gigabyte 660 ti OC and my 7950 sapphire 7950 OC) the 7950 wins hands down at 3 monitor resolution.

    How can you still trying to explain things when the only side of the medal you can speak of is Nvidia. Sorry, I see the good of both while you can't say a good thing about AMD. Both of my computer uses intel overclocked sandy bridge/ivy bridge K cpus, I'm no AMD fan but I can recognize I did the right thing and I did my research and having BOTH freaking cards in HANDS and testing them side by side with my 3570k @ 4,6ghz.

    My 7950 wins @ 3 monitors in skyrim EASILY, you can't say anything to that because you ain't got both cards in hands. Geez, will you freaking understand some day. And no I ain't got any freaking problem with my drivers... And I paid the 7950 the same price than the gtx 660 ti. EXACT same price. 319$ before taxes.

    Geez it's complicated when arguing with you because you ain't open to any opinions/facts other than: AMD IS CRAP, NVIDIA WINS EVERYTHING, AMD IS CRAP, NVIDIA WINS EVERYTHING, HERE'S MY LINK TO A WEBSITE THAT SHOWS THE 660TI WINNING AGAINST A 7970 AT EVERYTHING EVEN 6 MONITORS LOOK LOOK LOOK.
  • TheJian - Friday, August 24, 2012 - link

    I was speaking to their finances. If you see in one of my other posts, I believed they deserved 20bil from Intel, but courts screwed them. That is part of what I meant. They deserved their profits and more. Tough to get profits when Intel is stealing them basically by blocking your products at every end.

    No comment was directed at "dumb" employees. I said it was hard to overcome, not easy. Also that they had the crown for 3 years and weren't allowed to get just desserts. I'm sorry you didn't get that from the posts. I like AMD. I just fear they're on their last financial leg. I've owned their stock 4 times over the last 10 years. There doesn't look like there will be a 5th is all I'm saying. I speak from a stock/company financial position sometimes since I've bought both and follow their income statements. I'm sure they're all great people that work there, no comment on them (besides management's mishandling of Dirk Meyer, ATI overpurchase).
  • felipetga - Thursday, August 16, 2012 - link

    I have been holding to upgrade my GTX 460 256bits. I wonder if this card will be bottlenecked by my C2Q 9550 @ 3.6ghz....
  • dishayu - Thursday, August 16, 2012 - link

    It won't. You need to SLI/CF 2 top end cards for the processor to be a bottleneck.
  • tipoo - Thursday, August 16, 2012 - link

    Only on some games, but the majority aren't as CPU intensive as they are GPU intensive, so it would still be a nice upgrade for you.
  • Jamahl - Thursday, August 16, 2012 - link

    Do you realise that the majority of 660 Ti's being benchmarked at other techsites are overclocked vs the stock Radeons?
  • Biorganic - Thursday, August 16, 2012 - link

    Exactly this. Anyone who follows these respective cards, 7950:670, 7970:680 etc knows that the AMD alternatives have excellent overclocking potential. All these reviews are comparing high clocked GTX vs stock or very conservatively boosted AMD cards. I can get my 7950 to 1000 mHz on stock voltage. That will destroy this toy they call a TI. Sorry but the results seem a bit biased.
  • Ryan Smith - Thursday, August 16, 2012 - link

    "Sorry but the results seem a bit biased."

    Just so we're clear, are you talking about our article, or articles on other sites?

    if it's the former, in case you've missed it we are explicitly testing a reference clocked GTX 660 Ti in the form of Zotac's card at reference clocks (this is hardware identical to their official reference clocked model).
  • mwildtech - Thursday, August 16, 2012 - link

    Biased?? This guy is an idiot. Anandtech is the least biased tech site on the interwebs. Ryan - awesome review! keep up the good work.

Log in

Don't have an account? Sign up now