Compute Performance

Shifting gears, as always our final set of real-world benchmarks is a look at compute performance. As we have seen with GTX 680 and GTX 670, GK104 appears to be significantly less balanced between rendering and compute performance than GF110 or GF114 were, and as a result compute performance suffers.  Cache and register file pressure in particular seem to give GK104 grief, which means that GK104 can still do well in certain scenarios, but falls well short in others. For GTX 660 Ti in particular, this is going to be a battle between the importance of shader performance – something it has just as much of as the GTX 670 – and cache/memory pressure from losing that ROP cluster and cache.

Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes. Note that this is a DX11 DirectCompute benchmark.

For Civilization V memory bandwidth and cache are clearly more important than raw compute performance in this test. Although this isn’t a worst case scenario outcome for the GTX 660 Ti, it drops substantially from the GTX 670. As a result its compute performance is barely better than the GTX 560 Ti, which wasn’t a strong performer at compute in the first place.

Our next benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. We’re now using a development build from the version 2.0 branch, and we’ve moved on to a more complex scene that hopefully will provide a greater challenge to our GPUs.

Ray tracing likes memory bandwidth and cache, which means another tough run for the GTX 660 Ti. In fact it’s now slower than the GTX 560 Ti. Compared to the 7950 this isn’t even a contest. GK104 is generally bad at compute, and GTX 660 Ti is turning out to be especially bad.

For our next benchmark we’re looking at AESEncryptDecrypt, an OpenCL AES encryption routine that AES encrypts/decrypts an 8K x 8K pixel square image file. The results of this benchmark are the average time to encrypt the image over a number of iterations of the AES cypher.

The GTX 660 Ti does finally turn things around on our AES benchmark, thanks to the fact that it generally favors NVIDIA. At the same time the gap between the GTX 670 and GTX 660 Ti is virtually non-existent.

Our fourth benchmark is once again looking at compute shader performance, this time through the Fluid simulation sample in the DirectX SDK. This program simulates the motion and interactions of a 16k particle fluid using a compute shader, with a choice of several different algorithms. In this case we’re using an (O)n^2 nearest neighbor method that is optimized by using shared memory to cache data.

The compute shader fluid simulation provides the GTX 660 Ti another bit of reprieve, although like other GK104 cards it’s still relatively weak. Here it’s virtually tied with the GTX 670 so it’s clear that it isn’t being impacted by cache or memory bandwidth losses, but it needs about 10% more to catch the 7950.

Finally, we’re adding one last benchmark to our compute run. NVIDIA and the Folding@Home group have sent over a benchmarkable version of the client with preliminary optimizations for GK104. Folding@Home and similar initiatives are still one of the most popular consumer compute workloads, so it’s something NVIDIA wants their GPUs to do well at.

Interestingly Folding @ Home proves to be rather insensitive to the differences between the GTX 670 and GTX 660 Ti, which is not what we would have expected. The GTX 660 Ti isn’t doing all that much better than the GTX 570, once more reflecting that GK104 is generally struggling with compute performance, but it’s not a bad result.

Civilization V Synthetics
Comments Locked

313 Comments

View All Comments

  • Oxford Guy - Thursday, August 16, 2012 - link

    What is with the 285 being included? It's not even a DX 11 card.

    Where is the 480? Why is the 570 included instead of the 580?

    Where is the 680?
  • Ryan Smith - Saturday, August 18, 2012 - link

    The 285 was included because I wanted to quickly throw in a GTX 285 card where applicable, since NVIDIA is promoting the GTX 660 Ti as a GTX 200 series upgrade. Basically there was no harm in including it where we could.

    As for the 480, it's equivalent to the 570 in performance (eerily so), so there's never a need to break it out separately.

    And the 680 is in Bench. It didn't make much sense to include a card $200 more expensive which would just compress the results among the $300 cards.
  • CeriseCogburn - Sunday, August 19, 2012 - link

    So you're saying the 680 is way faster than the 7970 which you included in every chart, since the 7970 won't compress those $300 card results.
    Thanks for admitting that the 7970 is so much slower.
  • Pixelpusher6 - Friday, August 17, 2012 - link

    Thanks Ryan. Great review as always.

    I know one of the differentiating factors for the Radeon 7950s is the 3GB of ram but I was curious are there any current games which will max out 2GB of RAM with high resolution, AA, etc.?

    I think it's interesting how similar AMDs and Nvidias GPUs are this generation. I believe Nvidia will be releasing the GTX 660 non Ti based on GK106. Leaked specs seem to be similar to this card but the texture units will be reduced to 64. I wonder how much of a performance reduction this will account for. I think it will be hard for Nvidia to get the same type of performance / $ as say GTX 460 / 560 Ti this generation because of having to have GK104 fill in more market segments.

    Also I wasn't aware that Nvidia was still having trouble meeting demand with GK104 chips I thought those issues were all cleared up. I think when AMD released their 7000 series chips they should have taken advantage of being first to market and been more competitive on price to grow market share rather than increase margins. At that time someone sitting on 8800GT era hardware would be hard pressed to upgrade knowing that AMDs inflated prices would come down once Nvidia brought their GPUs to market. People who hold on to their cards for a number of years is unlikely to upgrade 6 months later to Nvidias product. If AMD cards were priced lower at this time a lot more people would have bought them, thereby beating Nvidia before they even have a card to market. I do give some credit to AMD for preparing for this launch and adjusting prices, but in my opinion this should have been done much earlier. AMD management needs to be more aggressive and catch Nvidia off guard, rather than just reacting to whatever they do. I would "preemptively" strike at the GTX 660 non Ti by lowering prices on the 7850 to $199. Instead it seems they'll follow the trend and keep it at $240-250 right up until the launch of the GTX 660 then lower it to $199.
  • Ryan Smith - Saturday, August 18, 2012 - link

    Pixelpusher, there are no games we test that max out 2GB of VRAM out of the box. 3GB may one day prove to be advantageous, but right even at multi-monitor resolutions 2GB is doing the job (since we're seeing these cards run out of compute/render performance before they run out of RAM).
  • Sudarshan_SMD - Friday, August 17, 2012 - link

    Where are naked images of the card?
  • CeriseCogburn - Thursday, August 23, 2012 - link

    You don't undress somebody you don't love.
  • dalearyous - Friday, August 17, 2012 - link

    it seems the biggest disappointment i see in comments is the price point.

    but if this card comes bundled with borderlands 2, and you were already planning on buying borderlands 2 then this puts the card at $240, worth it IMO.
  • rarson - Friday, August 17, 2012 - link

    but it's the middle of freaking August. While Tahiti was unfortunately clocked a bit lower than it probably should have been, and AMD took a bit too long to bring out the GE edition cards, Nvidia is now practically 8 months behind AMD, having only just released a $300 card. (In the 8 months that have gone by since the release of the 7950, its price has dropped from $450 to $320, effectively making it a competitor to the 660 Ti. AMD is able to compete on price with a better-performing card by virtue of the fact that it simply took Nvidia too damn long to get their product to market.) By the time the bottom end appears, AMD will be ready for Canary Islands.

    It's bad enough that Kepler (and Fermi, for that matter) was so late and so not available for several months, but it's taking forever to simply roll out the lower-tier products (and yes, I know 28nm wafers have been in short supply, but that's partially due to Nvidia's crappy Kepler yields... AMD have not had such supply problems). Can you imagine what would have happened if Nvidia actually tried to release GK110 as a consumer card? We'd have NOTHING. Hot, unmanufacturable nothing.

    Nvidia needs to get their shit together. At the rate they're going, they'll have to skip an entire generation just to get back on track. I liked the 680 because it was a good performer, but that doesn't do consumers any good when it's 4 months late to the party and almost completely unavailable. Perhaps by the end of the year, 28nm will have matured enough and Nvidia will be able to design something that yields decently while still offering the competitiveness that the 680 brought us, because what I'd really like to see is both companies releasing good cards at the same time. Thanks to Fermi and Kepler, that hasn't happened for a while now. Us consumers benefit from healthy competition and Nvidia has been screwing that up for everyone. Get it together, Nvidia!
  • CeriseCogburn - Sunday, August 19, 2012 - link

    So as any wacko fanboy does, you fault nVidia for releasing a card later that drives the very top end tier amd cards down from the 579+ shipping I paid to $170 less plus 3 free games.
    Yeah buddy, it's all nVidia's fault, and they need to get their act together, and if they do in fact get their act together, you can buy the very top amd card for $150, because that's likely all it will be worth.
    Good to know it's all nVidia's fault. AMD from $579+plus ship to $409 and 3 free games and nVidia sucks for not having it's act together.
    The FDA as well as the EPA should ban the koolaid you're drinking.

Log in

Don't have an account? Sign up now