Compute Performance

As always our final set of real-world benchmarks is composed of a look at compute performance. As we have seen with GTX 680 and GTX 670, Kepler appears to be significantly less balanced between rendering and compute performance than GF110 or GF114 were, and as a result compute performance suffers.  Further compounding this is the fact that GK106 only has 5 SMXes versus the 8 SMXes of GK104, which will likely further depress compute performance.

Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes. Note that this is a DX11 DirectCompute benchmark.

It’s interesting then that despite the obvious difference between the GTX 660 and GTX 660 Ti in theoretical compute performance, the GTX 660 actually beats the GTX 660 Ti here. Despite being a compute benchmark, Civlization V’s texture decompression benchmark is more sensitive to memory bandwidth and cache performance than it is shader performance, giving us the results we see above. Given the GTX 660 Ti’s poor showing in this benchmark this is a good thing for NVIDIA since this means they don’t fall any farther behind. Still, the GTX 660 is effectively tied with the 7850 and well behind the 7870.

Our next benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. We’re now using a development build from the version 2.0 branch, and we’ve moved on to a more complex scene that hopefully will provide a greater challenge to our GPUs.

SmallLuxGPU sees us shift towards an emphasis on pure compute performance, which of course is going to be GTX 660’s weak point here. Over 2 years after the launch of the GTX 460 and SLG performance has gone exactly nowhere, with the GTX 460 and GTX 660 turning in the same exact scores. Thank goodness the 8800GT is terrible at this benchmark, otherwise the GTX 660 would be in particularly bad shape.

It goes without saying that with the GTX 660’s poor compute performance here, the 7800 series is well in the lead. The 7870 more than trebles the GTX 660’s performance, an indisputable victory if there ever was one.

For our next benchmark we’re looking at AESEncryptDecrypt, an OpenCL AES encryption routine that AES encrypts/decrypts an 8K x 8K pixel square image file. The results of this benchmark are the average time to encrypt the image over a number of iterations of the AES cypher.

Our AES benchmark was one of the few compute benchmarks where the GTX 660 Ti had any kind of lead, but the significant loss of compute resources has erased that for the GTX 660. At 395ms it’s a hair slower than the 7850, never mind the 7870.

For our next benchmark we’re looking at AESEncryptDecrypt, an OpenCL AES encryption routine that AES encrypts/decrypts an 8K x 8K pixel square image file. The results of this benchmark are the average time to encrypt the image over a number of iterations of the AES cypher.

The fluid simulation is another benchmark that includes a stronger mix of memory bandwidth and cache rather than being purely dependent on compute resources. As a result the GTX 660 still trails the GTX 660 Ti, but by a great amount. Even so, the GTX 660 is no match for the 7800 series.

Finally, we’re adding one last benchmark to our compute run. NVIDIA and the Folding@Home group have sent over a benchmarkable version of the client with preliminary optimizations for Kepler. Folding@Home and similar initiatives are still one of the most popular consumer compute workloads, so it’s something NVIDIA wants their GPUs to do well at.

As we’ve seen previously with GK104, this is one of the few compute benchmarks that shows any kind of significant performance advantage for Little Kepler compared to Little Fermi. GTX 660 drops by 12% compared to GTX 660 Ti, but this is still good enough for a 60% performance advantage over GTX 460.

Civilization V Synthetics
Comments Locked

147 Comments

View All Comments

  • yeeeeman - Saturday, September 15, 2012 - link

    Really, G80 was a revolution on its own. Spectacular jump in performance compared to the previous generation, and combined with 65nm process technology gave birth to some of the finest video cards.
    The real setback here, is the fact that the gaming industry is driven by the lowest common denominator, and we all know that consoles are the most important. They are sold in the largest quantities, and most games are designed for their power, not higher.
    For PCs, games receive a DX11 treatment, with some fancy features, than enhance the quality a little bit, but it can never make up for the fact that the textures and the game is designed for a much slower platform.
    So given these facts, why change my 9600GT, when it can handle pretty much everything?
  • steelnewfie - Saturday, September 15, 2012 - link

    "For the 2GB GTX 660, NVIDIA has outfit the card with 8 2Gb memory modules"

    Should read outfitted.

    Also 8 2Gb memory modules? Did you mean 2GB? Either is incorrect by my math.

    If there are 8 banks should not each module be 256 MB?

    Otherwise, great articles, keep up the good work!
  • Ryan Smith - Saturday, September 15, 2012 - link

    Individual memory modules are labeled by their capacity in bits, not bytes. So each module is 2 gigabits (Gb), which is 256MB. 8x2Gb is how the card ends up with 2 gigabytes (GB) of RAM.
  • MrBubbles - Saturday, September 15, 2012 - link

    Cool, I have a GTX 260 and since NVidia is deliberately breaking their driver support for games like Civ 5 I guess this is the card to get.
  • saturn85 - Saturday, September 15, 2012 - link

    nice folding@home benchmark.
  • JWill97 - Thursday, September 27, 2012 - link

    For me, I really think it's the best card you can buy at this price. Not a fan (neutral) of both NVidia or AMD, but really, at $200+ segment nvidia takes it. But I still wondering, why all reviewers aren't using Maxpayne3 as one of the game benchmark? A lot of cards would be struggle playing it.
  • Grawbad - Friday, March 1, 2013 - link

    "NVIDIA has spent a lot of time in the past couple of years worrying about the 8800GT/9800GT in particular. “The only card that matters” was a massive hit for the company straight up through 2010, which has made it difficult to get users to upgrade even 4 years later."

    I am one of those. I purchased a 9800 GTX and that sucker runs everything. Mind you, all my other components were quality too so I didn't bottleneck myself. But this card has run everything I have ever thrown at it.. Only recently have I had to start watching the AA a bit. Which is why I am now, 5 years later in the market for a new card. 5 Years.

    Indeed, those cards were astounding.

    Mine was an EVGA 9800 GTX with a lifetime warranty. Thank goodness for that as it finally went out on me this year and I had to RMA it. And now that I am looking into getting a new card it seems EVGA has dropped their lifetime warranty. That makes me sad.

    Anyways, yeah, those were are are still great cards. I mean, if you picked up a 9800 GTX today, you will be able to run even the newest games. Albeit youll need to turn down aa and such, but you can still get GREAT graphics out of most anything even today.

Log in

Don't have an account? Sign up now