Compute Performance

As always our final set of real-world benchmarks is composed of a look at compute performance. As we have seen with GTX 680 and GTX 670, Kepler appears to be significantly less balanced between rendering and compute performance than GF110 or GF114 were, and as a result compute performance suffers.  Further compounding this is the fact that GK106 only has 5 SMXes versus the 8 SMXes of GK104, which will likely further depress compute performance.

Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes. Note that this is a DX11 DirectCompute benchmark.

It’s interesting then that despite the obvious difference between the GTX 660 and GTX 660 Ti in theoretical compute performance, the GTX 660 actually beats the GTX 660 Ti here. Despite being a compute benchmark, Civlization V’s texture decompression benchmark is more sensitive to memory bandwidth and cache performance than it is shader performance, giving us the results we see above. Given the GTX 660 Ti’s poor showing in this benchmark this is a good thing for NVIDIA since this means they don’t fall any farther behind. Still, the GTX 660 is effectively tied with the 7850 and well behind the 7870.

Our next benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. We’re now using a development build from the version 2.0 branch, and we’ve moved on to a more complex scene that hopefully will provide a greater challenge to our GPUs.

SmallLuxGPU sees us shift towards an emphasis on pure compute performance, which of course is going to be GTX 660’s weak point here. Over 2 years after the launch of the GTX 460 and SLG performance has gone exactly nowhere, with the GTX 460 and GTX 660 turning in the same exact scores. Thank goodness the 8800GT is terrible at this benchmark, otherwise the GTX 660 would be in particularly bad shape.

It goes without saying that with the GTX 660’s poor compute performance here, the 7800 series is well in the lead. The 7870 more than trebles the GTX 660’s performance, an indisputable victory if there ever was one.

For our next benchmark we’re looking at AESEncryptDecrypt, an OpenCL AES encryption routine that AES encrypts/decrypts an 8K x 8K pixel square image file. The results of this benchmark are the average time to encrypt the image over a number of iterations of the AES cypher.

Our AES benchmark was one of the few compute benchmarks where the GTX 660 Ti had any kind of lead, but the significant loss of compute resources has erased that for the GTX 660. At 395ms it’s a hair slower than the 7850, never mind the 7870.

For our next benchmark we’re looking at AESEncryptDecrypt, an OpenCL AES encryption routine that AES encrypts/decrypts an 8K x 8K pixel square image file. The results of this benchmark are the average time to encrypt the image over a number of iterations of the AES cypher.

The fluid simulation is another benchmark that includes a stronger mix of memory bandwidth and cache rather than being purely dependent on compute resources. As a result the GTX 660 still trails the GTX 660 Ti, but by a great amount. Even so, the GTX 660 is no match for the 7800 series.

Finally, we’re adding one last benchmark to our compute run. NVIDIA and the Folding@Home group have sent over a benchmarkable version of the client with preliminary optimizations for Kepler. Folding@Home and similar initiatives are still one of the most popular consumer compute workloads, so it’s something NVIDIA wants their GPUs to do well at.

As we’ve seen previously with GK104, this is one of the few compute benchmarks that shows any kind of significant performance advantage for Little Kepler compared to Little Fermi. GTX 660 drops by 12% compared to GTX 660 Ti, but this is still good enough for a 60% performance advantage over GTX 460.

Civilization V Synthetics
Comments Locked

147 Comments

View All Comments

  • Zds - Saturday, September 15, 2012 - link

    "Reference clock" is very different from "reference PCB". The operative words are "clock" and "PCB", not "reference".
  • Redshift_91 - Wednesday, September 19, 2012 - link

    a superclocked card is not reference clocked, thus the keyword is "reference". Unless you're going to argue that a superclocked card is reference clocked and thus the very idea of overclocking is thrown out the window.
  • guidryp - Thursday, September 13, 2012 - link

    "NVIDIA has spent a lot of time in the past couple of years worrying about the 8800GT/9800GT in particular"

    I am still using a 8800GT without much need to upgrade. I don't play any new games so I really can't justify an upgrade. Though of course you get that upgrade itch. So the first thing I wondered was, how much power/noise compared to my 8800GT (I have giant slow fan on mine).
  • Anonymous Blowhard - Thursday, September 13, 2012 - link

    Now that the 600-series has gotten a firm foothold, older cards like the GTX460 have been available for around $100 if you're patient enough to wait for sales and rebates.

    Pick one based on the NV reference design if you're concerned about noise; I've had models from MSI and EVGA that both performed admirably in terms of noise and temperature. Blower-style fans can be extremely loud if you buy the wrong model (ZOTAC) so do your homework.

    I came from an 8800GT myself and didn't feel the need to upgrade, but there's a definite benefit even in "low end" games based on Source/UE3. The ability to crank up the details/AA and still hold a solid 60fps is wonderful. Well worth the money.
  • DanNeely - Thursday, September 13, 2012 - link

    buying a 2+ generation old high end card is almost never a good idea. What you save upfront over an equivalent lower mid range card is quickly lost due to the significantly higher power draw.
  • rarson - Friday, September 14, 2012 - link

    Huh? How expensive is electricity where you live? I can't imagine the power difference making up the cost difference in less than 2 years of constant use.

    I replaced my 3870 with a 6850 a few months ago, and it actually uses a bit less power at idle, which is where my GPU spends the bulk of its time, so I'm actually saving a tiny bit. Sure, the 460 uses more power under load, but the 880GT uses significantly more power than the 460 during idle (about 20W!).
  • CeriseCogburn - Thursday, November 29, 2012 - link

    If you're worried about 20 watts at idle, you're definitely an amd fanboy.
    Probably something else too I won't mention since humiliating yourself is already a public past time.
  • gamara - Thursday, June 6, 2013 - link

    20W x 2 days is 1 KW hr. 15 KW hr a month, 180 KW hrs a year. At $.10 a KW hr, that's $18. In California, some places it runs almost triple that, so if you use So Cal Ed, and are in Tier 3 or 4, you pay almost $50 a year extra for those 20 watts.
  • guidryp - Friday, September 14, 2012 - link

    I am patient enough to wait for the gtx 660 to get down to $150.

    If I do upgrade, one thing that is a must, is getting 3+ monitor capability.

    I currently drive my TV and desktop monitor, and would like a second desktop monitor.

    Here the power usage looks line line with the 8800GT and NVidia finally allows 3+ monitors.
  • raghu78 - Thursday, September 13, 2012 - link

    GTX 660 is actually weak competition. Nvidia's pricing sucks . USD 200 would have really made it an amazing card. Performance wise its stuck between the HD 7850 and HD 7870 but pricing wise its nearer to HD 7870. the GTX 660 is up against a faster chip in the HD 7870. and needs a price correction . GTX 660 OC matches a HD 7870

    http://www.techpowerup.com/reviews/ASUS/GeForce_GT...

    Also anandtech's gaming suite is quite out of date. They are testing Portal 2 which is useless and don't have a single game released in 2012 like Alan Wake, Max Payne 3, Dirt Showdown, Sniper Elite V2, Diablo III, Sleeping Dogs. most sites have started including newer games . hardocp has included sleeping dogs. techpowerup has included alan wake, sniper elite v2, max payne 3, diablo III. techreport has max payne 3 and dirt showdown. And to state that GTX 660 is faster than HD 7870 or the better card with such an obsolete suite is ridiculous

Log in

Don't have an account? Sign up now