Compute

Shifting gears, as always our final set of benchmarks is a look at compute performance. As we have seen with GTX 680, GK104 appears to be significantly less balanced between rendering and compute performance than GF110 or GF114 were, and as a result compute performance suffers.  Cache and register file pressure in particular seem to give GK104 grief, which means that GK104 can still do well in certain scenarios, but falls well short in others.

Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes. Note that this is a DX11 DirectCompute benchmark.

It’s quite shocking to see the GTX 670 do so well here. For sure it’s struggling relative to the Radeon HD 7900 series and the GTX 500 series, but compared to the GTX 680 it’s only trailing by 4%. This is a test that should cause the gap between the two cards to open up due to the lack of shader performance, but clearly that this not the case. Perhaps we’ve been underestimating the memory bandwidth needs of this test? If that’s the case, given AMD’s significant memory bandwidth advantage it certainly helps to cement the 7970’s lead.

Our next benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. We’re now using a development build from the version 2.0 branch, and we’ve moved on to a more complex scene that hopefully will provide a greater challenge to our GPUs.

SmallLuxGPU on the other hand finally shows us that larger gap we’ve been expecting between the GTX 670 and GTX 680. The GTX 680’s larger number of SMXes and higher clockspeed cause the GTX 670 to fall behind by 10%, performing worse than the GTX 570 or even the GTX 470. More so than any other test, this is the test that drives home the point that GK104 isn’t a strong compute GPU while AMD offers nothing short of incredible compute performance.

For our next benchmark we’re looking at AESEncryptDecrypt, an OpenCL AES encryption routine that AES encrypts/decrypts an 8K x 8K pixel square image file. The results of this benchmark are the average time to encrypt the image over a number of iterations of the AES cypher.

Once again the GTX 670 has a weak showing here, although not as bad as with SmallLuxGPU. Still, it’s enough to fall behind the GTX 570; but at least it’s enough to beat the 7950. Clockspeeds help as showcased by the EVGA GTX 670SC but nothing really makes up for the missing SMX.

Our foruth benchmark is once again looking at compute shader performance, this time through the Fluid simulation sample in the DirectX SDK. This program simulates the motion and interactions of a 16k particle fluid using a compute shader, with a choice of several different algorithms. In this case we’re using an (O)n^2 nearest neighbor method that is optimized by using shared memory to cache data.

For reasons we’ve yet to determine, this benchmark strongly dislikes GTX 670 in particular. There doesn’t seem to be a performance regression in NVIDIA’s drivers, and there’s not an incredible gap due to TDP, it just struggles on the GTX 670. As a result performance of the GTC 670 only hits 42% of the GTX 680, which is well below what the GTX 670 should theoretically be getting. Barring some kind of esoteric reaction between this program and the unbalanced GPC a driver issue is still the most likely culprit, but it looks to only affect the GTX 670.

Finally, we’re adding one last benchmark to our compute run. NVIDIA  and the Folding@Home group have sent over a benchmarkable version of the client with preliminary optimizations for GK104. Folding@Home and similar initiatives are still one of the most popular consumer compute workloads, so it’s something NVIDIA wants their GPUs to do well at.

Whenever NVIDIA sends over a benchmark you can expect they have good reason to, and this is certainly the case for Folding@Home. GK104 is still a slouch given its resources compared to GF110, but at least it can surpass the GTX 580. At 970 nanoseconds per day the GTX 670 can tie the GTX 580, while the GTX 680 can pull ahead by 6%. Interestingly this benchmark appears to be far more constrained by clockspeed than the number of shaders, as the EVGA GTX 670SC outperforms the GTX 680 thanks to its 1188MHz boost clock, which it manages to stick to the entire time.

Civilization V Synthetics
Comments Locked

414 Comments

View All Comments

  • SlyNine - Saturday, May 12, 2012 - link

    No the 5870 was replaced by the 6970. The 5870 was faster then the 6870.

    The wall was coming, since the 9700pro that needed a power adapter, to videocards that need 2 power adapters and took 2 slots. That was how they got those 2 and even 4x increases. the 9700pro was as much as 6x faster then a 4600 at times.

    But like I said this wall was coming and from now on expect all performance improvements to be based on architecture and node improvements.
  • CeriseCogburn - Sunday, May 13, 2012 - link

    My text > " 4890-5870-6970 ???? "

    It was a typo earlier, dippy do.

    9600pro was not 6X faster than a 4600 ever, period - once again we have your spew and nothing else. But below we have the near EQUAL benchmarks.

    http://www.anandtech.com/show/947/20

    http://www.anandtech.com/show/947/22

    6X, 4X, 2X your rear end... another gigantic lie.

    Congrats on lies so big - hey at least your insane amd fanboy imagination and brainwashing of endless lies is being exposed.

    Keep up the good work.
  • Iketh - Thursday, May 10, 2012 - link

    do you listen to yourself? you're just as bad as wreckage....

    you have never and will never run a corporation
  • CeriseCogburn - Thursday, May 10, 2012 - link

    How can I disagree as obviously you are another internet blogger CEO - one of the many thousands we now have online with corporate business school degrees and endless babbling about profits without a single price cost for a single component of a single video card discussed under your belts.
    It's amazing how many of you tell us who can cut prices and remain profitable - when none of you have even the tiniest inkling of the cost of any component whatsoever, let alone the price it's sold at by nVidia or amd for that matter.
    I'm glad so many of you are astute and learned CEO mind masters, though.
  • chizow - Thursday, May 10, 2012 - link

    You really don't need to be an internet blogger or CEO, you don't even need a business degree although it certainly wouldn't hurt (especially in accounting).

    Just a rudimentary understanding of financial statements and you can easily understand Nvidia's business model, then see when and why they are most successful financially by looking at the market landscape and what products were selling and for how much.

    I can tell you right now, Nvidia was at its most profitable during G80 and G92's run of success (6 straight quarters of record profits that have been unmatched since), so we know for a fact what kind of revenues, margins and ASPs for components they can succeed with by looking at historical data.
  • CeriseCogburn - Friday, May 11, 2012 - link

    G92's were the most wide ranging selection of various cores hacks, bit width, memory config, etc- and released a enormous amount of different card versions - while this release is a flagship only tier thus far - so they don't relate at all.
    So no, you're stuck in the know exactly nothing spot I claimed you are, no matter what you spew about former releases.
    Worse than that, nVidia profit came from chipset sales and high end cards then - and getting information to show the G80 G92 G92b G94 etc profitability by itself will cost you a lot of money buying industry information.
    So you know nothing again, and tried to use a false equivalency.
    Thanks for trying though, and I certainly won't say you should change your personal stance on pricing of the "mid tier" 680, on the other hand I don't see you making a reasonable historical pricing/ performance/current prices release analysis - you haven't done that, and I've been reading all of your comments of course, and otherwise often agree with you.
    As I've said, the GTX580 was this year $499 - the 7970 released and 2.5 months later we're supposed to see the 580 killer not just at $499, but at $299 as the semi-accurate rumors and purported and unbelievable "insider anonymous information" rumors told us - that $299, since it was so unbelievable if examined at all, has become $399, or maybe $449, or $420, whatever the moaner wants it to be...
    I frankly don't buy any of it - and for good reason - this 680 came in as it did because it's a new core and they stripped it down for power/perf and that's that - and they drove amd pricing down.
    Now they're driving it down further.
    If the 680 hit at $299 like everyone claimed it was going to (bouncing off Charlie D's less than honest cranium and falling back on unquoted and anonymous "industry wide" claimed rumors or a single nVidia slide or posted trash prediction charts proven to be incorrect), then where would the 670 be priced at now ? $250 ?
    I suggest the performance increase along with the massive driver improvement bundle and keeping within the 300watt power requirements means that there is nowhere else to go right now.
    The "secret" "held back" performance is nowhere - the rumored card not here yet is a compute monster - so goodbye power/perf win and the giant PR advantage not to mention the vast body of amd fanboys standing on that alone - something nVidia NEVER planned to lead with this time - the big Kepler.
    It's not that nVidia outperformed itself, it's that their secrecy outperformed all the minds of the rabble - and all that's left is complainers who aren't getting something for nothing or something for half price as they hoped.
  • chizow - Thursday, May 10, 2012 - link

    I don't need to run a corporation to understand good and bad business. The fact there are *OUTRAGED* GTX 680 buyers who feel *CHEATED* after seeing the GTX 670 price:performance drives the point home.

    Nvidia really needs to be careful here as they've successfully upset their high-end target market on two fronts:

    1) High-end enthusiasts like myself who are upset they decided to follow AMD's lackluster price:performance curve and market a clearly mid-range ASIC (GK104) as a high-end SKU (GTX 670, 680, 690) and charge high-end premiums for it.

    2) High-end enthusiasts who actually felt the GTX 680 was worthy of its premium price tag, paid the $500 asking price and often, more to get them. Only to see that premium completely eroded by a card that performs within a few % points, yet costs 20% less and is readily available on the market.

    Talk about losing insane value overnight, you don't need to run a business to understand the kind of anger and angst that can cause.
  • CeriseCogburn - Friday, May 11, 2012 - link

    Well, the $$ BURN $$ is still less than the $$ BURN $$ the amd flagship cost - $130 + and that's the same card, not a need to be overclocked lower shader cut version.
    So as far as angry dollar burning, yeah, except amd has done worse in dollar costs than nvidia, and with the same card.
    Nice to know, hopefully your theory has a lot fo strong teeth, then the high end buyers can hold back and drive the price down...
    ( seems a dream doesn't it )
  • CeriseCogburn - Friday, May 11, 2012 - link

    Let's not forget there rage guy, that 7970 burn of $130+ bucks just turned into a $180 or $200 burn.

    Yet, CURRENTLY, all GTX680 owners can unload for upwards of $500... LOL

    Not so for 7970 owners, they are already perma burned.

    I guess you just didn't think it through, it was more important to share a falsity and rage against nVidia.
    Nice try, you've failed.
  • chizow - Sunday, May 13, 2012 - link

    Yes I've said from Day 1 the 7970 was horribly overpriced; it was just an extension of the 40nm price:performance curve 18 months after the fact.

    But that doesn't completely let Nvidia off the hook since they obviously used AMD's weak offering as the launching point to use a mid-range ASIC as their high-end SKU.

    End result is the consumer gets the SMALLEST increase in performance for their money in the last decade of GPUs. I don't understand why this is so hard for you to understand. Look at the benchmarks, do the math and have a seat.

Log in

Don't have an account? Sign up now