Compute and Tessellation

Moving on from our look at gaming performance, we have our customary look at compute performance, bundled with a look at theoretical tessellation performance. Unlike our gaming benchmarks where NVIDIA’s architectural enhancements could have an impact, everything here should be dictated by the core clock and SMs, with shader and polymorph engine counts defining most of these tests.

Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes.

We previously discovered that NVIDIA did rather well in this test, so it shouldn’t come as a surprise that the GTX 580 does even better. Even without the benefits of architectural improvements, the GTX 580 still ends up pulling ahead of the GTX 480 by 15%. The GTX 580 also does well against the 5970 here, which does see a boost from CrossFire but ultimately falls short, showcasing why multi-GPU cards can be inconsistent at times.

Our second compute benchmark is Cyberlink’s MediaEspresso 6, the latest version of their GPU-accelerated video encoding suite. MediaEspresso 6 doesn’t currently utilize a common API, and instead has codepaths for both AMD’s APP (née Stream) and NVIDIA’s CUDA APIs, which gives us a chance to test each API with a common program bridging them. As we’ll see this doesn’t necessarily mean that MediaEspresso behaves similarly on both AMD and NVIDIA GPUs, but for MediaEspresso users it is what it is.

We throw MediaEspresso 6 in largely to showcase that not everything that’s GPU accelerated is GPU-bound, as ME6 showcases this nicely. Once we move away from sub-$150 GPUs, APIs and architecture become much more important than raw speed. The 580 is unable to differentiate itself from the 480 as a result.

Our third GPU compute benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. While it’s still in beta, SmallLuxGPU recently hit a milestone by implementing a complete ray tracing engine in OpenCL, allowing them to fully offload the process to the GPU. It’s this ray tracing engine we’re testing.

SmallLuxGPU is rather straightforward in its requirements: compute and lots of it. The GTX 580 attains most of its theoretical performance improvement here, coming in at a bit over 15% over the GTX 480. It does get bested by a couple of AMD’s GPUs however, a showcase of where AMD’s theoretical performance advantage in compute isn’t so theoretical.

Our final compute benchmark is a Folding @ Home benchmark. Given NVIDIA’s focus on compute for Fermi and in particular GF110 and GF100, cards such as the GTX 580 can be particularly interesting for distributed computing enthusiasts, who are usually looking for the fastest card in the coolest package. This benchmark is from the original GTX 480 launch, so this is likely the last time we’ll use it.

If I said the GTX 580 was 15% faster, would anyone be shocked? So long as we’re not CPU bound it seems, the GTX 580 is 15% faster through all of our compute benchmarks. This coupled with the GTX 580’s cooler/quieter design should make the card a very big deal for distributed computing enthusiasts.

At the other end of the spectrum from GPU computing performance is GPU tessellation performance, used exclusively for graphical purposes. Here we’re interesting in things from a theoretical architectural perspective, using the Unigine Heaven benchmark and Microsoft’s DirectX 11 Detail Tessellation sample program to measure the tessellation performance of a few of our cards.

NVIDIA likes to heavily promote their tessellation performance advantage over AMD’s Cypress and Barts architectures, as it’s by far the single biggest difference between them and AMD. Not surprisingly the GTX 400/500 series does well here, and between those cards the GTX 580 enjoys a 15% advantage in the DX11 tessellation sample, while Heaven is a bit higher at 18% since Heaven is a full engine that can take advantage of the architectural improvements in GF110.

Seeing as how NVIDIA and AMD are still fighting about the importance of tessellation in both the company of developers and the public, these numbers shouldn’t be used as long range guidance. NVIDIA clearly has an advantage – getting developers to use additional tessellation in a meaningful manner is another matter entirely.

Wolfenstein Power, Temperature, and Noise
POST A COMMENT

159 Comments

View All Comments

  • Haydyn323 - Tuesday, November 09, 2010 - link

    Nobody seems to be taking into account the fact that the 580 is a PREMIUM level card. It is not meant to be compared to a 6870. Sure 2x 6870s can do more. This card is not, however, geared for that category of buyer.

    It is geared for the enthusiast who intends to buy 2 or 3 580s and completely dominate benchmarks and get 100+ fps in every situation. Your typical gamer will not likely buy a 580, but your insane gamer will likely buy 2 or 3 to play their 2560x1600 monitor at 60fps all the time.

    I fail to see how AMD is destroying anything here. Cost per speed AMD wins, but speed possible, Nvidia clearly wins for the time being. If anyone can come up with something faster than 3x 580s in the AMD camp feel free to post it in response here.
    Reply
  • TemplarGR - Tuesday, November 09, 2010 - link

    Do you own NVIDIA stock, or are you a fanboy? Because really, only one of the two could not see how AMD destroys NVIDIA. AMD's architecture is much more efficient.

    How many "insane gamers" exist, that would pay 1200 or 1800 dollars just for gpus, and adding to that an insanely expensive PSU, tower and mainboard needed to support such a thing? And play what? Console ports? On what screens? Maximum resolution is still 2560x1600 and even a single 6870 could do fine in most games in it...

    And just because there may be about 100 rich kids in the whole world with no lives who could create such a machine, does it make 580 a success?

    Do YOU intent to create such a beast? Or would you buy a mainstream NVIDIA card, just because the posibility of 3x 580s exists?Come on...
    Reply
  • Haydyn323 - Tuesday, November 09, 2010 - link

    So, the answer is no; you cannot come up with something faster. Also, as shown right here on Anandtech:

    http://www.anandtech.com/show/3987/amds-radeon-687...

    A single 6870 cannot play most modern games at anywhere near 60fps at 2560x1600. Even the 580 needs to be SLI'd to guarantee it.

    That is all.
    Reply
  • Haydyn323 - Tuesday, November 09, 2010 - link

    Oh and yes I do intend to buy a couple of them in a few months. One at first and add another later. I also love when fanboys call other fanboys, "fanboys." It doesn't get anyone anywhere. Reply
  • smookyolo - Tuesday, November 09, 2010 - link

    PC games are not simply console ports, the fact that you need a top of the line PC to get even close to 60 FPS in most cases at not even maximum graphics settings is proof of this.

    PC "ports" of console games have been tweaked and souped up to have much better graphics, and can take advantage of current gen hardware, instead of the ancient hardware in consoles.

    The "next gen" consoles will, of course, be worse than PCs of the time.

    And game companies will continue to alter their games so that they look better on PCs.

    It's a fact, live with it.
    Reply
  • mapesdhs - Tuesday, November 09, 2010 - link


    'How many "insane gamers" exist, that would pay 1200 or 1800 dollars just for gpus, ...'

    Actually the market for this is surprisingly strong in some areas, especially
    CA I was told. I suspect it's a bit like other components such as top-spec
    hard drives and high-end CPUs: the volumes are smaller but the margins
    are significantly higher for the seller.

    Some sellers even take a loss on low-end items just to retain the custom,
    making their money on more expensive models.

    Ian.
    Reply
  • QuagmireLXIX - Sunday, November 14, 2010 - link

    "Maximum resolution is still 2560x1600 and even a single 6870 could do fine in most games in it..."

    Multiple monitors (surround, eyefinity) resolutions get much larger.
    Reply
  • 7Enigma - Tuesday, November 09, 2010 - link

    Just to clarify your incorrect (or misleading) statement 2 6870's in CF use significantly more power than a single 580, but also perform significantly better in most games (minimum frame rate issue noted however). Reply
  • TemplarGR - Tuesday, November 09, 2010 - link

    True. I made a mistake on this one. Only in idle power it consumes slightly less. My bad. Reply
  • cjb110 - Tuesday, November 09, 2010 - link

    "The thermal pads connecting the memory to the shroud have once again wiped out the chip markets", wow powerful adhesive that! Bet Intel's pissed. Reply

Log in

Don't have an account? Sign up now