Compute and Tessellation

Moving on from our look at gaming performance, we have our customary look at compute performance, bundled with a look at theoretical tessellation performance. Unlike our gaming benchmarks where NVIDIA’s architectural enhancements could have an impact, everything here should be dictated by the core clock and SMs, with shader and polymorph engine counts defining most of these tests.

Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes.

We previously discovered that NVIDIA did rather well in this test, so it shouldn’t come as a surprise that the GTX 580 does even better. Even without the benefits of architectural improvements, the GTX 580 still ends up pulling ahead of the GTX 480 by 15%. The GTX 580 also does well against the 5970 here, which does see a boost from CrossFire but ultimately falls short, showcasing why multi-GPU cards can be inconsistent at times.

Our second compute benchmark is Cyberlink’s MediaEspresso 6, the latest version of their GPU-accelerated video encoding suite. MediaEspresso 6 doesn’t currently utilize a common API, and instead has codepaths for both AMD’s APP (née Stream) and NVIDIA’s CUDA APIs, which gives us a chance to test each API with a common program bridging them. As we’ll see this doesn’t necessarily mean that MediaEspresso behaves similarly on both AMD and NVIDIA GPUs, but for MediaEspresso users it is what it is.

We throw MediaEspresso 6 in largely to showcase that not everything that’s GPU accelerated is GPU-bound, as ME6 showcases this nicely. Once we move away from sub-$150 GPUs, APIs and architecture become much more important than raw speed. The 580 is unable to differentiate itself from the 480 as a result.

Our third GPU compute benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. While it’s still in beta, SmallLuxGPU recently hit a milestone by implementing a complete ray tracing engine in OpenCL, allowing them to fully offload the process to the GPU. It’s this ray tracing engine we’re testing.

SmallLuxGPU is rather straightforward in its requirements: compute and lots of it. The GTX 580 attains most of its theoretical performance improvement here, coming in at a bit over 15% over the GTX 480. It does get bested by a couple of AMD’s GPUs however, a showcase of where AMD’s theoretical performance advantage in compute isn’t so theoretical.

Our final compute benchmark is a Folding @ Home benchmark. Given NVIDIA’s focus on compute for Fermi and in particular GF110 and GF100, cards such as the GTX 580 can be particularly interesting for distributed computing enthusiasts, who are usually looking for the fastest card in the coolest package. This benchmark is from the original GTX 480 launch, so this is likely the last time we’ll use it.

If I said the GTX 580 was 15% faster, would anyone be shocked? So long as we’re not CPU bound it seems, the GTX 580 is 15% faster through all of our compute benchmarks. This coupled with the GTX 580’s cooler/quieter design should make the card a very big deal for distributed computing enthusiasts.

At the other end of the spectrum from GPU computing performance is GPU tessellation performance, used exclusively for graphical purposes. Here we’re interesting in things from a theoretical architectural perspective, using the Unigine Heaven benchmark and Microsoft’s DirectX 11 Detail Tessellation sample program to measure the tessellation performance of a few of our cards.

NVIDIA likes to heavily promote their tessellation performance advantage over AMD’s Cypress and Barts architectures, as it’s by far the single biggest difference between them and AMD. Not surprisingly the GTX 400/500 series does well here, and between those cards the GTX 580 enjoys a 15% advantage in the DX11 tessellation sample, while Heaven is a bit higher at 18% since Heaven is a full engine that can take advantage of the architectural improvements in GF110.

Seeing as how NVIDIA and AMD are still fighting about the importance of tessellation in both the company of developers and the public, these numbers shouldn’t be used as long range guidance. NVIDIA clearly has an advantage – getting developers to use additional tessellation in a meaningful manner is another matter entirely.

Wolfenstein Power, Temperature, and Noise
Comments Locked

160 Comments

View All Comments

  • AnnonymousCoward - Wednesday, November 10, 2010 - link

    I'm with you, that AMD still has a superior performance per power design. But with the 580, nvidia took Fermi from being outrageous to competitive in that category, and even wins by a wide margin with idle power. Looking at the charts, the 580 also has a vastly superior cooling system to the 5970. Mad props to nvidia for turning things around.
  • FragKrag - Tuesday, November 9, 2010 - link

    Still no SC2? :(
  • Ryan Smith - Tuesday, November 9, 2010 - link

    Honestly, I ran out of time. I need to do a massive round of SC2 benchmarking this week, at which time it will be in all regular reviews and will be in Bench.
  • ph3412b07 - Tuesday, November 9, 2010 - link

    There is always some debate as to the value of single gpu solutions vs multi gpu. I've noticed that the avg/max framerate in multi gpu setups is in fact quite good in some cases, but the min fps paints a different picture, with nearly all setups and various games being plagued by micro-stutter. Has anybody else come across this as reason to go with a more expensive single card?
  • eXces - Tuesday, November 9, 2010 - link

    Why did u not include some overclocked 5970? Like u did with GTX 460 when u reviewed 6800 series?
  • Ryan Smith - Wednesday, November 10, 2010 - link

    If you don't recall from our 5970 review, we disqualified our 5970 when running at 5870 clocks. The VRMs on the 5970 cannot keep up with the power draw on some real world applications, so it does not pass our muster at those speeds by even the loosest interpretation.
  • 529th - Tuesday, November 9, 2010 - link

    I knew OCCT was a culprit of causing problems.
  • Ph0b0s - Tuesday, November 9, 2010 - link

    Was very interested to look at the review today to see how the new GTX580 and other DX11 card options are in comparison to my GTX 285 SLI setup. But unfortunately for the games I am playing BFBC2, Stalker etc and would base my descition on, I still don't know as my card is not represented. I know why,, becuase they are DX11 games and my card is DX10, but my card still runs them and I would want to know how they compare even if one is running DX10 and the other running DX11. Even Anandtech's chart system gives no measure for my cards in these games . Please sort this out. Just becuase a card does not run the latest version of directx does not mean it should be forgotten. Escpecially since the people most likley to be looking at upgrading are those with this generation of card rather than people with DX 11 hardware...
  • mapesdhs - Wednesday, November 10, 2010 - link


    Don't worry, I'll have some useful info for you soon! 8800GT vs. 4890 vs. 460, in all
    three cases testing 1 & 2 cards. You should be able to eaisly extrapolate from the
    results to your GTX285 vs. 580 scenario. Send me an email (mapesdhs@yahoo.com)
    and I'll drop you a line when the results are up. Data for 8800 GT vs. 4890 is already up:

    http://www.sgidepot.co.uk/misc/pctests.html
    http://www.sgidepot.co.uk/misc/stalkercopbench.txt

    but I'm adding two more tests (Unigine and X3TC).

    Ian.
  • juampavalverde - Tuesday, November 9, 2010 - link

    NVIDIA exceed AMD with this... as long as the barts should have been 6770, this fermi slight improvement just in this universe can be called 5xx series. it is just the gf100 done right, and should have been named properly, as gtx 490.

Log in

Don't have an account? Sign up now