Compute and Tessellation

Moving on from our look at gaming performance, we have our customary look at compute performance, bundled with a look at theoretical tessellation performance. Unlike our gaming benchmarks where NVIDIA’s architectural enhancements could have an impact, everything here should be dictated by the core clock and SMs, with shader and polymorph engine counts defining most of these tests.

Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes.

We previously discovered that NVIDIA did rather well in this test, so it shouldn’t come as a surprise that the GTX 580 does even better. Even without the benefits of architectural improvements, the GTX 580 still ends up pulling ahead of the GTX 480 by 15%. The GTX 580 also does well against the 5970 here, which does see a boost from CrossFire but ultimately falls short, showcasing why multi-GPU cards can be inconsistent at times.

Our second compute benchmark is Cyberlink’s MediaEspresso 6, the latest version of their GPU-accelerated video encoding suite. MediaEspresso 6 doesn’t currently utilize a common API, and instead has codepaths for both AMD’s APP (née Stream) and NVIDIA’s CUDA APIs, which gives us a chance to test each API with a common program bridging them. As we’ll see this doesn’t necessarily mean that MediaEspresso behaves similarly on both AMD and NVIDIA GPUs, but for MediaEspresso users it is what it is.

We throw MediaEspresso 6 in largely to showcase that not everything that’s GPU accelerated is GPU-bound, as ME6 showcases this nicely. Once we move away from sub-$150 GPUs, APIs and architecture become much more important than raw speed. The 580 is unable to differentiate itself from the 480 as a result.

Our third GPU compute benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. While it’s still in beta, SmallLuxGPU recently hit a milestone by implementing a complete ray tracing engine in OpenCL, allowing them to fully offload the process to the GPU. It’s this ray tracing engine we’re testing.

SmallLuxGPU is rather straightforward in its requirements: compute and lots of it. The GTX 580 attains most of its theoretical performance improvement here, coming in at a bit over 15% over the GTX 480. It does get bested by a couple of AMD’s GPUs however, a showcase of where AMD’s theoretical performance advantage in compute isn’t so theoretical.

Our final compute benchmark is a Folding @ Home benchmark. Given NVIDIA’s focus on compute for Fermi and in particular GF110 and GF100, cards such as the GTX 580 can be particularly interesting for distributed computing enthusiasts, who are usually looking for the fastest card in the coolest package. This benchmark is from the original GTX 480 launch, so this is likely the last time we’ll use it.

If I said the GTX 580 was 15% faster, would anyone be shocked? So long as we’re not CPU bound it seems, the GTX 580 is 15% faster through all of our compute benchmarks. This coupled with the GTX 580’s cooler/quieter design should make the card a very big deal for distributed computing enthusiasts.

At the other end of the spectrum from GPU computing performance is GPU tessellation performance, used exclusively for graphical purposes. Here we’re interesting in things from a theoretical architectural perspective, using the Unigine Heaven benchmark and Microsoft’s DirectX 11 Detail Tessellation sample program to measure the tessellation performance of a few of our cards.

NVIDIA likes to heavily promote their tessellation performance advantage over AMD’s Cypress and Barts architectures, as it’s by far the single biggest difference between them and AMD. Not surprisingly the GTX 400/500 series does well here, and between those cards the GTX 580 enjoys a 15% advantage in the DX11 tessellation sample, while Heaven is a bit higher at 18% since Heaven is a full engine that can take advantage of the architectural improvements in GF110.

Seeing as how NVIDIA and AMD are still fighting about the importance of tessellation in both the company of developers and the public, these numbers shouldn’t be used as long range guidance. NVIDIA clearly has an advantage – getting developers to use additional tessellation in a meaningful manner is another matter entirely.

Wolfenstein Power, Temperature, and Noise
POST A COMMENT

159 Comments

View All Comments

  • nitrousoxide - Tuesday, November 09, 2010 - link

    Delaying is something good because it indicates that Cayman can be very big, very fast and...very hungry making it hard to build. What AMD needs is a card that can defeat GTX580, no matter how hot or power-hungry it is. Reply
  • GeorgeH - Tuesday, November 09, 2010 - link

    Is there any word on a fully functional GF104?

    Nvidia could call it the 560, with 5="Not Gimped".
    Reply
  • Sihastru - Tuesday, November 09, 2010 - link

    I guess once GTX470 goes EOL. If GTX460 had all it's shaders enabled then the overclocked versions would have canibalized GTX470 sales. Even so, it will happen on occasion. Reply
  • tomoyo - Tuesday, November 09, 2010 - link

    My guess is there will be GTX 580 derivatives with less cores enabled as usual, probably a GTX 570 or something. And then an improved GTX 460 with all cores enabled as the GTX 560. Reply
  • tomoyo - Tuesday, November 09, 2010 - link

    Good to see nvidia made a noticeable improvement over the overly hot and power hungry GTX 480. Unfortunately way above my power and silence needs, but competition is a good thing. Now I'm highly curious how close the Radeon 69xx will come in performance or if it can actually beat the GTX 580 in some cases.
    Of course the GTX 480 is completely obsolete now, more power, less speed, more noise, ugly to look at.
    Reply
  • 7eki - Tuesday, November 09, 2010 - link

    What we got here today is a higher clocked, better cooled GTX 480 with a slightly better power consumption. All of that for only 80$ MORE ! Any first served version of non referent GTX 480 is equipped with a much better cooling solution that gives higher OC possibilites and could kick GTX 580's ass. If we compare GTX 480 to a GTX580 clock2clock we will get about 3% of a difference in performance. All thanks to 32 CUDA processors, and a few more TMU's. How come the reviewers are NOW able to find pros of something that they used to criticise 7 months ago ? Don't forget that AMD's about to break their Sweet Spot strategy just to cut your hypocrites tongues. I bet that 6990's going to be twice as fast as what we got here today . If we really got anything cause I can't really tell the difference. Reply
  • AnnonymousCoward - Tuesday, November 09, 2010 - link

    32W less for 15% more performance, still on 40nm, is a big deal. Reply
  • 7eki - Wednesday, November 10, 2010 - link

    32W and 15% you say ? No it isn't a big deal since AMD's Barts GPUs release. Have on mind that GTX580 still consumes more energy than a faster (in most cases) and one year older multi GPU HD5970. In that case even 60 would sound ridiculosly funny. It's not more than a few percent improvement over GTX480. If you don't believe it calculate how much longer will you have to play on your GTX580 just to get your ~$40 spent on power consumption compared to a GTX480 back. Not to mention (again) that a nonreferent GTX480 provides much better cooling solutions and OC possibilities. Nvidia's diggin their own grave. Just like they did by releasing GTX460. The only thing that's left for them right now is to trick the reviewers. But who cares. GTX 580 isn't going to make them sell more mainstream GPUs. It isn't nvidia whos cutting HD5970 prices right now. It was AMD by releasing HD6870/50 and announcing 6970. It should have been mentioned by all of you reviewers who treat the case seriously. Nvidia's a treacherous snake and the reviewers job is not to let such things happen. Reply
  • Sihastru - Wednesday, November 10, 2010 - link

    Have you heard about the ASUS GTX580 Voltage Tweak edition that can be clocked up to 1100 MHz, that's more then 40% OC? Have you seen the EVGA GTX580 FTW yet?

    The fact that a single GPU card is in some cases faster then a dual GPU card built with two of the fastest competing GPU's tells a lot of good things about that single GPU card.

    This "nVidia in the Antichrist" speech is getting old. Repeating it all over the interwebs doesn't make it true.
    Reply
  • AnnonymousCoward - Wednesday, November 10, 2010 - link

    I'm with you, that AMD still has a superior performance per power design. But with the 580, nvidia took Fermi from being outrageous to competitive in that category, and even wins by a wide margin with idle power. Looking at the charts, the 580 also has a vastly superior cooling system to the 5970. Mad props to nvidia for turning things around. Reply

Log in

Don't have an account? Sign up now