Compute and Tessellation

Moving on from our look at gaming performance, we have our customary look at compute performance, bundled with a look at theoretical tessellation performance. Unlike our gaming benchmarks where NVIDIA’s architectural enhancements could have an impact, everything here should be dictated by the core clock and SMs, with shader and polymorph engine counts defining most of these tests.

Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes.

We previously discovered that NVIDIA did rather well in this test, so it shouldn’t come as a surprise that the GTX 580 does even better. Even without the benefits of architectural improvements, the GTX 580 still ends up pulling ahead of the GTX 480 by 15%. The GTX 580 also does well against the 5970 here, which does see a boost from CrossFire but ultimately falls short, showcasing why multi-GPU cards can be inconsistent at times.

Our second compute benchmark is Cyberlink’s MediaEspresso 6, the latest version of their GPU-accelerated video encoding suite. MediaEspresso 6 doesn’t currently utilize a common API, and instead has codepaths for both AMD’s APP (née Stream) and NVIDIA’s CUDA APIs, which gives us a chance to test each API with a common program bridging them. As we’ll see this doesn’t necessarily mean that MediaEspresso behaves similarly on both AMD and NVIDIA GPUs, but for MediaEspresso users it is what it is.

We throw MediaEspresso 6 in largely to showcase that not everything that’s GPU accelerated is GPU-bound, as ME6 showcases this nicely. Once we move away from sub-$150 GPUs, APIs and architecture become much more important than raw speed. The 580 is unable to differentiate itself from the 480 as a result.

Our third GPU compute benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. While it’s still in beta, SmallLuxGPU recently hit a milestone by implementing a complete ray tracing engine in OpenCL, allowing them to fully offload the process to the GPU. It’s this ray tracing engine we’re testing.

SmallLuxGPU is rather straightforward in its requirements: compute and lots of it. The GTX 580 attains most of its theoretical performance improvement here, coming in at a bit over 15% over the GTX 480. It does get bested by a couple of AMD’s GPUs however, a showcase of where AMD’s theoretical performance advantage in compute isn’t so theoretical.

Our final compute benchmark is a Folding @ Home benchmark. Given NVIDIA’s focus on compute for Fermi and in particular GF110 and GF100, cards such as the GTX 580 can be particularly interesting for distributed computing enthusiasts, who are usually looking for the fastest card in the coolest package. This benchmark is from the original GTX 480 launch, so this is likely the last time we’ll use it.

If I said the GTX 580 was 15% faster, would anyone be shocked? So long as we’re not CPU bound it seems, the GTX 580 is 15% faster through all of our compute benchmarks. This coupled with the GTX 580’s cooler/quieter design should make the card a very big deal for distributed computing enthusiasts.

At the other end of the spectrum from GPU computing performance is GPU tessellation performance, used exclusively for graphical purposes. Here we’re interesting in things from a theoretical architectural perspective, using the Unigine Heaven benchmark and Microsoft’s DirectX 11 Detail Tessellation sample program to measure the tessellation performance of a few of our cards.

NVIDIA likes to heavily promote their tessellation performance advantage over AMD’s Cypress and Barts architectures, as it’s by far the single biggest difference between them and AMD. Not surprisingly the GTX 400/500 series does well here, and between those cards the GTX 580 enjoys a 15% advantage in the DX11 tessellation sample, while Heaven is a bit higher at 18% since Heaven is a full engine that can take advantage of the architectural improvements in GF110.

Seeing as how NVIDIA and AMD are still fighting about the importance of tessellation in both the company of developers and the public, these numbers shouldn’t be used as long range guidance. NVIDIA clearly has an advantage – getting developers to use additional tessellation in a meaningful manner is another matter entirely.

Wolfenstein Power, Temperature, and Noise
POST A COMMENT

159 Comments

View All Comments

  • Iketh - Thursday, November 11, 2010 - link

    ok EVERYONE belonging to this thread is on CRACK... what other option did AMD have to name the 68xx? If they named them 67xx, the differences between them and 57xx are too great. They use nearly as little power as 57xx yet the performance is 1.5x or higher!!!

    im a sucker for EFFICIENCY... show me significant gains in efficiency and i'll bite, and this is what 68xx handily brings over 58xx

    the same argument goes for 480-580... AT, show us power/performance ratios between generations on each side, then everyone may begin to understand the naming

    i'm sorry to break it to everyone, but this is where the GPU race is now, in efficiency, where it's been for cpus for years
    Reply
  • MrCommunistGen - Tuesday, November 09, 2010 - link

    Just started reading the article and I noticed a couple of typos on p1.

    "But before we get to deep in to GF110" --> "but before we get TOO deep..."

    Also, the quote at the top of the page was placed inside of a paragraph which was confusing.
    I read: "Furthermore GTX 480 and GF100 were clearly not the" and I thought: "the what?". So I continued and read the quote, then realized that the paragraph continued below.
    Reply
  • MrCommunistGen - Tuesday, November 09, 2010 - link

    well I see that the paragraph break has already been fixed... Reply
  • ahar - Tuesday, November 09, 2010 - link

    Also, on page 2 if Ryan is talking about the lifecycle of one process then "...the processes’ lifecycle." is wrong. Reply
  • Aikouka - Tuesday, November 09, 2010 - link

    I noticed the remark on Bitstreaming and it seems like a logical choice *not* to include it with the 580. The biggest factor is that I don't think the large majority of people actually need/want it. While the 580 is certainly quieter than the 480, it's still relatively loud and extraneous noise is not something you want in a HTPC. It's also overkill for a HTPC, which would delegate the feature to people wanting to watch high-definition content on their PC through a receiver, which probably doesn't happen much.

    I'd assume the feature could've been "on the board" to add, but would've probably been at the bottom of the list and easily one of the first features to drop to either meet die size (and subsequently, TDP/Heat) targets or simply to hit their deadline. I certainly don't work for nVidia so it's really just pure speculation.
    Reply
  • therealnickdanger - Tuesday, November 09, 2010 - link

    I see your points as valid, but let me counterpoint with 3-D. I think NVIDIA dropped the ball here in the sense that there are two big reasons to have a computer connected to your home theater: games and Blu-ray. I know a few people that have 3-D HDTVs in their homes, but I don't know anyone with a 3-D HDTV and a 3-D monitor.

    I realize how niche this might be, but if the 580 supported bitstreaming, then it would be perfect card for anyone that wants to do it ALL. Blu-ray, 3-D Blu-Ray, any game at 1080p with all eye-candy, any 3-D game at 1080p with all eye-candy. But without bitstreaming, Blu-ray is moot (and mute, IMO).

    For a $500+ card, it's just a shame, that's all. All of AMD's high-end cards can do it.
    Reply
  • QuagmireLXIX - Sunday, November 14, 2010 - link

    Well said. There are quite a few fixes that make the 580 what I wanted in March, but the lack of bitstream is still a hard hit for what I want my PC to do.

    Call me niche.
    Reply
  • QuagmireLXIX - Sunday, November 14, 2010 - link

    Actually, this is killing me. I waited for the 480 in March b4 pulling the trigger on a 5870 because I wanted HDMI to a Denon 3808 and the 480 totally dropped the ball on the sound aspect (S/PDIF connector and limited channels and all). I figured no big deal, it is a gamer card after all, so 5870 HDMI I went.

    The thing is, my PC is all-in-one (HTPC, Game & typical use). The noise and temps are not a factor as I watercool. When I read that HDMI audio got internal on the 580, I thought, finally. Then I read Guru's article and seen bitstream was hardware supported and just a driver update away, I figured I was now back with the green team since 8800GT.

    Now Ryan (thanks for the truth, I guess :) counters Gurus bitstream comment and backs it up with direct communication with NV. This blows, I had a lofty multimonitor config in mind and no bitstream support is a huge hit. I'm not even sure if I should spend the time to find out if I can arrange the monitor setup I was thinking.

    Now I might just do a HTPC rig and Game rig or see what 6970 has coming. Eyefinity has an advantage for multiple monitors, but the display-port puts a kink in my designs also.
    Reply
  • Mr Perfect - Tuesday, November 09, 2010 - link

    So where do they go from here? Disable one SM again and call it a GTX570? GF104 is to new to replace, so I suppose they'll enable the last SM on it for a GTX560. Reply
  • chizow - Tuesday, November 09, 2010 - link

    There's 2 ways they can go with the GTX 570, either more disabled SM than just 1 (2-3) and similar clockspeed to the GTX 580, or fewer disabled SM but much lower clockspeed. Both would help to reduce TDP but with more disabled SM that would also help Nvidia unload the rest of their chip yield. I have a feeling they'll disable 2-3 SM with higher clocks similar to the 580 so that the 470 is still slightly slower than the 480.

    I'm thinking along the same lines as you though for the GTX 560, it'll most likely be the full-fledged GF104 we've been waiting for with all 384SP enabled, probably slightly higher clockspeeds and not much more than that, but definitely a faster card than the original GF104.
    Reply

Log in

Don't have an account? Sign up now