For a while now we’ve been trying to establish a proper cross-platform compute benchmark suite to add to our GPU articles. It’s not been entirely successful.

While GPUs have been compute capable in some form since 2006 with the launch of G80, and AMD significantly improved their compute capabilities in 2009 with Cypress, the software has been slow to catch on. From gatherings such as NVIDIA’s GTC we’ve seen first-hand how GPU computing is being used in the high-performance computing market, but the consumer side hasn’t materialized as quickly as the right situations for using GPU computing aren’t as straightforward and many developers are unwilling to attach themselves to a single platform in the process.

2009 saw the ratification of OpenCL 1.0 and the launch of DirectCompute, and while the launch of these cross-platform APIs removed some of the roadblocks, we heard as recently as last month from Adobe and others that there’s still work to be done before companies can confidently deploy GPU compute accelerated software. The immaturity of OpenCL drivers was cited as one cause, however there’s also the fact that a lot of computers simply don’t have a suitable compute-capable GPU – it’s Intel that’s the world’s biggest GPU vendor after all.

So here in the fall of 2010 our search for a wide variety of GPU compute applications hasn’t panned out quite like we expected it too. Widespread adoption of GPU computing in consumer applications is still around the corner, so for the time being we have to get creative.

With that in mind we’ve gone ahead and cooked up a new GPU compute benchmark suite based on the software available to us. On the consumer side we have the latest version of Cyberlink’s MediaEspresso video encoding suite and an interesting sub-benchmark from Civilization V. On the professional side we have SmallLuxGPU, an OpenCL based ray tracer. We don’t expect this to be the be all and end all of GPU computing benchmarks, but it gives us a place to start and allows us to cover both cross-platform APIs and NVIDIA & AMD’s platform-specific APIs.

Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes.

In our look at Civ V’s performance as a game, we noted that it favors NVIDIA’s GPUs at the moment, and this may be part of the reason why. NVIDIA’s GPUs clean up here, particularly when compared to the 6800 series and its reduced shader count. Furthermore within the GPU families the results are very straightforward, with the order following the relative compute power of each GPU. To be fair to AMD they made a conscious decision to not chase GPU computing performance with the 6800 series, but as a result it fares poorly here.

Our second compute benchmark is Cyberlink’s MediaEspresso 6, the latest version of their GPU-accelerated video encoding suite. MediaEspresso 6 doesn’t currently utilize a common API, and instead has codepaths for both AMD’s APP (née Stream) and NVIDIA’s CUDA APIs, which gives us a chance to test each API with a common program bridging them. As we’ll see this doesn’t necessarily mean that MediaEspresso behaves similarly on both AMD and NVIDIA GPUs, but for MediaEspresso users it is what it is.

We decided to go ahead and use MediaEspresso in this article not knowing what we’d find, and it turns out the results were both more and less than we were expecting at the same time. While our charts don’t show it, video transcoding isn’t all that GPU intensive with MediaEspresso; once we achieve a certain threshold of compute performance on a GPU – such as a GTX 460 in the case of an NVIDIA card – the rest of the process is CPU bottlenecked. As a result all of our Fermi NVIDIA cards at the GTX 460 or better take just as long to encode our sample video, and while the AMD cards show some stratification, it’s on the order of only a couple of seconds. From this it’s clear that with Cyberlink’s technology having a GPU is going to help, but it can’t completely offload what’s historically been a CPU-intensive activity.

As for an AMD/NVIDIA cross comparison, the results are straightforward but not particularly enlightening. It turns out that MediaEspresso  6 is significantly faster on NVIDIA GPUs than it is on AMD GPUs, but since we’ve already established that MediaEspresso 6 is CPU limited when using these powerful GPUs, it doesn’t say anything about the hardware. AMD and NVIDIA both provide common GPU video encoding frameworks for their products that Cyberlink taps in to, and it’s here where we believe the difference lies.

In particular we see MediaEspresso 6 achieve 50% CPU utilization (4 core) when being used with an NVIDIA GPU, while it only achieves 13% CPU utilization (1 core) with an AMD GPU. At this point it would appear that the CPU portions of NVIDIA’s GPU encoding framework are multithreaded while AMD’s framework is singlethreaded. And since the performance bottleneck for video encoding still lies with the CPU, this would be why the NVIDIA GPUs do so much better than the AMD GPUs in this benchmark.

Our final GPU compute benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. While it’s still in beta, SmallLuxGPU recently hit a milestone by implementing a complete ray tracing engine in OpenCL, allowing them to fully offload the process to the GPU. It’s this ray tracing engine we’re testing.

Compared to our other two GPU computing benchmarks, SmallLuxGPU follows the theoretical performance of our GPUs much more closely. As a result our Radeon GPUs with their difficult-to-utilize VLIW5 design end up topping the charts by a significant margin, while the fastest comparable NVIDIA GPU is still 10% slower than the 6850. Ultimately what we’re looking at is what amounts to the best-case scenarios for these GPUs, with this being as good an example as any that in the right circumstances AMD’s VLIW5 shader design can go toe-to-toe with NVIDIA’s compute-focused design and still win.

At the other end of the spectrum from GPU computing performance is GPU tessellation performance, used exclusively for graphical purposes. For the Radeon 6800 series, AMD enhanced their tessellation unit to offer better tessellation performance at lower tessellation factors. In order to analyze the performance of AMD’s enhanced tessellator, we’re using the Unigine Heaven benchmark and Microsoft’s DirectX 11 Detail Tessellation sample program to measure the tessellation performance of a few of our cards.

Since Heaven is a synthetic benchmark at the moment (the DX11 engine isn’t currently used in any games) we’re less concerned with performance relative to NVIDIA’s cards and more concerned with performance relative to the 5870. Compared to the 5870 the 6870 ends up being slightly slower when using moderate amounts of tessellation, while it pulls ahead when using extreme amounts of tessellation. Considering that the 6870 is around 7% slower in games than the 5870 this is actually quite an accomplishment for Barts, and one that we can easily trace back to AMD’s tessellator improvements.

Our second tessellation test is Microsoft’s DirectX 11 Detail Tessellation sample program, which is a much more straightforward test of tessellation performance. Here we’re simply looking at the framerate of the program at different tessellation levels, specifically level 7 (the default level) and level 11 (the maximum level). Here AMD’s tessellation improvements become even more apparent, with the 6870 handily beating the 5870. In fact our results are very close to AMD’s own internal results – at level 7 the 6870 is 43% faster than the 5870, while at level 11 that improvement drops to 29% as the increased level leads to an increasingly large tessellation factor. However this also highlights the fact that AMD’s tessellation performance still collapses at high factors compared to NVIDIA’s GPUs, making it all the more important for AMD to encourage developers to use more reasonable tessellation factors.

Wolfenstein Power, Temperature, & Noise
Comments Locked

197 Comments

View All Comments

  • StriderGT - Friday, October 22, 2010 - link

    I agree with you that the inclusion of the FTW card was a complete caving and casts shadows to a so far excellent reputation of anandtech. I believe the whole motivation was PR related, retaining a workable relation with nvidia, but was it worth it?!

    Look how ugly can this sort of thing get, they do not even include the test setup... Quote from techradar.com:

    We expected the 6870 to perform better than it did – especially as this is essentially being pitched as a GTX 460 killer.
    The problem is, Nvidia's price cuts have made this an impossible task, with the FTW edition of the GTX 460 rolling in at just over £170, yet competently outperforming the 6870 in every benchmark we threw at it.
    In essence, therefore, all the 6870 manages is to unseat the 5850 which given its end of life status isn't too difficult a feat. We'd still recommend buying a GTX 460 for this sort of cash. All tests ran at 1,920 x 1,080 at the highest settings, apart from AvP, which was ran at 1,680 x 1,050.

    http://www.techradar.com/reviews/pc-mac/pc-compone...
  • oldscotch - Friday, October 22, 2010 - link

    ...where a Civilization game would be used for a GPU benchmark.
  • AnnihilatorX - Friday, October 22, 2010 - link

    It's actually quite taxing on the maps. It lags on my HD4850.

    The reason is, it uses DX 11 DirectCompute features on texture decompression. The performance is noticeably better on DX11 cards.
  • JonnyDough - Friday, October 22, 2010 - link

    "Ultimately this means we’re looking at staggered pricing. NVIDIA and AMD do not have any products that are directly competing at the same price points: at every $20 you’re looking at switching between AMD and NVIDIA."

    Not when you figure in NVidia's superior drivers, or power consumption...depending on which one matters most to you.
  • Fleeb - Friday, October 22, 2010 - link

    I looked at the load power consumption charts and saw the Radeon cards are better in this department and I don't clearly understand your statement. Did you mean that the nVidia cards in these tests should be better because of superior power consumption or that their power consumption is superior in a sense that nVidia cards consume more power?
  • jonup - Friday, October 22, 2010 - link

    I think he meant the nVidia has better drivers but worse power consumption. So it all depends on what you value most. At least that's how I took it.
  • zubzer0 - Friday, October 22, 2010 - link

    Great review!

    If you have the time I would be wery happy if you test how well these boards do in Age of Conan DX10?

    Some time ago you included (feb. 2009) Age of Conan in your reviews, but since then DX10 support was added to the game. I have yet to see an official review of the current graphics cards performance in AoC DX10.

    Btw. With the addon "Rise of the godslayer" the graphics in the new Khitai zone are gorgeous!
  • konpyuuta_san - Friday, October 22, 2010 - link

    In my case (pun intended), the limiting factor is the physical size of the card. I've abandoned the ATX formats completely, going all out for mini-ITX (this one is Silverstone's sugo sg06). The king of ITX cases might still be the 460, but this is making me feel a bit sore about the 460 I'm just about to buy. Especially since the 6870 is actually only $20 more than the 6850 where I live and the 6850 is identically priced to the 460. There's just no way I can fit a 10.5 inch card into a 9 inch space. The 9 inch 6850 would fit, but there's a large radiator mounted on the front of the case, connected to a cpu water cooling block, that will interfere with the card. I've considered some crazy mods to the case, but those options just don't feel all that attractive. The GTX460 is a good quarter inch shorter and I'm getting a model with top-mounted power connectors so there's ample room for everything in this extremely packed little gaming box. I'm still kind of trying to find a way to put a 6850 in there (bangs and bucks and all that), which leads to my actual question, namely:

    The issue of rated power consumption; recommended minimum for the 460 is 450W (which I can support), but for the 6850 it's 500W (too much). How critical are those requirements? Does the 6850 really require a 500W supply? Despite having lower power consumption than the 460?! Or is that just to ensure the PSU can supply enough amps on whatever rail the card runs off? If my 450W SFF PSU can't supply the 6850, it really doesn't matter how much better or cheaper it is ....
  • joshua4000 - Friday, October 22, 2010 - link

    let me get this straigt, fermi was once too expensive to manufacture due to its huge die and stuff but its striped down versions sell for less and outpace newley released amd cards (by a wide margin when looked at the 470)

    amds cheaper to manufacture cards (5xxx) on the other hand came in overpriced once the 460 had been released (if they havent been over priced all along...), still, the price did not drop to levels nvidia could not sell products without making a loss.

    amd has optimised an already cheap product price wise, that does not outperforme the 470 or an oced 460 while at the same time selling for the same amount $.

    considering manufacturing and pricing of the 4870 in its last days, i guess amd will still be making money out of those 6xxx when dropping the price by 75% msrp.
  • NA1NSXR - Friday, October 22, 2010 - link

    Granted there have been a lot of advancements in the common feature set of today's cards and improvement in power/heat/noise, but the absolute 3D performance has been stagnant. I am surprised the competition was called alive and well in the final words section. I built my PC back in 7/2009 using a 4890 which cost $180 then. Priced according to the cards in question today, it would slot in roughly the same spot, meaning pretty much no performance improvement at all since then. Yes, I will repeat myself to ward off what is certainly coming - I know the 4890 is a pig (loud, noisy, power hungry) compared to the cards here. However, ignoring those factors 3D performance has barely budged in more than a year. Price drops on 5xxx was a massive disappointment for me. They never came in the way I thought was reasonable to expect after 4xxx. I am somewhat indifferent because in my own PC cycle I haven't been in the market for a card, but like I said before, disappointment in the general market and i wouldn't really agree with the statement that competition is alive and well, at least in any sense that is benefiting people who weight performance more heavily in criteria.

Log in

Don't have an account? Sign up now