Compute Performance

Shifting gears, as always our final set of performance benchmarks is a look at compute performance. As we saw with the launch of the GTX 680, Kepler (GK104) just doesn’t do very well here, thanks in part to NVIDIA stripping out a fair bit of compute hardware and memory bandwidth on GK104 in order to focus on gaming performance. OpenCL performance is particularly bad with NVIDIA almost completely ignoring it, but even DirectCompute performance often swings AMD’s way. This isn’t to say that GK104 doesn’t have its moments, but when it comes to compute it’s typically AMD’s time to shine.

Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes. Note that this is a DX11 DirectCompute benchmark.

The 7970 already had a significant lead in this benchmark thanks to AMD’s work on improving their DirectCompute performance, and the 7970GE extends it further. The most important factor of course is actual game performance – where the 7970GE and GTX 680 are tied – but this is clear software evidence of what we already know in hardware: that the 7970GE is far more potent at compute than the GTX 680 is.

Our next benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. We’re now using a development build from the version 2.0 branch, and we’ve moved on to a more complex scene that hopefully will provide a greater challenge to our GPUs.

Being an OpenCL title that NVIDIA isn’t taking any care to optimize for, the 7970GE simply blows the GTX 680 out of the water. It’s not even a contest here. Only one card family is even worth consideration for use here. However it’s interesting to note that the 7970GE’s performance improvement over the 7970 is a bit below average, with the 7970GE only picking up 6%. SLG does stress memory bandwidth and compute performance, but in all likelihood the 7970GE isn’t boosting as much here as it is under our gaming tests. Once AMD starts exposing real clockspeeds we’ll need to revisit this assumption.

For our next benchmark we’re looking at AESEncryptDecrypt, an OpenCL AES encryption routine that AES encrypts/decrypts an 8K x 8K pixel square image file. The results of this benchmark are the average time to encrypt the image over a number of iterations of the AES cypher.

While the 7970GE does improve upon the 7970’s already strong performance, we’re clearly reaching the point where the relatively long CPU/GPU transfer times over PCIe are taking their toll, explaining why the 7970GE could only shave off 5ms. This is actually an important point to make and is why APUs are so important to AMD’s GPU computing plans, but it also means that at a certain speed GPU performance ceases to matter.

Our fourth benchmark is once again looking at compute shader performance, this time through the Fluid simulation sample in the DirectX SDK. This program simulates the motion and interactions of a 16k particle fluid using a compute shader, with a choice of several different algorithms. In this case we’re using an (O)n^2 nearest neighbor method that is optimized by using shared memory to cache data.

In this final compute shader benchmark NVIDIA’s performance is actually quite respectable, leading to them besting the 7970. However the 7970GE provides just enough of a performance boost to push AMD ahead of NVIDIA here, giving AMD a solid majority of our standard compute benchmarks. Even when Kepler is faced with a favorable workload, it looks like GCN based 7970GE is capable of taking NVIDIA head-on.

Finally, we received a number of requests for some further compute benchmarking using some of the consumer programs AMD provided the press with for the Trinity launch. In particular WinZip and handbrake were requested, so we’ve gone ahead and run those benchmarks for this review.

Starting with WinZip, WinZip 16.5 introduced OpenCL acceleration of both compression and AES achieve encryption. Despite being accelerated via OpenCL WinZip only supports AMD devices, presumably because only AMD provided technical assistance. As a result we’re looking solely at pure CPU performance and GPU accelerated performance across AMD’s lineup.

One thing immediately sticks out: WinZip isn’t very sensitive to GPU performance. Merely having a GPU increases performance rather significantly, but it doesn’t matter if it’s a fast GCN card or a GCN card at all for that matter, as even the VLIW4 based 6970 returns the same times. In fact AMD’s drivers report almost no GPU load, so it’s questionable how much of this is actually being run on the GPU versus being run on the CPU through AMD’s OpenCL CPU driver.

As for Handbrake, AMD sent along a newer version that works with discrete GPUs. AMD notes that this is still very much a work in progress, which we saw first-hand when OpenCL acceleration failed to handle two of our three test clips. It failed to properly crop one video, and failed to properly detelecine another. Handbrake’s OpenCL acceleration will of course continue to improve as it approaches release, but for the time being it’s definitely a beta.

Much like WinZip, Handbrake doesn’t appear to be particularly GPU performance sensitive, which doesn’t come as much of a surprise. Large parts of the H.264 encoding process are ill suited for GPU acceleration, so X.264 is only offloading part of the process and the deciding factor is still CPU performance. The actual GPU load is very inconsistent, but generally tops out at around 40% usage.

The end result is nothing to sneeze at however. Whereas Handbrake averaged 25.6fps without GPU acceleration, with it performance increases by 24% to around 32fps. And unlike other GPU compute accelerated encoders the quality here is very consistent between the CPU and GPU paths (though GPU file size tends to be a bit larger), which means we’re retaining the same quality and customizability of Handbrake/x264 while gaining additional performance for free.

Despite the fact that this is an AMD backed initiative it’s interesting to see that Handbrake’s performance isn’t heavily reliant on the GPU being used. We would have assumed that Handbrake was only optimized for AMD’s GPUs at this point, and even if that’s the case NVIDIA’s GPUs are still fast enough to make up the difference. The fact that Handbrake performance with NVIDIA’s GPUs is a hair faster is not at all what we would have expected, but at the same time this is very beta quality software and is likely dependent on the clip being used, so we wouldn’t advise reading too much into this at this time.

Civilization V Synthetics
Comments Locked

110 Comments

View All Comments

  • silverblue - Tuesday, June 26, 2012 - link

    I think that's the way people do every review. However, ordinarily I'd recommend looking back at the 680 review, but as we've seen with the new Catalyst drivers, performance can vary over a relatively short period of time. So, a future article such as "AMD's Radeon 7970 and NVIDIA's GTX 680: How Much Difference Can A Few Months Make?" might be very nice *hint hint*. ;)
  • Temelj - Thursday, July 12, 2012 - link

    For simplicity, the OC data should be put up on this graph for reference purposes and ease of use. Who on earth wants to troll a few reviews and collect this data manually? At the very least include a reference link to the previous article that compares the NVidia 680 and provides the OC scores.

    Also, instead of a conclusion write up why not have a result summary showing all performed tests, the cards there were used as reference and provide a tabular view clearly showing the top runner of each test (or top 3).
  • b3nzint - Wednesday, June 27, 2012 - link

    So what about ; DX11 DirectCompute, SmallLuxGPU, Fluid simulation, WinZip 16.5 tests. amd is winning streak. Dont buy nvidia, its an empty thing!
  • CeriseCogburn - Saturday, June 30, 2012 - link

    If you're going to use winzip to game, and support evil proprietary corruption in software by amd while using open source, great, hypocrisy and lying to stone cold stupid amd fans for years works well !
    Fluid sim - not a game
    DX11 DC - not a game
    SmallLux - not a game

    Oops ! "Empty" suddenly applies to amd when it wins any "benchmarks that are not real world for end users, ever."

    I guess empty crap no one uses, declared fraudulently, as a "win", sways the dark hollow spaces in the hearts and minds of the little amd fans. It's sad.
  • yay123 - Saturday, June 30, 2012 - link

    hi there I'm buying this card but my psu is cm gx550w does it fit well if I oc it?
  • Temelj - Thursday, July 12, 2012 - link

    If you can afford a card like this, why not just upgrade your power supply?
    Review System Requirements here: http://www.amd.com/us/products/desktop/graphics/70...
  • Jamahl - Thursday, July 5, 2012 - link

    Comments totally ruined by CeriseCogburn's bullshit on every page.

    Is this maddoctor in disguise, or one of the other Nvidia zealots? Whatever, just IP ban this weirdo and be done with it.
  • Mauhi123 - Monday, October 15, 2012 - link

    Dear All.

    Hello,
    I am having 3960x and DX79SI and graphics card asus hd7970-dc2t-3gd5
    i am not able to boot the computer. when i am bootiing the computer on mother board 2 digit led shows "00" duble zero and on led screen shows "0_" and stops, but i can reboot the computer useing ctl+atl+del. i can able to oparate bios. that means the computer is not in hanging mode.

    Please Help me ASAP......
  • seansplayin - Tuesday, November 20, 2012 - link

    I have the Xfx 7970 Ghz edition and I really am not sure what is the big deal with the noise. My Card is not that loud. Honestly Power control settings @ +20%, Gpu core 1175 and memory @ 1600 completely stable. The games I play are at 1080P MAx everything and my GPU rarely gets above 70C, which is only around 40% fan speed. @ 40% fan speed I literally cannot hear the GPU fan unless I have the speakers completely turned off and even still I have to listen carefully to actually discern that the noise I hear is coming from the Video Card. My experience in gaming the GPU fan noise is absolutely NOT an issue. when I'm running synthetic GPU benchmarking apps like geekbench's Furmark then the card will ramp up around 70% fan speed and you can hear it, but even then it is really not an Issue. I am using the latest catylist beta Driver 12.11 which as added 15% increase in BF3 FPS and 10% increase in Dirt 3, basically taking Nvidia's crown in virtually every game.
    I do lot's of Video transcoding and the openCL domination this card produces is amazing.
    Yesterday I trancoded a 1080P 5.3GB .mkv file to .mp4 with nero 11 when using AMD's app acceleration codec the transcode took 20 minutes as compared to 60 minutes when I used Nero's .mp4 codec at the same output settings. Durring the Transcoding the GPU stays at I believe 300 mhz with the GPU at 20% load average. when doing transcoding the gpu hoovers around 111F with the Fan at like 5%.
    I love this card.
    My Computer has three states, Idle 60% of the time, gaming and transcoding 40% of the time. At Idle with AMD's zero core this video card is using 10 watts less than Nvidia's 680, In gaming it's beating the 680 in almost every game now, and when it comes to encoding open cl and open gl it's basically a blowout averaging 75% more than the 680. If your an Nvidia fan (I formally was) and open CL is important to you, go with the Fermi cards because on most GPGPU processing they outperform with Kepler cards.

    IF you question anything I've said do some google homework. Catalyst 12.11 actually does what they say, I can attest to it at least when it comes to Encoding, playing BF3 and Dirt 3
  • Peters357 - Wednesday, June 27, 2018 - link

    The majority of good front-loaders receive https://washersanddryersmaker.squarespace.com at the very least CEE Tier II form III, a recognition https://washersanddryerstop.jimdofree.com from the Consortium for Energy Performance https://canvas.instructure.com/courses/1351368/ass... for super effective washing machines any great http://bestwashersanddryers.wikidot.com/ HE top-loader has an Energy Star badge agitator https://www.wattpad.com/586634818-best-smart-washe... top-loaders don't even claim to be reliable assuming

Log in

Don't have an account? Sign up now