Compute

Shifting gears, we have our look at compute performance. As compute performance will be more significantly impacted by the reduction in CUs than most other tests, we’re expecting the performance hit for the R9 Fury relative to the R9 Fury X to be more significant here than under our gaming tests.

Starting us off for our look at compute is LuxMark3.0, the latest version of the official benchmark of LuxRender 2.0. LuxRender’s GPU-accelerated rendering mode is an OpenCL based ray tracer that forms a part of the larger LuxRender suite. Ray tracing has become a stronghold for GPUs in recent years as ray tracing maps well to GPU pipelines, allowing artists to render scenes much more quickly than with CPUs alone.

Compute: LuxMark 3.0 - Hotel

For LuxMark with the R9 Fury X already holding the top spot, the R9 Fury cards easily take the next two spots. One interesting artifact of this is that the R9 Fury’s advantage over the GTX 980 is actually greater than the R9 Fury X’s over the GTX 980 Ti’s, both on an absolute and relative basis. This despite the fact that the R9 Fury is some 13% slower than its fully enabled sibling.

For our second set of compute benchmarks we have CompuBench 1.5, the successor to CLBenchmark. CompuBench offers a wide array of different practical compute workloads, and we’ve decided to focus on face detection, optical flow modeling, and particle simulations.

Compute: CompuBench 1.5 - Face Detection

Compute: CompuBench 1.5 - Optical Flow

Compute: CompuBench 1.5 - Particle Simulation 64K

Not unlike LuxMark, tests where the R9 Fury X did well have the R9 Fury doing well too, particularly the optical flow sub-benchmark. The drop-off in that benchmark and face detection is about what we’d expect for losing 1/8th of Fiji’s CUs. On the other hand the particle simulation benchmark is hardly fazed beyond the clockspeed drop, indicating that the bottleneck lies elsewhere.

Our 3rd compute benchmark is Sony Vegas Pro 13, an OpenGL and OpenCL video editing and authoring package. Vegas can use GPUs in a few different ways, the primary uses being to accelerate the video effects and compositing process itself, and in the video encoding step. With video encoding being increasingly offloaded to dedicated DSPs these days we’re focusing on the editing and compositing process, rendering to a low CPU overhead format (XDCAM EX). This specific test comes from Sony, and measures how long it takes to render a video.

Compute: Sony Vegas Pro 13 Video Render

At this point Vegas is becoming increasingly CPU-bound and will be due for replacement. The R9 Fury comes in one second behind the chart-topping R9 Fury X, at 22 seconds.

Moving on, our 4th compute benchmark is FAHBench, the official Folding @ Home benchmark. Folding @ Home is the popular Stanford-backed research and distributed computing initiative that has work distributed to millions of volunteer computers over the internet, each of which is responsible for a tiny slice of a protein folding simulation. FAHBench can test both single precision and double precision floating point performance, with single precision being the most useful metric for most consumer cards due to their low double precision performance. Each precision has two modes, explicit and implicit, the difference being whether water atoms are included in the simulation, which adds quite a bit of work and overhead. This is another OpenCL test, utilizing the OpenCL path for FAHCore 17.

Compute: Folding @ Home: Explicit, Single Precision

Compute: Folding @ Home: Implicit, Single Precision

Compute: Folding @ Home: Explicit, Double Precision

Overall while the R9 Fury doesn’t have to aim quite as high given its weaker GTX 980 competition, FAHBench still stresses the Radeon cards. Under single precision tests the GTX 980 pulls ahead, only surpassed under double precision thanks to NVIDIA’s weaker FP64 performance.

Wrapping things up, our final compute benchmark is an in-house project developed by our very own Dr. Ian Cutress. SystemCompute is our first C++ AMP benchmark, utilizing Microsoft’s simple C++ extensions to allow the easy use of GPU computing in C++ programs. SystemCompute in turn is a collection of benchmarks for several different fundamental compute algorithms, with the final score represented in points. DirectCompute is the compute backend for C++ AMP on Windows, so this forms our other DirectCompute test.

Compute: SystemCompute v0.5.7.2 C++ AMP Benchmark

As with our other tests the R9 Fury loses some performance on our C++ AMP benchmark relative to the R9 Fury X, but only around 8%. As a result it’s competitive with the GTX 980 Ti here, blowing well past the GTX 980.

 

Synthetics Power, Temperature, & Noise
Comments Locked

288 Comments

View All Comments

  • Sefem - Wednesday, July 15, 2015 - link

    "Draw calls are the best metric we have right now to compare AMD Radeon to nVidia ON A LEVEL PLAYING FIELD."
    Well, lets just for a moment consider this as true (and you should try to explain why :D )
    Looking at draw calls a GTX 980 should perform 2.5x faster than a 290X in DX11 (respectively 2.62M vs 1.05M draw calls) and even a GTX 960 would be 2.37x faster than the over mentioned 290X (respectively 2.49M vs 1.05M draw calls) :)
  • D. Lister - Friday, July 17, 2015 - link

    Performing minor optimizations, on an API that isn't even out yet, to give themselves the appearance of a theoretical advantage in some arbitrary GPU function, as a desperate attempt to keep themselves relevant, is so very AMD (their motto should be, "we will take your money now, and give you its worth... later..., maybe.)

    Meanwhile people at NV are optimizing for the API that is currently actually being used to make games, and raising their stock value and market share while they're at it.

    Why wouldn't AMD optimize for DX11, and instead do what it's doing? Because DX11 is a mature API, so any further improvements would be small, yet expensive, while DX12 isn't even out yet, so it would be comparatively cheaper to get bigger gains, and AMD is seriously low on funds.

    Realistically, proper DX12 games are stll 2-3 years away. By that time AMD probably wouldn't even be around anymore.

    Hence, in conclusion, whatever DX12 performance the Fury trio (or AMD in general) claims, means absolutely nothing at this point.
  • FlushedBubblyJock - Wednesday, July 15, 2015 - link

    Thank GOD for nvidia or amd would have this priced so sky high no one could afford it.

    Instead of crazy high scalping greedy pricing amd only greeded up on price perf the tiny bit it could since it can't beat nvidia, who saved our wallets again !

    THANK YOU NVIDIA ! YOUR COMPETITION HAS KEPT THE GREEDY RED TEAM FROM EXHORBITANT OVERPRICING LIKE THEY DID ON THEIR 290 SERIES !
  • f0d - Friday, July 10, 2015 - link

    i wasnt really impressed with the fury-x at its price point and performance
    this normal fury seems a bit better at it price point than the fury-x does

    as i write this the information on overclocking wasnt finished - i sure hope the fury overclocks much better than fury-x did because that was a massive letdown when it came to overclocking, when nvidia can get some crazy high overclocks with its maxwell it kinda makex the fury line seem not as good with its meager overclocks the fury-x had
    hopefully fury (non x) overclocks like a beast like the nvidia cards do
  • noladixiebeer - Friday, July 10, 2015 - link

    AMD haven't unlocked the voltage yet on Fury X. Hopefully, they will unlock the voltage cap soon, so the Fury X should be able to overclock better. Better than 980ti? We'll see, but Fury X still has lots of uptapped resources.
  • Chaser - Saturday, July 11, 2015 - link

    Don't hold you breath. There is very little overhead in Fiji. That's clearly been divulged. As the article states Maxwell is very efficient and has a good deal of room for partners to indulge themselves. Especially the Ti.
  • chrnochime - Friday, July 10, 2015 - link

    The WC for the X makes up ~half of the price increase from non-x. For someone who's going to do moderate OC and don't want to bother doing WC conversion the X is a good choice, even over a Ti.
  • cmdrdredd - Friday, July 10, 2015 - link

    no it's not...the 980ti bests it handily. It's not a good choice at all when 980ti can overclock as well and many coolers have 0rpm fan modes for when it's at idle or very low usage.
  • akamateau - Tuesday, July 14, 2015 - link

    You haven't seen the DX12 Benchmarks yet. Anand has been keeping them from you. Once you see how much Radeon crushes nVidia you would never buy green again.

    nVidia silicon is RUBBISH with DX12 and Mantle. Radeon 290x is 33% faster than GTX 980Ti.
  • FlushedBubblyJock - Wednesday, July 15, 2015 - link

    sefem already told you...
    " "Draw calls are the best metric we have right now to compare AMD Radeon to nVidia ON A LEVEL PLAYING FIELD."
    Well, lets just for a moment consider this as true (and you should try to explain why :D )
    Looking at draw calls a GTX 980 should perform 2.5x faster than a 290X in DX11 (respectively 2.62M vs 1.05M draw calls) and even a GTX 960 would be 2.37x faster than the over mentioned 290X (respectively 2.49M vs 1.05M draw calls) :) "

    Now go back to stroking your amd spider platform.

Log in

Don't have an account? Sign up now