Compute

Shifting gears, we have our look at compute performance. As compute performance will be more significantly impacted by the reduction in CUs than most other tests, we’re expecting the performance hit for the R9 Fury relative to the R9 Fury X to be more significant here than under our gaming tests.

Starting us off for our look at compute is LuxMark3.0, the latest version of the official benchmark of LuxRender 2.0. LuxRender’s GPU-accelerated rendering mode is an OpenCL based ray tracer that forms a part of the larger LuxRender suite. Ray tracing has become a stronghold for GPUs in recent years as ray tracing maps well to GPU pipelines, allowing artists to render scenes much more quickly than with CPUs alone.

Compute: LuxMark 3.0 - Hotel

For LuxMark with the R9 Fury X already holding the top spot, the R9 Fury cards easily take the next two spots. One interesting artifact of this is that the R9 Fury’s advantage over the GTX 980 is actually greater than the R9 Fury X’s over the GTX 980 Ti’s, both on an absolute and relative basis. This despite the fact that the R9 Fury is some 13% slower than its fully enabled sibling.

For our second set of compute benchmarks we have CompuBench 1.5, the successor to CLBenchmark. CompuBench offers a wide array of different practical compute workloads, and we’ve decided to focus on face detection, optical flow modeling, and particle simulations.

Compute: CompuBench 1.5 - Face Detection

Compute: CompuBench 1.5 - Optical Flow

Compute: CompuBench 1.5 - Particle Simulation 64K

Not unlike LuxMark, tests where the R9 Fury X did well have the R9 Fury doing well too, particularly the optical flow sub-benchmark. The drop-off in that benchmark and face detection is about what we’d expect for losing 1/8th of Fiji’s CUs. On the other hand the particle simulation benchmark is hardly fazed beyond the clockspeed drop, indicating that the bottleneck lies elsewhere.

Our 3rd compute benchmark is Sony Vegas Pro 13, an OpenGL and OpenCL video editing and authoring package. Vegas can use GPUs in a few different ways, the primary uses being to accelerate the video effects and compositing process itself, and in the video encoding step. With video encoding being increasingly offloaded to dedicated DSPs these days we’re focusing on the editing and compositing process, rendering to a low CPU overhead format (XDCAM EX). This specific test comes from Sony, and measures how long it takes to render a video.

Compute: Sony Vegas Pro 13 Video Render

At this point Vegas is becoming increasingly CPU-bound and will be due for replacement. The R9 Fury comes in one second behind the chart-topping R9 Fury X, at 22 seconds.

Moving on, our 4th compute benchmark is FAHBench, the official Folding @ Home benchmark. Folding @ Home is the popular Stanford-backed research and distributed computing initiative that has work distributed to millions of volunteer computers over the internet, each of which is responsible for a tiny slice of a protein folding simulation. FAHBench can test both single precision and double precision floating point performance, with single precision being the most useful metric for most consumer cards due to their low double precision performance. Each precision has two modes, explicit and implicit, the difference being whether water atoms are included in the simulation, which adds quite a bit of work and overhead. This is another OpenCL test, utilizing the OpenCL path for FAHCore 17.

Compute: Folding @ Home: Explicit, Single Precision

Compute: Folding @ Home: Implicit, Single Precision

Compute: Folding @ Home: Explicit, Double Precision

Overall while the R9 Fury doesn’t have to aim quite as high given its weaker GTX 980 competition, FAHBench still stresses the Radeon cards. Under single precision tests the GTX 980 pulls ahead, only surpassed under double precision thanks to NVIDIA’s weaker FP64 performance.

Wrapping things up, our final compute benchmark is an in-house project developed by our very own Dr. Ian Cutress. SystemCompute is our first C++ AMP benchmark, utilizing Microsoft’s simple C++ extensions to allow the easy use of GPU computing in C++ programs. SystemCompute in turn is a collection of benchmarks for several different fundamental compute algorithms, with the final score represented in points. DirectCompute is the compute backend for C++ AMP on Windows, so this forms our other DirectCompute test.

Compute: SystemCompute v0.5.7.2 C++ AMP Benchmark

As with our other tests the R9 Fury loses some performance on our C++ AMP benchmark relative to the R9 Fury X, but only around 8%. As a result it’s competitive with the GTX 980 Ti here, blowing well past the GTX 980.

 

Synthetics Power, Temperature, & Noise
Comments Locked

288 Comments

View All Comments

  • SolMiester - Saturday, July 11, 2015 - link

    WOW, so much fail from AMD...might as well kiss their ass goodbye!
    Pimping the Fury at 4k, when really even the 980Ti is borderline on occasion, and releasing a card with no OC headroom at the same price as its competitor!
  • ES_Revenge - Saturday, July 11, 2015 - link

    I didn't have too high hopes for the "regular" Fury [Pro] after the disappointing Fury X. However I have to say...this thing makes the Fury X look bad, plain and simple. With a pretty significant cut-down (numerically) in SPs and 32 fewer TMUs, you'd expect this thing to be more of a yawn. Instead it gives very near to Fury X performance and still faster than a GTX 980.

    The only problem with it is price. At $550 it still costs more than a GTX 980 and Fury has less OC potential. And at only $100 less than Fury X it's not really much of a deal considering the AIO/CLC with that is probably worth $60-80. So really you're only paying $30 or so for the performance increase of Fury X (which isn't that much but it's still faster). What I suggest AMD "needs to do" is price this thing near to where they have the 390X priced. Fury Pro at ~$400 price will pull sales from Nvidia's 980 so fast it's not funny. Accordingly the 390X should be priced lower as well.

    But I guess AMD can't really afford to undercut Nvidia at the moment so they're screwed either way. Price is high, people aren't going to bother; lower the price and people will buy but then maybe they're just losing money.

    But imagine buying one of these at $400ish, strapping on an Asetek AIO/CLC you might have lying around (perhaps with a Kraken bracket), and you have a tiny little card* with a LOT of GPU power and nice low temps, with performance like a Fury X. Well one can dream, right? lol

    *What I don't understand is why Asus did a custom PCB to make the thing *longer*??? One of the coolest things about Fury is how small the card is. They just went and ruined that--they took it and turned it back into a 290X, the clowns. While the Sapphire one still straps on an insanely large cooler, at least if you remove it you're still left with the as-intended short card.
  • FlushedBubblyJock - Thursday, July 16, 2015 - link

    can you even believe the 390x is at $429 and $469 and $479 ... the rebrand over 2.5 years old or so... i mean AMD has GONE NUTS.
  • akamateau - Sunday, July 12, 2015 - link

    @Ryan Smith

    Hmmm.

    You ran a whole suite of synthetic Benchmarks yet you completely ignored DX12 Starswarm and 3dMarks API Overhead test.

    The question that I have is why did you omit DX12 benchmarks?

    Starswarm is NOW COMPLETE AND MATURE.

    It is also NOT synthetic but rather a full length game simulation; but you know this.

    3dMark is synthetic but it is THE prime indicator of the CPU to GPU data pipeline performance.

    They are also all we have right now to adequately judge the value of a $549 dollar AMD GPU vs a $649 nVidia GPU for new games coming up.

    Since better than 50% of games released this Christmas will be DX12 don't you think that consumers have a right to know how well a high performance API will work with a dGPU card designed to run on both Mantle and DX12?

    AMD did not position Fiji for DX11. This card IS designed for DX12 and Mantle.

    So show us how well it does.
  • Ryan Smith - Monday, July 13, 2015 - link

    The Star Swarm benchmark is, by design, a proof of concept. It is meant to showcase the benefits of DX12/Mantle as it applies to draw calls, not to compare the gaming performance of video cards.

    Furthermore the latest version is running a very old version of the engine that has seen many changes. We will not be able to include any Oxide engine games until Ashes of the Singularity (which looks really good, by the way) is out of beta.

    Finally, the 3DMark API Overhead test is not supposed to be used to compare video cards from different vendors. From the technical guide: "The API Overhead feature test is not a general-purpose GPU benchmark, and it should not be used to compare graphics cards from different vendors."
  • FlushedBubblyJock - Thursday, July 16, 2015 - link

    " Since better than 50% of games released this Christmas will be DX12 "
    I'LL BET YOU A GRAND THAT DOES NOT HAPPEN.

    It's always the amd fanboy future, with the svengali ESP blabbed in for full on PR BS...
  • akamateau - Sunday, July 12, 2015 - link

    @Ryan Smith

    Do you also realise that Fiji completely outclasses Maxwell and Tesla as well?

    Gaming is a sideshow. AMD is positioning Fury x to sell for $350+ as a single unit silicon for HPC. With HBM on the package!!!

    HPGPU computing is now up for grabs. Compaing the Fiji PACKAGE to the Maxwell or Tesla PACKAGE has AMD thoroughly outclassing the Professional Workstation and HPC silicon.

    HBM stacked memory can be configured as cache and still feed GDDR5 RAM for multiple monitors.

    AMD has several patents for just that while using HBM stacked memory.

    I think that AMD is quietly positioning Fiji and Greenland next for High Performance Computing.

    Fury X2 with 17 Tflps of single precision and almost 8Tflops dual precison is going to change the cluster server market.

    Of course Fury X2 will rock this Christmas.

    What will be the release price? I think less than $999!!!

    AMD has made a habit of being the grinch that stole nVidias Christmas.
  • Ryan Smith - Monday, July 13, 2015 - link

    Note that Fiji is not expected to appear in any HPC systems. It has no ECC, minimal speed FP64, and only 4GB of VRAM. HPC users are generally after processors with large amounts of memory and ECC, and frequently FP64 as well.

    AMD's HPC product for this cycle is the FirePro S9170, a 32GB Hawaii card: http://www.amd.com/en-us/products/graphics/server/...
  • FlushedBubblyJock - Thursday, July 16, 2015 - link

    ROFLMAO delusion after delusion...
  • loguerto - Sunday, July 12, 2015 - link

    Looking at what happened with the old generation AMD and nvidia gpus, i wouldn't be surprised if after a few driver updates the fiji will be so ahead of maxwell. AMD always improved it's old architectures with software updates while nvidia quite never did that, actually they downgrade their old gpus so that they can sell their next overpriced SoC.

Log in

Don't have an account? Sign up now