Compute

With GTX 980 NVIDIA surprised us with their stunning turnaround in in GPU compute performance, which saw them capable of reaching the top in many compute benchmarks they couldn’t before. GTX 970 meanwhile should benefit from these architectural and driver improvements, though since compute is nearly analogous to shader performance this is also a case where the performance difference between the GTX 970 and GTX 980 stands to be among its widest.

As always we’ll start with LuxMark2.0, the official benchmark of SmallLuxGPU 2.0. SmallLuxGPU is an OpenCL accelerated ray tracer that is part of the larger LuxRender suite. Ray tracing has become a stronghold for GPUs in recent years as ray tracing maps well to GPU pipelines, allowing artists to render scenes much more quickly than with CPUs alone

Compute: LuxMark 2.0

Thanks to GTX 980 taking the top spot here, GTX 970 still maintains a small lead over R9 290XU. So even with the GTX 970's weaker performance, it can still manage to outperform AMD's flagship in this case.

For our second set of compute benchmarks we have CompuBench 1.5, the successor to CLBenchmark. We’re not due for a benchmark suite refresh until the end of the year, however as CLBenchmark does not know what to make of GTX 980 and is rather old overall, we’ve upgraded to CompBench 1.5 for this review.

Compute: CompuBench 1.5 - Face Detection

Compute: CompuBench 1.5 - Optical Flow

Compute: CompuBench 1.5 - Particle Simulation 64K

In the cases where the GTX 980 does well, so does the GTX 970. In the cases where the GTX 980 wasn’t fast enough to top the charts, the GTX 970 will be similarly close behind. Overall compared to AMD’s lineup we see the whole gamut, from a tie between the GTX 970 and R9 290XU to victories for either card.

Our 3rd compute benchmark is Sony Vegas Pro 12, an OpenGL and OpenCL video editing and authoring package. Vegas can use GPUs in a few different ways, the primary uses being to accelerate the video effects and compositing process itself, and in the video encoding step. With video encoding being increasingly offloaded to dedicated DSPs these days we’re focusing on the editing and compositing process, rendering to a low CPU overhead format (XDCAM EX). This specific test comes from Sony, and measures how long it takes to render a video.

Compute: Sony Vegas Pro 12 Video Render

As expected, GTX 970 sheds a bit of performance here. AMD still holds a lead here overall, and against GTX 970 that lead is a little bit larger.

Moving on, our 4th compute benchmark is FAHBench, the official Folding @ Home benchmark. Folding @ Home is the popular Stanford-backed research and distributed computing initiative that has work distributed to millions of volunteer computers over the internet, each of which is responsible for a tiny slice of a protein folding simulation. FAHBench can test both single precision and double precision floating point performance, with single precision being the most useful metric for most consumer cards due to their low double precision performance. Each precision has two modes, explicit and implicit, the difference being whether water atoms are included in the simulation, which adds quite a bit of work and overhead. This is another OpenCL test, utilizing the OpenCL path for FAHCore 17.

Compute: Folding @ Home: Explicit, Single Precision

Compute: Folding @ Home: Implicit, Single Precision

Compute: Folding @ Home: Explicit, Double Precision

With the GTX 980 holding such a commanding lead here, even with the GTX 970’s lower performance it still is more than enough to easily beat any other card in single precision Folding @ Home workloads. Only in double precision with NVIDIA’s anemic 1:32 ratio does GTX 970 falter.

Wrapping things up, our final compute benchmark is an in-house project developed by our very own Dr. Ian Cutress. SystemCompute is our first C++ AMP benchmark, utilizing Microsoft’s simple C++ extensions to allow the easy use of GPU computing in C++ programs. SystemCompute in turn is a collection of benchmarks for several different fundamental compute algorithms, with the final score represented in points. DirectCompute is the compute backend for C++ AMP on Windows, so this forms our other DirectCompute test.

Compute: SystemCompute v0.5.7.2 C++ AMP Benchmark

Recently this has been a stronger benchmark for AMD cards than NVIDIA cards, and consequently GTX 970 doesn’t enjoy quite the lead it sees elsewhere. Though not too far behind R9 280X and even R9 290, like GTX 980 it can’t crunch numbers quite fast enough to keep up with R9 290XU.

Synthetics Power, Temperature, & Noise
Comments Locked

155 Comments

View All Comments

  • MrSpadge - Saturday, September 27, 2014 - link

    What you describe is what Tonga should have been. Didn't turn out so well :/
    Sure, the 285 is priced below GM204 cards, but the chip is almost as large and hence costs AMD the same to produce it. They SHOULD play in the same league.
  • thepaleobiker - Friday, September 26, 2014 - link

    "Crysis 3 Summary" - The GTX 670 trails the R9 290XU by 10%....

    It should be the GTX 970 :)

    Also, on the page with Company of Heroes - The Charts do not display correctly, or more specifically, their headers (the thick Blue bar/heading with info about resolution etc?) are cropped out on all the images except the first one.

    Regards,
    Vishnu
  • krazyfrog - Friday, September 26, 2014 - link

    Also, last page, fifth paragraph

    "AMD would have to cut R9 290X’s performance by nearly $200 to be performance competitive, and even then they can’t come close to matching NVIDIA’s big edge in power consumption."

    Should be '290X's price', I believe.
  • CZroe - Friday, September 26, 2014 - link

    I have the reference style 4GB EVGA GTX 760 with the short PCB but it was discontinued shortly after launch. I got some 670/760 water blocks for SLI from Swiftech and found that only Zotac was making a short PCB 4GB GTX 760 card like my EVGA even though it has fewer memory chips (probably worse for over locking). Because the vast majority of GTX 760 cards had reference 680 PCBs, it is very difficult to tell which "reference" 760 this article is talking about. The rare short PCB or the longer one?
  • Ryan Smith - Friday, September 26, 2014 - link

    The short PCB and the stretched PCB were virtually identical, so to answer the question I'm technically comparing it to the short PCB, but either comparison is valid. The stretched section only contains a handful of additional discrete components; it's mostly to allow fitting an open air cooler.
  • Atari2600 - Friday, September 26, 2014 - link

    The Radeon 290x is nearly a year old now. It would be surprising if an Nvidia GPU that sacrifices DP capability wouldn't be significantly quicker per mm^2 at this point.

    The improvement in performance/watt is notable and nvidia deserve much credit for their work in this area.
  • Phasenoise - Friday, September 26, 2014 - link

    Oh my the name of that card. "EVGA GeForce GTX 970 FTW ACX 2.0"
    Reads like my niece's texting log. omg lol gtx ftw, btb.
  • jjj - Friday, September 26, 2014 - link

    "considering the fact that GTX 970 is a $329 card I don’t seriously expect it to be used for 4K gaming"
    You should and Nvidia should market it as lowering the entry price for 4k. It's kinda the least you need for 4k, can do 4k just not max setting everywhere for just 329$. Pair it with some bellow 500$ screen deal and 4k gaming is a lot more accessible than 2 weeks ago.
    We've looked at what's the bare minimum for 1080p for years and we are used to it, now it's 4k's turn to become more mainstream and we need to get used to looking at it the same way.
    in the US and even more so in China 4k screens are not that prohibitive anymore. 400-500$ for a screen, 330$ for a 970, an overclocked dual core and you can do budget 4k gaming at 1.2k$.
    4k should be one of the reasons some people that buy 200$ cards might go for the 970 this time around.
    And with 20nm 4k should become a lot more affordable so it's time to not think of it as a small niche.
  • garadante - Friday, September 26, 2014 - link

    Can we get power usage with non reference 290/290X's? If I recall correctly, power usage drops something like 15-25 watts when it's running at closer to 70 C than 95 C as reference cooling profiles make it run.
  • justaviking - Friday, September 26, 2014 - link

    Zero Fan Speed Idling...

    Will the zero fan speed capability be something that will be delivered via a driver update? I hope so. Or will it require a "Rev B" of the hardware too?

    I don't need silence, but quiet is nice. I assumed the 970 would be quite a bit quieter than the 980 due to lower TDP. The test results surprise me (not in a good way).

    This makes me even more impressed with the 980. But it still costs $200 more. Tough choice.

Log in

Don't have an account? Sign up now