Compute

Our final set of performance benchmarks is compute performance, which for dual-GPU cards is always a mixed bag. Unlike gaming where the somewhat genericized AFR process is applicable to most games, when it comes to compute the ability for a program to make good use of multiple GPUs lies solely in the hands of the program’s authors and the algorithms they use.

At the same time while we’re covering compute performance for completeness, the high price and unconventional cooling apparatus for the 295X2 is likely to deter most serious compute users.

In any case, our first compute benchmark is LuxMark2.0, the official benchmark of SmallLuxGPU 2.0. SmallLuxGPU is an OpenCL accelerated ray tracer that is part of the larger LuxRender suite. Ray tracing has become a stronghold for GPUs in recent years as ray tracing maps well to GPU pipelines, allowing artists to render scenes much more quickly than with CPUs alone.

Compute: LuxMark 2.0

As one of the few compute tasks that’s generally multi-GPU friendly, ray tracing is going to be the best case scenario for compute performance for the 295X2. Under LuxMark AMD sees virtually perfect scaling, with the 295X2 nearly doubling the 290X’s performance under this benchmark. No other single card is currently capable of catching up to the 295X2 in this case.

Our second compute benchmark is Sony Vegas Pro 12, an OpenGL and OpenCL video editing and authoring package. Vegas can use GPUs in a few different ways, the primary uses being to accelerate the video effects and compositing process itself, and in the video encoding step. With video encoding being increasingly offloaded to dedicated DSPs these days we’re focusing on the editing and compositing process, rendering to a low CPU overhead format (XDCAM EX). This specific test comes from Sony, and measures how long it takes to render a video.

Compute: Sony Vegas Pro 12 Video Render

Sony Vegas Pro on the other hand sees no advantage from multiple GPUs. The 295X2 does just as well as the other Hawaii cards at 22 seconds, sharing the top of the chart, but the second GPU goes unused.

Our third benchmark set comes from CLBenchmark 1.1. CLBenchmark contains a number of subtests; we’re focusing on the most practical of them, the computer vision test and the fluid simulation test. The former being a useful proxy for computer imaging tasks where systems are required to parse images and identify features (e.g. humans), while fluid simulations are common in professional graphics work and games alike.

Compute: CLBenchmark 1.1 Fluid Simulation

Compute: CLBenchmark 1.1 Computer Vision

Like Vegas Pro, the CLBenchmark sub-tests we use here don't scale with additional GPUs. So the 295X2 can only match the performance of the 290X on these benchmarks.

Moving on, our fouth compute benchmark is FAHBench, the official Folding @ Home benchmark. Folding @ Home is the popular Stanford-backed research and distributed computing initiative that has work distributed to millions of volunteer computers over the internet, each of which is responsible for a tiny slice of a protein folding simulation. FAHBench can test both single precision and double precision floating point performance, with single precision being the most useful metric for most consumer cards due to their low double precision performance. Each precision has two modes, explicit and implicit, the difference being whether water atoms are included in the simulation, which adds quite a bit of work and overhead. This is another OpenCL test, as Folding @ Home has moved exclusively to OpenCL this year with FAHCore 17.

Compute: Folding @ Home: Explicit, Single Precision

Compute: Folding @ Home: Explicit, Double Precision

Unlike most of our compute benchmarks, Folding@Home does see some degree of multi-GPU scaling. However the outcome is really a mixed bag; single-precision performance ends up being a wash (if not a slight regression) while double-precision is seeing sub-50% scaling.

Wrapping things up, our final compute benchmark is an in-house project developed by our very own Dr. Ian Cutress. SystemCompute is our first C++ AMP benchmark, utilizing Microsoft’s simple C++ extensions to allow the easy use of GPU computing in C++ programs. SystemCompute in turn is a collection of benchmarks for several different fundamental compute algorithms, as described in this previous article, with the final score represented in points. DirectCompute is the compute backend for C++ AMP on Windows, so this forms our other DirectCompute test.

Compute: SystemCompute v0.5.7.2 C++ AMP Benchmark

Our final compute benchmark has the 295X2 and 290X virtually tied once again, as this is another benchmark that doesn’t scale up with multiple GPUs.

Synthetics Power, Temperature, & Noise
Comments Locked

131 Comments

View All Comments

  • HalloweenJack - Tuesday, April 8, 2014 - link

    cheaper set of 780ti`s? 2 of them is $1300 > $1400 and the 295 isn't even in retail yet....

    anandtech going to slate the Titan Z as much? or is the pay cheques worth too much. shame to see the bias , anandtech used to be a good site before it sold out.
  • GreenOrbs - Tuesday, April 8, 2014 - link

    Not seeing the bias--Anandtech is usually pretty fair. I think you have overlooked the fact that AMD is a sponsor not NVIDA. If anything "slating" Titan Z would be more consistent of your theory of "selling out."
  • nathanddrews - Tuesday, April 8, 2014 - link

    What bias?

    http://www.anandtech.com/bench/product/1187?vs=107...
    Two 780ti cards are cheaper than the 295x2, that's a fact.
    Two 780ti cards consume much less power than the 295x2, that's a fact.
    Two 780ti cards have better frame latency than the 295x2, that's a fact.
    Two 780ti cards have nearly identical performance to the 295x2, that's a fact.

    If someone was trying to decide between them, I'd recommend dual 780ti cards to save money and get similar performance. However, if that person only had a dual-slot available, it would be the 295x2 hands-down.

    The Titan Z isn't really any competition here - the 790 (790ti?) will be the 295x2's real competition. The real question is will NVIDIA price it less than or more than the 295x2?
  • PEJUman - Tuesday, April 8, 2014 - link

    I don't think the target market for this stuff (295x2 or Titan Z) are single GPU slots, as Ryan briefly mentioned, most people who are quite poor (myself included), will go with 780TI x 2 or 290x x 2, These cards are aimed at Quads.

    AMD have priced it appropriately, roughly equal perf. potential for 3k dual 295x2 vs 6k for dual titan-z. Unfortunately, 4GB may not be enough for Quads...

    I've ventured into multiGPUs in the past, I find these rely too much on driver updates (see how poorly 7990 runs nowadays, and AMD will be concentrating their resource on 295x2). Never again.
  • Earballs - Wednesday, April 9, 2014 - link

    With respect, any decision on what to buy should made but what your application is. Paper facts are worthless when they don't hold up to (your version of) real world tasks. Personally I've been searching for a good single card to make up for Titanfall's flaws with CF/SLI. Point is, be careful with your recommendations if they're based on facts. ;)

    Sidenote: I managed to pick up a used 290x for MSRP with the intention of adding another one once CF is fixed with Titanfall. That price:performance, which can be had today, skews the results of this round-up quite a bit IMO.
  • MisterIt - Tuesday, April 8, 2014 - link

    By drawing that much power from the PCI-lane, won't it be a fire hassard? I'v read multiple post about motherboard which take fire at bitcoin/scryptcoin mining forums due to using to many GPU without using a power riser to lower the amount of power delivered trought the pci-lane.

    Would Anandtech be willing to test the claim from AMD by running the GPU at full load for a longer period of time under a fire controlled environment?
  • Ryan Smith - Tuesday, April 8, 2014 - link

    The extra power is designed to be drawn off of the external power sockets, not the PCIe slot itself. It's roughly 215W + 215W + 75W, keeping the PCIe slot below its 75W limit.
  • MisterIt - Tuesday, April 8, 2014 - link

    Hmm allright, thanks for the reply.
    Still rather skeptical, but I'll guess there should be plenty of users reviews before the time i'm considering to upgrade my own GPU anyways.
  • CiccioB - Tuesday, April 8, 2014 - link

    Don't 8-pin molex connector specifics indicate 150W max power draw? 215W are quite out of that limit.
  • Ryan Smith - Tuesday, April 8, 2014 - link

    Yes, but it's a bit more complex than that: http://www.anandtech.com/show/4209/amds-radeon-hd-...

Log in

Don't have an account? Sign up now