Compute

As always we'll start with our DirectCompute game example, Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes.  While DirectCompute is used in many games, this is one of the only games with a benchmark that can isolate the use of DirectCompute and its resulting performance.

Compute: Civilization V

Civ V's texture compression routines are technically mutli-GPU capable, but multi-GPU scaling has never been particularly impressive here. So this test mostly reinforces what we already know about the Tahiti GPU being very capable in most DirectCompute workloads.

Our next benchmark is LuxMark2.0, the official benchmark of SmallLuxGPU 2.0. SmallLuxGPU is an OpenCL accelerated ray tracer that is part of the larger LuxRender suite. Ray tracing has become a stronghold for GPUs in recent years as ray tracing maps well to GPU pipelines, allowing artists to render scenes much more quickly than with CPUs alone.

Compute: LuxMark 2.0

The 7990 isn’t billed as a compute product, but that doesn’t mean it’s at all weak at compute. On the contrary, as LuxMark doesn’t hit the ROPs hard the 7990 has no trouble staying under its 375W target, allowing it to sustain 1000MHz indefinitely. As a result the 7990 takes AMD’s compute advantage and runs with it. The 7990 is a bit more 2x the cost of a 680, but it’s 8.5x faster. Even against GTX Titan the difference is just short of 4x; NVIDIA simply doesn’t do well in our OpenCL tests.

Our 3rd benchmark set comes from CLBenchmark 1.1. CLBenchmark contains a number of subtests; we’re focusing on the most practical of them, the computer vision test and the fluid simulation test. The former being a useful proxy for computer imaging tasks where systems are required to parse images and identify features (e.g. humans), while fluid simulations are common in professional graphics work and games alike.

Compute: CLBenchmark 1.1 Fluid Simulation

Compute: CLBenchmark 1.1 Computer Vision

These two CLBenchmark sub-tests aren’t multi-GPU capable, so the performance of the 7990 is essentially dictated by its first GPU. All that means however is that the 7990 is once again at the top of the charts, well ahead of NVIIDA’s other cards and beating Titan by 50%-100%. At the same time this is a good reminder that not every compute task scales well across multiple GPUs, which is why single-GPU products still have a strong place in the GPU compute world.

Moving on, our 4th compute benchmark is FAHBench, the official Folding @ Home benchmark. Folding @ Home is the popular Stanford-backed research and distributed computing initiative that has work distributed to millions of volunteer computers over the internet, each of which is responsible for a tiny slice of a protein folding simulation. FAHBench can test both single precision and double precision floating point performance, with single precision being the most useful metric for most consumer cards due to their low double precision performance. Each precision has two modes, explicit and implicit, the difference being whether water atoms are included in the simulation, which adds quite a bit of work and overhead. This is another OpenCL test, as Folding @ Home is moving exclusively OpenCL this year with FAHCore 17.

Compute: Folding @ Home: Explicit, Single Precision

Only FAHBench’s explicit mode is multi-GPU capable, and even then the scaling to multiple GPUs isn’t great.  Still, it’s enough to help the 7990 take the top spot on this benchmark, even with NVIDIA’s latest drivers slightly closing the gap. What’s particularly interesting here though is that the 7990 is faster than the 7970GE CF, and that’s not a fluke in our results. The 7990 should by all means be at least a bit slower, and more if throttling kicks in. It looks like we’re seeing one of those rare cases where the GPUs are benefitting from the presence of the PLX bridge, as going through the relatively close-by bridge is faster than in a two-card setup where the GPUs would have to communicate through the CPU/Northbridge. However this is the only time we see such an advantage; in most other compute scenarios – let alone gaming – the PLX bridge won’t have any kind of notable impact.

Wrapping things up, our final compute benchmark is an in-house project developed by our very own Dr. Ian Cutress. SystemCompute is our first C++ AMP benchmark, utilizing Microsoft’s simple C++ extensions to allow the easy use of GPU computing in C++ programs. SystemCompute in turn is a collection of benchmarks for several different fundamental compute algorithms, as described in this previous article, with the final score represented in points. DirectCompute is the compute backend for C++ AMP on Windows, so this forms our other DirectCompute test.

Compute: SystemCompute v0.5.7.2 C++ AMP Benchmark

SystemCompute isn’t multi-GPU capable, so once again we’re leaning on the 7990’s first GPU. To that end we find the 7990 in second place, but we also see the 7790 clearly trailing the 7970GE by more than we’ve seen in our other compute benchmarks. SystemCompute does do a lot of I/O, so if FAHBench is the ideal case for showcasing the benefits of the PLX bridge in GPU to GPU I/O, then SystemCompute is good case for showcasing the drawbacks of the PLX bridge, mainly the higher I/O latency. It’s not enough to cripple the 7990 – it’s faster than the GTX Titan even here – but it does cost the 7990 some performance.

Synthetics Power, Temperature, & Noise
Comments Locked

91 Comments

View All Comments

  • colonelclaw - Wednesday, April 24, 2013 - link

    The card I don't understand the price/performance/name of is the Titan. Looking at these charts shouldn't Nvidia have called it the GTX780? Maybe I'm reading it wrong, but it doesn't look like much more than the standard generational change we normally get once a year from Nvidia/AMD, and follows on from 2012's 6xx series. Charging a grand for it seems a little offensive, in my opinion.
  • prime2515103 - Wednesday, April 24, 2013 - link

    "The GTX 690 is a 300W card and the 7990 is a 375W card. The GTX 690 consumes around 75W less power and puts off 75W less heat, full stop."

    If the 690 was consuming 75W less power and dissipating 75W less heat, it would be drawing 150W less in total. How did you calculate this?
  • tk11 - Wednesday, April 24, 2013 - link

    Consumed power = dissipated heat. He's just pointing out that the increased power draw also equates to an increase of 75W of heat output.
  • sulu1977 - Wednesday, April 24, 2013 - link

    3 fans? Oh please, you can do better than that. For that price I want at least 9 whizzing fans because I simply love my quiet workroom to sound like a busy airport.
  • tk11 - Wednesday, April 24, 2013 - link

    More fans != more noise because more fans running at lower speeds make less noise than fewer fans running at higher speeds.
  • chadwilson - Wednesday, April 24, 2013 - link

    I know the whole mindset of put it out on release, but I really don't see a reason to read this article without FCAT information. Anyone who would be considering a purchase will be waiting until this data comes available with the latest drivers, so the entire article IMO is moot without it.
  • JarredWalton - Wednesday, April 24, 2013 - link

    Personally, if you're concerned about FCAT I think you'll want to wait about three months before buying any dual-GPU AMD setup. Maybe they'll surprise me and fix their drivers before then, but I'm betting on partial and flaky fixes for a little while longer.
  • Beavermatic - Wednesday, April 24, 2013 - link

    looks like Nvidia already responded with a Titan Ultra model today...

    http://www.cinemablend.com/games/Nvidia-Teases-GTX...

    seeing how the 7990 is a dual-gpu card, and the Titan is a single GPU, I would hope the 7990 would beat it. You'd have been a lot wiser to compare it to Nvidia's dual GPU card, the 690 (which is already faster than the Titan to begin with).

    The fact remains, the titan is like 15 to 20% slower than the 690 or 7990, and its single GPU. That's pretty damn impressive that the single-gpu titan can compete with the dual-gpu cards. Toss in another titan for SLI, and it slaughters both of those cards, lolololol. And not by a small amount, but by leaps and bounds.

    Also, check the 7990 benchmarks, look at the microstutter and framerate averages. They are god awfully terrible as well as power consumption. What good is a card when it's rollercoastering framerates like mad? I know Nvidia's SLI has some issues as well, but they've really fined tuned it, but AMD's crossfire and multigpu cards are just horrendous, and shouldn't even be considered.
  • Nfarce - Wednesday, April 24, 2013 - link

    "The fact remains, the titan is like 15 to 20% slower than the 690 or 7990, and its single GPU. That's pretty damn impressive that the single-gpu titan can compete with the dual-gpu cards. Toss in another titan for SLI, and it slaughters both of those cards, lolololol. And not by a small amount, but by leaps and bounds."

    Yeah, and you would be "leaps and bounds" $2000 lighter in the bank account too (or in credit card debt like the way many home PC builders pay for the components in their rigs). You can bet $2k in price would not equal double performance what $1k could buy.
  • Beavermatic - Wednesday, April 24, 2013 - link

    I've got (2) Titan's in SLI and I didn't use a credit card, just sayn'

Log in

Don't have an account? Sign up now