Compute

Jumping into compute, we aren’t expecting too much here. Outside of DirectCompute GK104 is generally a poor compute GPU, and other than the clockspeed boost GTX 770 doesn’t have much going for it.

As always we'll start with our DirectCompute game example, Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes.  While DirectCompute is used in many games, this is one of the only games with a benchmark that can isolate the use of DirectCompute and its resulting performance.

Civilization V at least shows that NVIDIA’s DirectCompute performance is up to snuff in this case. Though as is the case with GTX 780, we’re reaching the limits of what this benchmark can do, due to just how fast modern cards have become.

Our next benchmark is LuxMark2.0, the official benchmark of SmallLuxGPU 2.0. SmallLuxGPU is an OpenCL accelerated ray tracer that is part of the larger LuxRender suite. Ray tracing has become a stronghold for GPUs in recent years as ray tracing maps well to GPU pipelines, allowing artists to render scenes much more quickly than with CPUs alone.

Moving on to a more general compute task, we get a reminder of how poor GK104 is here. GTX 770 can beat the slower GK104 products, and that’s it. Even GTX 570 is faster, never mind the massive lead that 7970GE holds.

Our 3rd benchmark set comes from CLBenchmark 1.1. CLBenchmark contains a number of subtests; we’re focusing on the most practical of them, the computer vision test and the fluid simulation test. The former being a useful proxy for computer imaging tasks where systems are required to parse images and identify features (e.g. humans), while fluid simulations are common in professional graphics work and games alike.

CLBenchmark paints GTX 770 in a better light than LuxMark, but not by a great deal. The gains over the GTX 680 are miniscule since these benchmarks aren’t memory bandwidth limited, and the gap between it and the 7970GE is nothing short of enormous.

Moving on, our 4th compute benchmark is FAHBench, the official Folding @ Home benchmark. Folding @ Home is the popular Stanford-backed research and distributed computing initiative that has work distributed to millions of volunteer computers over the internet, each of which is responsible for a tiny slice of a protein folding simulation. FAHBench can test both single precision and double precision floating point performance, with single precision being the most useful metric for most consumer cards due to their low double precision performance. Each precision has two modes, explicit and implicit, the difference being whether water atoms are included in the simulation, which adds quite a bit of work and overhead. This is another OpenCL test, as Folding @ Home has moved exclusively to OpenCL this year with FAHCore 17.

Recent core improvements in Folding @ Home continue to pay off for NVIDIA. In single precision the GTX 770 is just fast enough to hang with the 7970 vanilla, though the 7970GE is still over 10% faster. Double precision on the other hand is entirely in AMD’s favor thanks to GK104’s very poor FP64 performance.

Wrapping things up, our final compute benchmark is an in-house project developed by our very own Dr. Ian Cutress. SystemCompute is our first C++ AMP benchmark, utilizing Microsoft’s simple C++ extensions to allow the easy use of GPU computing in C++ programs. SystemCompute in turn is a collection of benchmarks for several different fundamental compute algorithms, as described in this previous article, with the final score represented in points. DirectCompute is the compute backend for C++ AMP on Windows, so this forms our other DirectCompute test.

Unlike our other compute benchmarks, System Compute is at least a little bit memory bandwidth sensitive, so GTX 770 pulls ahead of GTX 680 by 11%. Otherwise like every other compute benchmark, AMD’s cards fare far better here.

Synthetics Power, Temperature, & Noise
Comments Locked

117 Comments

View All Comments

  • ninjaquick - Thursday, May 30, 2013 - link

    Delta Percentages: The 7990 just needs to be removed, it skews the whole chart way too much.
  • Brainling - Thursday, May 30, 2013 - link

    It's a nice card, but not nice enough for me to upgrade my 670. If it had been a slightly more paired down GK110, I would have considered it...but the performance gains are just not enough to justify replacing my 670 (which still has little trouble with most games).

    I'll spend my computing dollar on going from Sandy Bridge -> Haswell instead, and wait for the eventual 800 series sometime next year (which should be a new micro architecture).
  • The0ne - Thursday, May 30, 2013 - link

    "... the rest of the year will be a battle of prices and bundles."

    Can't wait.
  • Runadumb - Thursday, May 30, 2013 - link

    Firstly Thank you for the detailed review.

    Right, I could pull the trigger on two of these (when they come out with 4GB versions) as they are 85%+ above my current 2x570GTX's. Thanks for having 570's in the results by the way.

    My BIG question is: Is the proper next-gen cards still due early next year or is this all we've got for the next 18 months? The rumour mill is fine here.

    As I run 3 displays and a 6000x1080p resolution I literally can never have too much performance. So if waiting until next year meant I get a better upgrade I'm happy to do it. This system can just about keep me going right now.

    I may abandon the 3 screen setup for a consumer version Oculus Rift but how long away is that? I want to hedge my bets.
  • jwcalla - Thursday, May 30, 2013 - link

    I think Maxwell is going to be more like mid-2014. It seems aggressive though... a new architecture, a shrink to 20nm and they're shoe-horning a 64-bit ARM chip in there. Lots of opportunities for delays IMO. But it might be a wise idea to wait... NVIDIA is promising (read: take with a grain of salt) 3x "GFLOPS per watt" over Kepler, and about 7-8x over Fermi. It's hard to predict how that will scale into performance though.
  • Ryan Smith - Thursday, May 30, 2013 - link

    It would be historically accurate to state that NVIDIA typically releases new architectures on new nodes, and that both Maxwell and TSMC 20nm are scheduled for 2014. When in 2014 is currently something only NVIDIA could tell you.

    But I would say this: consider this the half-time show. This is the mid-generation refresh, so we're roughly half-way between the launch of Kepler and Maxwell.
  • araczynski - Thursday, May 30, 2013 - link

    Hopefully by the time I replace my aging E8500/6970 system (which still plays everything I care about pretty well at 1080p) with a haswell, this thing will have a 4GB option so i can make another long lasting rig.
  • xdesire - Thursday, May 30, 2013 - link

    Ok i see that this is an overclocked/tweaked GTX 680 but what in the hell is that TDP?
  • !HEADHUNTERZ! - Thursday, May 30, 2013 - link

    Soooo...where does the GTX 670 FTW compare to the GTX 770!? Theyre both the same price! So which one would be a better decision to make?
  • JPForums - Thursday, May 30, 2013 - link

    They won't be priced the same for long. Unfortunately, I just can't see a GTX670FTW beating a GTX770 (especially with a factory overclock). I wonder if there is enough overclocking headroom for a GTX770FTW.

Log in

Don't have an account? Sign up now