Titan’s Compute Performance, Cont

With Rahul having covered the basis of Titan’s strong compute performance, let’s shift gears a bit and take a look at real world usage.

On top of Rahul’s work with Titan, as part of our 2013 GPU benchmark suite we put together a larger number of compute benchmarks to try to cover real world usage, including the old standards of gaming usage (Civilization V) and ray tracing (LuxMark), along with several new tests. Unfortunately that got cut short when we discovered that OpenCL support is currently broken in the press drivers, which prevents us from using several of our tests. We still have our CUDA and DirectCompute benchmarks to look at, but a full look at Titan’s compute performance on our 2013 GPU benchmark suite will have to wait for another day.

For their part, NVIDIA of course already has OpenCL working on GK110 with Tesla. The issue is that somewhere between that and bringing up GK110 for Titan by integrating it into NVIDIA’s mainline GeForce drivers – specifically the new R314 branch – OpenCL support was broken. As a result we expect this will be fixed in short order, but it’s not something NVIDIA checked for ahead of the press launch of Titan, and it’s not something they could fix in time for today’s article.

Unfortunately this means that comparisons with Tahiti will be few and far between for now. Most significant cross-platform compute programs are OpenCL based rather than DirectCompute, so short of games and a couple other cases such as Ian’s C++ AMP benchmark, we don’t have too many cross-platform benchmarks to look at. With that out of the way, let’s dive into our condensed collection of compute benchmarks.

We’ll once more start with our DirectCompute game example, Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes.  While DirectCompute is used in many games, this is one of the only games with a benchmark that can isolate the use of DirectCompute and its resulting performance.

Note that for 2013 we have changed the benchmark a bit, moving from using a single leader to using all of the leaders. As a result the reported numbers are higher, but they’re also not going to be comparable with this benchmark’s use from our 2012 datasets.

With Civilization V having launched in 2010, graphics cards have become significantly more powerful since then, far outpacing growth in the CPUs that feed them. As a result we’ve rather quickly drifted from being GPU bottlenecked to being CPU bottlenecked, as we see both in our Civ V game benchmarks and our DirectCompute benchmarks. For high-end GPUs the performance difference is rather minor; the gap between GTX 680 and Titan for example is 45fps, or just less than 10%. Still, it’s at least enough to get Titan past the 7970GE in this case.

Our second test is one of our new tests, utilizing Elcomsoft’s Advanced Office Password Recovery utility to take a look at GPU password generation. AOPR has separate CUDA and OpenCL kernels for NVIDIA and AMD cards respectively, which means it doesn’t follow the same code path on all GPUs but it is using an optimal path for each GPU it can handle. Unfortunately we’re having trouble getting it to recognize AMD 7900 series cards in this build, so we only have CUDA cards for the time being.

Password generation and other forms of brute force crypto is an area  where the GTX 680 is particularly weak, thanks to the various compute aspects that have been stripped out in the name of efficiency. As a result it ends up below even the GTX 580 in these benchmarks, never mind AMD’s GCN cards. But with Titan/GK110 offering NVIDIA’s full compute performance, it rips through this task. In fact it more than doubles performance from both the GTX 680 and the GTX 580, indicating that the huge performance gains we’re seeing are coming from not just the additional function units, but from architectural optimizations and new instructions that improve overall efficiency and reduce the number of cycles needed to complete work on a password.

Altogether at 33K passwords/second Titan is not just faster than GTX 680, but it’s faster than GTX 690 and GTX 680 SLI, making this a test where one big GPU (and its full compute performance) is better than two smaller GPUs. It will be interesting to see where the 7970 GHz Edition and other Tahiti cards place in this test once we can get them up and running.

Our final test in our abbreviated compute benchmark suite is our very own Dr. Ian Cutress’s SystemCompute benchmark, which is a collection of several different fundamental compute algorithms. Rahul went into greater detail on this back in his look at Titan’s compute performance, but I wanted to go over it again quickly with the full lineup of cards we’ve tested.

Surprisingly, for all of its performance gains relative to GTX 680, Titan still falls notably behind the 7970GE here. Given Titan’s theoretical performance and the fundamental nature of this test we would have expected it to do better. But without additional cross-platform tests it’s hard to say whether this is something where AMD’s GCN architecture continues to shine over Kepler, or if perhaps it’s a weakness in NVIDIA’s current DirectCompute implementation for GK110. Time will tell on this one, but in the meantime this is the first solid sign that Tahiti may be more of a match for GK110 than it’s typically given credit for.

Titan’s Compute Performance (aka Ph.D Lust) Meet The 2013 GPU Benchmark Suite & The Test
Comments Locked

337 Comments

View All Comments

  • chizow - Thursday, February 21, 2013 - link

    You must not have followed the development of GPUs, and particularly flagship GPUs very closely in the last decade or so.

    G80, the first "Compute GPGPU" as Nvidia put it, was first and foremost a graphics part and a kickass one at that. Each flagship GPU after, GT200, GT200b, GF100, GF110 have continued in this vein...driven by the desktop graphics market first, Tesla/compute market second. Hell, the Tesla business did not even exist until the GeForceTesla200. Jensen Huang, Nvidia's CEO, even got on stage likening his GPUs to superheroes with day jobs as graphics cards while transforming into supercomputers at night.

    Now Nvidia flips the script, holds back the flagship GPU from the gaming market that *MADE IT POSSIBLE* and wants to charge you $1K because it's got "SuperComputer Guts"??? That's bait and switch, stab in the back, whatever you want to call it. So yes, if you were actually in this market before, Nvidia has screwed you over to the tune of $1K for something that used to cost $500-$650 max.
  • CeriseCogburn - Saturday, February 23, 2013 - link

    You only spend at max $360 for a video card as you stated, so this doesn't affect you and you haven't been screwed.

    Grow up crybaby. A company may chagre what it desires, and since you're never buying, who cares how many times you scream they screwed everyone ?
    NO ONE CARES, not even you, since you never even pony up $500, as you yourself stated in this long, continuous crybaby whine you made here, and have been making, since the 680 was released, or rather, since Charlie fried your brain with his propaganda.

    Go get your 98 cent a gallon gasoline while you're at it , you fool.
  • chizow - Saturday, February 23, 2013 - link

    Uh no, I've spent over $1K in a single GPU purchasing transaction, have you? I didn't think so.

    I'm just unwilling to spend *$2K* for what cost $1K in the past for less than the expected increase in performance. I spent $700 this round instead of the usual $1K because that's all I was willing to pay for a mid-range ASIC in GK104 and while it was still a significant upgrade to my last set of $1K worth of graphics cards, I wasn't going to plunk down $1K for a set of mid-range GK104 GTX 680s.

    It's obvious you have never bought in this range of GPUs in the past, otherwise you wouldn't be posting such retarded replys for what is clearly usurious pricing by Nvidia.

    Now go away, idiot.
  • CeriseCogburn - Tuesday, February 26, 2013 - link

    Wrong again, as usual.
    So what it boils down to is you're a cheapskate, still disgruntled, still believe in Charlie D's lie, and are angry you won't have the current top card at a price you demand.
    I saw your whole griping list in the other thread too, but none of what you purchase or don't purchase makes a single but of difference when it comes to your insane tinfoil hat lies that you have used for your entire argument

    Once again, pretending you aren't aware of production capacity leaves you right where you brainless rant started a long time ago.

    You cover your tracks whining about ATI's initial price, which wasn't out of line either, and ignore nVidia's immediate crushing of it when the 680 came out, as you still complained about the performance increase there. You're a crybaby, that's it.

    That's what you have done now for months on end, whined and whined and whined, and got caught over and over in exaggerations and lies, demanding a perfectly increasing price perf line slanting upwards, for years on end, lying about it's past, which I caught you on in the earlier reviews.

    Well dummy, that's not how performance/price increases work in any area of computer parts, anyway.
    Glad you're just another freaking parrot, as the reviewers have trained you fools to automaton levels.
  • Pontius - Thursday, February 21, 2013 - link

    My only interest at the moment is OpenCL compute performance. Sad to see it's not working at the moment, but once they get the kinks worked out, I would really love to see some benchmarks.

    Also, as any GPGPU programmer knows, the number one bottleneck for GPU computing is randomly accessing memory. If you are working only within the on-chip local memory, then yes, you get blazingly fast speeds on a GPU. However, the second you do something as simple as a += on a global memory location, your performance grinds to a screeching halt. I would really like to see the performance of these cards on random memory heavy OpenCL benchmarks. Thanks for the review!
  • codedivine - Thursday, February 21, 2013 - link

    We may do this in the future if I get some time off from univ work. Stay tuned :)
  • Pontius - Thursday, February 21, 2013 - link

    Thanks codedevine, I'll keep an eye out.
  • Pontius - Thursday, February 21, 2013 - link

    My only interest at the moment is OpenCL compute performance. Sad to see it's not working at the moment, but once they get the kinks worked out, I would really love to see some benchmarks.

    Also, as any GPGPU programmer knows, the number one bottleneck for GPU computing is randomly accessing memory. If you are working only within the on-chip local memory, then yes, you get blazingly fast speeds on a GPU. However, the second you do something as simple as a += on a global memory location, your performance grinds to a screeching halt. I would really like to see the performance of these cards on random memory heavy OpenCL benchmarks. Thanks for the review!
  • Bat123Man - Thursday, February 21, 2013 - link

    The Titan is nothing more than a proof-of-concept; "Look what we can do! Whohoo! Souped up to the max!" Nvidia is not intending this card to be for everyone. They know it will be picked up by a few well-moneyed enthusiasts, but it is really just a science project so that when people think about "the fastest GPU on the market", they think Nvidia.

    How often do you guys buy the best of the best as soon as it is out the door anyway ? $1000, $2000, it makes no difference, most of us wouldn't buy it even at 500 bucks. This is all about bragging rights, pure and simple.
  • Oxford Guy - Thursday, February 21, 2013 - link

    Not exactly. The chip isn't fully enabled.

Log in

Don't have an account? Sign up now