Titan’s Compute Performance, Cont

With Rahul having covered the basis of Titan’s strong compute performance, let’s shift gears a bit and take a look at real world usage.

On top of Rahul’s work with Titan, as part of our 2013 GPU benchmark suite we put together a larger number of compute benchmarks to try to cover real world usage, including the old standards of gaming usage (Civilization V) and ray tracing (LuxMark), along with several new tests. Unfortunately that got cut short when we discovered that OpenCL support is currently broken in the press drivers, which prevents us from using several of our tests. We still have our CUDA and DirectCompute benchmarks to look at, but a full look at Titan’s compute performance on our 2013 GPU benchmark suite will have to wait for another day.

For their part, NVIDIA of course already has OpenCL working on GK110 with Tesla. The issue is that somewhere between that and bringing up GK110 for Titan by integrating it into NVIDIA’s mainline GeForce drivers – specifically the new R314 branch – OpenCL support was broken. As a result we expect this will be fixed in short order, but it’s not something NVIDIA checked for ahead of the press launch of Titan, and it’s not something they could fix in time for today’s article.

Unfortunately this means that comparisons with Tahiti will be few and far between for now. Most significant cross-platform compute programs are OpenCL based rather than DirectCompute, so short of games and a couple other cases such as Ian’s C++ AMP benchmark, we don’t have too many cross-platform benchmarks to look at. With that out of the way, let’s dive into our condensed collection of compute benchmarks.

We’ll once more start with our DirectCompute game example, Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes.  While DirectCompute is used in many games, this is one of the only games with a benchmark that can isolate the use of DirectCompute and its resulting performance.

Note that for 2013 we have changed the benchmark a bit, moving from using a single leader to using all of the leaders. As a result the reported numbers are higher, but they’re also not going to be comparable with this benchmark’s use from our 2012 datasets.

With Civilization V having launched in 2010, graphics cards have become significantly more powerful since then, far outpacing growth in the CPUs that feed them. As a result we’ve rather quickly drifted from being GPU bottlenecked to being CPU bottlenecked, as we see both in our Civ V game benchmarks and our DirectCompute benchmarks. For high-end GPUs the performance difference is rather minor; the gap between GTX 680 and Titan for example is 45fps, or just less than 10%. Still, it’s at least enough to get Titan past the 7970GE in this case.

Our second test is one of our new tests, utilizing Elcomsoft’s Advanced Office Password Recovery utility to take a look at GPU password generation. AOPR has separate CUDA and OpenCL kernels for NVIDIA and AMD cards respectively, which means it doesn’t follow the same code path on all GPUs but it is using an optimal path for each GPU it can handle. Unfortunately we’re having trouble getting it to recognize AMD 7900 series cards in this build, so we only have CUDA cards for the time being.

Password generation and other forms of brute force crypto is an area  where the GTX 680 is particularly weak, thanks to the various compute aspects that have been stripped out in the name of efficiency. As a result it ends up below even the GTX 580 in these benchmarks, never mind AMD’s GCN cards. But with Titan/GK110 offering NVIDIA’s full compute performance, it rips through this task. In fact it more than doubles performance from both the GTX 680 and the GTX 580, indicating that the huge performance gains we’re seeing are coming from not just the additional function units, but from architectural optimizations and new instructions that improve overall efficiency and reduce the number of cycles needed to complete work on a password.

Altogether at 33K passwords/second Titan is not just faster than GTX 680, but it’s faster than GTX 690 and GTX 680 SLI, making this a test where one big GPU (and its full compute performance) is better than two smaller GPUs. It will be interesting to see where the 7970 GHz Edition and other Tahiti cards place in this test once we can get them up and running.

Our final test in our abbreviated compute benchmark suite is our very own Dr. Ian Cutress’s SystemCompute benchmark, which is a collection of several different fundamental compute algorithms. Rahul went into greater detail on this back in his look at Titan’s compute performance, but I wanted to go over it again quickly with the full lineup of cards we’ve tested.

Surprisingly, for all of its performance gains relative to GTX 680, Titan still falls notably behind the 7970GE here. Given Titan’s theoretical performance and the fundamental nature of this test we would have expected it to do better. But without additional cross-platform tests it’s hard to say whether this is something where AMD’s GCN architecture continues to shine over Kepler, or if perhaps it’s a weakness in NVIDIA’s current DirectCompute implementation for GK110. Time will tell on this one, but in the meantime this is the first solid sign that Tahiti may be more of a match for GK110 than it’s typically given credit for.

Titan’s Compute Performance (aka Ph.D Lust) Meet The 2013 GPU Benchmark Suite & The Test
Comments Locked

337 Comments

View All Comments

  • Ryan Smith - Thursday, February 21, 2013 - link

    PCI\VEN_10DE&DEV_1005&SUBSYS_103510DE

    I have no idea what a Tesla card's would be, though.
  • alpha754293 - Thursday, February 21, 2013 - link

    I don't suppose you would know how to tell the computer/OS that the card has a different PCI DevID other than what it actually is, would you?

    NVIDIA Tesla C2075 PCI\VEN_10DE&DEV_1096
  • Hydropower - Friday, February 22, 2013 - link

    PCI\VEN_10DE&DEV_1022&SUBSYS_098210DE&REV_A1

    For the K20c.
  • brucethemoose - Thursday, February 21, 2013 - link

    "This TDP limit is 106% of Titan’s base TDP of 250W, or 265W. No matter what you throw at Titan or how you cool it, it will not let itself pull more than 265W sustained."

    The value of the Titan isn't THAT bad at stock, but 106%? Is that a joke!?

    Throw in an OC for OC comparison, and this card is absolutely ridiculous. Take the 7970 GE... 1250mhz is a good, reasonable 250mhz OC on air, a nice 20%-25% boost in performance.

    The Titan review sample is probably the best case scenario and can go 27MHz past turbo speed, 115MHZ past base speed, so maybe 6%-10%. That $500 performance gap starts shrinking really, really fast once you OC, and for god sakes, if you're the kind of person who's buying a $1000 GPU, you shouldn't intend to leave it at stock speeds.

    I hope someone can voltmod this card and actually make use of a waterblock, but there's another issue... Nvidia is obviously setting a precedent. Unless they change this OC policy, they won't be seeing any of my money anytime soon.
  • JarredWalton - Thursday, February 21, 2013 - link

    As someone with a 7970GE, I can tell you unequivocally that 1250MHz on air is not at all a given. My card can handle many games at 1150MMhz, but other titles and applications (say, running some compute stuff) and I'm lucky to get stability for more than a day at 1050MHz. Perhaps with enough effort playing with voltage mods and such I could improve the situation, but I'm happier living with a card for a couple years that doesn't crap out because of excessively high voltages.
  • CeriseCogburn - Saturday, February 23, 2013 - link

    " After a few hours of trial and error, we settled on a base of the boost curve of 9,80 MHz, resulting in a peak boost clock of a mighty 1,123MHz; a 12 per cent increase over the maximum boost clock of the card at stock.

    Despite the 3GB of GDDR5 fitted on the PCB's rear lacking any active cooling it too proved more than agreeable to a little tweaking and we soon had it running at 1,652MHz (6.6GHz effective), a healthy ten per cent increase over stock.

    With these 12-10 per cent increases in clock speed our in-game performance responded accordingly."

    http://www.bit-tech.net/hardware/2013/02/21/nvidia...

    Oh well, 12 is 6 if it's nVidia bash time, good job mr know it all.
  • Hrel - Thursday, February 21, 2013 - link

    YES! 1920x1080 has FINALLY arrived. It only took 6 years from when it became mainstream but it's FINALLY here! FINALLY! I get not doing it on this card, but can you guys PLEASE test graphics cards, especially laptop ones, at 1600x900 and 1280x720. A lot of the time when on a budget playing games at a lower resolution is a compromise you're more than willing to make in order to get decent quality settings. PLEASE do this for me, PLEASE!
  • JarredWalton - Thursday, February 21, 2013 - link

    Um... we've been testing 1366x768, 1600x900, and 1920x1080 as our graphics standards for laptops for a few years now. We don't do 1280x720 because virtually no laptops have that as their native resolution, and stretching 720p to 768p actually isn't a pleasant result (a 6.7% increase in resolution means the blurring is far more noticeable). For desktop cards, I don't see much point in testing most below 1080p -- who has a desktop not running at least 1080p native these days? The only reason for 720p or 900p on desktops is if your hardware is too old/slow, which is fine, but then you're probably not reading AnandTech for the latest news on GPU performance.
  • colonelclaw - Thursday, February 21, 2013 - link

    I must admit I'm a little bit confused by Titan. Reading this review gives me the impression it isn't a lot more than the annual update to the top-of-the-line GPU from Nvidia.
    What would be really useful to visualise would be a graph plotting the FPS rates of the 480, 580, 680 and Titan along with their release dates. From this I think we would get a better idea of whether or not it's a new stand out product, or merely this year's '780' being sold for over double the price.
    Right now I genuinely don't know if i should be holding Nvidia in awe or calling them rip-off merchants.
  • chizow - Friday, February 22, 2013 - link

    From Anandtech's 7970 Review, you can see relative GPU die sizes:

    http://images.anandtech.com/doci/5261/DieSize.png

    You'll also see the prices of these previous flagships has been mostly consistent, in the $500-650 range (except for a few outliers like the GTX 285 which came in hard economic times and the 8800Ultra, which was Nvidia's last ultra-premium card).

    You an check some sites that use easy performance rating charts, like computerbase.de to get a quick idea of relative performance increases between generations, but you can quickly see that going from a new generation (not half-node) like G80 > GT200 > GF100 > GK100/110 should offer 50%+ increase, generally closer to the 80% range over the predecessor flagship.

    Titan would probably come a bit closer to 100%, so it does outperform expectations (all of Kepler line did though), but it certainly does not justify the 2x increase in sticker price. Nvidia is trying to create a new Ultra-premium market without giving even a premium alternative. This all stems from the fact they're selling their mid-range part, GK104, as their flagship, which only occurred due to AMD's ridiculous pricing of the 7970.

Log in

Don't have an account? Sign up now