Titan For Compute

Titan, as we briefly mentioned before, is not just a consumer graphics card. It is also a compute card and will essentially serve as NVIDIA’s entry-level compute product for both the consumer and pro-sumer markets.

The key enabler for this is that Titan, unlike any consumer GeForce card before it, will feature full FP64 performance, allowing GK110’s FP64 potency to shine through. Previous NVIDIA cards either had very few FP64 CUDA cores (GTX 680) or artificial FP64 performance restrictions (GTX 580), in order to maintain the market segmentation between cheap GeForce cards and more expensive Quadro and Tesla cards. NVIDIA will still be maintaining this segmentation, but in new ways.

NVIDIA GPU Comparison
  Fermi GF100 Fermi GF104 Kepler GK104 Kepler GK110
Compute Capability 2.0 2.1 3.0 3.5
Threads/Warp 32 32 32 32
Max Warps/SM(X) 48 48 64 64
Max Threads/SM(X) 1536 1536 2048 2048
Register File 32,768 32,768 65,536 65,536
Max Registers/Thread 63 63 63 255
Shared Mem Config 16K
48K
16K
48K
16K
32K
48K
16K
32K
48K
Hyper-Q No No No Yes
Dynamic Parallelism No No No Yes

We’ve covered GK110’s compute features in-depth in our look at Tesla K20 so we won’t go into great detail here, but as a reminder, along with beefing up their functional unit counts relative to GF100, GK110 has several feature improvements to further improve compute efficiency and the resulting performance. Relative to the GK104 based GTX 680, Titan brings with it a much greater number of registers per thread (255), not to mention a number of new instructions such as the shuffle instructions to allow intra-warp data sharing. But most of all, Titan brings with it NVIDIA’s Kepler marquee compute features: HyperQ and Dynamic Parallelism, which allows for a greater number of hardware work queues and for kernels to dispatch other kernels respectively.

With that said, there is a catch. NVIDIA has stripped GK110 of some of its reliability and scalability features in order to maintain the Tesla/GeForce market segmentation, which means Titan for compute is left for small-scale workloads that don’t require Tesla’s greater reliability. ECC memory protection is of course gone, but also gone is HyperQ’s MPI functionality, and GPU Direct’s RDMA functionality (DMA between the GPU and 3rd party PCIe devices). Other than ECC these are much more market-specific features, and as such while Titan is effectively locked out of highly distributed scenarios, this should be fine for smaller workloads.

There is one other quirk to Titan’s FP64 implementation however, and that is that it needs to be enabled (or rather, uncapped). By default Titan is actually restricted to 1/24 performance, like the GTX 680 before it. Doing so allows NVIDIA to keep clockspeeds higher and power consumption lower, knowing the apparently power-hungry FP64 CUDA cores can’t run at full load on top of all of the other functional units that can be active at the same time. Consequently NVIDIA makes FP64 an enable/disable option in their control panel, controlling whether FP64 is operating at full speed (1/3 FP32), or reduced speed (1/24 FP32).

The penalty for enabling full speed FP64 mode is that NVIDIA has to reduce clockspeeds to keep everything within spec. For our sample card this manifests itself as GPU Boost being disabled, forcing our card to run at 837MHz (or lower) at all times. And while we haven't seen it first-hand, NVIDIA tells us that in particularly TDP constrained situations Titan can drop below the base clock to as low as 725MHz. This is why NVIDIA’s official compute performance figures are 4.5 TFLOPS for FP32, but only 1.3 TFLOPS for FP64. The former is calculated around the base clock speed, while the latter is calculated around the worst case clockspeed of 725MHz. The actual execution rate is still 1/3.

Unfortunately there’s not much else we can say about compute performance at this time, as to go much farther than this requires being able to reference specific performance figures. So we’ll follow this up on Thursday with those figures and a performance analysis.

Meet The GeForce GTX Titan GPU Boost 2.0: Temperature Based Boosting
Comments Locked

157 Comments

View All Comments

  • Wolfpup - Tuesday, February 19, 2013 - link

    I think so too. And IMO this makes sense...no one NEEDS this card, the GTX 680 is still awesome, and still competitive where it is. They can be selling these elsewhere for more, etc.

    Now, who wants to buy me 3 of them to run Folding @ Home on :-D
  • IanCutress - Tuesday, February 19, 2013 - link

    Doing some heavy compute, this card could pay for itself in a couple of weeks over a 680 or two. On the business side, it all comes down to 'does it make a difference to throughput', and if you can quantify that and cost it up, then it'll make sense. Gaming, well that's up to you. Folding... I wonder if the code needs tweaking a little.
  • wreckeysroll - Tuesday, February 19, 2013 - link

    Price is going to kill this card. See the powerpoints in the previews for performance. Titan is not too much faster than what they have on the market now, so not just the same price as a 690 but 30% slower as well.

    Game customers are not pro customers.

    This card could of been nice before someone slipped a gear at nvidia and thought gamers would eat this $1000 rip-off. A few will like anything not many though. Big error was made here on pricing this for $1000. A sane price would of sold many more than this lunacy.

    Nvidia dropped the ball.
  • johnthacker - Tuesday, February 19, 2013 - link

    People doing compute will eat this up, though. I went to NVIDIA's GPU Tech Conference last year, people were clamoring for a GK110 based consumer card for compute, after hearing that Dynamic Paralleism and HyperQ were limited to the GK110 and not on the GK104.

    They will sell as many as they want to people doing compute, and won't care at all if they aren't selling them to gamers, since they'll be making more profit anyway.

    Nvidia didn't drop the ball, it's that they're playing a different game than you think.
  • TheJian - Wednesday, February 20, 2013 - link

    Want to place bet on them being out of stock on the day their on sale? I'd be shocked if you can get one in a day if not a week.

    I thought the $500 Nexus 10 would slow some down but I had to fight for hours to get one bought and sold out in most places in under an hour. I believe most overpriced apple products have the same problem.

    They are not trying to sell this to the middle class ya know.

    Asus prices the ares 2 at $1600. They only made 1000 last I checked. These are not going to sell 10 million and selling for anything less would just mean less money, and problems meeting production. You price your product at what you think the market will bare. Not what Wreckeysroll thinks the price should be. Performance like a dualchip card is quite a feat of engineering. Note the Ares2 uses like ~475watts. This will come in around 250w. Again, quite a feat. That's around ~100 less than a 690 also.

    Don't forget this is a card that is $2500 of compute power. Even Amazon had to buy 10000 K20's just to get a $1500 price on them, and had to also buy $500 insurance for each one to get that deal. You think Amazon is a bunch of idiots? This is a card that fixes 600 series weakness and adds substantial performance to boot. It would be lunacy to sell it for under $1000. If we could all afford it they'd make nothing and be out of stock in .5 seconds...LOL
  • chizow - Friday, February 22, 2013 - link

    Except they have been selling this *SAME* class of card for much <$1000 for the better part of a decade. *SAME* size, same relative performance, same cost to produce. Where have you been and why do you think it's now OK to sell it for 2x as much when nothing about it justifies the price increase?
  • CeriseCogburn - Sunday, February 24, 2013 - link

    LOL same cost to produce....

    You're insane.
  • Gastec - Wednesday, February 27, 2013 - link

    You forget about the "bragging rights" factor. Perhaps Nvidia won't make many GTX Titan but all those they do make will definitely sell like warm bread. There are enough "enthusiasts" and other kinds of trolls out there (most of them in United States) willing to give anything to show to the Internet their high scores in various benchmarks and/or post a flashy picture with their shiny "rig".
  • herzwoig - Tuesday, February 19, 2013 - link

    Unacceptable price.
    Less than promised performance.

    Pro customers will get a Tesla, that is what those cards are for with the attenuate support and drivers. Nvidia is selling this as a consumer gaming play card and trying to reshape the high end gaming SKU as an even more premium product (doubly so!!)

    Terrible value and whatever performance it has going for it is erroded by the nonsensical pricing strategy. Surprising level of miscalculation on the greed front from nvidia...
  • TheJian - Wednesday, February 20, 2013 - link

    They'll pay $2500 to get that unless they buy 10000 like amazon (which still paid $2000/card). Unacceptable for you, but I guess you're not their target market. You can get TWO of these for the price of ONE tesla at $2000 and ONLY if you buy 10000 like amazon. Heck if buying one Tesla, I get two of these, a new I7-3770K+ a board...LOL. They're selling this as a consumer card with telsa performance (sans support/insurance). Sounds like they priced it right in line with nearly every other top of the line card released for years in this range. 7990, 690 etc...on down the line.

    Less than promised performance? So you've benchmarked it then?

    "Terrible value and whatever performance it has going for it"
    So you haven't any idea yet right?...Considering a 7990 costs a $1000 too basically, and uses 475w vs. 250w, while being 1/2 the size this isn't so nonsensical. This card shouldn't heat up your room either. There are many benefits, you just can't see beyond those AMD goggles you've got on.

Log in

Don't have an account? Sign up now