Titan For Compute

Titan, as we briefly mentioned before, is not just a consumer graphics card. It is also a compute card and will essentially serve as NVIDIA’s entry-level compute product for both the consumer and pro-sumer markets.

The key enabler for this is that Titan, unlike any consumer GeForce card before it, will feature full FP64 performance, allowing GK110’s FP64 potency to shine through. Previous NVIDIA cards either had very few FP64 CUDA cores (GTX 680) or artificial FP64 performance restrictions (GTX 580), in order to maintain the market segmentation between cheap GeForce cards and more expensive Quadro and Tesla cards. NVIDIA will still be maintaining this segmentation, but in new ways.

NVIDIA GPU Comparison
  Fermi GF100 Fermi GF104 Kepler GK104 Kepler GK110
Compute Capability 2.0 2.1 3.0 3.5
Threads/Warp 32 32 32 32
Max Warps/SM(X) 48 48 64 64
Max Threads/SM(X) 1536 1536 2048 2048
Register File 32,768 32,768 65,536 65,536
Max Registers/Thread 63 63 63 255
Shared Mem Config 16K
48K
16K
48K
16K
32K
48K
16K
32K
48K
Hyper-Q No No No Yes
Dynamic Parallelism No No No Yes

We’ve covered GK110’s compute features in-depth in our look at Tesla K20 so we won’t go into great detail here, but as a reminder, along with beefing up their functional unit counts relative to GF100, GK110 has several feature improvements to further improve compute efficiency and the resulting performance. Relative to the GK104 based GTX 680, Titan brings with it a much greater number of registers per thread (255), not to mention a number of new instructions such as the shuffle instructions to allow intra-warp data sharing. But most of all, Titan brings with it NVIDIA’s Kepler marquee compute features: HyperQ and Dynamic Parallelism, which allows for a greater number of hardware work queues and for kernels to dispatch other kernels respectively.

With that said, there is a catch. NVIDIA has stripped GK110 of some of its reliability and scalability features in order to maintain the Tesla/GeForce market segmentation, which means Titan for compute is left for small-scale workloads that don’t require Tesla’s greater reliability. ECC memory protection is of course gone, but also gone is HyperQ’s MPI functionality, and GPU Direct’s RDMA functionality (DMA between the GPU and 3rd party PCIe devices). Other than ECC these are much more market-specific features, and as such while Titan is effectively locked out of highly distributed scenarios, this should be fine for smaller workloads.

There is one other quirk to Titan’s FP64 implementation however, and that is that it needs to be enabled (or rather, uncapped). By default Titan is actually restricted to 1/24 performance, like the GTX 680 before it. Doing so allows NVIDIA to keep clockspeeds higher and power consumption lower, knowing the apparently power-hungry FP64 CUDA cores can’t run at full load on top of all of the other functional units that can be active at the same time. Consequently NVIDIA makes FP64 an enable/disable option in their control panel, controlling whether FP64 is operating at full speed (1/3 FP32), or reduced speed (1/24 FP32).

The penalty for enabling full speed FP64 mode is that NVIDIA has to reduce clockspeeds to keep everything within spec. For our sample card this manifests itself as GPU Boost being disabled, forcing our card to run at 837MHz (or lower) at all times. And while we haven't seen it first-hand, NVIDIA tells us that in particularly TDP constrained situations Titan can drop below the base clock to as low as 725MHz. This is why NVIDIA’s official compute performance figures are 4.5 TFLOPS for FP32, but only 1.3 TFLOPS for FP64. The former is calculated around the base clock speed, while the latter is calculated around the worst case clockspeed of 725MHz. The actual execution rate is still 1/3.

Unfortunately there’s not much else we can say about compute performance at this time, as to go much farther than this requires being able to reference specific performance figures. So we’ll follow this up on Thursday with those figures and a performance analysis.

Meet The GeForce GTX Titan GPU Boost 2.0: Temperature Based Boosting
Comments Locked

157 Comments

View All Comments

  • WhoppingWallaby - Thursday, February 21, 2013 - link

    Dude, you have some gall calling another person a fanboy. We could all do without your ranting and raving, so go troll elsewhere or calm down a little.
  • CeriseCogburn - Sunday, February 24, 2013 - link

    Oh shut up yourself you radeon rager.

    You idiots think you have exclusive rights to spew your crap all over the place, and when ANYONE disagrees you have a ***** fit and demand they stop.

    How about all you whining critical diaper pooping fanatics stop instead ?
  • IanCutress - Tuesday, February 19, 2013 - link

    It's all about single card performance. Everything just works eaier with a single card. Start putting SLI into the mix and you need to take into account for drivers, or when doing compute it requires a complete reworking of code. Not to mention the potentially lower power output and OC capabilities of Titan over a dual GPU card.

    At any given price point, getting two cards up to that cost will always be quicker than a single card in any scenario that can take advantage, if you're willing to put up with it. So yes, two GTX 680s, a 690, or a Titan is a valid question, and it's up to the user preference which one to get.

    I need to double check my wallet, see if it hasn't imploded after hearing the price.
  • wreckeysroll - Tuesday, February 19, 2013 - link

    lost their minds?
    how about fell and cracked their head after losing it. Smoking too much of that good stuff down there in California.

    How stupid do they take us for. Way to thumb your customers in the eye nvidia. $1000 on a single gpu kit.

    Good laugh for the morning.
  • B3an - Tuesday, February 19, 2013 - link

    Use some ****ing common sense. You get what you pay for.

    6GB with 386-bit memory bus, and a 551mm2 size GPU. Obviously this wont be cheap and theres no way this could be sold for anywhere near the price of a 680 without losing tons of money.

    Nvidia already had this thing in super computers anyway so why not turn it in to a consumer product? Some people WILL buy this. If you have the money why not. Atleast NV are not sitting on their arses like AMD are with no new high-end GPU's this year. Even though i have AMD cards i'm very disappointed with AMD's crap lately as an enthusiast and someone who's just interested in GPU tech. First they literally give up on competitive performance CPU's and now it's looking like they're doing it with GPU's.
  • siliconfiber - Tuesday, February 19, 2013 - link

    Common sense is what you are missing.

    GTX 580, 480, 285 were all sold to for much less than this card and were all used in HPC applications, had the same or much bigger dies sizes, and the same or bigger bus. DDR memory is dirt cheap as well

    I have seen it all now. Largest rip-off in the history of video cards right here.
  • Genx87 - Tuesday, February 19, 2013 - link

    Oh look I have never seen this argument before. Biggest rip off in history of video cards. Pre-ceded only by every high end video card release since the introduction of high end discrete GPUs. And will remain a ripoff until the next high end GPU is released surpassing this card ripoff factor.
  • Blibbax - Tuesday, February 19, 2013 - link

    It's not a rip off because you don't have to buy it. The 680 hasn't gotten any slower.

    Just like with cars and anything else, when you add 50% more performance to a high-end product, it's gunna be a lot more than 50% more expensive.
  • johnthacker - Tuesday, February 19, 2013 - link

    The largest rip-off in the history of video cards are some of the Quadro cards. This is extremely cheap for a card with so good FP64 performance.
  • TheJian - Wednesday, February 20, 2013 - link

    GTX580 (40nm) was not in the same league as this and only had 3b transistors. Titan has 7.1B on 28nm. 512cuda cores compared to 2880? It came with 1.5GB memory too, this has 6. etc etc..The 580 did not run like a $2500 pro card @ a 1500 discount either. Also a chip this complicated doesn't YIELD well. It's very expensive to toss out the bad ones.

    Do you know the difference between system memory and graphics memory (you said ddr). They do not cost the same. You meant GDDR? Well this stuff is 4x as much running 6ghz not 4ghz.

    Ref clock is 876 but these guys got theirs to 1176:
    http://www.guru3d.com/articles-pages/geforce_gtx_t...

    The card is a monster value vs. $2500 K20. Engineering is not FREE. Ask AMD. They lost 1.18B last year selling crap at prices that would make you happy I guess. That's how you go out of business. Get it? They haven't made money in 10yrs (lost 3-4B over that time as a whole). Think they should've charged more for their cards/chips the last ten years? I DO. If Titan is priced wrong, they will remain on the shelf. Correct? So if you're right they won't sell. These will be gone in a day, because there are probably enough people that would pay $1500 for them they'll sell out quickly. You have to pay $2500 to get this on the pro side.

Log in

Don't have an account? Sign up now