NVIDIA's GeForce GTX Titan, Part 1: Titan For Gaming, Titan For Compute

Name: NVIDIA's GeForce GTX Titan, Part 1: Titan For Gaming, Titan For Compute
Item: NVIDIA's GeForce GTX Titan, Part 1: Titan For Gaming, Titan For Compute
Author: Ryan Smith

by Ryan Smith on February 19, 2013 9:01 AM EST

Posted in
GPUs
GeForce
Kepler
NVIDIA
Titan

157 Comments | Add A Comment

157 Comments

GPU Boost 2.0: Overclocking & Overclocking Your Monitor

The first half of the GPU Boost 2 story is of course the fact that with 2.0 NVIDIA is switching from power based controls to temperature based controls. However there is also a second story here, and that is the impact to overclocking.

With the GTX 680, overclocking capabilities were limited, particularly in comparison to the GeForce 500 series. The GTX 680 could have its power target raised (guaranteed “overclocking”), and further overclocking could be achieved by using clock offsets. But perhaps most importantly, voltage control was forbidden, with NVIDIA going so far as to nix EVGA and MSI’s voltage adjustable products after a short time on the market.

There are a number of reasons for this, and hopefully one day soon we’ll be able to get into NVIDIA’s Project Greenlight video card approval process in significant detail so that we can better explain this, but the primary concern was that without strict voltage limits some of the more excessive users may blow out their cards with voltages set too high. And while the responsibility for this ultimately falls to the user, and in some cases the manufacturer of their card (depending on the warranty), it makes NVIDIA look bad regardless. The end result being that voltage control on the GTX 680 (and lower cards) was disabled for everyone, regardless of what a card was capable of.

With Titan this has finally changed, at least to some degree. In short, NVIDIA is bringing back overvoltage control, albeit in a more limited fashion.

For Titan cards, partners will have the final say in whether they wish to allow overvolting or not. If they choose to allow it, they get to set a maximum voltage (Vmax) figure in their VBIOS. The user in turn is allowed to increase their voltage beyond NVIDIA’s default reliability voltage limit (Vrel) up to Vmax. As part of the process however users have to acknowledge that increasing their voltage beyond Vrel puts their card at risk and may reduce the lifetime of the card. Only once that’s acknowledged will users be able to increase their voltages beyond Vrel.

With that in mind, beyond overvolting overclocking has also changed in some subtler ways. Memory and core offsets are still in place, but with the switch from power based monitoring to temperature based monitoring, the power target slider has been augmented with a separate temperature target slider.

The power target slider is still responsible for controlling the TDP as before, but with the ability to prioritize temperatures over power consumption it appears to be somewhat redundant (or at least unnecessary) for more significant overclocking. That leaves us with the temperature slider, which is really a control for two functions.

First and foremost of course is that the temperature slider controls what the target temperature is for Titan. By default for Titan this is 80C, and it may be turned all the way up to 95C. The higher the temperature setting the more frequently Titan can reach its highest boost bins, in essence making this a weaker form of overclocking just like the power target adjustment was on GTX 680.

The second function controlled by the temperature slider is the fan curve, which for all practical purposes follows the temperature slider. With modern video cards ramping up their fan speeds rather quickly once cards get into the 80C range, merely increasing the power target alone wouldn’t be particularly desirable in most cases due to the extra noise it generates, so NVIDIA tied in the fan curve to the temperature slider. By doing so it ensures that fan speeds stay relatively low until they start exceeding the temperature target. This seems a bit counterintuitive at first, but when put in perspective of the goal – higher temperatures without an increase in fan speed – this starts to make sense.

Finally, in what can only be described as a love letter to the boys over at 120hz.net, NVIDIA is also introducing a simplified monitor overclocking option, which can be used to increase the refresh rate sent to a monitor in order to coerce it into operating at that higher refresh rate. Notably, this isn’t anything that couldn’t be done before with some careful manipulation of the GeForce control panel’s custom resolution option, but with the monitor overclocking option exposed in PrecisionX and other utilities, monitor overclocking has been reduced to a simple slider rather than a complex mix of timings and pixel counts.

Though this feature can technically work with any monitor, it’s primarily geared towards monitors such as the various Korean LG-based 2560x1440 monitors that have hit the market in the past year, a number of which have come with electronics capable of operating far in excess of the 60Hz that is standard for those monitors. On the models that have been able to handle it, modders have been able to get some of these 2560x1440 monitors up to and above 120Hz, essentially doubling their native 60Hz refresh rate to 120Hz, greatly improving smoothness to levels similar to a native 120Hz TN panel, but without the resolution and quality drawbacks inherent to those TN products.

Of course it goes without saying that just like any other form of overclocking, monitor overclocking can be dangerous and risks breaking the monitor. On that note, out of our monitor collection we were able to get our Samsung 305T up to 75Hz, but whether that’s due to the panel or the driving electronics it didn’t seem to have any impact on performance, smoothness, or responsiveness. This is truly a “your mileage may vary” situation.

GPU Boost 2.0: Temperature Based Boosting Origin’s Genesis: Titan on Water & More to Come

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

157 Comments

View All Comments

hammer256 - Tuesday, February 19, 2013 - link
Ryan's analysis of the target market for this card is spot on: this card is for small scale HPC type workloads, where the researcher just want to build a desktop-like machine with a few of those cards. I know that's what I use for my research. To me, this is the real replacement of the GTX 580 for our purposes. The price hike is not great, but when put to context of the K20X, it's a bargain. I'm lusting to get 8 of these cards and get a Tyan GPU server.
Gadgety - Tuesday, February 19, 2013 - link
While gamers see little benefit, it looks like this is the card for GPU rendering, provided the software developers at VRay, Octane and others find a way to tap into this. So one of these can replace the 3xGTX580 3GBs.
chizow - Tuesday, February 19, 2013 - link
Nvidia has completely lost their minds. Throwing in a minor bone with the non-neutered DP performance does not give them license to charge $1K for this part, especially when DP on previous flagship parts carried similar performance relative to Tesla.

First the $500 for a mid-range ASIC in GTX 680, then $1200 GTX 690 and now a $1000 GeForce Titan. Unbelievable. Best of luck Nvidia, good luck competing with the next-gen consoles at these price points, or even with yourselves next generation.

While AMD is still at fault in all of this for their ridiculous launch pricing for the 7970, these recent price missteps from Nvidia make that seem like a distant memory.
ronin22 - Wednesday, February 20, 2013 - link
Bullshit of a typical NV hater.

The compute-side of the card isn't a minor bone, it's its prime feature, along with the single-chip GTX690-like performance.

"especially when DP on previous flagship parts carried similar performance relative to Tesla"

Bullshit again.
Give me a single card that is anywhere near the K20 in DP performance and we'll talk.

You don't understand the philosophy of this card, as many around here.
Thanksfully, the real intended audience is already recognizing the awesomeness of this card (read previous comments).

You can go back to playing BF3 on your 79xx, but please close the door behind you on your way out ;)
chizow - Wednesday, February 20, 2013 - link
Heh, your ignorant comments couldn't be further from the truth about being an "NV hater". I haven't bought an ATI/AMD card since the 9700pro (my gf made the mistake of buying a 5850 though, despite my input) and previously, I solely purchased *multiple* Nvidia cards in this flagship market for the last 3 generations.

I have a vested interest in Nvidia in this respect as I enjoy their products, so I've never rooted for them to fail, until now. It's obvious to me now that between AMD's lackluster offerings and ridiculous launch prices along with Nvidia's greed with their last two high-end product launches (690 and Titan), that they've completely lost touch with their core customer base.

Also, before you comment ignorantly again, please look up the DP performance of GTX 280 and GTX 480/580 relative to their Tesla counterparts. You will see they are still respectable, ~1/8th of SP performance, which was still excellent compared to the completely neutered 1/32 DP of GK104 Kepler. That's why there is still a high demand for flagship Fermi parts and even GT200 despite their overall reputation as a less desirable part due to their thermal characteristics.

Lastly, I won't be playing BF3 on a 7970, try a pair of GTX 670s in SLI. There's a difference between supporting a company through sound purchasing decisions and stupidly pissing away $1K for something that cost $500-$650 in the past.

The philosophy of this card is simple: Rob stupid people of their money. I've seen enough of this in the past from the same target audience and generally that feeling of "awesomeness" is quickly replaced by buyer's remorse as they realize that slightly higher FPS number in the upper left of their screen isn't worth the massive number on their credit card statement.
CeriseCogburn - Sunday, February 24, 2013 - link
That one's been pissing acid since the 680 launch, failed and fails to recognize the superior leap of the GTX580 over the prior gen, which gave him his mental handicap believing he can get something for nothing, along with sucking down the master amd fanboy Charlie D's rumor about the "$350" flagship nVidia card blah blah blah 680 blah blah second tier blah blah blah.

So instead the rager now claims he wasted near a grand on two 670's - R O F L - the lunatics never end here man.
bamboo69 - Tuesday, February 19, 2013 - link
Origin is using EK Waterblocks? i hope they arentt nickel plated, their nickel blocks flake
Knock24 - Wednesday, February 20, 2013 - link
I've seen it mentioned in the article that Titan has HyperQ support, but I've also read the opposite elsewhere.
Can anyone confirm that HyperQ is supported? I'm guessing the simpleHyperQ Cuda SDK example might reveal if it's supported or not.
torchedguitar - Wednesday, February 20, 2013 - link
HyperQ actually means two separate things... One part is the ability to have a process act as a server, providing access to the GPU for other MPI processes. This is supported on Linux using Tesla cards (e.g. K20X) only, so it won't work on GTX Titan (it does work on Titan the supercomputer, though). The other part of HyperQ is that there are multiple hardware queues available for managing the work on multiple CUDA streams. GTX Titan DOES support this part, although I'm not sure just how many of these will be enabled (it's a tradeoff, having more hardware streams allows more flexibility in launching concurrent kernels, but also takes more memory and takes more time to initialize). The simpleHyperQ sample is a variation of the concurrentKernels sample (just look at the code), and it shows how having more hardware channels cuts down on false dependencies between kernels in different streams. You put things in different stream because they have no dependencies on each other, so in theory nothing in stream X should ever get stuck waiting for something in stream Y. When that does happen due to hitting limits of the hardware, it's a false dependency. An example would be when you try to time a kernel launch by wrapping it with CUDA event records (this is the simpleHyperQ sample). GPUs before GK110 only have one hardware stream, and if you take a program that launches kernels concurrently in separate streams, and wrap all the kernels with CUDA event records, you'll see that suddenly the kernels run one-at-a-time instead of all together. This is because in order to do the timing for the event, the single hardware channel queues up the other launches while waiting for each kernel to finish, then it records the end time in the event, then goes on to the next kernel. With HyperQ's addition of more hardware streams, you get around this problem. Run the simpleHyperQ sample on a 580 or a 680 through a tool like Nsight and look at the timeline... You'll see all the work in the streams show up like stair steps -- even though they're in different streams, they happen one at a time. Now run it on a GTX Titan or a K20 and you'll see many of the kernels are able to completely overlap. If 8 hardware streams are enabled, the app will finish 8x faster, or if 32 are enabled, 32x faster.

Now, this sample is extremely contrived, just to illustrate the feature. In reality, overlapping kernels won't buy you much speedup if you're already launching big enough kernels to use the GPU effectively. In that case, there shouldn't much room left for overlapping kernels, except when you have unbalanced workloads where many threads in a kernel finish quickly but a few stragglers run way longer. With HyperQ, you greatly increase your chances that kernels in other streams can immediately start using the resources freed up when some of the threads in a kernel finish early, instead of waiting for all threads in the kernel to finish before starting the next kernel.
vacaloca - Monday, March 4, 2013 - link
I wanted to say that you hit the nail on the head... I just tested the simpleHyperQ example, and indeed, the Titan has 8 hardware streams enabled. For every multiple higher than 8, and the "Measured time for sample" goes up.

NVIDIA's GeForce GTX Titan, Part 1: Titan For Gaming, Titan For Compute

Post Your Comment

157 Comments

View All Comments

hammer256 - Tuesday, February 19, 2013 - link

Gadgety - Tuesday, February 19, 2013 - link

chizow - Tuesday, February 19, 2013 - link

ronin22 - Wednesday, February 20, 2013 - link

chizow - Wednesday, February 20, 2013 - link

CeriseCogburn - Sunday, February 24, 2013 - link

bamboo69 - Tuesday, February 19, 2013 - link

Knock24 - Wednesday, February 20, 2013 - link

torchedguitar - Wednesday, February 20, 2013 - link

vacaloca - Monday, March 4, 2013 - link

Log in

Don't have an account? Sign up now