The NVIDIA GeForce GTX 1080 & GTX 1070 Founders Editions Review: Kicking Off the FinFET Generation

Name: The NVIDIA GeForce GTX 1080 & GTX 1070 Founders Editions Review: Kicking Off the FinFET Generation
Item: The NVIDIA GeForce GTX 1080 & GTX 1070 Founders Editions Review: Kicking Off the FinFET Generation
Author: Ryan Smith

by Ryan Smith on July 20, 2016 8:45 AM EST

Posted in
GPUs
GeForce
NVIDIA
Pascal
16nm

200 Comments | Add A Comment

200 Comments

GPU Boost 3.0: Finer-Grained Clockspeed Controls

While much of this is abstracted away in everyday GPU discussions, under the hood the concept of clockspeed is a ~~little~~ lot more complex than the simple base clock and boost clock numbers posted in specification tables. Since the introduction of Kepler, NVIDIA has introduced fine-grained voltage points, which defines a series of GPU voltages and their respective clockspeeds. The GPU in turn operates at points along the resulting curve, shifting clockspeeds based on which voltage it’s at and what the environmental conditions are.

While these voltage points have been present since Kepler, NVIDIA has not, until now, exposed them to end users. However with Pascal this is finally changing, with the introduction of GPU Boost 3.0.

With the latest rendition of NVIDIA’s GPU clockspeed management technology, NVIDIA has made the individual voltage points programmable, and in turn they are exposing this functionality to third party overclocking programs via NVAPI. Consequently it is now possible to adjust the clockspeed of Pascal GPUs at each voltage point, a much greater level of control than before.

The addition of finer-grained controls is designed to improve the flexibility of overclocking on Pascal. Prior to GPU Boost 3.0, the only way to overclock was to adjust the clockspeed for all voltage points by the same amount at the same time – or in NVIDIA’s GPU Boost 3.0 vernacular, a fixed frequency offset. While this certainly works, it limits the highest stable overclock to the lowest point on the voltage/frequency curve. If the GPU can only overclock by 50MHz at the highest voltage point, but 100MHz at a middle point, then the highest stable overclock is only going to be 50MHz.

With GPU Boost 3.0 on the other hand, each point on the curve can be adjusted individually. This means the weakest points can be overclocked to a lesser degree while the strongest points can be more significantly overclocked. All other things held equal, this should improve GPU overclocking performance, as GPU tends to shift along multiple points when it’s running. Put another way: GPU Boost 3.0 seeks to wring out the last bits of overclocking headroom along the voltage frequency curve. The only way to go higher still would be to increase the voltage, which NVIDIA hasn’t truly allowed since Fermi.

Meanwhile, the flip side of having finer-grained controls is that it’s now more work to dial in the perfect overclock. Rather than testing one overclock you now have to test nearly two-dozen voltage points to fully exploit GPU Boost 3.0’s abilities, which is time consuming at the best of times. As a result NVIDIA has also exposed a setting in NVAPI to lock the GPU at a specific voltage point. The significance of this is that it now allows overclocking utilities to go through the voltage points and discretely test each one.

The first software to implement this concept is EVGA’s Precision XOC. The latest iteration of EVGA’s overclocking software is able to go through the voltage points and run an OC ScannerX test on each one to find its stability. When a point fails, Precision XOC will then back off the frequency at that point and move on. The end result is that after a series of trials and failures, you should have the virtually-perfect overclock.

Unfortunately while this is sound in concept, in practice NVIDIA and EVGA still aren’t quite there yet. Overclocking failures can cause multiple types of failures; graphic corruption (easy to catch and recover), driver crashes (moderately difficult to recover from), and system hardlocks (very difficult to recover from). In practice, Precision XOC isn’t yet at the point where it can quickly and efficiently handle the last two cases; so driver crashes and system hardlocks still require human intervention, and Precision XOC doesn’t do a great job of resuming from where it left off.

Hopefully one day NVIDIA and EVGA will get there, but for now the only practical way to fully exploit GPU Boost 3.0 is the tedious way. This means either using traditional offset overclocking, or a mode NVIDIA calls linear overclocking, in which the slope of the voltage/frequency curve is adjusted rather than offset (think m in y=mx+b rather than b). In this case two points are picked, and all of the voltage points are overclocked to match the resulting linear curve.

Observations on Clocking with Pascal

While we’re on the subject of clockspeed management on Pascal, I want to discuss my observations with how clockspeeds work on NVIDIA’s newest GPU. When it comes to clockspeed management NVIDIA hasn’t just changed how overclocking works, but relative to Kepler/Maxwell, there are some other, subtle changes.

To start, Pascal clockspeeds are much more temperature-dependent than on Maxwell 2 or Kepler. Kepler would drop a single bin at a specific temperature, and Maxwell 2 would sustain the same clockspeed throughout. However Pascal will drop its clockspeeds as the GPU warms up, regardless of whether it still has formal thermal and TDP headroom to spare. This happens by backing off both on the clockspeed at each individual voltage point, and backing off to lower voltage points altogether.

To quantify this effect, I ran LuxMark 3.1 continuously for several minutes, until the GPU temperature leveled out. As a compute test, LuxMark does not cause the GTX 1080 to hit its 83C temperature limit nor its 180W TDP limit, so it’s a good example of the temperature compensation effect.

What we find is that from the start of the run until the end, the GPU clockspeed drops from the maximum boost bin of 1898MHz to a sustained 1822MHz, a drop of 4%, or 6 clockspeed bins. These shifts happen relatively consistently up to 68C, after which they stop.

For what it’s worth, the GTX 1080 gets up to 68C relatively quickly, so GPU performance stabilizes rather soon. But this does mean that GTX 1080’s performance is more temperature dependent than GTX 980’s. Throwing a GTX 1080 under water could very well net you a few percent performance increase by avoiding the compensation effect, along with any performance gained from avoiding the card’s 83C temperature throttle.

In any case, I believe this to be compensation for the effects of higher temperatures on the GPU, backing off on voltages/clockspeeds due to potential issues. What those issues are I’m not sure; it could be that 16nm FinFET doesn’t like high voltages at higher temperatures (NVIDIA takes several steps to minimize GPU degradation), or something else entirely.

Otherwise, outside of the temperature compensation effect, clockspeeds on GTX 1080 appear to mostly be a function of temperature or running out of boost bins (VREL limited). The card rarely appears to be TDP limited, especially at steady-state. This indicates that NVIDIA could probably increase the fan speed of the cooler a bit to get a bit more performance, but at the cost of generating a bit more noise.

Finally, how overvolting is being represented is a bit different from before. Previously NVIDIA (and EVGA Precision) would show the exact additional voltage (i.e. the voltage of the unlocked voltage points) when overvolting. However now overvolting is expressed on a percentage scale from 0% to 100%, which obfuscates what the higher voltage points actually are. However this hasn’t changed the underlying behavior of overvolting; one or more voltage points are calibrated by NVIDIA, but they are locked due to the potential for GPU degradation. Overvolting then unlocks these points, allowing the GPU to boost higher so long as there is thermal and power headroom to allow it.

SLI: The Abridged Version NVIDIA Works: ANSEL & VRWorks Audio

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

200 Comments

View All Comments

TestKing123 - Wednesday, July 20, 2016 - link
Sorry, too little too late. Waited this long, and the first review was Tomb Raider DX11?! Not 12?

This review is both late AND rushed at the same time.
Mat3 - Wednesday, July 20, 2016 - link
Testing Tomb Raider in DX11 is inexcusable.

http://www.extremetech.com/gaming/231481-rise-of-t...
TheJian - Friday, July 22, 2016 - link
Furyx still loses to 980ti until 4K at which point the avg for both cards is under 30fps, and the mins are both below 20fps. IE, neither is playable. Even in AMD's case here we're looking at 7% gain (75.3 to 80.9). Looking at NV's new cards shows dx12 netting NV cards ~6% while AMD gets ~12% (time spy). This is pretty much a sneeze and will as noted here and elsewhere, it will depend on the game and how the gpu works. It won't be a blanket win for either side. Async won't be saving AMD, they'll have to actually make faster stuff. There is no point in even reporting victory at under 30fps...LOL.

Also note in that link, while they are saying maxwell gained nothing, it's not exactly true. Only avg gained nothing (suggesting maybe limited by something else?), while min fps jumped pretty much exactly what AMD did. IE Nv 980ti min went from 56fps to 65fps. So while avg didn't jump, the min went way up giving a much smoother experience (amd gained 11fps on mins from 51 to 62). I'm more worried about mins than avgs. Tomb on AMD still loses by more than 10% so who cares? Sort of blows a hole in the theory that AMD will be faster in all dx12 stuff...LOL. Well maybe when you force the cards into territory nobody can play at (4k in Tomb Raiders case).

It would appear NV isn't spending much time yet on dx12, and they shouldn't. Even with 10-20% on windows 10 (I don't believe netmarketshare's numbers as they are a msft partner), most of those are NOT gamers. You can count dx12 games on ONE hand. Most of those OS's are either forced upgrades due to incorrect update settings (waking up to win10...LOL), or FREE on machine's under $200 etc. Even if 1/4 of them are dx12 capable gpus, that would be NV programming for 2.5%-5% of the PC market. Unlike AMD they were not forced to move on to dx12 due to lack of funding. AMD placed a bet that we'd move on, be forced by MSFT or get console help from xbox1 (didn't work, ps4 winning 2-1) so they could ignore dx11. Nvidia will move when needed, until then they're dominating where most of us are, which is 1080p or less, and DX11. It's comic when people point to AMD winning at 4k when it is usually a case where both sides can't hit 30fps even before maxing details. AMD management keeps aiming at stuff we are either not doing at all (4k less than 2%), or won't be doing for ages such as dx12 games being more than dx11 in your OS+your GPU being dx12 capable.

What is more important? Testing the use case that describes 99.9% of the current games (dx11 or below, win7/8/vista/xp/etc), or games that can be counted on ONE hand and run in an OS most of us hate. No hate isn't a strong word here when the OS has been FREE for a freaking year and still can't hit 20% even by a microsoft partner's likely BS numbers...LOL. Testing dx12 is a waste of time. I'd rather see 3-4 more dx11 games tested for a wider variety although I just read a dozen reviews to see 30+ games tested anyway.
ajlueke - Friday, July 22, 2016 - link
That would be fine if it was only dx12. Doesn't look like Nvidia is investing much time in Vulkan either, especially not on older hardware.

http://www.pcgamer.com/doom-benchmarks-return-vulk...
Cygni - Wednesday, July 20, 2016 - link
Cool attention troll. Nobody cares what free reviews you choose to read or why.
AndrewJacksonZA - Wednesday, July 20, 2016 - link
Typo on page 18: "The Test"
"Core i7-4960X hosed in an NZXT Phantom 630 Windowed Edition" Hosed -> Housed
Michael Bay - Thursday, July 21, 2016 - link
I`d sure hose me a Core i7-4960X.
AndrewJacksonZA - Wednesday, July 20, 2016 - link
@Ryan & team: What was your reasoning for not including the new Doom in your 2016 GPU Bench game list? AFAIK it's the first indication of Vulkan performance for graphics cards.

Thank you! :-)
Ryan Smith - Wednesday, July 20, 2016 - link
We cooked up the list and locked in the games before Doom came out. It wasn't out until May 13th. GTX 1080 came out May 14th, by which point we had already started this article (and had published the preview).
AndrewJacksonZA - Wednesday, July 20, 2016 - link
OK, thank you. Any chance of adding it to the list please?

I'm a Windows gamer, so my personal interest in the cross-platform Vulkan is pretty meh right now (only one title right now, hooray! /s) but there are probably going to be some devs are going to choose it over DX12 for that very reason, plus I'm sure that you have readers who are quite interested in it.

The NVIDIA GeForce GTX 1080 & GTX 1070 Founders Editions Review: Kicking Off the FinFET Generation

GPU Boost 3.0: Finer-Grained Clockspeed Controls

Observations on Clocking with Pascal

Post Your Comment

200 Comments

View All Comments

TestKing123 - Wednesday, July 20, 2016 - link

Mat3 - Wednesday, July 20, 2016 - link

TheJian - Friday, July 22, 2016 - link

ajlueke - Friday, July 22, 2016 - link

Cygni - Wednesday, July 20, 2016 - link

AndrewJacksonZA - Wednesday, July 20, 2016 - link

Michael Bay - Thursday, July 21, 2016 - link

AndrewJacksonZA - Wednesday, July 20, 2016 - link

Ryan Smith - Wednesday, July 20, 2016 - link

AndrewJacksonZA - Wednesday, July 20, 2016 - link

Log in

Don't have an account? Sign up now