The NVIDIA GeForce GTX 1080 & GTX 1070 Founders Editions Review: Kicking Off the FinFET Generation
by Ryan Smith on July 20, 2016 8:45 AM ESTOverclocking
For our final evaluation of the GTX 1080 and GTX 1070 Founders Edition cards, let’s take a look a overclocking.
Whenever I review an NVIDIA reference card, I feel it’s important to point out that while NVIDIA supports overclocking – why else would they include fine-grained controls like GPU Boost 3.0 – they have taken a hard stance against true overvolting. Overvolting is limited to NVIDIA’s built in overvoltage function, which isn’t so much a voltage control as it is the ability to unlock 1-2 more boost bins and their associated voltages. Meanwhile TDP controls are limited to whatever value NVIDIA believes is safe for that model card, which can vary depending on its GPU and its power delivery design.
For GTX 1080FE and its 5+1 power design, we have a 120% TDP limit, which translates to an absolute maximum TDP of 216W. As for GTX 1070FE and its 4+1 design, this is reduced to a 112% TDP limit, or 168W. Both cards can be “overvolted” to 1.093v, which represents 1 boost bin. As such the maximum clockspeed with NVIDIA’s stock programming is 1911MHz.
GeForce GTX 1080FE Overclocking | ||||
Stock | Overclocked | |||
Core Clock | 1607MHz | 1807MHz | ||
Boost Clock | 1734MHz | 1934MHz | ||
Max Boost Clock | 1898MHz | 2088MHz | ||
Memory Clock | 10Gbps | 11Gbps | ||
Max Voltage | 1.062v | 1.093v |
GeForce GTX 1070FE Overclocking | ||||
Stock | Overclocked | |||
Core Clock | 1506MHz | 1681MHz | ||
Boost Clock | 1683MHz | 1858MHz | ||
Max Boost Clock | 1898MHz | 2062MHz | ||
Memory Clock | 8Gbps | 8.8Gbps | ||
Max Voltage | 1.062v | 1.093v |
Both cards ended up overclocking by similar amounts. We were able to take the GTX 1080FE another 200MHz (+12% boost) on the GPU, and another 1Gbps (+10%) on the memory clock. The GTX 1070 could be pushed another 175MHz (+10% boost) on the GPU, while memory could go another 800Mbps (+10%) to 8.8Gbps.
Both of these are respectable overclocks, but compared to Maxwell 2 where our reference cards could do 20-25%, these aren’t nearly as extreme. Given NVIDIA’s comments on the 16nm FinFET voltage/frequency curve being steeper than 28nm, this could be first-hand evidence of that. It also indicates that NVIDIA has pushed GP104 closer to its limit, though that could easily be a consequence of the curve.
Given that this is our first look at Pascal, before diving into overall performance, let’s first take a look at an overclocking breakdown. NVIDIA offers 4 knobs to adjust when overclocking: overvolting (unlocking additional boost bins), increasing the power/temperature limits, the memory clock, and the GPU clock. Though all 4 will be adjusted for a final overclock, it’s often helpful to see whether it’s GPU overclocking or memory overclocking that delivers the greater impact, especially as it can highlight where the performance bottlenecks are on a card.
To examine this, we’ve gone ahead and benchmarked the GTX 1080 4 times: once with overvolting and increased power/temp limits (to serve as a baseline), once with the memory overclocked added, once with GPU overclock added, and finally with both the GPU and memory overclocks added.
GeForce GTX 1080 Overclocking Performance | ||||||
Power/Temp Limit (+20%) | Core (+12%) | Memory (+10%) | Cumulative | |||
Tomb Raider |
+3%
|
+4%
|
+1%
|
+10%
|
||
Ashes |
+1%
|
+9%
|
+1%
|
+10%
|
||
Crysis 3 |
+4%
|
+4%
|
+2%
|
+11%
|
||
The Witcher 3 |
+2%
|
+6%
|
+3%
|
+10%
|
||
Grand Theft Auto V |
+1%
|
+4%
|
+2%
|
+8%
|
Across all 5 games, the results are clear and consistent: GPU overclocking contributes more to performance than memory overclocking. To be sure, both contribute, but even after compensating for the fact that the GPU overclock was a bit greater than the memory overclock (12% vs 10%), we still end up with the GPU more clearly contributing. Though I am a bit surprised that increasing the power/temperature limit didn't have more of an effect.
Overall we’re looking at an 8%-10% increase in performance from overclocking. It’s enough to further stretch the GTX 1080FE and GTX 1070FE’s leads, but it won’t radically alter performance.
Finally, let’s see the cost of overclocking in terms of power, temperature, and noise. For the GTX 1080FE, the power cost at the wall proves to be rather significant. An 11% Crysis 3 performance increase translates into a 60W increase in power consumption at the wall, essentially moving GTX 1080FE into the neighborhood of NVIDIA’s 250W cards like the GTX 980 Ti. The noise cost is also not insignificant, as GTX 1080FE has to ramp up to 52.2dB(A), a 4.6dB(A) increase in noise. Meanwhile FurMark essentially confirms these findings, with a smaller power increase but a similar increase in noise.
As for the GTX 1070FE, neither the increase in power consumption nor noise is quite as high as GTX 1080FE, though the performance uplift is also a bit smaller. The power penalty is just 21W at the wall for Crysis 3 and 38W for FurMark. This translates to a 2-3dB(A) increase in noise, topping out at 50.0dB for FurMark.
200 Comments
View All Comments
eddman - Wednesday, July 20, 2016 - link
That puts a lid on the comments that Pascal is basically a Maxwell die-shrink. It's obviously based on Maxwell but the addition of dynamic load balancing and preemption clearly elevates it to a higher level.Still, seeing that using async with Pascal doesn't seem to be as effective as GCN, the question is how much of a role will it play in DX12 games in the next 2 years. Obviously async isn't be-all and end-all when it comes to performance but can Pascal keep up as a whole going forward or not.
I suppose we won't know until more DX12 are out that are also optimized properly for Pascal.
javishd - Wednesday, July 20, 2016 - link
Overwatch is extremely popular right now, it deserves to be a staple in gaming benchmarks.jardows2 - Wednesday, July 20, 2016 - link
Except that it really is designed as an e-sport style game, and can run very well with low-end hardware, so isn't really needed for reviewing flagship cards. In other words, if your primary desire is to find a card that will run Overwatch well, you won't be looking at spending $200-$700 for the new video cards coming out.Ryan Smith - Wednesday, July 20, 2016 - link
And this is why I really wish Overwatch was more demanding on GPUs. I'd love to use it and DOTA 2, but 100fps at 4K doesn't tell us much of use about the architecture of these high-end cards.Scali - Wednesday, July 20, 2016 - link
Thanks for the excellent write-up, Ryan!Especially the parts on asynchronous compute and pre-emption were very thorough.
A lot of nonsense was being spread about nVidia's alleged inability to do async compute in DX12, especially after Time Spy was released, and actually showed gains from using multiple queues.
Your article answers all the criticism, and proves the nay-sayers wrong.
Some of them went so far in their claims that they said nVidia could not even do graphics and compute at the same time. Even Maxwell v2 could do that.
I would say you have written the definitive article on this matter.
The_Assimilator - Wednesday, July 20, 2016 - link
Sadly that won't stop the clueless AMD fanboys from continuing to harp on that NVIDIA "doesn't have async compute" or that it "doesn't work". You've gotta feel for them though, NVIDIA's poor performance in a single tech demo... written with assistance from AMD... is really all the red camp has to go on. Because they sure as hell can't compete in terms of performance, or power usage, or cooler design, or adhering to electrical specifications...tipoo - Wednesday, July 20, 2016 - link
Pretty sure critique was of Maxwell. Pascals async was widely advertised. It's them saying "don't worry, Maxwell can do it" to questions about it not having it, and then when Pascal is released, saying "oh yeah, performance would have tanked with it on Maxwell", that bugs people as it shouldScali - Wednesday, July 20, 2016 - link
Nope, a lot of critique on Time Spy was specifically *because* Pascal got gains from the async render path. People said nVidia couldn't do it, so FutureMark must be cheating/bribed.darkchazz - Thursday, July 21, 2016 - link
It won't matter much though because they won't read anything in this article or Futuremark's statement on Async use in Time Spy.And they will keep linking some forum posts that claim nvidia does not support Async Compute.
Nothing will change their minds that it is a rigged benchmark and the developers got bribed by nvidia.
Scali - Friday, July 22, 2016 - link
Yea, not even this official AMD post will: http://radeon.com/radeon-wins-3dmark-dx12/