NVIDIA GeForce GTX 680 Review: Retaking The Performance Crown
by Ryan Smith on March 22, 2012 9:00 AM ESTTheoreticals
As with any new architecture, we want to take a few moments to look at theoretical performance. These numbers shouldn’t be taken too seriously for cross-vendor comparison, but these numbers often tell us more about interesting architectural improvements that occur from one generation to the next.
Our first theoretical test is perhaps the most perplexing: 3DMark Vantage’s pixel fill test. Typically this test is memory bandwidth bound as the nature of the test has the ROPs pushing as many pixels as possible with as little overhead as possible, which in turn shifts the bottleneck to a mix of ROP performance and the memory bandwidth needed to feed those ROPs.
Compared to the GTX 580, the GTX 680 has almost exactly the same amount of memory bandwidth (192GB/sec) and only 86% of the theoretical ROP performance (37Gpix vs. 32Gpix). In short, it shouldn’t outperform the GTX 580 here, and yet it outperforms the 580 by 33%.
Why does it do this? That’s the hard thing to answer. As we mentioned in our look at GK104’s architecture, NVIDIA did make some minor incremental improvements to their ROPs coming from GF114, such as slightly improved compression and improved polygon merging. One of those may very well be the contributing factor, particularly the compression improvements since this is a typically memory bandwidth bottlenecked test. Alternatively, it’s interesting to note that the difference between the two video cards is almost identical to the difference in the core clock. GTX 560 Ti’s results tend to blow a hole in this theory, but it bears consideration.
In any case, it’s an interesting turn of events and hopefully one that isn’t simply an edge case. As we’ve seen in our benchmarks GTX 680 has strong performance – even if its lead compared to the 7970 diminishes with resolution – but compared to the GTX 580 in particular it needs strong ROP performance across all games in order to deliver good performance at high resolutions and anti-aliasing.
Our second theoretical test is 3DMark Vantage’s texture fill test, which to no surprise has the GTX 680 handily clobbering all prior NVIDIA cards. NVIDIA’s inclusion of 128 texture units on GK104 versus 64 on their previous generation GPUs gives the GTX 680 far better texturing performance. The 30%+ core clock difference only serves to further widen the gap.
Our third theoretical test is the set of settings we use with Microsoft’s Detail Tessellation sample program out of the DX11 SDK. Overall while NVIDIA didn’t make any significant changes to their tessellation hardware (peak triangle rate is still 4/cycle), they have been working on further improving performance at absurdly high tessellation factors. You can see some of this in action at the max factor setting, but even then we’re running into a general performance wall since the Detail Tessellation program can’t go to the absolute highest tessellation factors NVIDIA’s hardware supports.
Our final theoretical test is Unigine Heaven 2.5, a benchmark that straddles the line between a synthetic benchmark and a real-world benchmark as the engine is licensed but no notable DX11 games have been produced using it yet. In any case the Heaven benchmark is notable for its heavy use of tessellation, which means it’s largely a proxy test for tessellation performance. Here we can see the GTX 680 shoot well ahead of the GTX 580 – by more than we saw in the DX11 Detail Tessellation sample – but at the same time there’s a lot more going on in Heaven than just tessellation.
Honestly at this point in time I’m not sure just how much more tessellation performance is going to matter. Until DX11 is the baseline API for games, tessellation is still an add-on feature, which means it’s being used to add fine detail to specific models rather than being used on everything in a game world. This demands good tessellation at high factors but at the same time it’s subject to diminishing returns on the improvement to image quality as triangles reach single pixel sizes and smaller. To that end I’m still waiting to see the day where we see tessellation scale similarly to textures – that is by using full MIP chaining of displacement maps – at which point we can evaluate tessellation performance similar to texture performance when it comes to both measuring the performance hit and evaluating the difference in image quality.
404 Comments
View All Comments
SlyNine - Thursday, March 22, 2012 - link
While I love BF3, it's not the only game that matters, but it is the only example of the 680 beating the 7970 at frame rates that matter. However the 7970 is catching up as the res goes up. If we add 2 monitors does the 680 still win ?BTW Crysis and Metro 2033 FPS matters to me. Do you think the GPU world revolves around you and what you want? You are not the center of the Videocard world.
Eugene86 - Thursday, March 22, 2012 - link
No, the GPU world doesn't revolve around me but as I already said, nobody but you and a handful of other people actually care about the Crysis and Metro benchmarks because almost nobody plays those games anymore.In my example, Battlefield 3 is a current game that is actually played by people so those benchmarks are useful.
The only reason why Crysis and Metro are used are because they are benchmark games that stress the video cards to their limits. This is nice for bragging rights but completely useless in the real world.
Nvidia and AMD both release video cards that are aimed to please their main market, which is gamers who play on a single monitor at 1080p.
SlyNine - Friday, March 23, 2012 - link
So what are you baseing this on? can you give me any sources?I'm not buying a 600$ video card for just one game.
Plus like I said, as the settings go up, they seem to converge. I can't help but wonder if the 7970 would overtake the 680 at some point before we hit 30fps.
CeriseCogburn - Tuesday, March 27, 2012 - link
Then look at SHOGUN 2 total war in this very article man.Wow, s many of you are so controlled and so mindless on things...
" Total War: Shogun 2 is the latest installment of the long-running Total War series of turn based strategy games, and alongside Civilization V is notable for just how many units it can put on a screen at once. As it also turns out, it’s the single most punishing game in our benchmark suite "
680 takes the top in that game man.
Galidou - Sunday, March 25, 2012 - link
''Nvidia and AMD both release video cards that are aimed to please their main market, which is gamers who play on a single monitor at 1080p''Well then it means that gamers can be more than pleased with a radeon 6870 at 140$ that runs everything with 95% graphical options enabled or a gtx 560ti(not the 448 cores version) for around 200$ which performs a little better than the 6870 and still does the trick in everygame at 1080p.
Prices taken from the bay as no regular 560ti was available on newegg for price comparison.
Oh... and for the 5% graphical options you can't turn on, you'll only notice when you go on a sunday walk in your games, but doing so will have you dead in a second if you play online against other players...
b3nzint - Monday, March 26, 2012 - link
i play metro, cysis a lot, amazing graphic! but thats not "real world" to me. maybe if i play bf3 im in real world?CeriseCogburn - Tuesday, March 27, 2012 - link
Shogun 2 total war, the most demanding game in the benches- did you read ?Nvidia GTX680 sweeps the entire resolution set beating the slower 7970 that cannot handle modern demanding games as well,.
akse - Thursday, March 22, 2012 - link
Seems impossible to do such a feat!!! Considering they launched it months later than the competitor!arjuna1 - Thursday, March 22, 2012 - link
Correct??You call +10 fps difference at best, on certain situations, a domination??
The only good thing this will bring is prices down, the rest is truly unremarkable, for both companies.
See you in the 8xxx/7xxx series.
Wreckage - Thursday, March 22, 2012 - link
Maybe I was a bit hasty. I did forget to mention that it hard launched with working drivers and working h.264 encoding, also quiet under load. Impressive++