Theoreticals

As with any new architecture, we want to take a few moments to look at theoretical performance. These numbers shouldn’t be taken too seriously for cross-vendor comparison, but these numbers often tell us more about interesting architectural improvements that occur from one generation to the next.

3DMark Vantage Pixel Fill

Our first theoretical test is perhaps the most perplexing: 3DMark Vantage’s pixel fill test. Typically this test is memory bandwidth bound as the nature of the test has the ROPs pushing as many pixels as possible with as little overhead as possible, which in turn shifts the bottleneck to a mix of ROP performance and the memory bandwidth needed to feed those ROPs.

Compared to the GTX 580, the GTX 680 has almost exactly the same amount of memory bandwidth (192GB/sec) and only 86% of the theoretical ROP performance (37Gpix vs. 32Gpix). In short, it shouldn’t outperform the GTX 580 here, and yet it outperforms the 580 by 33%.

Why does it do this? That’s the hard thing to answer. As we mentioned in our look at GK104’s architecture, NVIDIA did make some minor incremental improvements to their ROPs coming from GF114, such as slightly improved compression and improved polygon merging. One of those may very well be the contributing factor, particularly the compression improvements since this is a typically memory bandwidth bottlenecked test. Alternatively, it’s interesting to note that the difference between the two video cards is almost identical to the difference in the core clock. GTX 560 Ti’s results tend to blow a hole in this theory, but it bears consideration.

In any case, it’s an interesting turn of events and hopefully one that isn’t simply an edge case. As we’ve seen in our benchmarks GTX 680 has strong performance – even if its lead compared to the 7970 diminishes with resolution – but compared to the GTX 580 in particular it needs strong ROP performance across all games in order to deliver good performance at high resolutions and anti-aliasing.

3DMark Vantage Texture Fill

Our second theoretical test is 3DMark Vantage’s texture fill test, which to no surprise has the GTX 680 handily clobbering all prior NVIDIA cards. NVIDIA’s inclusion of 128 texture units on GK104 versus 64 on their previous generation GPUs gives the GTX 680 far better texturing performance. The 30%+ core clock difference only serves to further widen the gap.

DirectX11 Detail Tessellation Sample - Normal

DirectX11 Detail Tessellation Sample - Max

Our third theoretical test is the set of settings we use with Microsoft’s Detail Tessellation sample program out of the DX11 SDK. Overall while NVIDIA didn’t make any significant changes to their tessellation hardware (peak triangle rate is still 4/cycle), they have been working on further improving performance at absurdly high tessellation factors. You can see some of this in action at the max factor setting, but even then we’re running into a general performance wall since the Detail Tessellation program can’t go to the absolute highest tessellation factors NVIDIA’s hardware supports.

Unigine Heaven

Our final theoretical test is Unigine Heaven 2.5, a benchmark that straddles the line between a synthetic benchmark and a real-world benchmark as the engine is licensed but no notable DX11 games have been produced using it yet. In any case the Heaven benchmark is notable for its heavy use of tessellation, which means it’s largely a proxy test for tessellation performance. Here we can see the GTX 680 shoot well ahead of the GTX 580 – by more than we saw in the DX11 Detail Tessellation sample – but at the same time there’s a lot more going on in Heaven than just tessellation.

Honestly at this point in time I’m not sure just how much more tessellation performance is going to matter. Until DX11 is the baseline API for games, tessellation is still an add-on feature, which means it’s being used to add fine detail to specific models rather than being used on everything in a game world. This demands good tessellation at high factors but at the same time it’s subject to diminishing returns on the improvement to image quality as triangles reach single pixel sizes and smaller. To that end I’m still waiting to see the day where we see tessellation scale similarly to textures – that is by using full MIP chaining of displacement maps – at which point we can evaluate tessellation performance similar to texture performance when it comes to both measuring the performance hit and evaluating the difference in image quality.

Compute: What You Leave Behind? Power, Temperature, & Noise
Comments Locked

404 Comments

View All Comments

  • SlyNine - Sunday, March 25, 2012 - link

    Well, the driver themself can take more CPU power to run. But with a quad core CPU the thought is laughable. Back in the single CPU/core days it was actually an issue. And before DX9 (or 10) Drivers were only capable of accessing single cores I believe.
  • SlyNine - Sunday, March 25, 2012 - link

    Then look for an overclocked review. Anandtech is always going to do an out of the box for the first review.

    This is what they(amd/nvidia) are promising you, nothing more.
  • papapapapapapapababy - Monday, March 26, 2012 - link

    USELESS !

    YESSS OMFG i cant wait to play the latest crappy kinect port with this!.... at 600.000.000 FPS and in 3-D! GTFO GUYS! REALLY....

    just put this ridiculously large, ugly, noisy, silly, and overpriced, toxic waste where it belongs: faaar away from me, ( sensible user) inside one bulky OnLive cloud server. (and pushing avatar 2 graphics, no HDps2 ports)
  • henrikfm - Monday, March 26, 2012 - link

    Most monitors have 60Hz refresh rate, you can't benefit from higher frame rates because only 60 frames are drawn.

    By looking at the benchmarks and considering a resolution of 1920, the latest cards fail in 3 games to deliver at least 60fps: Crysis, Metro and BF3. In the first two games the HD7970 beats de GTX680, only loses in BF3 where nVidia has a clear advantage (in my opinion AMD has to work in drivers for BF3).

    So, the GTX680 is faster when the speed really doesn't matter because you're already around 100fps. The guys who are running multiple monitors and higher resolutions will have also money to buy multiple GPU setups, and that is another story.

    Still the GTX680 is a better card, but for $500 I would expect a card to deliver at least 60fps at 1920 for a 2008 released videogame like Crysis. Neither nVidia nor AMD can do that with a single GPU, it's disappointing.
  • gramboh - Monday, March 26, 2012 - link

    I'll agree about Metro because there is a sequel (Last Light) coming out in Q1-2013 which will presumably be similar in the graphics department.

    Crysis is irrelevant other than for benchmarking, who still plays it? Single player campaign is entertaining for the eye candy once through (in 2008).

    BF3 is the game that matters because of the MP component, people will be playing it for years to come. AMD really really has to improve performance on the 7950/7970 in BF3, I won't even consider buying it (vs. the 680) unless they can make up some significant ground.
  • CeriseCogburn - Tuesday, March 27, 2012 - link

    I just have to do it, sorry.
    You forgot Shogun 2 total war, the hardest game in this bench set, that Nvidia wins in all 3 resolutions.
    You also forgot average frames are not low frames, so you need far above 60 fps average before you don't dip below 60 fps.
    Furthermore, all the eye candy is not cranked, driving the average and dips even lower when it is.
    You got nothing right.
  • b3nzint - Monday, March 26, 2012 - link

    back in 7970 review, its got cool stuffs tech. like PRT, MST hubm, DDMA and bla bla bla. why gtx680 dont have sh** like that. pardon my english. its like this thing is built only for 1 purpose only and thats a success. thanks
  • mpx - Monday, March 26, 2012 - link

    This new Nvidia card supposedly has an architecture that burdens CPU with scheduling etc. It may mean that it requires a faster CPU than ATI cards to reach similar performance. And since fast CPUs are expansive it may mean it's actually more expansive.
  • BoFox - Monday, March 26, 2012 - link

    The key word in your first sentence is "supposedly".

    I see no evidence of this. It actually does far better in Starcraft 2, a game that already burdens the CPU. It also excels in Skyrim, while still doing just fine in Civilization V, which are also the most CPU-intensive games out there.
  • BoFox - Monday, March 26, 2012 - link

    In SC2 which is a very CPU-dependent game, the card still does amazingly well against the rest of others. The same also goes for Skyrim, beating the red team by a whopping percentage.

Log in

Don't have an account? Sign up now