Normalized Clocks: Separating Architecture & SMs from Clockspeed Increases

While we were doing our SLI benchmarking we got several requests for GTX 580 results with normalized clockspeeds in order to better separate what performance improvements were due to NVIDIA’s architectural changes and enabling the 16th SM, and what changes are due to the 10% higher clocks. So we’ve quickly run a GTX 580 at 2560 with GTX 480 clockspeeds (700Mhz core, 924Mhz memory) in order to capture this data. Games that benefit most from the clockspeed bump are going to be memory bandwidth or ROP limited, while games showing the biggest improvements in spite of the normalized clockspeeds are games that are shader/texture limited or benefit from the texture and/or Z-cull improvements.

We’ll put 2 charts here, one with the actual framerates and a second with all performance numbers normalized to the GTX 480’s performance.

Games showing the lowest improvement in performance with normalized clockspeeds are BattleForge, STALKER, and Civilization V (which is CPU limited anyhow). At the other end are HAWX, DIRT 2, and Metro 2033.

STALKER and BattleForge hold consistent with our theory that games that benefit the least when normalized are ROP or memory bandwidth limited, as both games only see a pickup in performance once we ramp up the clocks. And on the other end HAWX, DIRT 2, and Metro 2033 still benefit from the clockspeed boost on top of their already hefty boost thanks to architectural improvements and the extra SMs. Interestingly Crysis looks to be the paragon game for the average situation, as it benefits some from the arch/SM improvements, but not a ton.

A subset of our compute benchmarks is much more straightforward here; Folding@Home and SmallLuxGPU improve 6% and 7% respectively from the increase in SMs (theoretical improvement, 6.6%), and then after the clockspeed boost reach 15% faster. From this it’s a safe bet that when GF110 reaches Tesla cards that the performance improvement for Telsa won’t be as great as it was for GeForce since the architectural improvements were purely for gaming purposes. On the flip side with so many SMs currently disabled, if NVIDIA can get a 16 SM Tesla out, the performance increase should be massive.

GTX 580 SLI: Setting New Dual-GPU Records
Comments Locked

82 Comments

View All Comments

  • Nate0007 - Friday, November 12, 2010 - link

    The 6870 CF and 470 SLI are still the BEST bang For the $$$ here when you consider the performance and $$ spent .
    You get Better performance then either of AMD's or Nvida's TOP cards and at less money then either single card solution.
    Here's hoping that AMD's New cards drive down prices !! even more !
    Competition is great .
  • BoFox - Thursday, December 9, 2010 - link

    Hey Ryan Smith, I've been thinking about the normalized results...

    Since the 580 would be more bandwidth-bottlenecked than the 480 at same clock speeds, the bandwidth of the 580 should also be 6.67% higher than that of the 480, in order to keep it in line with 6.67% more shaders and TMU's.

    Then we could subtract exactly 6.67% from the overall performance gains, to compare it directly to GTX 480.

    I've added up an average of the normalized gains (excluding Civilization V since it's CPU-bound), and subtracted 6.67% from the average for the grand result of 3%, after subtracting 2% for the bandwidth penalty. The 2% penalty is just an estimation of how much it would have gained with 6.67% more bandwidth (with the memory running at 3942MHz to keep the bandwidth perfectly linear with the core muscle). If 2% penalty is still too much, please also consider the ROP penalty, as there are still only 48 ROP's for full 512sp, compared to 48 ROP's for 480sp, but then the penalty for ROP's should be very slight, given that 48 ROP's are already plentiful and hardly ever reaches 100% usage with 4x AA.

    The grand result of 3% is a bit lackluster for doubled FP16 texturing power along with minor z-culling and other architecture optimizations.

Log in

Don't have an account? Sign up now