Compute & Tessellation Performance

As we mentioned in our look at the new Forceware 260 driver set, the drivers provided by NVIDIA for our testing of the GTS 450 have a broken OpenCL component, so we had to cut our compute benchmarking short by dropping our OpenCL benchmark. Furthermore our pre-release version of Badaboom with Fermi support doesn’t work either, so that also was dropped. What we offer instead is a much more abbreviated look at the GTS 450’s compute performance.

As was the case with GF104, GF106 is a superscalar design. With 2 warp schedulers, only 2 banks of 16 CUDA cores per SM can be put in use unless NVIDIA’s hardware can extract a degree of instruction level parallelism from the resulting code. As a result all of these GF104-derrived GPUs have a wider range of compute performance than what we’re used to. At the best case scenario of being able to extract ILP every clock, we achieve peak theoretical performance. The worst case scenario is 2/3rds that performance. So the GTS 450 can perform between a 192 CUDA core and 128 CUDA core card depending on the application in use.

For our look at computing performance we once again have the CUDA version of Folding @ Home. Using the Lambda work unit, we run a short benchmark that extrapolates the number of nodes per day the card would be able to process. All things considered the GTS 450 does quite well here compared to the rest of the Fermi family thanks to its high clock speed. It may only have around 57% as many CUDA cores as the GTX 460, but the higher clockspeed means that it’s just shy of 70% of the performance. Furthermore we’ve already established that this benchmark isn’t L2 cache or memory bandwidth limited, so even though the GTS 450 isn’t using a “full” GF106 chip here, it isn’t penalized for the limitation.

Our other benchmark is a quick look at tessellation. With the DirectX 11 Detail Tessellation sample program, we’re primarily looking at whether we can throw a high enough tessellation load at the GPU to overwhelm its tessellation abilities and bring it to its knees. In this case we cannot, as the GTS 450 scales from tessellation factor 7 to tessellation factor 11 by only a little below the rate of the GTX 480 and GTX 460, achieving 63% of its performance at factor 11 . This means that the GTS 460 should still has plenty of tessellation power for even this demanding sample, but of course this is heavily dependent on how much tessellation is used by future games.

One interesting thing is that because NVIDIA built its geometry units in to its Polymorph Engines, their geometry abilities scale in a way that AMD’s doesn’t thanks to AMD’s relatively constant fixed-function pipeline in the Radeon HD 5000 series. With the GTX 480 NVIDIA was advertising an 8-fold increase in geometry performance over the GTX 285, but with the GTS 450 NVIDIA is only talking about around a 2.4x increase over the GTS 250. This neatly showcases the much wider range of geometry performance in NVIDIA’s Fermi family. It also reinforces the fact that they need developers to fully utilize tessellation in order to maximize the geometry capabilities of the GTX 480, otherwise if a card like the GTS 450 is the geometry baseline, then scaling geometry capabilities through the Polymorph Engines will not have paid off.

Wolfenstein Power, Temperature, & Noise
POST A COMMENT

66 Comments

View All Comments

  • FragKrag - Monday, September 13, 2010 - link

    Why isn't there a SC2 bench? :(

    I saw them on the laptop reviews and expected them here :(((((
    Reply
  • Ryan Smith - Monday, September 13, 2010 - link

    Because of the vast number of cards in our library, we only refresh the GPU test suite twice a year. It will get refreshed later this fall, and SC2 is a very likely candidate. Reply
  • Gomez Addams - Sunday, September 19, 2010 - link

    When you do your refresh please include older cards like the GTX285 and those of its era. I find it helpful to be able to evaluate whether a video card update will be of any value. So far, it would be of very little value. Reply
  • ronnybrendel - Monday, September 13, 2010 - link

    http://www.legitreviews.com/article/1408/11/ Reply
  • eanazag - Monday, September 13, 2010 - link

    Regretfully, I am patiently waiting for the test suite refresh. Reply
  • Leyawiin - Monday, September 13, 2010 - link

    Prior to Newegg pulling them there were lower priced ones at $130 going up to $140. Its true if you play the rebate game you can get an HD 5770 at that price, but what's coming out of your pocket on the day you buy is generally $140-150. HD 5750s are sitting at $120-140. GTX 460 768MBs are down to $170 (no rebate) so the pressure seems to be on the Radeons as much as the GTS 450. The HD 5750 is almost a useless purchase when the others are clustered so closely to its price point. Reply
  • iwodo - Monday, September 13, 2010 - link

    It is too bad that we wont get a 28nm die shrink of the Fermi soon. But it seems the logical plan for Nvidia is to work on Frequency and Bandwidth.

    You mention Nvidia has a relatively poor Memory Controller for GDDR5 and that is why it had to use 384bit MC where 256 from a ATI design would be enough.

    It we get a MC upgrade, + some better Frequency Headroom, and unlock the last bit of the SP, Nvidia should be able to counter the Northen Island coming in 2 - 3 months time.

    As that would be the best we can get with 40nm limit and respin of Fermi.
    Reply
  • DMisner - Monday, September 13, 2010 - link

    How does the 450 stand up to the 250 in power consumption and general gaming performance.

    Also, any word on how many PPD the GTS 450 will get in Folding@Home?
    Reply
  • lecaf - Monday, September 13, 2010 - link

    hey
    in all these benchmarks the 450 beats the 460 is that right ?
    I've took a look at tomshardware review and there the 460 wins.

    Did I miss-look at something or the figures are wrong?
    Reply
  • Ryan Smith - Monday, September 13, 2010 - link

    Where are you seeing the GTS 450 beating the GTX 460? Reply

Log in

Don't have an account? Sign up now