Compute & Tessellation Performance

As we mentioned in our look at the new Forceware 260 driver set, the drivers provided by NVIDIA for our testing of the GTS 450 have a broken OpenCL component, so we had to cut our compute benchmarking short by dropping our OpenCL benchmark. Furthermore our pre-release version of Badaboom with Fermi support doesn’t work either, so that also was dropped. What we offer instead is a much more abbreviated look at the GTS 450’s compute performance.

As was the case with GF104, GF106 is a superscalar design. With 2 warp schedulers, only 2 banks of 16 CUDA cores per SM can be put in use unless NVIDIA’s hardware can extract a degree of instruction level parallelism from the resulting code. As a result all of these GF104-derrived GPUs have a wider range of compute performance than what we’re used to. At the best case scenario of being able to extract ILP every clock, we achieve peak theoretical performance. The worst case scenario is 2/3rds that performance. So the GTS 450 can perform between a 192 CUDA core and 128 CUDA core card depending on the application in use.

For our look at computing performance we once again have the CUDA version of Folding @ Home. Using the Lambda work unit, we run a short benchmark that extrapolates the number of nodes per day the card would be able to process. All things considered the GTS 450 does quite well here compared to the rest of the Fermi family thanks to its high clock speed. It may only have around 57% as many CUDA cores as the GTX 460, but the higher clockspeed means that it’s just shy of 70% of the performance. Furthermore we’ve already established that this benchmark isn’t L2 cache or memory bandwidth limited, so even though the GTS 450 isn’t using a “full” GF106 chip here, it isn’t penalized for the limitation.

Our other benchmark is a quick look at tessellation. With the DirectX 11 Detail Tessellation sample program, we’re primarily looking at whether we can throw a high enough tessellation load at the GPU to overwhelm its tessellation abilities and bring it to its knees. In this case we cannot, as the GTS 450 scales from tessellation factor 7 to tessellation factor 11 by only a little below the rate of the GTX 480 and GTX 460, achieving 63% of its performance at factor 11 . This means that the GTS 460 should still has plenty of tessellation power for even this demanding sample, but of course this is heavily dependent on how much tessellation is used by future games.

One interesting thing is that because NVIDIA built its geometry units in to its Polymorph Engines, their geometry abilities scale in a way that AMD’s doesn’t thanks to AMD’s relatively constant fixed-function pipeline in the Radeon HD 5000 series. With the GTX 480 NVIDIA was advertising an 8-fold increase in geometry performance over the GTX 285, but with the GTS 450 NVIDIA is only talking about around a 2.4x increase over the GTS 250. This neatly showcases the much wider range of geometry performance in NVIDIA’s Fermi family. It also reinforces the fact that they need developers to fully utilize tessellation in order to maximize the geometry capabilities of the GTX 480, otherwise if a card like the GTS 450 is the geometry baseline, then scaling geometry capabilities through the Polymorph Engines will not have paid off.

Wolfenstein Power, Temperature, & Noise
POST A COMMENT

66 Comments

View All Comments

  • just4U - Monday, September 13, 2010 - link

    Here in Canada I haven't really seen any 5850's priced under 300 yet and most are up in the 330 range.. The 1G 460 sit's in the 220-240 (no price drops for us) so it's a tempting alternative for many (I think)

    I also believe the 5850 will be a $200 card sometime in the near future. It's been selling way above it's suggested retail price (at launch) and when that happens it will be harder to consider the 460 as a viable alternative. I can't see it being sold at $150 (for the 1G variants) any time soon... so only fan's of Nvidia would consider it if it's priced in the 5850s range.
    Reply
  • jabber - Monday, September 13, 2010 - link

    Big thing is...who actually bought a 5830?

    When it came out everyone said it was a pointless card so big whoop, Nvidia's 460 beats a card that should never have been released in the first place.

    Bit like saying "our car out performed the Ford Edsel!"

    If you want middle of the road performance you get a 5770, if you want a better boost you get the 5850.
    Reply
  • just4U - Monday, September 13, 2010 - link

    It was only a pointless card because of it's price... Originally it should have been alot cheaper but supply and demand has inflated the prices of most of Amd's 5X lineup. Sitting near a $100 more then the 5770 is what made it a hard sell. Reply
  • erple2 - Wednesday, September 15, 2010 - link

    Sure it did - the 768MB version fo the 460 now gave you 5830 performance for > 10% less money. To me, that makes it sound like the 5830 was now "obsoleted" by the 460 series. The 1 GB card was significantly faster at the same price point, and the 768 MB version was just as fast, but significantly cheaper. Both using less power, noise and heat.

    Isn't that essentially what defines "obsoletes"?
    Reply
  • SandmanWN - Monday, September 13, 2010 - link

    Throughout the entire test suite the 5850 is within 4-6 frames of the 470. In two it makes it to 8 and 10 frames more. Given you need an extra 100W's on your power supply and the additional cost associated with that just to get that tiny fraction of output, the statement seems fanboyish. AT should be better than this. Reply
  • just4U - Monday, September 13, 2010 - link

    What does the 470 have to do with this? Most of us all agree that the 465/470/480 are all heat scores with insane power draws.. the 460 addressed that and came in at a price point that hit a sweet spot.. bringing alot of the 470/480's strengths and none of it's weaknesses to the table. Only real complaint I've seen for the 460 is the mini hdmi. Reply
  • IceDread - Tuesday, September 14, 2010 - link

    You are missing my point.

    By saying "and it struck beautifully" implies that you are cheering for the nvidia team. It would be a different thing if he wrote "and it struck hard" or something like that.
    Reply
  • adonn78 - Monday, September 13, 2010 - link

    I read another review, damn my cheating heart. That stated the SLI scalling ont hese cards were impressive. You got 2 GTS 450 cards but no SLI? Reply
  • Stuka87 - Monday, September 13, 2010 - link

    Er, every single benchmark shows the GTS-450 SLI scores. They are marked in green (like the regular GTS-450). Reply
  • marraco - Monday, September 13, 2010 - link

    Other web sites show that the 450 is slower than the 250.

    It's strange when the 250 has 128 shaders, and the 450 has 192.... and the 450 has DDR5 vs DDR3 in the 250.

    It looks like the texture units bottleneck this design.

    Even more strange is that I could not find the 250 on this article charts.

    I don't see the 450 as price competitive with the radeons, except as SLI setup. It would be more valuable if 3-SLI way were allowed, and I guess that is not the case, because the photos shown only a single SLI connector.

    The SLI setups are unbeatable against the radeons price/performance. Maybe nVidia should design cheap, energy-efficient chips so a card manufacturer can pack 10 video chips on a single card.
    Reply

Log in

Don't have an account? Sign up now