Compute Performance & Synthetics

While the GT 430 isn’t meant to be a computing monster and you won’t see NVIDIA presenting it as such, it’s still a member of the Fermi family and possesses the family’s compute capabilities. This includes the Fermi cache structure, along with the 48 CUDA core SM that was introduced with GF104/GTX 460. This also means that it has a greater variation of performance than the past-generation NVIDIA cards; the need to extract ILP means the card performs between a 64 CUDA core card and a 96 CUDA core card depending on the application.

Meanwhile being based on the GF104 SM, the GT 430 is FP64 capable at 1/12th FP32 speeds  (~20 GFLOPS FP64), a first for a card of this class.

For our look at compute performance we’ll turn to our trusty benchmark copy of Folding @ Home. We’ve also included the GT 240, a last-generation 96 CUDA core card just like the GT 430. This affords us an interesting opportunity to see the performance of Fermi compared to GT200 with the same number of CUDA cores in play, although GT 430 has a clockspeed advantage here that gives it a higher level of performance in theory.

The results are interesting, but also a bit distressing. GT 430’s performance as compared to the GTS 450’s performance is quite a bit lower, but this is expected. GT 240 however manages to pull ahead by nearly 17%, which is quite likely a manifestation of Fermi’s more variable performance. This makes the GT 220 comparison all the more appropriate, as if Fermi’s CUDA cores are weaker on average then GT 430 can’t hope to keep pace with GT 240.

To take a second look at CUDA core performance, we’ve also busted 3DMark Vantage out of the vault. As we’ve mentioned before we’re not huge fans of synthetic tests like 3DMark since they encourage non-useful driver optimizations for the benchmark instead of real games, but the purely synthetic tests do serve a useful purpose when trying to get to the bottom of certain performance situations.

We’ll start with the Perlin Noise test, which is supposed to be computationally bound, similar to Folding @ Home.

Once more we see the GT 430 come in behind the GT 240, even though the GT 430 has the theoretical advantage due to clockspeed. The loss isn’t nearly as great as it was under Folding @ Home, but this lends more credit to the theory that Fermi shaders are less efficient than GT21x CUDA cores. As a card for development GT 430 still has a number of advantages such as the aforementioned FP64 support and C++ support in CUDA, but if we were trying to use it as a workhorse card it looks like it wouldn’t be able to keep up with GT 240. Based on our gaming results earlier, this would seem to carry over to shader-bound games, too.

Moving on, we also used this opportunity to look at 3DMark Vantage’s color fill test, which is a ROP-bound test. With only 4 ROPs on the GT 430, this is the perfect synthetic test for seeing if having fewer ROPs really is an issue when we’re comparing GT 430 to older cards.

And the final verdict? A not very useful yes and no. GT 220 and GT 240 both have 8 ROPs, with GT 220 having the clockspeed advantage. This is why GT 220 ends up coming out ahead of GT 240 here by less than 100 MPixels/sec. But on the other hand, GT 430 has a clockspeed advantage of its own while possessing half the ROPs. The end result is that GT 430 is effectively tied with these previous-generation cards, which is actually quite a remarkable feat for having half the ROPs.

NVIDIA worked on making the Fermi ROPs more efficient and it has paid off by letting them use 4 ROPs to do what took 8 in the last generation. With this data in hand, NVIDIA’s position that 4 ROPs is enough is much more defensible, as they’re at least delivering last-generation ROP performance on a die not much larger than GT216 (GT 220). This doesn’t provide enough additional data to clarify whether the ROPs alone are the biggest culprit in the GT 430’s poor gaming performance, but it does mean that we can’t rule out less efficient shaders either.

Do note however that while Fermi ROPs are more efficient than GT21x ROPs, it’s only a saving grace when doing comparisons to past-generation architectures. GT 430 still only has ¼ the ROP power as GTS 450, which definitely hurts the card compared to its more expensive sibling.

Wolfenstein Power, Temperature, & Noise
Comments Locked

120 Comments

View All Comments

  • Ryan Smith - Monday, October 11, 2010 - link

    None of the good product shots I have include passive GT 430s. However there are 2 in the collage on an NV slide: a Sparkle and a Zotac.
  • ranger203 - Monday, October 11, 2010 - link

    I have a Geforce G210 for my media center and it runs blu-rays well. I am looking for a little more umpppff to allow some post processing of other videos. But I paid $35 for my card, and the only way I would replace it was if this one was under $50.

    But.... a quick froogle search found they want to go for around $75. Giga-byte offers a passive cooler, looks pretty bad ass.

    http://www.provantage.com/gigabyte-technology-gv-n...
  • Lolimaster - Monday, October 11, 2010 - link

    Just buy a HD5550. You shoul've started buying an AMD IGP Mobo, that's enough.

    Nvidia on the low end is just worthless.
  • DMisner - Monday, October 11, 2010 - link

    I noticed it has the same number of CUDA cores as the GT 240. Would this perform any better or about the same as the GT 240 for Folding@Home?
  • AznBoi36 - Monday, October 11, 2010 - link

    It's slower than the GT240. Also the GT240 is a full height card, while this one is low profile. There is a reason why this card is called the GT430 and not a GT440.
  • mczak - Monday, October 11, 2010 - link

    Hmm on page 1 it says roughly same die size as "Juniper GPU in the 5500/5600 families" - that should be redwood. Also, the "GT430 goes up against... GT430..." I guess that should be GT240?
    I'm really wondering how they attach 4 rops to two memory partitions btw. I believe that one quad rop per partition wouldn't really have required a lot of changes over one octo-rop per partition, but either the quad-rop block was split into 2 or it's actually attached to both MCs. Of course, for actual color fillrate, it doesn't really matter if there are 4 or 8 rops - pixel output is limited to 2 per SM anyway.
  • jsbiggs - Monday, October 11, 2010 - link

    Additional correction:

    On the Power, Temperature, and Noise page, the line "Even at these low wattages where our 1200W power supply isn’t very efficiency". Probably should be "...isn't very efficient". Cheers.
  • Onferno - Monday, October 11, 2010 - link

    Passively cooled and working as a PhysX card:

    http://www.hardwareheaven.com/reviews/1046/pg10/zo...
  • kmitty - Monday, October 11, 2010 - link

    Depends on your definition of 'ultimate'. For me, just looking at it and seeing a fan meant it wasn't my ultimate HTPC card - absolute silence required!
  • mcnabney - Tuesday, October 12, 2010 - link

    No kidding.

    An ideal HTPC card is:

    1. Silent
    2. Consumes very little power
    3. Can decode all current and pending video format/containers
    4. Fits half-height as well
    5. Has rock-solid driver support
    6. Cheap
    7. Bitstream audio and includes a full version of BluRay software to support it

    This card just doesn't measure up. Ideally the best HTPC card isn't a card, but is actually part of the motherboard.

Log in

Don't have an account? Sign up now