Compute & Tessellation

Moving on from our look at gaming performance, we have our customary look at compute performance, bundled with a look at theoretical tessellation performance. Unlike our gaming benchmarks where NVIDIA’s architectural differences between GF114 and GF110 are largely irrelevant, they can become much more important under a compute-bound situation depending on just how much ILP can be extracted for the GTX 560 Ti.

Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes.

Under our Civilization 5 compute benchmark we have a couple of different things going on even when we just look at the NVIDIA cards. Compared to the GTX 460 1GB, the GTX 560 enjoys a 31% performance advantage; this is less than the theoretical maximum of 39%, but not far off from the performance advantages we’ve seen in most games. Meanwhile the GTX 470 is practically tied with the GTX 560 even though on paper the GTX 560 has around a 15% theoretical performance advantage. This ends up being a solid case of where the limitations of ILP come in to play, as clearly the GTX 560 isn’t maximizing the use of its superscalar shaders. Or to put it another way, it’s an example as to why NVIDIA isn’t using a superscalar design on their Tesla products.

Meanwhile this benchmark has always favored NVIDIA’s architectures, so in comparison to AMD’s cards there’s little to be surprised about. The GTX 560 Ti is well in the lead, with the only AMD card it can’t pass being the dual-GPU 5970.

Our second GPU compute benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. While it’s still in beta, SmallLuxGPU recently hit a milestone by implementing a complete ray tracing engine in OpenCL, allowing them to fully offload the process to the GPU. It’s this ray tracing engine we’re testing.

Small Lux GPU is the other test in our suite where NVIDIA’s drivers significantly revised our numbers. Where this test previously favored raw theoretical performance, giving the vector-based Radeons an advantage, NVIDIA has now shot well ahead. Given the rough state of both AMD and NVIDIA’s OpenCL drivers, we’re attributing this to bug fixes or possibly enhancements in NVIDIA’s OpenCL driver, with the former seeming particularly likely. However NVIDIA is not alone when it comes to driver fixes, and AMD has seem a similar uptick against the newly released 6900 series. It’s not nearly the leap NVIDIA saw, but it’s good for around 25%-30% more rays/second under SLG. This appears to be accountable to further refinement of AMD’s VLIW4 shader compiler, which as we have previously mentioned stands to gain a good deal of performance as AMD works on optimizing it.

So where does SLG stack up after the latest driver enhancements? With NVIDIA’s rocket to the top, they’re now easily dominating this benchmark. The GTX 560 Ti is now slightly ahead of the 6970, never mind the 6950 1GB where it has a 33% lead. Rather than being a benchmark that showed the advantage of having lots of theoretical compute performance, this is now a benchmark that seems to favor NVIDIA’s compute-inspired architecture.

Our final compute benchmark is a Folding @ Home benchmark. Given NVIDIA’s focus on compute for Fermi, cards such as the GTX 560 Ti can be particularly interesting for distributed computing enthusiasts, who are usually looking for a compute card first and a gaming card second.

Against the senior members of the GTX 500 series and even the GTX 480 the GTX 560 Ti is still well behind, but at the same time Folding @ Home does not look like it significantly penalizes GTX 560’s superscalar architecture.

At the other end of the spectrum from GPU computing performance is GPU tessellation performance, used exclusively for graphical purposes. With Fermi NVIDIA bet heavily on tessellation, and as a result they do very well at very high tessellation factors. With 2 GPCs the GTX 560 Ti can retire 2 triangles/clock, the same rate as the Radeon HD 6900 series, so this should be a good opportunity to look at theoretical architectural performance versus actual performance.

Against the AMD 5800 and 6800 series, the GTX 560 enjoys a solid advantage, as it’s able to retire twice as many triangles per clock as either architecture. And while it falls to both GTX 480 and GTX 580, the otherwise faster Radeon HD 6970 is close at times – at moderate tessellation it has quite the lead, but the two are neck-and-neck at extreme tessellation where triangle throughput and the ability to efficiently handle high tessellation factors counts for everything. Though since Heaven is a synthetic benchmark at the moment (the DX11 engine isn’t currently used in any games) we’re less concerned with performance relative to AMD’s cards and more concerned with performance relative to the other NVIDIA cards.

Microsoft’s Detail Tessellation sample program showcases NVIDIA’s bet on tessellation performance even more clearly. NVIDIA needs very high tessellation factors to shine compared to AMD’s cards. Meanwhile against the GTX 460 1GB our gains are a bit more muted; even though this is almost strictly a theoretical test, the GTX 560 only gains 30% on the GTX 460. Ultimately while the additional SM unlocks another tessellator on NVIDIA’s hardware, it does not unlock a higher triangle throughput rate, which is dictated by the GPCs.

Wolfenstein Power, Temperature, & Noise
Comments Locked

87 Comments

View All Comments

  • Nimiz99 - Tuesday, January 25, 2011 - link

    One of my buddies has a C2D 8500 system OC'd to 3.5 i think. He got himself a 5870 (overclocked) to game. The problem we ran into was that the C2D is too slow to handle games like Civ5 that heavily rely on the CPU to keep up (you can still play the game, but it's literally wasting the 5870 with noticeable lag from the chip). Basically, he is upgrading now to a sandy bridge. I'd wager some of the older i7's or maybe even a Thuban (OC'd to 3.8 with a good HT overclock) could manage, but why bother when a new architecture is out form Intel (or AMD later in the year).
    So enjoy your new build ;),
    Nimiz
  • Beenthere - Tuesday, January 25, 2011 - link

    Over the last couple years Nvidia has really struggled and they may be on the ropes at this point. They have created a lot of their own problems with their arrogance so we'll see how it all plays out.
  • kilkennycat - Tuesday, January 25, 2011 - link

    eVGA GTX560 Ti "Superclocked" Core: 900MHz, Shader 1800MHZ; Memory 4212MHz $279.99

    ~ 10% factory-overclock for $20 extra, together with a lifetime warranty (if you register within 30 days) ain't too shabby....
  • Belard - Tuesday, January 25, 2011 - link

    Sure, the name shouldn't be a big deal... but each year or worse, Nvidia comes up with a new marketing product name that is meaningless and confusing.

    Here is the full product name:

    GeForce GTX 560 Ti But in reality, the only part that is needed or makes ANY sense is:
    GeForce 560

    GTX / GT / GTs are worthless. Unless there were GTX 560, GTS 560 and GT 560. Much like the older 8800 series.

    TI is only added to this idiotic mess. Might as well Ultra, Pro or MX.... so perhaps Nvidia will come out with the "GT 520 mx"?

    The product itself is solid, why turn it into something stupid with your marketing department?

    AMD does it right (mostly), the "Radeaon 6870" that's it. DUH.
  • omelet - Tuesday, January 25, 2011 - link

    Yeah. Not that it really matters. And while this might be what you meant by "mostly" note that AMD's naming was pretty retarded this generation with the 68xx having lower performance than 58xx.

    But I don't see why they readopted the Ti moniker.
  • Sufo - Wednesday, January 26, 2011 - link

    no, that's only a result of the 5xxx series being stupidly named. Using 5970 for a dual chip part was the error. Use an x2 suffix or smthng. AMD is back on track with the 6xxx naming convention... well, until we see what they do with the 6 series dual chip card.
  • Belard - Thursday, January 27, 2011 - link

    The model numbers of:

    x600, x800, etc have been consistent since the 3000 series.

    x800 is top
    x700 is high-end mid range ($200 sub)
    x600 is mid-range ($150 sub)
    x400~500 low-end ($50~60)
    x200~300 Desktop or HTPC cards.

    AMD said they changed because they didn't want to confuse people with the 5750/5770 cards with the 6000 series. Which is completely stupid... so instead they confuse everyone with all th cards.

    If the 6800s were called 6700s - they would have been easily faster than any of the 5700s and at least somewhat equal to the 5800s (sometimes slower, others faster). Instead, we have "6850" that is slower than the 5850.

    The prices are a bit high still, yet far cheaper than the 5800 series, in which a 5850 was $300+ or $400 for the 5870. But by all means, I'd rather spend $220 on a 6870 than $370 on todays 5870s.

    Anyways, I'm still using a 4670 in my main computer. When I do my next upgrade, I'll spend about $200 at the most and want at least 6870 level of performance, which is still about 4x faster than what I have now. Noise & heat are very high on my list, my 4670 was $15 extra for the better noise & heat cooling system. Perhaps in 6 months, the AMD 7000 or GeForce 700 series will be out.
  • marraco - Tuesday, January 25, 2011 - link

    Is the first time I see a radiator geometrically aligned to the direction of air velocity thrown by the fan.

    Obviously it increases the efficiency of the fan, increasing the flow of air thrown across the radiator, and reducing noise.

    It’s an obvious enhancement in air cooling, that I don’t understand why CPU coolers don’t use.
  • strikeback03 - Tuesday, January 25, 2011 - link

    I wouldn't be surprised if in some cases the increase in fin surface area (from having a bunch of straight fins packed more closely together) produces better cooling than having a cleaner airpath.
  • MeanBruce - Wednesday, January 26, 2011 - link

    You should check out the four Asus Direct CU II three slot radiators that came out today on the GTX 580, 570, and the HD 6970 and 6950, each using two 100mm fans, five heatpipes and three slots of pure metal, they claim you can easily fit two of them on ATX for SLI and CB?

Log in

Don't have an account? Sign up now