Compute & Tessellation

Moving on from our look at gaming performance, we have our customary look at compute performance, bundled with a look at theoretical tessellation performance. Unlike our gaming benchmarks where NVIDIA’s architectural differences between GF114 and GF110 are largely irrelevant, they can become much more important under a compute-bound situation depending on just how much ILP can be extracted for the GTX 560 Ti.

Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes.

Under our Civilization 5 compute benchmark we have a couple of different things going on even when we just look at the NVIDIA cards. Compared to the GTX 460 1GB, the GTX 560 enjoys a 31% performance advantage; this is less than the theoretical maximum of 39%, but not far off from the performance advantages we’ve seen in most games. Meanwhile the GTX 470 is practically tied with the GTX 560 even though on paper the GTX 560 has around a 15% theoretical performance advantage. This ends up being a solid case of where the limitations of ILP come in to play, as clearly the GTX 560 isn’t maximizing the use of its superscalar shaders. Or to put it another way, it’s an example as to why NVIDIA isn’t using a superscalar design on their Tesla products.

Meanwhile this benchmark has always favored NVIDIA’s architectures, so in comparison to AMD’s cards there’s little to be surprised about. The GTX 560 Ti is well in the lead, with the only AMD card it can’t pass being the dual-GPU 5970.

Our second GPU compute benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. While it’s still in beta, SmallLuxGPU recently hit a milestone by implementing a complete ray tracing engine in OpenCL, allowing them to fully offload the process to the GPU. It’s this ray tracing engine we’re testing.

Small Lux GPU is the other test in our suite where NVIDIA’s drivers significantly revised our numbers. Where this test previously favored raw theoretical performance, giving the vector-based Radeons an advantage, NVIDIA has now shot well ahead. Given the rough state of both AMD and NVIDIA’s OpenCL drivers, we’re attributing this to bug fixes or possibly enhancements in NVIDIA’s OpenCL driver, with the former seeming particularly likely. However NVIDIA is not alone when it comes to driver fixes, and AMD has seem a similar uptick against the newly released 6900 series. It’s not nearly the leap NVIDIA saw, but it’s good for around 25%-30% more rays/second under SLG. This appears to be accountable to further refinement of AMD’s VLIW4 shader compiler, which as we have previously mentioned stands to gain a good deal of performance as AMD works on optimizing it.

So where does SLG stack up after the latest driver enhancements? With NVIDIA’s rocket to the top, they’re now easily dominating this benchmark. The GTX 560 Ti is now slightly ahead of the 6970, never mind the 6950 1GB where it has a 33% lead. Rather than being a benchmark that showed the advantage of having lots of theoretical compute performance, this is now a benchmark that seems to favor NVIDIA’s compute-inspired architecture.

Our final compute benchmark is a Folding @ Home benchmark. Given NVIDIA’s focus on compute for Fermi, cards such as the GTX 560 Ti can be particularly interesting for distributed computing enthusiasts, who are usually looking for a compute card first and a gaming card second.

Against the senior members of the GTX 500 series and even the GTX 480 the GTX 560 Ti is still well behind, but at the same time Folding @ Home does not look like it significantly penalizes GTX 560’s superscalar architecture.

At the other end of the spectrum from GPU computing performance is GPU tessellation performance, used exclusively for graphical purposes. With Fermi NVIDIA bet heavily on tessellation, and as a result they do very well at very high tessellation factors. With 2 GPCs the GTX 560 Ti can retire 2 triangles/clock, the same rate as the Radeon HD 6900 series, so this should be a good opportunity to look at theoretical architectural performance versus actual performance.

Against the AMD 5800 and 6800 series, the GTX 560 enjoys a solid advantage, as it’s able to retire twice as many triangles per clock as either architecture. And while it falls to both GTX 480 and GTX 580, the otherwise faster Radeon HD 6970 is close at times – at moderate tessellation it has quite the lead, but the two are neck-and-neck at extreme tessellation where triangle throughput and the ability to efficiently handle high tessellation factors counts for everything. Though since Heaven is a synthetic benchmark at the moment (the DX11 engine isn’t currently used in any games) we’re less concerned with performance relative to AMD’s cards and more concerned with performance relative to the other NVIDIA cards.

Microsoft’s Detail Tessellation sample program showcases NVIDIA’s bet on tessellation performance even more clearly. NVIDIA needs very high tessellation factors to shine compared to AMD’s cards. Meanwhile against the GTX 460 1GB our gains are a bit more muted; even though this is almost strictly a theoretical test, the GTX 560 only gains 30% on the GTX 460. Ultimately while the additional SM unlocks another tessellator on NVIDIA’s hardware, it does not unlock a higher triangle throughput rate, which is dictated by the GPCs.

Wolfenstein Power, Temperature, & Noise
Comments Locked

87 Comments

View All Comments

  • SolMiester - Wednesday, January 26, 2011 - link

    Hi can we please have some benchies on this feature please...While surround is not an option for the single nVidia card, 3D is, and it is hard to judge performance with this enabled. Readers need to know if 3D is an viable option with this card at perhaps 16x10 or 19x10

    Ta
  • DarknRahl - Wednesday, January 26, 2011 - link

    I agree with some of the other comments and find the conclusion rather odd. I didn't really get why all the comparison's were done with the 460 either, yes this is the replacement card for the 460 but isn't really to relevant as far a purchasing decision goes at this moment in time. I found HardOCP's review far more informative, especially as they get down to brass tacs; the price to performance. In both instances the 560 doesn't make too much sense.

    I still enjoy reading your reviews of other products, particularly power supplies and CPUs.
  • kallogan - Thursday, January 27, 2011 - link

    Put those damn power connectors at the top of the graphic card, think about mini-itx users !!! Though this one is 9 inches, should fit in my sugo sg05 anyway.
  • neo_moco - Thursday, January 27, 2011 - link

    I really don`t understand the conclusion :

    The radeon 6950 wins hands down in 1920x and 2560x in almost all important games: crysis , metro , bad company2, mass effect 2 and wolfenstein

    The geforce wins only the games who not a lot of people play : civ 5 , battleforge, hawx , dirt 2

    Add to that others tests : 6950 wins in the popular call of duty , vantage .
    In 3dmark 11 the geforce is 15 % weaker(guru3d) so the conclusion as i see it : radeon 6950 1 gb is aprox. 5-10 % better and not the other way .
  • neo_moco - Thursday, January 27, 2011 - link

    after 15 months of the radeon 5850 launch we get a card 10 % better for the same price; i dont get the enthusiasm of some people over this cards ; its kind of lame
  • HangFire - Thursday, January 27, 2011 - link

    "By using a center-mounted fan, NVIDIA went with a design that recirculates some air in the case of the computer instead of a blower that fully exhausts all hot air."

    I don't think I've ever seen a middle or high end NVIDIA card that fully exhausted all hot air. Maybe it exists, I certainly haven't owned them all, perhaps in the customized AIB vendor pantheon there have been a few.

    This is not just a nitpick. When swapping out an 8800GT to an 8800 Ultra a few years back, I thought I was taking a step forwards in case temperatures, because the single-slot GT was internally vented and the Ultra had that second slot vent. I didn't notice the little gap, or the little slots in the reference cooler shroud.

    That swap began a comical few months of debugging and massive case ventilation upgrades. Not just the Ultra got hot, everything in that case got hot. Adding a second 120mm input fan and another exhaust, an across-the-PCI-slots 92mm fan finally got things under control (no side panel vent). Dropping it into a real gamer case later was even better. (No, I didn't pay $800 for the Ultra, either).

    I'm not a fan of, um, graphics card fans that blow in two directions at once, I call them hard drive cookers, and they can introduce some real challenges in managing case flow. But I no longer run under the assumption that a rear-venting double slot cooler is necessarily better.

    I'd like to see some case-cooker and noise testing with the new GTX 560 Ti reference, some AIB variants, and similar wattage rear venting cards. In particular, I'd like to see what temps a hard drive forward of the card endures, above, along, and below.
  • WRH - Saturday, January 29, 2011 - link

    I know that comparing an on CPU chip graphics with GPU cards like the ones on this list would be like comparing a Prius to a Ferrari but I would like to see the test results just the same. Sandy Bridge on CPU GPUs come in two models (2000 and 3000). It would be interesting to see just how far below the Radeon HD4870 they are and see how the 2000 and 3000 compare.

Log in

Don't have an account? Sign up now