Compute Performance

As always we'll start with our DirectCompute game example, Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes.  While DirectCompute is used in many games, this is one of the only games with a benchmark that can isolate the use of DirectCompute and its resulting performance.

Compute: Civilization V

As our Civilization V compute benchmark is just that, a compute benchmark, so our results aren’t too surprising. This is one of the few compute tests NVIDIA does well at, so the GTX 650 Ti Boost is close to both Radeon cards, and not all that far behind the GTX 660 either.

Our next benchmark is LuxMark2.0, the official benchmark of SmallLuxGPU 2.0. SmallLuxGPU is an OpenCL accelerated ray tracer that is part of the larger LuxRender suite. Ray tracing has become a stronghold for GPUs in recent years as ray tracing maps well to GPU pipelines, allowing artists to render scenes much more quickly than with CPUs alone.

Compute: LuxMark 2.0

Moving on to LuxMark, we quite frankly transition into a more normal compute benchmark pattern for NVIDIA, which sees Kepler flopping. The GTX 650 Ti Boost can’t get even remotely close to a 7770, let alone the 7850. On the NVIDIA side it doesn’t help that since this is a compute benchmark the GTX 650 Ti Boost gains fairly little over the GTX 650 Ti.

Our 3rd benchmark set comes from CLBenchmark 1.1. CLBenchmark contains a number of subtests; we’re focusing on the most practical of them, the computer vision test and the fluid simulation test. The former being a useful proxy for computer imaging tasks where systems are required to parse images and identify features (e.g. humans), while fluid simulations are common in professional graphics work and games alike.

Compute: CLBenchmark 1.1 Computer Vision

Compute: CLBenchmark 1.1 Fluid Simulation

CLBenchmark is much the same as LuxMark, with NVIDIA cards bringing up the rear. The fluid simulation ends up being the more painful of the two benchmarks for the GTX 650 Ti Boost, clocking in at less than 1/3rd the performance of the 7850.

Moving on, our 4th compute benchmark is FAHBench, the official Folding @ Home benchmark. Folding @ Home is the popular Stanford-backed research and distributed computing initiative that has work distributed to millions of volunteer computers over the internet, each of which is responsible for a tiny slice of a protein folding simulation. FAHBench can test both single precision and double precision floating point performance, with single precision being the most useful metric for most consumer cards due to their low double precision performance. Each precision has two modes, explicit and implicit, the difference being whether water atoms are included in the simulation, which adds quite a bit of work and overhead. This is another OpenCL test, as Folding @ Home is moving exclusively OpenCL this year with FAHCore 17.

Compute: Folding @ Home: Explicit, Single Precision

Compute: Folding @ Home: Implicit, Single Precision

NVIDIA still struggles at compute with FAHBench – the move to OpenCL isn’t doing them any favors – but it’s not the blowout that was our last two benchmarks. Interestingly explicit favors NVIDIA more than implicit, which may mean NVIDIA is handling the overhead better than AMD is. Still, any Folding @ Home users will be far better served by AMD than NVIIDA here.

Our 5th compute benchmark is Sony Vegas Pro 12, an OpenGL and OpenCL video editing and authoring package. Vegas can use GPUs in a few different ways, the primary uses being to accelerate the video effects and compositing process itself, and in the video encoding step. With video encoding being increasingly offloaded to dedicated DSPs these days we’re focusing on the editing and compositing process, rendering to a low CPU overhead format (XDCAM EX). This specific test comes from Sony, and measures how long it takes to render a video.

Compute: Sony Vegas Pro 12 Video Render

Vegas is another OpenCL benchmark, and another benchmark NVIDIA brings up the rear with. Certainly the additional compute performance of the GTX 650 Ti Boost over the GTX 650 Ti is helping NVIDIA here, but it can’t make up for a gap of over 30 seconds.

Wrapping things up, our final compute benchmark is an in-house project developed by our very own Dr. Ian Cutress. SystemCompute is our first C++ AMP benchmark, utilizing Microsoft’s simple C++ extensions to allow the easy use of GPU computing in C++ programs. SystemCompute in turn is a collection of benchmarks for several different fundamental compute algorithms, as described in this previous article, with the final score represented in points. DirectCompute is the compute backend for C++ AMP on Windows, so this forms our other DirectCompute test.

Compute: SystemCompute v0.5.7.2 C++ AMP Benchmark

SystemCompute mixes things up a bit with its multiple sub-benchmarks, but it still doesn’t change the fact that Kepler and GTX 650 Ti Boost just don’t do that well in most compute scenarios. 68K points is enough to tie the 6870 of all things, itself not a particular good compute card. Otherwise the bar is set by AMD at over 100K points.

Civilization V Synthetics
POST A COMMENT

78 Comments

View All Comments

  • royalcrown - Thursday, March 28, 2013 - link

    yeah, but the $$$ of a 660 is dropping every week, i just dont really see the point of the 650 ti when you have the 650 and 660 and they all have overclocked versions as well. a few places have the 2 gig 660 for $199.00 Reply
  • royalcrown - Thursday, March 28, 2013 - link

    well, if the new 650 is 149, then I guess that'd be a great price preformance vs the 660. I suppose it depends on what they cost in real life. Reply
  • SAAB_340 - Tuesday, March 26, 2013 - link

    Is it just me thinking the 1GB model might be a bad idea given that these cards with the 192bit memory bus have asymetrical memory placement. The card only has 768MB of the memory at full bandwidth while the last 256MB will only give a 3rd of the bandwidth. (it's the same with the 2GB card but there 1.5GB has full bandwidth.) 768MB is not much with todays standards. Looking forward to the test showing how much that will impact on performance. Reply
  • Oxford Guy - Tuesday, March 26, 2013 - link

    It's absurd, just like the AMD 1 GB card that was just announced. I've read that Skyrim with high resolution textures needs 2 GB at minimum and I doubt most people consider Skyrim a high-end game. Reply
  • Parhel - Tuesday, March 26, 2013 - link

    The high resolution texture pack didn't really affect memory usage that much when I installed it. It was below 1GB both before and after. That's at 2560x1600, no AA. Maybe with mods it's a different story, but I think if you're trying to show where 1GB hits a wall, you'd be better off starting with a different game. Reply
  • mczak - Wednesday, March 27, 2013 - link

    Personally I'd think it would make more sense to just have a 1.5GB card (at say right between the 149$ of the 1GB model and the 169$ of the 2GB model). All the same performance characteristics as the 2GB model (as you say the those asymmetric configurations are a little dubious or at least suspect anyway) while being cheaper. But marketing doesn't like 1.5GB cards (and as intended competitor of 7850 2GB of course "looks" much better). Reply
  • drew_afx - Wednesday, March 27, 2013 - link

    How about Performance per dollar(retail) comparison for these very similarly spec'd cards?
    Make up some metric for 3d games(dx9/10/11), encoding/decoding, OpenCL, etc
    Because a lot of games are CPU intensive, for potential buyers, FPS comparison on a specific benchmarking setup is not going to reflect equally in real life.
    Also if a game can run 60+min. fps & maybe 75fps avg., then the card is as good as it can get for average people. This comparison proves X is better than Y when used with top of the line CPU Mobo RAM combo, but thats it. Many don't go for $2000+ gaming computer setup and put sub $170 GPU in it. What about overclocking potential? It's like comparing non-K cpu to unlocked one (just to put it in a perspective)
    Reply
  • CiccioB - Wednesday, March 27, 2013 - link

    Still, the game list is quite obsolete.
    It is not time to replace Crysis: warhead with Crysis3 and Dirt: Showdown with Dirt3?
    And adding Skyrim? Last Tomb Raider?
    Gamers would like to know how today games run on these cards, not only if one GPU is faster than another playing ancient games with obsolete engines.

    This thing has already been pointed out during Titan's review. There someone suggested that games choice has been made to review games that are better on AMD rather than nvidia GPUs.
    However, no answer was made, either to give reasons on why so many old obsolete games or whether the list was going to be changed/enlarged.
    Still, new games are not considered for no apparent reason.
    After having spent so many efforts in upgrading the site's appearance, which I like very much, it would be nice also to spend a bit of time to make a new game benchmark suite. It's 2013 and many games have been published after Crysis: warhead and Dirt: showdown.
    Thanks in advance
    Reply
  • Ryan Smith - Friday, March 29, 2013 - link

    We'll be adding two more games next month (or whenever I can find the time to validate them). Crysis: Warhead isn't going anywhere since it's our one legacy title for comparing DX10 cards to. And DiRT: Showdown is newer than DiRT 3, not older. It was Showdown that we replaced 3 with. Skyrim was also removed, since it's badly CPU limited on higher-end cards. Reply
  • medi01 - Wednesday, March 27, 2013 - link

    Any reason 7850 and not 7790 (direct competitor) is marked black? Reply

Log in

Don't have an account? Sign up now