Compute Performance

As always we'll start with our DirectCompute game example, Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes.  While DirectCompute is used in many games, this is one of the only games with a benchmark that can isolate the use of DirectCompute and its resulting performance.

Compute: Civilization V

As our Civilization V compute benchmark is just that, a compute benchmark, so our results aren’t too surprising. This is one of the few compute tests NVIDIA does well at, so the GTX 650 Ti Boost is close to both Radeon cards, and not all that far behind the GTX 660 either.

Our next benchmark is LuxMark2.0, the official benchmark of SmallLuxGPU 2.0. SmallLuxGPU is an OpenCL accelerated ray tracer that is part of the larger LuxRender suite. Ray tracing has become a stronghold for GPUs in recent years as ray tracing maps well to GPU pipelines, allowing artists to render scenes much more quickly than with CPUs alone.

Compute: LuxMark 2.0

Moving on to LuxMark, we quite frankly transition into a more normal compute benchmark pattern for NVIDIA, which sees Kepler flopping. The GTX 650 Ti Boost can’t get even remotely close to a 7770, let alone the 7850. On the NVIDIA side it doesn’t help that since this is a compute benchmark the GTX 650 Ti Boost gains fairly little over the GTX 650 Ti.

Our 3rd benchmark set comes from CLBenchmark 1.1. CLBenchmark contains a number of subtests; we’re focusing on the most practical of them, the computer vision test and the fluid simulation test. The former being a useful proxy for computer imaging tasks where systems are required to parse images and identify features (e.g. humans), while fluid simulations are common in professional graphics work and games alike.

Compute: CLBenchmark 1.1 Computer Vision

Compute: CLBenchmark 1.1 Fluid Simulation

CLBenchmark is much the same as LuxMark, with NVIDIA cards bringing up the rear. The fluid simulation ends up being the more painful of the two benchmarks for the GTX 650 Ti Boost, clocking in at less than 1/3rd the performance of the 7850.

Moving on, our 4th compute benchmark is FAHBench, the official Folding @ Home benchmark. Folding @ Home is the popular Stanford-backed research and distributed computing initiative that has work distributed to millions of volunteer computers over the internet, each of which is responsible for a tiny slice of a protein folding simulation. FAHBench can test both single precision and double precision floating point performance, with single precision being the most useful metric for most consumer cards due to their low double precision performance. Each precision has two modes, explicit and implicit, the difference being whether water atoms are included in the simulation, which adds quite a bit of work and overhead. This is another OpenCL test, as Folding @ Home is moving exclusively OpenCL this year with FAHCore 17.

Compute: Folding @ Home: Explicit, Single Precision

Compute: Folding @ Home: Implicit, Single Precision

NVIDIA still struggles at compute with FAHBench – the move to OpenCL isn’t doing them any favors – but it’s not the blowout that was our last two benchmarks. Interestingly explicit favors NVIDIA more than implicit, which may mean NVIDIA is handling the overhead better than AMD is. Still, any Folding @ Home users will be far better served by AMD than NVIIDA here.

Our 5th compute benchmark is Sony Vegas Pro 12, an OpenGL and OpenCL video editing and authoring package. Vegas can use GPUs in a few different ways, the primary uses being to accelerate the video effects and compositing process itself, and in the video encoding step. With video encoding being increasingly offloaded to dedicated DSPs these days we’re focusing on the editing and compositing process, rendering to a low CPU overhead format (XDCAM EX). This specific test comes from Sony, and measures how long it takes to render a video.

Compute: Sony Vegas Pro 12 Video Render

Vegas is another OpenCL benchmark, and another benchmark NVIDIA brings up the rear with. Certainly the additional compute performance of the GTX 650 Ti Boost over the GTX 650 Ti is helping NVIDIA here, but it can’t make up for a gap of over 30 seconds.

Wrapping things up, our final compute benchmark is an in-house project developed by our very own Dr. Ian Cutress. SystemCompute is our first C++ AMP benchmark, utilizing Microsoft’s simple C++ extensions to allow the easy use of GPU computing in C++ programs. SystemCompute in turn is a collection of benchmarks for several different fundamental compute algorithms, as described in this previous article, with the final score represented in points. DirectCompute is the compute backend for C++ AMP on Windows, so this forms our other DirectCompute test.

Compute: SystemCompute v0.5.7.2 C++ AMP Benchmark

SystemCompute mixes things up a bit with its multiple sub-benchmarks, but it still doesn’t change the fact that Kepler and GTX 650 Ti Boost just don’t do that well in most compute scenarios. 68K points is enough to tie the 6870 of all things, itself not a particular good compute card. Otherwise the bar is set by AMD at over 100K points.

Civilization V Synthetics
POST A COMMENT

78 Comments

View All Comments

  • Bob Todd - Tuesday, March 26, 2013 - link

    More data points are usually a good thing, but can I ask what you'd use that for? Since you can't install one in the other, beyond the novelty of knowing how close a midrange desktop card is to a halo mobile part I'm curious to know what you want out of it. It seems like on the mobile side most parts are 2-3 rungs below the desktop part of the same name throughout the lineup. Reply
  • Hrel - Tuesday, March 26, 2013 - link

    I base card recs on how long they intend to keep it. On a budget? Ok, get something that will run 1080p for a year or two. 1GB 650 ti or 7790. Idk, I'd have to look at those two to really know between that. But then after that point spend more a 2GB card with a 256bit memory interface. (GTX660 192bit, WTF Nividia?) BUT, if you have some more money and want to keep the card 4 years or more, get the 7850. The 7850 will be faster now, last longer. But whatever card you'd replace the 650 ti with in a year or two for the same price will be even faster than that.

    I really don't like where the GPU market is right now. It feels stagnant. Nothing is really a good deal. Like you guys said, there is no sweet spot. The 8800GT was the card to get after the price dropped below 150. Same for the GTX460. Now to get that level of performance they expect you to shell out 220 bucks. Fuck that. I say, for now, either keep your current card or buy the cheapest one you can possibly stomach. This market needs to straighten itself out again.

    I'm keeping my GTX460 until I literally can't run games anymore. Don't really care if I have to turn off AA in new titles. Neither company has given me a reason to upgrade. Sub 200 used to be competitive.
    Reply
  • just4U - Wednesday, March 27, 2013 - link

    Well.. generally speaking the 660 and the 7870 are currently enjoying the sweet spot. Neither card breaks the bank and the trade off in the +$300 range isn't so great to be a game breaker.

    The 460 was a $240 card when it launched and both of the ones I mentioned can be had for $220 if you look around.. (not including mail in rebates etc or game bundles). On average their 70% faster than 460 but over the past few years there's been a focus on loading temperatures, power consumption, and other features.. which is something that got kick-started around the time of the 460. Right now it's not giant leaps forward but rather, several steps to the side with a few steps forward in performance.
    Reply
  • just4U - Wednesday, March 27, 2013 - link

    Also.. if the 460 was your last purchase over the 8800 than your buying every third generation.. For you that wont come up until the next line-up/fall refresh. Reply
  • Calinou_ - Wednesday, March 27, 2013 - link

    Want a large memory bus for cheap? Get an used GTX 570, then you have a 320 bit memory bus for the price of a 650 Ti. Then deal with the 250W in full load. :D Reply
  • Shadowmaster625 - Tuesday, March 26, 2013 - link

    The 7850 is a 10% larger die. So with all things being equal you can expect roughly 10% more performance. Both chips contain significant fused off sections but the amount that is fused off is roughly the same in percentage terms.

    The 7850 at stock is leaving an awful lot of performance on the table. You can tell this just by looking at the power consumption. Overclock the 7850 until its power consumption matches this new nvidia card, and how much performance differential are we now talking about? Well over 10% I'm sure...

    The 650 Ti Boost is clearly more aggressively timed and configured, to squeeze out more dollars out of the enthusiast's pocket and into Nvidia's. The fact that the review doesnt really mention any of this is kind of surprising. I would say the 7850 is a better deal, based on the assumption that it has more overclocking overhead. Given the same 28nm process vs the difference in die sizes, that is surely a safe assumption.
    Reply
  • mczak - Wednesday, March 27, 2013 - link

    7850 doesn't have larger die. Pitcairn is quoted as 212mm^2 whereas gk106 is actually 221mm^2 (though the difference might be just be measuring differently, i.e. including the empty space at the edges or not if those are official, not measured, figures). Pitcairn does have 10% more transistors though (I guess for whatever reason amd could pack them more densely overall).
    But yes pitcairn is faster than gk106 overall. The reason the gtx660 loses to hd7870 but the gtx650ti boost is very close to hd7850 is of course that hd7850 is a hd7870 with 20% less shader units and 14% less clocks, whereas the 650ti boost is a 660 with 20% less shader units but same clocks. And yes this shows in overclocking potential and perf/w.
    Reply
  • royalcrown - Tuesday, March 26, 2013 - link

    Too many cards, just buy a 660 ! Reply
  • Spunjji - Wednesday, March 27, 2013 - link

    Too many cards, just buy a 7870!

    FTFY ;)
    Reply
  • silverblue - Tuesday, March 26, 2013 - link

    Nice card. Okay, the power draw is a bit of a downside, meaning we've got a 680-7970-esque comparison again between the 7850 and 650 Ti Boost where the former performs a little better in general whilst using less power, however considering the gap to the 660 isn't that big, is it worth the extra money?

    I can definitely see people buying these; that extra 1GB will certainly help in time.
    Reply

Log in

Don't have an account? Sign up now