Applications of GF100’s Compute Hardware

Last but certainly not least are the changes to gaming afforded by the improved compute/shader hardware. NVIDIA believes that by announcing the compute abilities so far ahead of the gaming abilities of the GF100, that potential customers have gotten the wrong idea about NVIDIA’s direction. Certainly they’re increasing their focus on the GPGPU market, but as they’re trying their hardest to point out, most of that compute hardware has a use in gaming too.

Much of this is straightforward: all of the compute hardware is what processes the pixel and vertex shader commands, so the additional CUDA cores in the GF100 give it much more shader power than the GT200. We also have DirectCompute, which can use the compute hardware to quickly do some things that couldn’t be done quickly via shader code, such as Self Shadowing Ambient Occlusion in games like Battleforge, or to take an NVIDIA example, the depth-of-field effect in Metro 2033.

Perhaps the single biggest improvement for gaming that comes from NVIDIA’s changes to the compute hardware are the benefits afforded to compute-like tasks for gaming. PhysX plays a big part here, as along with DirectCompute it’s going to be one of the biggest uses of compute abilities when it comes to gaming.

NVIDIA is heavily promoting the idea that GF100’s concurrent kernels and fast context switching abilities are going to be of significant benefit here. With concurrent kernels, different PhysX simulations can start without waiting for other SMs to complete the previous simulation. With fast context switching, the GPU can switch from rendering to PhysX and back again while wasting less time on the context switch itself. The result is that there’s going to be less overhead in using the compute abilities of GF100 during gaming, be it for PhysX, Bullet Physics, or DirectCompute.

NVIDIA is big on pushing specific examples here in order to entice developers in to using these abilities, and a number of demo programs will be released along with GF100 cards to showcase these abilities. Most interesting among these is a ray tracing demo that NVIDIA is showing off. Ray tracing is something even G80 could do (albeit slowly) but we find this an interesting way for NVIDIA to go since promoting ray tracing puts them in direct competition with Intel, who has been showing off ray tracing demos running on CPUs for years. Ray tracing nullifies NVIDIA’s experience in rasterization, so to promote its use is one of the riskier things they can do in the long-term.


NVIDIA's car ray tracing demo

At any rate, the demo program they are showing off is a hybrid program that showcases the use of both rasterization and ray tracing for rendering a car. As we already know from the original Fermi introduction, GF100 is supposed to be much faster than GT200 at ray tracing, thanks in large part due to the L1 cache architecture of GF100. The demo we saw of a GF100 card next to a GT200 card had the GF100 card performing roughly 3x as well as the GT200 card. This specific demo still runs at less than a frame per second (0.63 on the GF100 card) so it’s by no means true real-time ray tracing, but it’s getting faster all the time. For lower quality ray tracing, certainly this would be doable in real-time.


Dark Void's turbulence in action

NVIDIA is also showing off several other demos of compute for gaming, including a PhysX fluid simulation, the new PhysX APEX turbulence effect on Dark Void, and an AI path finding simulation that we did not have a chance to see. Ultimately PhysX is still NVIDIA’s bigger carrot for consumers, while the rest of this is to entice developers to make use of the compute hardware through whatever means they’d like (PhysX, OpenCL, DirectCompute). Outside of PhysX, heavy use of the GPU compute abilities is still going to be some time off.

Better Image Quality: CSAA & TMAA 3D Vision Surround: NVIDIA’s Eyefinity
Comments Locked

115 Comments

View All Comments

  • SothemX - Tuesday, March 9, 2010 - link

    WELL.lets just make it simple. I am an advid gamer...I WANT and NEED power and performance. I care only about how well my games play, how good they look, and the impression they leave with me when I am done.
    I own a PS3 and am thrilled they went with Nvidia- (smart move)
    I own and PC that utilizes the 9800GT OC card....getting ready to upgrade to the new GF100 when it releases, last thing that is on my mind is how the market share is, cost is not an issue.

    Hard-Core gaming requires Nvidia. Entry-level baby boomers use ATI.

    Nvidia is just playing with their food....its a vulgar display of power- better architecture, better programming, better gamming.
  • StevoLincolnite - Monday, January 18, 2010 - link

    [quote]So why does NVIDIA want so much geometry performance? Because with tessellation, it allows them to take the same assets from the same games as AMD and generate something that will look better. With more geometry power, NVIDIA can use tessellation and displacement mapping to generate more complex characters, objects, and scenery than AMD can at the same level of performance.[/quote]

    Might I add to that, nVidia's design is essentially "Modular" they can increase and decrease there geometry performance essentially by taking units out, this however will force programmers to program for the lowest common denominator, whilst AMD's iteration of the technology is the same across the board, so essentially you can have identical geometry regardless of the chip.
  • Yojimbo - Monday, January 18, 2010 - link

    just say the minimum, not the lowest common denominator. it may look fancy bit it doesn't seem to fit.
  • chizow - Monday, January 18, 2010 - link

    The real distinction here is that Nvidia's revamp of fixed-function geometry units to a programmable, scalable, and parallel Polymorph engine means their implementation won't be limited to acceleration of Tesselation in games. Their improvements will benefit every game ever made that benefits from increased geometry performance. I know people around here hate to claim "winners" and "losers" around here when AMD isn't winning, but I think its pretty obvious Nvidia's design and implementation is the better one.

    Fully programmable vs. fixed-function, as long as the fully programmable option is at least as fast is always going to be the better solution. Just look at the evolution of the GPU from mostly fixed-function hardware to what it is today with GF100...a fully programmable, highly parallel, compute powerhouse.
  • mcnabney - Monday, January 18, 2010 - link

    If Fermi was a winner Nvidia would have had samples out to be benchmarked by Anand and others a long time ago.

    Fermi is designed for GPGPU with gaming secondary. Goody for them. They can probably do a lot of great things and make good money in that sector. But I don't know about gaming. Based upon the info that has gotten out and the fact that reality hasn't appeared yet I am guessing that Fermi will only be slightly faster than 5870 and Nvidia doesn't want to show their hand and let AMD respond. Remember, AMD is finishing up the next generation right now - so Fermi will likely compete against Northern Isles on AMDs 32nm process in the Fall.
  • dragonsqrrl - Monday, February 15, 2010 - link

    Firstly, did you not read this article? The gf100 delay was due in large part to the new architecture they developed, and architectural shift ATI will eventually have to make if they wish to remain competitive. In other words, similarly to the g80 enabling GPU computing features/unified shaders for the first time on the PC, Nvidia invested huge resources in r&d and as a result had a next generation, revolutionary GPU before ATI.

    Secondly, Nvidia never meant to place gaming second to GPU computing, as much as you ATI fanboys would like to troll about this subject. What they're trying to do is bring GPU computing up to the level GPU gaming is already at (in terms of accessibility, reliability, and performance). The research they're doing in this field could revolutionize research into many fields outside of gaming, including medicine, astronomy, and 'yes' film production (something I happen to deal with a LOT) while revolutionizing gaming performance and feature sets as well

    Thirdly, I would be AMAZED if AMD can come out with their new architecture (their first since the hd2900) by the 3rd quarter of this year, and on the 32nm process. I just can't see them pushing GPU technology forward in the same way Nvidia has given their new business model (smaller GPUs, less focus on GPU computing), while meeting that tight deadline.
  • chewietobbacca - Monday, January 18, 2010 - link

    "Winning" the generation? What really matters?

    The bottom line, that's what. I'm sure Nvidia liked winning the generation - I'm sure they would have loved it even more if they didn't lose market share and potential profits from the fight...
  • realneil - Monday, January 25, 2010 - link

    winning the generation is a non-prize if the mainstream buyer can only wish they had one. Make this kind of performance affordable and then you'll impress me.
  • chizow - Monday, January 18, 2010 - link

    Yes and the bottom line showed Nvidia turning a profit despite not having the fastest part on the market.

    Again, my point about G80'ing the market was more a reference to them revolutionizing GPU design again rather than simply doubling transistors and functional units or increasing clockspeeds based on past designs.

    The other poster brought up performance at any given point in time, I was simply pointing out a fact being first or second to market doesn't really matter as long as you win the generation, which Nvidia has done for the last few generations since G80 and will again once GF100 launches.
  • sc3252 - Monday, January 18, 2010 - link

    Yikes, if it is more than the original GTX 280 I would expect some loud cards. When I saw those benchmarks of farcrry 2 I was disappointed that I didn't wait, but now that it is using more than a GTX 280 I think I may have made the right choice. While right now I wan't as much performance as possible eventually my 5850 will go into a secondary pc(why I picked 5850) with a lesser power supply. I don't want to have to buy a bigger power supply just because a friend might come over and play once a week.

Log in

Don't have an account? Sign up now