NVIDIA’s GF100: Architected for Gaming

Name: NVIDIA’s GF100: Architected for Gaming
Item: NVIDIA’s GF100: Architected for Gaming
Author: Ryan Smith

by Ryan Smith on January 17, 2010 2:00 AM EST

Posted in
GPUs

115 Comments | Add A Comment

115 Comments

Applications of GF100’s Compute Hardware

Last but certainly not least are the changes to gaming afforded by the improved compute/shader hardware. NVIDIA believes that by announcing the compute abilities so far ahead of the gaming abilities of the GF100, that potential customers have gotten the wrong idea about NVIDIA’s direction. Certainly they’re increasing their focus on the GPGPU market, but as they’re trying their hardest to point out, most of that compute hardware has a use in gaming too.

Much of this is straightforward: all of the compute hardware is what processes the pixel and vertex shader commands, so the additional CUDA cores in the GF100 give it much more shader power than the GT200. We also have DirectCompute, which can use the compute hardware to quickly do some things that couldn’t be done quickly via shader code, such as Self Shadowing Ambient Occlusion in games like Battleforge, or to take an NVIDIA example, the depth-of-field effect in Metro 2033.

Perhaps the single biggest improvement for gaming that comes from NVIDIA’s changes to the compute hardware are the benefits afforded to compute-like tasks for gaming. PhysX plays a big part here, as along with DirectCompute it’s going to be one of the biggest uses of compute abilities when it comes to gaming.

NVIDIA is heavily promoting the idea that GF100’s concurrent kernels and fast context switching abilities are going to be of significant benefit here. With concurrent kernels, different PhysX simulations can start without waiting for other SMs to complete the previous simulation. With fast context switching, the GPU can switch from rendering to PhysX and back again while wasting less time on the context switch itself. The result is that there’s going to be less overhead in using the compute abilities of GF100 during gaming, be it for PhysX, Bullet Physics, or DirectCompute.

NVIDIA is big on pushing specific examples here in order to entice developers in to using these abilities, and a number of demo programs will be released along with GF100 cards to showcase these abilities. Most interesting among these is a ray tracing demo that NVIDIA is showing off. Ray tracing is something even G80 could do (albeit slowly) but we find this an interesting way for NVIDIA to go since promoting ray tracing puts them in direct competition with Intel, who has been showing off ray tracing demos running on CPUs for years. Ray tracing nullifies NVIDIA’s experience in rasterization, so to promote its use is one of the riskier things they can do in the long-term.

NVIDIA's car ray tracing demo

At any rate, the demo program they are showing off is a hybrid program that showcases the use of both rasterization and ray tracing for rendering a car. As we already know from the original Fermi introduction, GF100 is supposed to be much faster than GT200 at ray tracing, thanks in large part due to the L1 cache architecture of GF100. The demo we saw of a GF100 card next to a GT200 card had the GF100 card performing roughly 3x as well as the GT200 card. This specific demo still runs at less than a frame per second (0.63 on the GF100 card) so it’s by no means true real-time ray tracing, but it’s getting faster all the time. For lower quality ray tracing, certainly this would be doable in real-time.

Dark Void's turbulence in action

NVIDIA is also showing off several other demos of compute for gaming, including a PhysX fluid simulation, the new PhysX APEX turbulence effect on Dark Void, and an AI path finding simulation that we did not have a chance to see. Ultimately PhysX is still NVIDIA’s bigger carrot for consumers, while the rest of this is to entice developers to make use of the compute hardware through whatever means they’d like (PhysX, OpenCL, DirectCompute). Outside of PhysX, heavy use of the GPU compute abilities is still going to be some time off.

Better Image Quality: CSAA & TMAA 3D Vision Surround: NVIDIA’s Eyefinity

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

115 Comments

View All Comments

FlyTexas - Monday, January 18, 2010 - link
I have a feeling that nVidia is taking the long road here...

The past 6 months have been painful for nVidia, however I think they are looking way ahead. At its core, the 5000 series from AMD is really just a supersized 4000 series. Not a bad thing, but nothing new either (DX11 is nice, but that'll be awhile, and multiple monitors are still rare).

Games have all looked the same for years now. CPU and GPU power have gone WAY up in the past 5 years, but too much is still developed for DX9 (X360/PS3 partly to blame, as is Vista's poor adoption), and I suspect that even the 5000 series is really still designed around DX9 and games meant for it with a few "enhancements".

This new chip seems designed for DX11 and much higher detailed graphics. Polygon counts can go up with this, the number of new details can really shine, but only once games are designed from scratch for it. From that point, the 6 month wait isn't a big deal, it'll be another few years before games are really designed from scratch for DX11 ONLY. Otherwise you have DX9 games with a few "enhancements" that don't add to gameplay.

It seems like we are really skipping DX10 here, partly due to Vista's poor adoption, partly due to XP not being able to use DX10. With Windows 7 being a success and DX11 backported to Vista, I think in the next 2-3 years you'll finally see most games come out that really require Vista/7 because they will require DX10/11.

Of course, my 260GTX still runs everything I throw at it, so until games get more complex or something else changes, I see no reason to upgrade. I thought about a 5870 as an upgrade, but why? Everything already runs fast enough, what does it get me other than some headroom? If I was still on a 8800GT, it would make sense, but I'd rather wait for nVidia to launch so the prices come down.
PorscheRacer - Tuesday, January 19, 2010 - link
Well then there's the fact ATI designed their 2000 series (and 3000 and 4000 series) to comply with the full DirectX 10 specification. NVIDIA didn't have the chips required for this spec, and talked Microsoft into castrating DX10 by only adding in a few things. Tessellation was notably left out. ATI wsa hung out to dry on performanec and features wasted on die. They finalyl got DX10.1 later on but the damage was done.

Sure people complained about Vista, mostly gamers as games ran slower, but I wonder how those games would have been if DX10 was run at the full spec (which was marginally lower the DX11 today)?
Scali - Wednesday, January 27, 2010 - link
I think you need to read this, and reconsider your statement:
http://scalibq.spaces.live.com/blog/cns">http://scalibq.spaces.live.com/blog/cns!663AD9A4F9CB0661!194.entry
jimhsu - Monday, January 18, 2010 - link
I made this post in another forum, but I think it's relevant here:

---

Yes, I'm beginning to see this [games becoming less GPU limited and more CPU limited] with more mainstream games (to repeat, Crysis is NOT a mainstream game). FLOP wise, a high end video card (i.e. 5970 at 5 TFLOP) is something like 100 TIMES the performance of a high end CPU (i7 at 50 GFLOPS).

In comparison, during the 2004 days, we had GPUs like the 6800 Ultra (54 GFLOP) and P4's (6 GFLOP) (historical data here: http://forum.beyond3d.com/showthread.php?t=51677)">http://forum.beyond3d.com/showthread.php?t=51677). That's 9X the performance. We've gone from 9X to 100X the performance in a matter of 5 years. No wonder few modern games are actually pushing modern GPUs (requiring people who want to "get the most" out of their high powered GPUs to go for multiple screens, insane AA/AF, insane detail settings, complex shaders, etc)

I know this is a horrible comparison, but still - it gives you an idea of the imbalance in performance. This kind of reminds me of the whole hard drive capacity vs. transfer rate argument. Today's 2 TB monsters are actually not much faster than the few GB drives at the turn of the millennium (and even less so latency wise).

Personally, I think the days of GPU bound (for mainstream discrete GPU computing) closed when Nvidia's 8 series launched (the 8800GTX is perhaps the longest-lived video card ever made). And in general, when the industry adopted programmable compute units (aka DirectX 10).
AznBoi36 - Tuesday, January 19, 2010 - link
Actually the Radeon 9700/9800 Pro had a pretty long life too. The 9700 Pro I bought in 2002/2003 had lasted me all the way to early 2007, which was when I then bought a 8800GTS 640mb. 4 years is pretty good. It could have lasted longer, but then I was itching for a new platform and needed to get a PCI-Express card (the Radeon was AGP).
RJohnson - Monday, January 18, 2010 - link
Sorry you lost all credibility when you tried to spin this bullsh*#t "Today's 2 TB monsters are actually not much faster than the few GB drives at the turn of the millennium"
Go try and run your new rig off one of those old drives, come back and post your results in 2 hours when your system finally boots.
jimhsu - Monday, January 18, 2010 - link
A fun chart. Note the performance disparity.

http://i65.photobucket.com/albums/h204/killer-ra/V...">http://i65.photobucket.com/albums/h204/...Game%20S...
jimhsu - Monday, January 18, 2010 - link
Disclosure: I'm still on a 8800 GTS 512, and I am in no pressure to upgrade right now. While a 58xx would be nice to have, on a single monitor I really have no need to upgrade. I may look into going i7 though.
dentatus - Monday, January 18, 2010 - link
If something works well for you then there is no real reason (or need) to upgrade.

I still run an 8800 ultra, it still runs many games well on a 22 inch monitor. The GT200 was really only a 50% boost over the 8 series on average. For comparison, I bought a second hand ultra for $60, transplanted both of them into an i7 based system and this really produced a significant boost over a GTX285 in the games I liked; about 25% more performance- roughly equivalent to HD5850, albeit not always as smooth.

It would be good to upgrade to a single GPU that is more than double the performance of this kind of setup. But a HD5800 series card is not in that league, and it remains to be seen if the GF100 is.
dentatus - Monday, January 18, 2010 - link
I agree this chip does seem designed around new or upcoming features. Many architectural shortcomings from the GT200 chip seem to be addressed and worked around getting usable performance (like tesselation) for new API features.

Anyway to be pragmatic about things, nvidias history leaves much to be desired; performance promised and performance delivered is very variable. HardOCP mentioned the 5800 Ultra launch as a con, there is also th G80 launch on the flip side.

A GPU's theoretical performance and the expectations hanging around it are nothing to make choices by, wait for the real proof. Anyone recall the launch of the 'monstrous' 2900XT? A toothless beast that one.

NVIDIA’s GF100: Architected for Gaming

Post Your Comment

115 Comments

View All Comments

FlyTexas - Monday, January 18, 2010 - link

PorscheRacer - Tuesday, January 19, 2010 - link

Scali - Wednesday, January 27, 2010 - link

jimhsu - Monday, January 18, 2010 - link

AznBoi36 - Tuesday, January 19, 2010 - link

RJohnson - Monday, January 18, 2010 - link

jimhsu - Monday, January 18, 2010 - link

jimhsu - Monday, January 18, 2010 - link

dentatus - Monday, January 18, 2010 - link

dentatus - Monday, January 18, 2010 - link

Log in

Don't have an account? Sign up now