We always get very excited when we see a new GPU architecture come down the pipe from ATI or NVIDIA. For the past few years, we've really just been seeing reworked versions of old parts. NV40 evolved from NV30, G70 was just a step up from NV40, and the same is true with ATI as well. Fundamentally, not much has changed since the introduction of DX9 class hardware. But today, G80 ushers in a new class of GPU architecture that truly surpasses everything currently on the market. Changes like this only come along once every few years, so we will be sure to savor the joy that discovering a new architecture brings, and this one is big.

These massive architecture updates generally coincide with the release of a new DirectX, and guess what we've got? Thus we begin today's review not with discussions of pixel shaders and transistors, but about DirectX and what it will mean for the next-generation of graphics hardware, including G80.

DirectX 10

There has been quite a lot of talk about what DirectX 10 will bring to the table, and what we can expect from DX10 class hardware. Well, the hardware is finally here, but much like the situation we saw with the launch of ATI's Radeon 9700 Pro, the hardware precedes the new API. In the mean time, we can only look at our shiny new hardware as it performs under DX9. Of course, we will see full DX9 support, encompassing everything we've come to know and love about the current generation of hardware.

Even though we won't get to see any of the new features of DX10 and Shader Model 4.0, the performance of G80 will shine through due to its unified shader model. This will allow developers to do more with SM3.0 and DX9 while we all wait for the transition to DX10. In the mean time we will absolutely be able to talk about what the latest installment of Microsoft's pervasive graphics API will bring to the table.

More Efficient State and Object Management

One of the major performance improvements we will see from DX10 is a reduction in overhead. Under DX9, state change and draw calls are made quite often and can generate so much overhead that the API becomes the limiting factor in performance. With DX10, we will see the addition of state objects which hold all of the state information for a given pipeline stage. There are 5 state objects in DX10: InputLayout (vertex buffer layout), Sampler, Rasterizer, DepthStencil, and Blend. These objects can quickly change all state information without multiple calls to set the state per attribute.

Constant buffers have also been added to hold data for use in shader programs.

Each shader program has access to 16 buffers of 4096 constants. Each buffer can be updated in one function call. This hugely reduces the overhead of managing a lot of input for shader programs to use. Similar to constant buffers, texture arrays are also available in order to allow for much more data to be stored for use with a shader program. 512 equally sized textures can be stored in a texture array, and each shader is allowed 128 texture arrays (as opposed to 16 textures in DX9). The combination of 8Kx8K texture sizes with all this texture storage space will offer a huge boost in texturing ability to DX10 based games and hardware.

A new construct called a "view" is being introduced in DX10 which will allow resources to be used as more than one type of thing at the same time. For instance, a pixel shader could render vertex data to a texture, and then a vertex shader could use a view to interpret the data as vertex buffer. Views will basically give developers the ability to share resources between pipeline stages more easily.

There is also an DrawAuto call which can redraw an object without having to go back out to the CPU. This combined with predicated rendering should cut down on the overhead and performance impact of large numbers of draw calls currently being used in DX9.

GPUs get Virtual Memory
Comments Locked

111 Comments

View All Comments

  • aweigh - Friday, November 10, 2006 - link

    You can just use the program DX Tweaker to enable Triple Buffering in any D3D game and use your VSYNC with negligable performance impact. So you can play with your VSYNC, a high-res and AA as well. :)
  • aweigh - Friday, November 10, 2006 - link

    I'm gonna buy an 88 specifically to use 4x4 SuperSampling in games. Why bother with MSAA with a card like that?
  • DerekWilson - Friday, November 10, 2006 - link

    Supersampling can make textures blurry -- especially very detailed textures.

    And the impact will be much greater with the use of longer more detailed pixel shaders (as the shaders must be evaluated at every sub-pixel in supersample).

    I think transparency / adaptive AA are enough.

    On your previous comment, I don't think we're to the point where we can hit triple buffering, vsync, high levels of AA AND high resolution (2560x1600) without some input lag (triple buffering plus vsync with framerates less than your refresh rate can cause problems).

    If you're talking about enabling all these options on a lower resolution lcd panel, then I can definitely see that as a good use of the hardware. And it might be interesting to look at more numbers with these type of options enabled.

    Thanks for the suggestion.
  • aweigh - Saturday, November 11, 2006 - link

    I never knew that about SuperSampling. Is it something similar to Quincux blurring? And would using a negative LOD via RivaTuner/nHancer counteract the effect?

    How about NVIDIA's Digital Sharpness setting in Color Correction? I've found a smidge of sharpening can do wonders to improve overall clarity.

    By the way, when you said Adaptive AA, were you referring to ATI cards?
  • Unam - Friday, November 10, 2006 - link

    Derek,

    Saw your comment regarding the rationale for the test resolution, while I understand your reasoning now, it still begs the question how many of your readers have 30" LCD flat panels?
  • DerekWilson - Friday, November 10, 2006 - link

    There might not be many out there right now, but it's still the right test platform for G80. We did test down to 1600x1200, so people do have information if they need it.

    But it speaks to who should own an 8800 GTX right now. It doesn't make sense to spend that much money on a part if you aren't going to get anything out of it with your 1280x1024 panel.

    Owners of a 2560x1600 panel will want an 8800 GTX. Owners of an 8800 GTX will want a 2560x1600 panel. Smooth framerates with the ability to enable 4xAA in every game that allowed it is reason enough. People without a 2560x1600 panel should probably wait until prices come down on the 8800 GTX or until games that are able to push the 8800 GTX harder to buy the card.
  • Unam - Tuesday, November 14, 2006 - link

    Derek,

    A follow up to testing resolutions, the FPS numbers we see in your articles, are they maximum, minimum or average?
  • Unam - Friday, November 10, 2006 - link

    Who the heck runs 2560x1600? At 4XAA? Come on guys, real world benchmarks please!
  • DerekWilson - Friday, November 10, 2006 - link

    we did:

    1600x1200, 1920x1440, and even 1280x1024 in Oblivion
  • dragonsqrrl - Thursday, August 25, 2011 - link

    ....lol, owned.

Log in

Don't have an account? Sign up now