Shader Model 4.0 Enhancements

Aside from defining the capabilities and instructions that the different shaders must support, Microsoft also specifies attributes like precision, number of instructions that can make up a shader program, and the number of registers available to the programmer. Here's a table comparing DX9 and DX10 shader models.

Along with these changes, Microsoft has made some lower level adjustments. Until now, shaders have been exclusively floating point. This means that operations like memory addressing and array indexing (which use integer values) must be done carefully if interpolation is to be avoided. With DX10, integer and bitwise operations have been added to the mix. This means programmers can make use of traditional data structures and memory operations. Increasing the flexibility of the hardware and enabling programmers to employ methods commonly used on more general purpose hardware will certainly be helpful in creating a better platform for developers to create the effects they desire.

Floating point operations have also been enhanced, as Microsoft has placed tighter requirements on how to handle the numbers. IEEE 754 is a specification that defines all aspects of floating point interaction. Sticking to such a standard allows programmers to guarantee that operations will be consistent between different types of hardware. Because Microsoft hasn't been as strict in the past, we've seen some issues where ATI and NVIDIA don't provide the exact same result due to rounding and accuracy differences. This time around, DX10 has very nearly IEEE 754 requirements. There are certain aspects of IEEE 754 that are not desirable in graphics hardware. These aspects have to do with over and underflow and denorms. The special results that are usually returned in these cases under IEEE specifications aren't as useful as clamping the value of a calculation to either the smallest possible result or largest possible result. With DX10, we do see the addition of NaN and infinity as possible results, and along with a better specification of accuracy and precision, those interested in general purpose computing on graphics processors (GPGPU) should be very happy.

What are Geometry Shaders?

A whole new shader type has been added this time around as well: Geometry shaders. These shaders are similar to vertex shaders in that they operate on geometry before it has been projected on to screen space where pixel processing can take over. Rather than operating on single vertices, however, geometry shaders operate on larger blocks: meshes. These meshes (made up of vertices) can be manipulated in a myriad of ways. Working with an object containing vertices gives programmer the ability to manipulate those vertices in relation to each other more easily. Vertices can even be added or removed from a mesh. The ability to write out data from the geometry shaders (rather than simply sending it on for pixel processing) will also allow software to reprocess vertices that have been added or altered by the geometry shaders. As an extension to geometry instancing, we will have more flexibility in manipulating instanced geometry in order to avoid the cut and paste look. All of these new features mean we should see things like particle systems move completely off of the CPU and on to the GPU, and geometry may begin to play a larger role in graphics in the future.

In the beginning, increasing the number of triangles that could be rendered in a scene was a huge factor in performance. After a certain point, software, CPUs, buses, and overhead in general started to get in the way of how much difference adding more triangles made. Rather than having millions of really tiny triangles moving around, it became much faster to use textures to simulate geometry. Currently, per pixel lighting combined with uncompressed normal maps do a great job of simulating a whole lot of geometry at the expense of a lot of pixel power. With the new 8k*8k texture sizes and other DX10 enhancements, there is a lot of potential for using pixel processing to simulate geometry even better. But the combination of unified shaders and geometry shaders in new hardware should start to give developers a whole lot more flexibility in how they approach the problem of fine detail in geometry.

All GPUs are Created Equal: Say Goodbye to Cap Bits G80: A Mile High Overview
POST A COMMENT

111 Comments

View All Comments

  • aweigh - Friday, November 10, 2006 - link

    You can just use the program DX Tweaker to enable Triple Buffering in any D3D game and use your VSYNC with negligable performance impact. So you can play with your VSYNC, a high-res and AA as well. :) Reply
  • aweigh - Friday, November 10, 2006 - link

    I'm gonna buy an 88 specifically to use 4x4 SuperSampling in games. Why bother with MSAA with a card like that? Reply
  • DerekWilson - Friday, November 10, 2006 - link

    Supersampling can make textures blurry -- especially very detailed textures.

    And the impact will be much greater with the use of longer more detailed pixel shaders (as the shaders must be evaluated at every sub-pixel in supersample).

    I think transparency / adaptive AA are enough.

    On your previous comment, I don't think we're to the point where we can hit triple buffering, vsync, high levels of AA AND high resolution (2560x1600) without some input lag (triple buffering plus vsync with framerates less than your refresh rate can cause problems).

    If you're talking about enabling all these options on a lower resolution lcd panel, then I can definitely see that as a good use of the hardware. And it might be interesting to look at more numbers with these type of options enabled.

    Thanks for the suggestion.
    Reply
  • aweigh - Saturday, November 11, 2006 - link

    I never knew that about SuperSampling. Is it something similar to Quincux blurring? And would using a negative LOD via RivaTuner/nHancer counteract the effect?

    How about NVIDIA's Digital Sharpness setting in Color Correction? I've found a smidge of sharpening can do wonders to improve overall clarity.

    By the way, when you said Adaptive AA, were you referring to ATI cards?
    Reply
  • Unam - Friday, November 10, 2006 - link

    Derek,

    Saw your comment regarding the rationale for the test resolution, while I understand your reasoning now, it still begs the question how many of your readers have 30" LCD flat panels?
    Reply
  • DerekWilson - Friday, November 10, 2006 - link

    There might not be many out there right now, but it's still the right test platform for G80. We did test down to 1600x1200, so people do have information if they need it.

    But it speaks to who should own an 8800 GTX right now. It doesn't make sense to spend that much money on a part if you aren't going to get anything out of it with your 1280x1024 panel.

    Owners of a 2560x1600 panel will want an 8800 GTX. Owners of an 8800 GTX will want a 2560x1600 panel. Smooth framerates with the ability to enable 4xAA in every game that allowed it is reason enough. People without a 2560x1600 panel should probably wait until prices come down on the 8800 GTX or until games that are able to push the 8800 GTX harder to buy the card.
    Reply
  • Unam - Tuesday, November 14, 2006 - link

    Derek,

    A follow up to testing resolutions, the FPS numbers we see in your articles, are they maximum, minimum or average?
    Reply
  • Unam - Friday, November 10, 2006 - link

    Who the heck runs 2560x1600? At 4XAA? Come on guys, real world benchmarks please! Reply
  • DerekWilson - Friday, November 10, 2006 - link

    we did:

    1600x1200, 1920x1440, and even 1280x1024 in Oblivion
    Reply
  • dragonsqrrl - Thursday, August 25, 2011 - link

    ....lol, owned. Reply

Log in

Don't have an account? Sign up now