Shader Model 4.0 Enhancements

Aside from defining the capabilities and instructions that the different shaders must support, Microsoft also specifies attributes like precision, number of instructions that can make up a shader program, and the number of registers available to the programmer. Here's a table comparing DX9 and DX10 shader models.

Along with these changes, Microsoft has made some lower level adjustments. Until now, shaders have been exclusively floating point. This means that operations like memory addressing and array indexing (which use integer values) must be done carefully if interpolation is to be avoided. With DX10, integer and bitwise operations have been added to the mix. This means programmers can make use of traditional data structures and memory operations. Increasing the flexibility of the hardware and enabling programmers to employ methods commonly used on more general purpose hardware will certainly be helpful in creating a better platform for developers to create the effects they desire.

Floating point operations have also been enhanced, as Microsoft has placed tighter requirements on how to handle the numbers. IEEE 754 is a specification that defines all aspects of floating point interaction. Sticking to such a standard allows programmers to guarantee that operations will be consistent between different types of hardware. Because Microsoft hasn't been as strict in the past, we've seen some issues where ATI and NVIDIA don't provide the exact same result due to rounding and accuracy differences. This time around, DX10 has very nearly IEEE 754 requirements. There are certain aspects of IEEE 754 that are not desirable in graphics hardware. These aspects have to do with over and underflow and denorms. The special results that are usually returned in these cases under IEEE specifications aren't as useful as clamping the value of a calculation to either the smallest possible result or largest possible result. With DX10, we do see the addition of NaN and infinity as possible results, and along with a better specification of accuracy and precision, those interested in general purpose computing on graphics processors (GPGPU) should be very happy.

What are Geometry Shaders?

A whole new shader type has been added this time around as well: Geometry shaders. These shaders are similar to vertex shaders in that they operate on geometry before it has been projected on to screen space where pixel processing can take over. Rather than operating on single vertices, however, geometry shaders operate on larger blocks: meshes. These meshes (made up of vertices) can be manipulated in a myriad of ways. Working with an object containing vertices gives programmer the ability to manipulate those vertices in relation to each other more easily. Vertices can even be added or removed from a mesh. The ability to write out data from the geometry shaders (rather than simply sending it on for pixel processing) will also allow software to reprocess vertices that have been added or altered by the geometry shaders. As an extension to geometry instancing, we will have more flexibility in manipulating instanced geometry in order to avoid the cut and paste look. All of these new features mean we should see things like particle systems move completely off of the CPU and on to the GPU, and geometry may begin to play a larger role in graphics in the future.

In the beginning, increasing the number of triangles that could be rendered in a scene was a huge factor in performance. After a certain point, software, CPUs, buses, and overhead in general started to get in the way of how much difference adding more triangles made. Rather than having millions of really tiny triangles moving around, it became much faster to use textures to simulate geometry. Currently, per pixel lighting combined with uncompressed normal maps do a great job of simulating a whole lot of geometry at the expense of a lot of pixel power. With the new 8k*8k texture sizes and other DX10 enhancements, there is a lot of potential for using pixel processing to simulate geometry even better. But the combination of unified shaders and geometry shaders in new hardware should start to give developers a whole lot more flexibility in how they approach the problem of fine detail in geometry.

All GPUs are Created Equal: Say Goodbye to Cap Bits G80: A Mile High Overview
Comments Locked

111 Comments

View All Comments

  • JarredWalton - Wednesday, November 8, 2006 - link

    Page 17:

    "The dual SLI connectors are for future applications, such as daisy chaining three G80 based GPUs, much like ATI's latest CrossFire offerings."

    Using a third GPU for physics processing is another possibility, once NVIDIA begins accelerating physics on their GPUs (something that has apparently been in the works for a year or so now).
  • Missing Ghost - Wednesday, November 8, 2006 - link

    So it seems like by substracting the highest 8800gtx sli power usage result with the one for the 8800gtx single card we can conclude that the card can use as much as 205W. Does anybody knows if this number could increase when the card is used in DX10 mode?
  • JarredWalton - Wednesday, November 8, 2006 - link

    Without DX10 games and an OS, we can't test it yet. Sorry.
  • JarredWalton - Wednesday, November 8, 2006 - link

    Incidentally, I would expect the added power draw in SLI comes from more than just the GPU. The CPU, RAM, and other components are likely pushed to a higher demand with SLI/CF than when running a single card. Look at FEAR as an example, and here's the power differences for the various cards. (Oblivion doesn't have X1950 CF numbers, unfortunately.)

    X1950 XTX: 91.3W
    7900 GTX: 102.7W
    7950 GX2: 121.0W
    8800 GTX: 164.8W

    Notice how in this case, X1950 XTX appears to use less power than the other cards, but that's clearly not the case in single GPU configurations, as it requires more than everything besides the 8800 GTX. Here's the Prey results as well:

    X1950 XTX: 111.4W
    7900 GTX: 115.6W
    7950 GX2: 70.9W
    8800 GTX: 192.4W

    So there, GX2 looks like it is more power efficient, mostly because QSLI isn't doing any good. Anyway, simple subtraction relative to dual GPUs isn't enough to determine the actual power draw of any card. That's why we presented the power data without a lot of commentary - we need to do further research before we come to any final conclusions.
  • IntelUser2000 - Wednesday, November 8, 2006 - link

    It looks like putting SLI uses +170W more power. You can see how significant video card is in terms of power consumption. It blows the Pentium D away by couple of times.
  • JoKeRr - Wednesday, November 8, 2006 - link

    well, keep in mind the inefficiency of PSU, generally around 80%, so as overall power draw increases, the marginal loss of power increases a lot as well. If u actually multiply by 0.8, it gives about 136W. I suppose the power draw is from the wall.
  • DerekWilson - Thursday, November 9, 2006 - link

    max TDP of G80 is at most 185W -- NVIDIA revised this to something in the 170W range, but we know it won't get over 185 in any case.

    But games generally don't enable a card to draw max power ... 3dmark on the other hand ...
  • photoguy99 - Wednesday, November 8, 2006 - link

    Isn't 1920x1440 a resolution that almost no one uses in real life?

    Wouldn't 1920x1200 apply many more people?

    It seems almost all 23", 24", and many high end laptops have 1900x1200.

    Yes we could interpolate benchmarks, but why when no one uses 1440 vertical?

  • Frallan - Saturday, November 11, 2006 - link

    Well i have one more suggestion for a resolution. Full HD is 1920*1080 - that is sure to be found in a lot of homes in the future (after X-mas any1 ;0) ) on large LCDs - I believe it would be a good idea to throw that in there as well. Especially right now since loads of people will have to decide how to spend their money. The 37" Full HD is a given but on what system will I be gaming PS-3/X-Box/PC... Pls advice.
  • JarredWalton - Wednesday, November 8, 2006 - link

    This should be the last time we use that resolution. We're moving to LCD resolutions, but Derek still did a lot of testing (all the lower resolutions) on his trusty old CRT. LOL

Log in

Don't have an account? Sign up now