A New Compression Scheme: 3Dc

3Dc isn't something that's going to make current games run better or faster. We aren't talking about a glamorous technology; 3Dc is a lossy compression scheme for use in 3D applications (as its name is supposed to imply). Bandwidth is a highly prized commodity inside a GPU, and compression schemes exist to try to help alleviate pressure on the developer to limit the amount of data pushed through a graphics card.

There are already a few compressions schemes out there, but in their highest compression modes, they introduce some discontinuity into the texture. This is acceptable in some applications, but not all. The specific application ATI is initially targeting for use with 3Dc is normal mapping.

Normal mapping is used in making the lighting of a surface more detailed than is its geometry. Usually, the normal vector at any given point is interpolated from the normal data stored at the vertex level, but, in order to increase the detail of lighting and texturing effects on a surface, normal maps can be used to specify the way normal vectors should be oriented across an entire surface at a high level of detail. If very large normal maps are used, enormous amounts of lighting detail can produce the illusion of geometry that isn't actually there.


Here's an example of how normal mapping can add the appearance of more detailed geometry

In order to work with these large data sets, we would want to use a compression scheme. But since we don't want discontinuities in our lighting (which could appear as flashy or jumpy lighting on a surface), we would like a compression scheme that maintains the smoothness of the original normal map. Enter 3Dc.


This is an example of how 3Dc can help alieve continuity problems in normal map compression

In order to facilitate a high level of continuity, 3Dc divides textures into four by four blocks of vector4 data with 8 bits per component (512bit blocks). For normal map compression, we throw out the z component which can be calculated from the x and y components of the vector (all normal vectors in a normal map are unit vectors and fit the form x^2 + y^2 + z^2 = 1). After throwing out the unused 16 bits from each normal vector, we then calculate the minimum and maximum x and minimum and maximum y for the entire 4x4 block. These four values are stored, and each x or y value is stored as a 3 bit value selecting any of 8 equally spaced steps between the minimum and maximum x or y values (inclusive).


The storage space required for a 4x4 block of normal map data using 3Dc

the resulting compressed data is 4 vectors * 4 vectors * 2 components * 3 bits + 32 bits (128 bits) large, giving a 4:1 compression ratio for normal maps with no discontinuities. Any two channel or scalar data can be compressed fairly well via this scheme. When compressing data that is very noisy (or otherwise inherently discontinuous -- not that this is often seen) accuracy may suffer, and compression ratio falls off for data that is more than two components (other compression schemes may be more useful in these cases).

ATI would really like this compression scheme to catch on much as ST3C and DXTC have. Of course, the fact that compression and decompression of 3Dc is built in to R420 (and not NV40) won't play a small part in ATI's evangelism of the technology. After all is said and done, future hardware support by other vendors will be based on software adoption rate of the technology, and software adoption will likely also be influenced by hardware vendor's plans for future support.

As far as we are concerned, all methods of increasing apparent useable bandwidth inside a GPU in order to deliver higher quality games to end users are welcome. Until memory bandwidth surpasses the needs of graphics processors (which will never happen), innovative and effective compressions schemes will be very helpful in applying all the computational power available in modern GPUs to very large sets of data.

Depth and Stencil with Hyper Z HD The Cards
POST A COMMENT

95 Comments

View All Comments

  • Pumpkinierre - Wednesday, May 5, 2004 - link

    Sorry, scrub that last one. I couldnt help it. I will reform.
    Reply
  • Pumpkinierre - Wednesday, May 5, 2004 - link

    So, which is better: a64 at 2Gig or P4 at 3.2?
    Reply
  • jibbo - Wednesday, May 5, 2004 - link

    "Zobar is right; contra Jibbo, the increased flexibility of PS3 means that for many 2.0 shader programs a PS3 version can achieve equivalent results with a lesser performance hit."

    I think you're both still missing my point. There is nothing that says PS3.0 is faster than PS2.0. You are both correct that it has to potential to be faster, though you both assume that a first generation PS3.0 architecture will perform at the same level as a refined PS2.0 architechture.

    PS3.0 is one of the big reasons that nVidia's die size and transistor count are bigger than ATI's. The additional power drain (and consequently heat dissipation) of those 40M transistors also helps to limit the clock speeds of the 6800. When you're talking about ALU ops per second (which dominate math-intensive shaders), these clock speeds become very important. A lot of the 6800's speed for PS3.0 will have to be found in the driver optimizations that will compile these shaders for PS3.0. Left to itself, ATI's raw shader performance still slaughters nVidia's.

    They both made trade-offs, and it seems that ATI is banking that PS3.0 won't be a dealbreaker in 2004. Only time will tell....
    Reply
  • Phiro - Wednesday, May 5, 2004 - link

    K, I found the $400M that the CEO claimed. He also claimed $400M for the NV3x core as well. It seemed more as a boast than anything, not particularly scientific or exact.

    In any case, ATI supposedly spent $165-180M last year (2003) on R&D, with an estimated increase of 100% for this year. How long has the 4xx core been in development?

    Regardless, ultimately we the consumers are the winners. Whether or not the R&D spent pans out will play out over the next couple years, as supposedly the nv4x core has a 24 month lifespan.

    Reply
  • 413xram - Wednesday, May 5, 2004 - link

    If you watch nvidia's launch video on their site they mention the r&d costs for their new card. Reply
  • RyanVM - Wednesday, May 5, 2004 - link

    What ever happened to using ePSXe as a video card benchmark? Reply
  • Phiro - Wednesday, May 5, 2004 - link

    Well, Nvidia may have spent $400M on this (I've never seen that number before but we'll go with it I guess) but they paid themselves for the most part.

    ATI's cost can't be too trivialized - didn't they drop a product design or two in favor of getting this out the door instead? And any alteration in the architecture of something doesn't really qualify as a hardware "refresh" in my book - a hardware refresh for an OEM consists of maybe one speed notch increase in the RAM, new bios, larger default HD, stuff like that. MLK is what Dell used to call it - Mid Life Kick.
    Reply
  • retrospooty - Wednesday, May 5, 2004 - link

    "Precisely. By the time 512mb is useful, the card will be too slow for it to matter, and you'd need a new card any way."

    True...

    Both cards perform great, both have wins and losses depending on the game. The deciding factor will be price and power requirements.

    Since prices will adjust downward, at a fairly equal rate, that leaves power. With Power requirements being so incredibly high with the NV40, that leans me toward ATI.

    413xram also has a good point above. For Nvidia, this is a 400 million dollar new chip design. For ATI, this was a refresh of an old design to add 16 pipes, and a few other features. After the losses NV took with the heavily flawed NV30 and 35 , they need a financial boom, and this isnt it.

    Reply
  • mattsaccount - Wednesday, May 5, 2004 - link

    There are no games available today that use 256mb of video RAM, let alone 512mb. Even upper-high-end cards routinely come with 128mb (e.g. Geforce FX 5900, Radeon 9600XT). It would not make financial sense for a game developer to release a game that only a small fraction of the community could run acceptably.

    >> I have learned from the past that future possibilties of technology in hardware does nothing for me today.

    Precisely. By the time 512mb is useful, the card will be too slow for it to matter, and you'd need a new card any way.
    Reply
  • 413xram - Wednesday, May 5, 2004 - link

    #64 Can you explain "gimmick"? Reply

Log in

Don't have an account? Sign up now