A New Compression Scheme: 3Dc

3Dc isn't something that's going to make current games run better or faster. We aren't talking about a glamorous technology; 3Dc is a lossy compression scheme for use in 3D applications (as its name is supposed to imply). Bandwidth is a highly prized commodity inside a GPU, and compression schemes exist to try to help alleviate pressure on the developer to limit the amount of data pushed through a graphics card.

There are already a few compressions schemes out there, but in their highest compression modes, they introduce some discontinuity into the texture. This is acceptable in some applications, but not all. The specific application ATI is initially targeting for use with 3Dc is normal mapping.

Normal mapping is used in making the lighting of a surface more detailed than is its geometry. Usually, the normal vector at any given point is interpolated from the normal data stored at the vertex level, but, in order to increase the detail of lighting and texturing effects on a surface, normal maps can be used to specify the way normal vectors should be oriented across an entire surface at a high level of detail. If very large normal maps are used, enormous amounts of lighting detail can produce the illusion of geometry that isn't actually there.

Here's an example of how normal mapping can add the appearance of more detailed geometry

In order to work with these large data sets, we would want to use a compression scheme. But since we don't want discontinuities in our lighting (which could appear as flashy or jumpy lighting on a surface), we would like a compression scheme that maintains the smoothness of the original normal map. Enter 3Dc.

This is an example of how 3Dc can help alieve continuity problems in normal map compression

In order to facilitate a high level of continuity, 3Dc divides textures into four by four blocks of vector4 data with 8 bits per component (512bit blocks). For normal map compression, we throw out the z component which can be calculated from the x and y components of the vector (all normal vectors in a normal map are unit vectors and fit the form x^2 + y^2 + z^2 = 1). After throwing out the unused 16 bits from each normal vector, we then calculate the minimum and maximum x and minimum and maximum y for the entire 4x4 block. These four values are stored, and each x or y value is stored as a 3 bit value selecting any of 8 equally spaced steps between the minimum and maximum x or y values (inclusive).

The storage space required for a 4x4 block of normal map data using 3Dc

the resulting compressed data is 4 vectors * 4 vectors * 2 components * 3 bits + 32 bits (128 bits) large, giving a 4:1 compression ratio for normal maps with no discontinuities. Any two channel or scalar data can be compressed fairly well via this scheme. When compressing data that is very noisy (or otherwise inherently discontinuous -- not that this is often seen) accuracy may suffer, and compression ratio falls off for data that is more than two components (other compression schemes may be more useful in these cases).

ATI would really like this compression scheme to catch on much as ST3C and DXTC have. Of course, the fact that compression and decompression of 3Dc is built in to R420 (and not NV40) won't play a small part in ATI's evangelism of the technology. After all is said and done, future hardware support by other vendors will be based on software adoption rate of the technology, and software adoption will likely also be influenced by hardware vendor's plans for future support.

As far as we are concerned, all methods of increasing apparent useable bandwidth inside a GPU in order to deliver higher quality games to end users are welcome. Until memory bandwidth surpasses the needs of graphics processors (which will never happen), innovative and effective compressions schemes will be very helpful in applying all the computational power available in modern GPUs to very large sets of data.

Depth and Stencil with Hyper Z HD The Cards


View All Comments

  • TrogdorJW - Tuesday, May 4, 2004 - link

    Nice matchup we've got here! Just what we were all hoping for. Unfortunately, there are some disappointing trends I see developing....

    ShaderMark 2.0, we see many instances where the R420 is about 25% faster than the NV40. Let's see... 520 MHz vs 400 MHz. 'Nuf said, I think. Too bad for Nvidia that they have 222 million transistors, so they're not likely to be able to reach 500 MHz any time soon. (Or if they can, then ATI can likely reach 600+ MHz.)

    How about the more moderately priced card matchup? The X800 Pro isn't looking that attractive at $400. 25% more price gets you 33% more pipelines, which will probably help out on games that process a lot of pixels. And the Pro also has 4 vertex pipelines compared to 6? The optimizations make it better than a 9800XT, but not by a huge margin. The X800 SE with 8 pipelines is likely going to be about 20% faster than an 9800XT. Hopefully, it will come in at a $200 price point, but I'm not counting on that for at least six months. (Which is why I recently purchased a $200 9800 Pro 128.)

    The Nvidia lineup is currently looking a little nicer. The 6800 Ultra, Ultra Special, and GT all come with 16 pipelines, and there's talk of a lower clocked card for the future. If we can get a 16 pipeline card (with 6 vertex pipelines) for under $250, that would be pretty awesome. That would be a lot like the 5900 XT cards. Anyone else notice how fast the 9800 Pro prices dropped when Nvidia released the 5900 XT/SE? Hopefully, we'll see more of that in the future.

    Bottom line has to be that for most people, ATI is still the choice. (OpenGL gamers, Linux users, and professional 3D types would still be better with Nvidia, of course.) After all, the primary benefit of NV40 over R420 - Shader Model 3.0 - won't likely come into play for at least six months to a year. Not in any meaningful way, at least. By then, the fall refresh and/or next spring will be here, and ATI could be looking at SM3.0 support. Of course, adding SM3 might just knock the transistor counts of ATI's chips up into the 220 million range, which would kill their clock speed advantage.

    All told, it's a nice matchup. I figure my new 9800 Pro will easily last me until the next generation cards come out, though. By then I can look at getting an X800 Pro/XT for under $200. :)
  • NullSubroutine - Tuesday, May 4, 2004 - link

    I forgot to ask if anyone else noticed a huge difference (almost double) between AnandTechs UnrealTourment 2003 scores and that of Toms Hardware?

    (Its not the CPU difference, because the A64 3200+ had a baseline score of ~278 and the 3.2 P4 had a ~247 on a previous section.)

    So what gives?
  • NullSubroutine - Tuesday, May 4, 2004 - link

    The guy talking about the 400mhz and the 550mhz I have this to say.

    I agree with the other guy about the transistor count.

    Dont forget that ATi's cards used to be more powerful per clock speed compared to Nvidia a generation or two ago. So dont be babbling fanboy stuff.

    I would agree with that one guy (the # escapes me) about the fanboy stuff, but I said it first! On this thread anyways.
  • wassup4u2 - Tuesday, May 4, 2004 - link

    #30 & 38, I believe that while the ATI line is fabbed at TSMC, NVidia is using IBM for their NV40. I've heard also that yields at IBM aren't so good... which might not bode well for NVidia. Reply
  • quanta - Tuesday, May 4, 2004 - link

    > #14, I think it has more to do with the fact those OpenGL benchmarks are based on a single engine that was never fast on ATI hardware to begin with.

    Not true. ATI's FireGL X2 and Quadro FX 1100 were evenly matched in workstation OpenGL tests[1], which do not use Quake engines. Considering FireGL X2 is based on the Radeon 9800XT and Quadro FX 1100 is based on GeForce FX 5700 Ultra, such result is unacceptable. If I were an ATI boss, I would have made sure the OpenGL driver team would not make such a blunder, especially when R420 still sucks in most OpenGL games compared to GeFocre 6800 Ultra cards.

    [1] http://www.tomshardware.com/graphic/20040323/index...
  • AlexWade - Tuesday, May 4, 2004 - link

    From my standpoint the message is clear: nVidia is no longer THE standard in graphic cards. Why do I say that? It half the size, it requires less power, it has less transistors, and the performance is about the same. Even if the performance was slightly less, ATI would still be winner. Anyway, whatever, its not like these benchmarks will deter the hardcore gotta-have-it-now fanboys.

    Its not like I'm going to buy either. Maybe this will lower the prices of all the other video cards. $Dreams$
  • rsaville - Tuesday, May 4, 2004 - link

    If any 6800 users are wondering how to make their 6800 run the same shadows as the 5950 in the benchmark see this post:

    Also if you want to make your GeForceFX run the same shadows as the rest of the PS2.0 capable cards then find a file called driverConfig.lua in the homeworld2\bin directory and remove line 101 that disables fragment programs.
  • raskren - Tuesday, May 4, 2004 - link

    I wonder if this last line of AGP cards will ever completely saturate the AGP 8X bus. It would be interesting to see a true PCI-Express card compared to the same AGP 8X counterpart.

    Remember when Nvidia introduced the MX440 (or was it 460?) with an 8X AGP connector...what a joke.
  • sisq0kidd - Tuesday, May 4, 2004 - link

    That was the cheesiest line #46, but very true... Reply
  • sandorski - Tuesday, May 4, 2004 - link

    There is only 1 clear winner here, the Consumer!

    ATI and NVidia are running neck-neck.

Log in

Don't have an account? Sign up now