HSR Explained

On top of increased speed, various driver builds often introduce new features. A good video card company is always striving not only to improve driver speed but also to improve driver features. NVIDIA's Detonator series drivers provide a perfect example of this.

The Detonator drivers that support NVIDIA based cards always seem to be a work in progress. No matter what features are added or how much speed is gained, NVIDIA is always striving to produce more feature rich and powerful drivers. Take Detonator2 versus the newest Detonator3 drivers. Not only did NVIDIA increase speed dramatically with the latest drivers, they also introduced a host of new features, from Digital Vibrance Control to TwinView support.

Likewise, 3dfx seems to have followed in the footsteps of NVIDIA. Not only do the latest 1.04.01 drivers feature increased performance, they also contain a very exciting and often misunderstood "hidden" feature.

This feature is know as hidden surface removal, or HSR for short. Searching 3dfx's page reveals no information on the subject. In fact, even contacting 3dfx to discuss this new "feature" results in a "no comment" response. So what is this feature and what does it do?

3dfx's HSR driver setting does exactly what its name implies: removes hidden (or overwritten surfaces). Overdraw, the act of drawing polygons over existing polygons to render a scene, is a result of the immediate mode rendering that has been used since the 1960's to render a 3D scene. Let's take a look at how we described this form of rendering in our Imagination Technologies / STMicro PowerVR Series 3: Kyro review:

A traditional 3D accelerator processes each polygon as it is sent to the hardware, without any knowledge of the rest of the scene. Since there is no knowledge of the rest of the scene, every forward facing polygon must be shaded and textured. A z-buffer is used to store the depth of each pixel in the current back buffer. Each pixel of each polygon rendered must be checked against the z-buffer to determine if it is closer to the viewer than the pixel currently stored in the back buffer.

Checking against the z-buffer must be performed after the pixel is already shaded and textured. If a pixel turns out to be in front of the current pixel, the new pixel replaces (or is blended with in the case of transparency) the current pixel in the back buffer and the z-buffer depth updated. If the new pixel ends up behind the current pixel, the new pixel is thrown out and no changes are made to the back buffer (or blended in the case of transparency). When pixels are drawn for no reason, this is known as overdraw. Drawing the same pixel three times is equivalent to an overdraw of 3, which Imagination Technologies and STMicro claim is typical.

With the increasing game complexity and decreasing memory bandwidth, the overdraw encountered in traditional rendering systems is a very large problem. For example, an overdraw of 3 in a scene results in every polygon in a scene being overwritten (and thus hidden) an average of two times before the final visible polygon is visible. By eliminating polygons that are not visible to the user, not only is the memory bandwidth needed to render a scene decrease but the graphics processor has to work less. This means that a given scene with any amount of overdraw could be rendered faster if this overdraw is eliminated.

There are a few ways of attacking the overdraw problem that exists in immediate mode rendering systems. One approach is the tile rendering attack that PowerVR products have been using for quite some time now. Tile base rendering is described in depth in our Imagination Technologies / STMicro PowerVR Series 3: Kyro review, but essentially this rendering method eliminates overdraw by splitting a scene into groups known as display lists. Each item in the display list is then compared against others in order to find which item is on top (using the z-buffer) and only the visible area is textured. This feature of tile based rendering systems eliminates the overdraw problem described above.

Although there are many benefits to tile based rendering, there is one major problem. The problem is that this form of rendering requires a drastically different approach to rendering which requires not only new drivers but also a drastically new chip design. This makes the transition to a tile based rendering system extremely difficult to implement. Take PowerVR for example, who is finally realizing success with their 3rd generation tile based renderer after previous generations falling short of expectations and having various bugs.

The other approach to reduce overdraw and make scene rendering more efficient is to utilize a more efficient form of the immediate mode rendering that we are all used to. The potential gains from optimizations and the like are very large, as a feature that provides less overdraw and fewer memory hits could finally rid us of the memory bottlenecks that plague video cards currently.

One manufacturer, ATI, has chosen to include such optimizations in their most recent product. The ATI Radeon series cards all include what is called HyperZ in order to make more efficient use of the immediate mode rendering process. The entire process is described in detail in our ATI Radeon SDR review, but in essence the optimizations allow for more efficient memory access. ATI claims that the parts of HyperZ, Hierarchical Z, Z-Compression and Fast Z-Clear, combine to increase memory bandwidth by up to 20%.

Hierarchical Z is the part of ATI's HyperZ technology that most resembles a tile based rendering system. Hierarchical Z basically allows for the pixel being rendered to be checked against the z-buffer before actually hitting the pipeline. This means that much like a tile based rendering system, ATI's Radeon core can throw out pixels before actually having to render them. In our tests we did find that this feature increased performance to a noticeable extent and was measured to add 22% to performance.

The latest 3dfx drivers, version 1.04.01, seem to have some form of optimizations that prevent overdraw in a similar manner to ATI's solution. Called HSR, this setting may be enabled with a small registry tweak.

Voodoo4 4500 - 16-bit vs 32-bit Performance Enabling HSR
Comments Locked

0 Comments

View All Comments

Log in

Don't have an account? Sign up now