Depth and Stencil with Hyper Z HD

In accordance with their "High Definition Gaming" theme, ATI is calling the R420's method of handling depth and stencil processing Hyper Z HD. Depth and stencil processing is handled at multiple points throughout the pipeline, but grouping all this hardware into one block can make sense as each step along the way will touch the z-buffer (an on die cache of z and stencil data). We have previously covered other incarnations of Hyper Z which have done basically the same job. Here we can see where the Hyper Z HD functionality interfaces with the rendering pipeline:

The R420 architecture implements a hierarchical and early z type of occlusion culling in the rendering pipeline.

With early z, as data emerges from the geometry processing portion of the GPU, it is possible to skip further rendering large portions of the scene that are occluded (or covered) by other geometry. In this way, pixels that won't be seen don't need to run through the pixel shader pipelines and waste precious resources.

Hierarchical z indicates that large blocks of pixels are checked and thrown out if the entire tile is occluded. In R420, these tiles are the very same ones output by the geometry and setup engine. If only part of a tile is occluded, smaller subsections are checked and thrown out if possible. This processing doesn't eliminate all the occluded pixels, so pixels coming out of the pixel pipelines also need to be tested for visibility before they are drawn to the framebuffer. The real difference between R3xx and R420 is in the number of pixels that can be gracefully handled.

As rasterization draws nearer, the ATI and NVIDIA architectures begin to differentiate themselves more. Both claim that they are able to calculate up to 32 z or stencil operations per clock, but the conditions under which this is true are different. NV40 is able to push two z/stencil operations per pixel pipeline during a z or stencil only pass or in other cases when no color data is being dealt with (the color unit in NV40 can work with z/stencil data when no color computation is needed). By contrast, R420 pushes 32 z/stencil operations per clock cycle when antialiasing is enabled (one z/stencil operation can be completed per clock at the end of each pixel pipeline, and one z/stencil operation can be completed inside the multisample AA unit).

The different approaches these architectures take mean that each will excel in different ways when dealing with z or stencil data. Under R420, z/stencil speed will be maximized when antialiasing is enabled and will only see 16 z/stencil operations per clock under non-antialiased rendering. NV40 will achieve maximum z/stencil performance when a z/stencil only pass is performed regardless of the state of antialiasing.

The average case for NV40 will be closer to 16 z/stencil operations per clock, and if users don't run antialiasing on R420 they won't see more than 16 z/stencil operations per clock. Really, if everyone begins to enable antialiasing, R420 will begin to shine in real world situations, and if developers embrace z or stencil only passes (such as in Doom III), NV40 will do very well. The bottom line on which approach is better will be defined by the direction the users and developers take in the future. Will enabling antialiasing win out over running at ultra-high resolutions? Will developers mimic John Carmack and the intensive shadowing capabilities of Doom III? Both scenarios could play out simultaneously, but, really, only time will tell.

The Pixel Shader Engine A New Compression Scheme: 3Dc
POST A COMMENT

95 Comments

View All Comments

  • raks1024 - Monday, January 24, 2005 - link

    free ati x800: http://www.pctech4free.com/default.aspx?ref=46670 Reply
  • Ritalinkid - Monday, June 28, 2004 - link

    After reading almost all of the video cards reviews posted on anandtech I start to get the feeling the anandtech has a grudge against nvidia. The reviews seem to put nvidia down no matter what area they excel in. With leading openGL support, ps3.0 support, and the 6850 shadowing the x800 in directX, its seems like nvidia should not be counted out as the "best card."
    I would love to see a review that tested all the features that both cards offered especially if showed the games that would benefit the most from each cards features (if they are available). Maybe then could I decide which is better, or which could benefit me more.
    Reply
  • BlackShrike - Saturday, May 08, 2004 - link

    Hey if anyone is gonna be buying one of these new cards, would anyone want to sell their 9700 pro or 9800 por/Xt for like 100-150 bucks? If you do contact me at POT989@hotmail.com. Thanks. Reply
  • DonB - Saturday, May 08, 2004 - link

    No TV tuner on this card either? Will there be an "All-In-Wonder" version soon that will include it? Reply
  • xin - Friday, May 07, 2004 - link

    (my bad, I didn't notice that I was on the first page of the posts, and replied to a message there heh)

    Well, since everyone else is throwing their preferences out there... I guess I will too. My last 3 cards have been ATI cards (9700Pro & 9500Pro, and an 8500 "Pro"), and I have not been let down. Right at this moment I lean towards the x800XT.

    However, I am not concerned about power since I am running a TruePower550, and I will be interested in seeing what happens with all of this between now and the next 4-6 weeks when these cards actually come to market... and I will make my decision then on which card to buy.
    Reply
  • xin - Friday, May 07, 2004 - link


    Besides that, even if it were true (which it isn't), there is a world of difference between have *some* level of support, and requiring it. (*some* meaning the intial application of PS3.0 technology to games, that will likely be as sloppy as your first time in the back of a car with your first girlfriend).

    Game makers will not require PS3.0 support for a long long long time... because it would alienate the vast majority of the people out there, or at least for the time being any person who doesn't have a NV40 card.

    Some games may implement it and look slightly better, or even still look the same only run faster while looking the same.... but I would put money down that by the time PS3.0 usage in games comes anywhere close to mainstream, both mfg's will have their new, latest and greatest cards out, probably a 2 generations or more past these cards.
    Reply
  • xin - Friday, May 07, 2004 - link


    first of all... "alot of the upcoming topgames will support PS3.0!" ??? They will? Which ones exactly?
    Reply
  • Z80 - Friday, May 07, 2004 - link

    Good review. Pretty much tells me that I can select either Nvidia or ATI with confidence that I'm getting alot of "bang for my buck". However, my buck bang for video cards rarely exceeds $150 so I'm waiting for the new low to mid range cards before making a purchase. Reply
  • xin - Friday, May 07, 2004 - link


    I love how a handful of stores out there feel the need to rip people off by charing $500+ for the x800PRO cards, since the XT isn't available yet.

    Anyway, something interesting I noticed today:

    http://www.compusa.com/products/product_info.asp?p...

    http://www.compusa.com/products/product_info.asp?p...

    Notice the "expected ship date"... at least they have their pricing right.
    Reply
  • a2y - Friday, May 07, 2004 - link

    Trog, I Also agree, the thing is.. its true i do not have complete knowledge of deep details of video cards.. u see my current video card is now 1 year old (Geforce4 mx440) which is terrible for gaming (50fps and less) and some games actually do not support it (like deusEX 2). I wanted a card that would be future proof, every consumer would go thinking this way, I do not spend everything i earned, but to me and some others $400-$500 is O.K. If it means its going to last a bit longer.
    I especially worry about the technology used more than the other specs of the cards, more technologies mean future games are going to support it. I DO NOT know what i'v just said actually means, but I fealt it during the past few years and have been affected by it right now (like the deus ex 2 problem!) it just doesn't support it, and my card performs TERRIBLY in all games

    now my system is relatively slow for hardcore gaming:
    P4 2.4GHz - 512MB RDRAM PC800 - 533MHz FSB - 512KB L2 Cache - 128MB Geforce4 mx440 card.

    I wanted a big jump in performance especially in gaming so thats why i wanted the best card currently available.
    Reply

Log in

Don't have an account? Sign up now