The R420 Vertex Pipeline

The point of the vertex pipeline in any GPU is to take geometry data, manipulate it if needed (with either fixed function processes, or a vertex shader program), and project all of the 3D data in a scene to 2 dimensions for display. It is also possible to eliminate unnecessary data from the rendering pipeline to cut out useless work (via view volume clipping and backface culling). After the vertex engine is done processing the geometry, all the 2D projected data is sent to the pixel engine for further processing (like texturing and fragment shading).

The vertex engine of R420 includes 6 total vertex pipelines (R3xx has four). This gives R420 a 50% per clock increase in peak vertex shader power per clock cycle.

Looking inside an individual vertex pipeline, not much has changed from R3xx. The vertex pipeline is laid out exactly the same, including a 128bit vector math unit, and a 32bit scalar math unit. The major upgrade R420 has had from R3xx is that it is now able to compute a SINCOS instruction in one clock cycle. Before now, if a developer requested the sine or cosine of a number in a vertex shader program, R3xx would actually compute a taylor series approximation of the answer (which takes longer to complete). The adoption of a single cycle SINCOS instruction by ATI is a very smart move, as trigonometric computations are useful in implementing functionality and effects attractive to developers. As an example, developers could manipulate the vertices of a surface with SINCOS in order to add ripples and waves (such as those seen in bodies of water). Sine and cosine computations are also useful in more basic geometric manipulation. Overall, R420 has a welcome addition in single cycle SINCOS computation.

So how does ATI's new vertex pipeline layout compare to NV40? On a major hardware "black box" level, ATI lacks the vertex texture unit featured in NV40 that's required for shader model 3.0's vertex texturing support. Vertex texturing allows developers to easily implement any effect which would benefit from allowing texture data to manipulate geometry (such as displacement mapping). The other major difference between R420 and NV40 is feature set support. As has been widely talked about, NV40 supports Shader Model 3.0 and all the bells and whistles that come along with it. R420's feature set support can be described as an extended version of Shader Model 2.0, offering a few more features above and beyond the R3xx line (including more support of longer shader programs, and more registers).

What all this boils down to is that we are only seeing something that looks like a slight massaging of the hardware from R300 to R420. We would probably see many more changes if we were able too peer deeper under the hood. From a functionality standpoint, it is sometimes hard to see where performance comes from, but (as we will see even more from the pixel pipeline) as graphics hardware evolves into multiple tiny CPUs all laid out in parallel, performance will be effected by factors traditionally only spoken of in CPU analysis and reviews. The total number of internal pipeline stages (rather than our high level functionality driven pipeline), cache latencies, the size of the internal register file, number of instructions in flight, number of cycles an instructions takes to complete, and branch prediction will all come heavily into play in the future. In fact, this review marks the true beginning of where we will be seeing these factors (rather than general functionality and "computing power") determine the performance of a generation of graphics products. But, more on this later.

After leaving the vertex engine portion of R420, data moves into the setup engine. This section of the hardware takes the 2D projected data from the vertex engine, generates triangles and point sprites (particles), and partitions the output for use in the pixel engine. The triangle output is divided up into tiles, each of which are sent to a block of four pixel pipelines (called a quad pipeline by ATI). These tiles are simply square blocks of projected pixel data, and have nothing to do with "tile based rendering" (front to back rendering of small portions of the screen at a time) as was seen in PowerVR's Kyro series of GPUs.

Now we're ready to see what happens on the per-pixel level.

The Chip The Pixel Shader Engine
Comments Locked

95 Comments

View All Comments

  • NullSubroutine - Thursday, May 6, 2004 - link

    Trog I agree with you for the most part, but there are some people who can use upgrades. I myself have bought expensive video cards in the past. I got the Geforce3 right when it came out (in top of the line alienware system for 1400 bucks), and it lasted me for 2-3 years. Now if someone spends 400-500 bucks on a video card that lasts them that long (2-3 years) its no different than if someone buys a 200 buck video card every year. I am one of those people who likes to buy new compoents when computing speed doubles and if I have the money I'll get what I can that will last me the longest. If I cant afford top of the line Ill get something that will get me by (9500pro last card I bought for 170 over a year ago).

    However I do agree with you that people who upgrade to the best every generation is silly.
  • TrogdorJW - Thursday, May 6, 2004 - link

    I'm sorry, but I simply have to laugh at anyone going on and on about how they're going to run out and buy the latest graphics cards from ATI or Nvidia right now. $400 to $500 for a graphics card is simply too much (and it's too much for a CPU as well). Besides, unless you have some dementia that requires you to run all games at 1600x1200 with 4xAA and 8xAF, there's very little need for either the 6800 Ultra or the X800 XT right now. Relax, take a deep breath, save some money, and forget about the pissing contest.

    So, is it just me, or is there an inverse relationship between a person's cost of computer hardware and their actual knowledge of computers? I have a coworker that is always spending money on upgrading his PC, and he really has no idea what he's doing. He went from an Athlon XP 2800+ (OC'ed to 2.4 GHz) to a P4 2.8 OC'ed to 3.7 GHz. He also went from a 9800 Pro 256 to a 9800 XT. In the past, he also had a GeForce FX 5900 Ultra. He tries to overclock all of his systems, they sound like a jet engine, and none of them are actually fully stable. In the last year, he has spent roughly $5000 on computer parts (although he has sold off some of the "old" parts like the 5900 Ultra). Performance of his system has probably improved by about 25% over the course of the year.

    Sorry for the rant, but behavior like that from *anybody* is just plain stupid. He's gone from 120 FPS in some games up to 150 FPS. Anyone here actually think he can tell the difference? I suppose it goes without saying that he's constantly crowing about his 3DMark scores. Now he's all hot to go out and buy the X800 XT cards, and he's been asking me when they'll be in stores. Like I care. They're nice cards, I'm sure, but why buy them before you actually have a game that needs the added performance?

    His current games du joir? Battlefield 1942 and Battlefield Vietnam. Yeah... those really need a high performance DX9 card. The 80+ FPS of the 9800 XT he has just isn't cutting it.

    So, if you read my description of this guy and think I'm way off base, go get your head examined. Save your money, because some day down the road you will be glad that you didn't spend everything you earned on computer parts. Enjoy life, sure, but having a faster car, faster computer, bigger house, etc. than someone else is worth pretty much jack and shit when it all comes down to it.

    /Rant. :D
  • a2y - Thursday, May 6, 2004 - link

    If a card is going to come up every few weeks then how do you guys choose which to buy?

    ATI have the trade-up section for old cards, is that any good?
  • gxshockwav - Thursday, May 6, 2004 - link

    Um...what happened to the posting of new Ge6 6850 benchmark numbers?
  • NullSubroutine - Thursday, May 6, 2004 - link

    Trog, its good to hear you were being nice, but I wasnt bashing THG, I love that site (besides this one) and I get alot of my tech info from there.

    What I normally do though is I take benchmarks from different sites then put them in Excel, make a little graph and see the % point differences between the tests. If you plan on buying a new vid card its important to find out if the Nvida or ATi card is faster on your type of system.

    And from what I found is that the AMD system from Atech performed better with Nvidia, and Intel system peformed better with ATi from THG (for Farcry and Unreal2004 only ones to be somewhat similar tests).

    #61 How much money did ATi spend when developing the R3xx line? I would venture to say a decent amount...somtimes companies invest more money in a design then refine it several times (at less cost) before starting from scratch again. ATi and Nvidia has done this for quite awhile. Also from what Ive heard the r3xx had the possibilty of 16 pipes to begin with..this true anyone?

    Texture memory about 256 doesnt really matter now b/c of the insane bandwidth the 8x apg has to offer, however one might see that 512 may come in handy after Doom3 comes out since they use shitloads of high res textures instead of high polygons for alot of detail. I dont see 512 coming out for a little while, espescially with ram prices.
  • NullSubroutine - Thursday, May 6, 2004 - link

    Trog, its good to hear you were being nice, but I wasnt bashing THG, I love that site (besides this one) and I get alot of my tech info from there.

    What I normally do though is I take benchmarks from different sites then put them in Excel, make a little graph and see the % point differences between the tests. If you plan on buying a new vid card its important to find out if the Nvida or ATi card is faster on your type of system.

    And from what I found is that the AMD system from Atech performed better with Nvidia, and Intel system peformed better with ATi from THG (for Farcry and Unreal2004 only ones to be somewhat similar tests).

    #61 How much money did ATi spend when developing the R3xx line? I would venture to say a decent amount...somtimes companies invest more money in a design then refine it several times (at less cost) before starting from scratch again. ATi and Nvidia has done this for quite awhile. Also from what Ive heard the r3xx had the possibilty of 16 pipes to begin with..this true anyone?

    Texture memory about 256 doesnt really matter now b/c of the insane bandwidth the 8x apg has to offer, however one might see that 512 may come in handy after Doom3 comes out since they use shitloads of high res textures instead of high polygons for alot of detail. I dont see 512 coming out for a little while, espescially with ram prices.
  • deathwalker - Thursday, May 6, 2004 - link

    Well...once again..someone is lying thru there teeth. What happen to the $399 entry price of the Pro model? Cheapest price on pricewatch it $478. Someone trying to cash in on the new buyer hysteria? I am impressed though with ATI's ability to step up to the plate and steal Nvidia's thunder.
  • a2y - Thursday, May 6, 2004 - link

    OMG OMG!! I almost gone to buy and build a new system with latest specs and graphics card! and was going for the nVidia 6800Ultra ! until just now i decided to see any news from ATI and discovered their new card!

    Man if ATI and nVidia are going to bring up a card every 2/3 weeks then i'll never be able to build this system!!!

    Being a (Pre)fan of half-life 2, I guess im going to wait until its released to buy a graphics card (meaning when we all die and go to hell).
  • remy - Wednesday, May 5, 2004 - link

    For the OpenGL vs D3D performance argument don't forget to take a look at Homeworld2 as it is an OpenGL game. ATI's hardware certainly seems to have come a long way since the 9700 Pro in that game!
  • TrogdorJW - Wednesday, May 5, 2004 - link

    NullSubroutine - It was meant as nice sarcasm, more or less. No offense intended. (I was also trying to head off this thread becoming a "THG sucks blah blah blah" tangent, as many in the past have done when someone mentions their reviews.)

    My basic point (without doing a ton of research) is that pretty much every hardware site has their own demos that they use for benchmarking. Given that the performance difference between the ATI and Nvidia cards was relatively constant (I think), it's generally safe to assume that the levels, setup, bots, etc. are not the same when you see differing scores. Now if you see to places using the same demo and the same system setup, and there's a big difference, then you can worry. I usually don't bother comparing benchmark numbers from two different sites since they are almost never the same configuration.

Log in

Don't have an account? Sign up now