The Pipeline Overview

First, let us take a second to run through NVIDIA's architecture in general. DirectX or OpenGL commands and HLSL and GLSL shaders are translated and compiled for the architectures. Commands and data are sent to the hardware where we go from numbers, instructions and artwork to a rendered frame.

The first major stop along the way is the vertex engine where geometry is processed. Vertices can be manipulated using math and texture data, and the output of the vertex pipelines is passed on down the line to the fragment (or pixel) engine. Here, every pixel on the screen is processed based on input from the vertex engine. After the pixels have been processed for all the geometry, the final scene must be assembled based on color and z data generated for each pixel. Anti-aliasing and blending are done into the framebuffer for final render output in what NVIDIA calls the render output pipeline (ROP). Now that we have a general overview, let's take a look at the G70 itself.



The G70 GPU is quite a large IC. Weighing in at 302 million transistors, we would certainly hope that NVIDIA packed enough power in the chip to match its size. The 110nm TSMC process will certainly help with die size, but that is quite a few transistors. The actual die area is only slightly greater than NV4x. In fact, NVIDIA is able to fit the same number of ICs on a single wafer.



A glance at a block diagram of the hardware gives us a first look at the methods by which NVIDIA increased performance this time around.



The first thing to notice is that we now have 8 (up from 6) vertex pipelines. We still aren't vertex processing limited (except in the workstation market), but this 33% upgrade in vertex power will help to keep the extra pixel pipelines fed as well as handle any added vertex load developers try to throw at games in the near future. There are plenty of beautiful things that can be done with vertex shaders that we aren't seeing come about in games yet like parallax and relief mapping as well as extended use of geometry instancing and vertex texturing.

Moving on to pixel pipelines, we see a 50% increase in the number of pipelines packed under the hood. Each of the 24 pixel pipes is also more powerful than those of NV4x. We will cover just why that is a little later on. For now though, it is interesting to note that we do not see an increase in the 16 ROPs. These pipelines take the output of the fragment crossbar (which aggregates all of the pixel shader output) and finalizes the rendering process. It is here where MSAA is performed, as well as the color and z/stencil operations. Not matching the number of ROPs to the number of pixel pipelines indicates that NVIDIA feels its fill rate and ability to handle current and near future resolutions is not an issue that needs to be addressed in this incarnation of the GeForce. As NVIDIA's UltraShadow II technology is driven by the hardware's ability to handle twice as many z operations per clock when a z only pass is performed, this also means that we won't see improved performance in this area.

If NVIDIA is correct in their guess (and we see no reason they should be wrong), we will see increasing amounts of processing being done per pixel in future titles. This means that each pixel will spend more time in the pixel pipeline. In order to keep the ROPs busy in light of a decreased output flow from a single pixel pipe, the ratio of pixel pipes to ROPs can be increased. This is in accord with the situation we've already described.

ROPs will need to be driven higher as common resolutions increase. This can also be mitigated by increases in frequency. We will also need more ROPs as the number pixel pipelines are able to saturate the fragment crossbar in spite of the increased time a pixel spends being shaded.

Index No More Memory Bandwidth
POST A COMMENT

127 Comments

View All Comments

  • VIAN - Wednesday, June 22, 2005 - link

    "NVIDIA sees texture bandwidth as outweighing color and z bandwidth in the not too distant future." That was a quote from the article after saying that Nvidia was focusing less on Memory Bandwidth.

    Do these two statements not match or is there something I'm not aware of.
    Reply
  • obeseotron - Wednesday, June 22, 2005 - link

    These benchmarks are pretty clearly rushed out and wrong, or at least improperly attributed to the wrong hardware. SLI 6800 show up faster than SLI 7800's in many benchmarks, in some cases much more than doubling single 6800 scores. I understand NDAs suck with the limited amount of time to produce a review, but I'd rather it have not been posted until the afternoon than ignore the benchmarks section. Reply
  • IronChefMoto - Wednesday, June 22, 2005 - link

    #28 -- Mlittl3 can't pronounce Penske or terran properly, and he's giving out grammar advice? Sad. ;) Reply
  • SDA - Wednesday, June 22, 2005 - link

    QUESTION

    Okay, allcaps=obnoxious. But I do have a question. How was system power consumption measured? That is, was the draw of the computer at the wall measured, or was the draw on the PSU measured? In other words, did you measure how much power the PSU drew from the wall or how much power the components drew from the PSU?
    Reply
  • Aikouka - Wednesday, June 22, 2005 - link

    Wow, I'm simply amazed. I said to someone as soon as I saw this "Wow, now I feel bad that I just bought a 6800GT ... but at least they won't be available for 1 or 2 months." Then I look and see that retailers already have them! I was shocked to say the least. Reply
  • RyDogg1 - Wednesday, June 22, 2005 - link

    But my question was "who," was buying them. I'm a hardware goon as much as the next guy, but everyone knows that in 6-12 months, the next gen is out and price is lower on these. I mean the benches are presenting comparisons with cards that according to the article are close to a year old. Obviously some sucker lays down the cash because the "premium," price is way too high for a common consumer.

    Maybe this one of the factors that will lead to the Xbox360/PS3 becoming the new gaming standard as opposed to the Video Card market pushing the envelope.
    Reply
  • geekfool - Wednesday, June 22, 2005 - link

    What no Crossfire benchies? I guess they didn't wany Nvidia to loose on their big launch day. Reply
  • Lonyo - Wednesday, June 22, 2005 - link

    The initial 6800U's cost lots because of price gouging.
    They were in very limited supply, so people hiked up the prices.
    The MSRP of these cards is $600, and they are available.
    MSRP of the 6800U's was $500, the sellers then inflated prices.
    Reply
  • Lifted - Wednesday, June 22, 2005 - link

    #24: In the Wolfenstein graph they obviously reversed the 7800 GTX SLI with the Radeon.

    They only reveresed a couple of labels here and there, chill out. It's still VERY OBVIOUS which card is which just by looking at the performance!

    WAKE UP SLEEPY HEADS.
    Reply
  • mlittl3 - Wednesday, June 22, 2005 - link

    Derek,

    I know this article must have been rushed out but it needs EXTREME proofreading. As many have said in the other comments above, the results need to be carefully gone over to get the right numbers in the right place.

    There is no way that the ATI card can go from just under 75 fps at 1600x1200 to over 100 fps at 2048x1535 in Enemy Territory.

    Also, the Final Words heading is part of the paragraph text instead of a bold heading above it.

    There are other grammatical errors too but those aren't as important as the erroneous data. Plus, a little analysis of each of the benchmark results for each game would be nice but not necessary.

    Please go over each graph and make sure the numbers are right.
    Reply

Log in

Don't have an account? Sign up now