No More Shader Replacement

The secret is all in compilation and scheduling. Now that NVIDIA has had more time to work with scheduling and profiling code on an already efficient and powerful architecture, they have an opportunity. This generation, rather than build a compiler to fit hardware, they were able to take what they've learned and build their hardware to better fit a mature compiler already targeted to the architecture. All this leads up to the fact that the 7800 GTX with current drivers does absolutely no shader replacement. This is quite a big deal in light of the fact that, just over a year ago, thousands of shaders were stored in the driver ready for replacement on demand in NV3x and even NV4x. It's quite an asset to have come this far with hardware and software in the relatively short amount of time NVIDIA has spent working with real-time compilation of shader programs.

All these factors come together to mean that the hardware is busy more of the time. And getting more things done faster is what it's all about.

So, NVIDIA is offering a nominal increase in clock speed to 430MHz, just a little more memory bandwidth (256bit memory buss running at a 1.2GHz data rate), 1.33x vertex pipelines, 1.5x pixel pipelines, and various increases in efficiency. These all work together to give us as much as double the performance in extreme cases. If the performance increase can actually be realized, we are looking at a pretty decent speed increase over the 6800 Ultra. Obviously, in the real world we won't be seeing a threefold performance increase in anything but a bad benchmark. In cases where games are CPU limited, we will likely see a much lower increase in performance, but performance double that of the 6800 Ultra is entirely possible in very shader limited games.

In fact, EPIC reports that under certain Unreal Engine 3 tests they currently see two to 2.4x improvements in framerate over the 6800 Ultra. Of course, UE3 is not finished yet and there won't be games out based on the engine for a while. We don't usually like reporting performance numbers from software that hasn't been released, but even if these numbers are higher than we will see in a shipping product, it seems that NVIDIA has at least gotten it right for one developer's technology. We are very interested in seeing how next generation games will perform on this hardware. If we can trust these numbers at all, it looks like the performance advantage will only get better for the GeForce 7800 GTX until Windows Graphics Foundation 2.0 comes along and inspires new techniques beyond SM3.0 capabilities.

Right now, each triangle that gets fed through the vertex pipeline, there are many pixels inside the object that needs her help.

Bringing It All Together

Why didn't NVIDIA build a part with unified shaders?

Every generation, NVIDIA evaluates alternative architectures, but at this time they don't feel that a unified architecture is a good match to the current PC landscape. We will eventually see a unified shader architecture from NVIDIA, but it will not likely be until DirectX itself is focused around a unified shader architecture. At this point, vertex hardware doesn't need to be as complex or intricate as the pixel pipeline. As APIs develop more and more complex functionality it will be advantageous for hardware developers to move towards a more generic and programmable shader unit that can easily adapt to any floating point processing need.

As pixel processing is currently more important than vertex processing, NVIDIA is separating the two in order to focus attention where it is due. Making hardware more generic usually makes it necessarily slower, but explicitly targeting a specific aspect of something can often improve performance a great deal.

When WGF 2.0 comes along and geometry shaders are able to dynamically generate vertex data inside the GPU we will likely see an increased burden on vertex processing as well. Being able to programmatically generate vertex data will help to remove the burden on the system to supply all the model data to the GPU.

Inside The Pipes Transparency AA, Purevideo, and HDTV
Comments Locked

127 Comments

View All Comments

  • BenSkywalker - Wednesday, June 22, 2005 - link

    Derek-

    I wanted to offer my utmost thanks for the inclusion of 2048x1536 numbers. As one of the fairly sizeable group of owners of a 2070/2141 these numbers are enormously appreciated. As everyone can see 1600x1200x4x16 really doesn't give you an idea of what high resolution performance will be like. As far as the benches getting a bit messed up- it happens. You moved quickly to rectify the situation and all is well now. Thanks again for taking the time to show us how these parts perform at real high end settings.
  • blckgrffn - Wednesday, June 22, 2005 - link

    You're forgiven, by me anyway :) It is also the great editorial staff that makes Anandtech my homepage on every browser on all of my boxes!

    Nat
  • yacoub - Wednesday, June 22, 2005 - link

    #72 - Totally agree. Some Rome: Total War benchs are much needed - but primarily to see how the game's battle performance with large numbers of troops varies between AMD and Intel more so than NVidia and ATi, considering the game is highly CPU-limited currently in my understanding.
  • DerekWilson - Wednesday, June 22, 2005 - link

    Hi everyone,

    Thank you for your comments and feedback.

    I would like to personally apologize for the issues that we had with our benchmarks today. It wasn't just one link in the chain that caused the problems we had, but there were many factors that lead to the results we had here today.

    For those who would like an explanation of what happened to cause certain benchmark numbers not to reflect reality, we offer you the following. Some of our SLI testing was done forcing multi-GPU rendering on for tests where there was no profile. In these cases, the default mutli-GPU mode caused a performance hit rather than the increase we are used to seeing. The issue was especially bad in Guild Wars and the SLI numbers have been removed from offending graphs. Also, on one or two titles our ATI display settings were improperly configured. Our windows monitor properties, ATI "Display" tab properties, and refresh rate override settings were mismatched. This caused the card to render. Rather than push the display at a the pixel clock we expected, ATI defaulted to a "safe" mode where the game is run at the resolution requested, but only part of the display is output to the screen. This resulted in abnormally high numbers in some cases at resolutions above 1600x1200.

    For those of you who don't care about why the numbers ran the way they did, please understand we are NOT trying to hide behind our explanation as an excuse.

    We agree completely that the more important issue is not why bad numbers popped up, but that bad numbers made it into a live article. For this I can only offer my sincerest of apologies. We consider it our utmost responsibility to produce quality work on which people may rely with confidence.

    I am proud that our readership demands a quality above and beyond the norm, and I hope that that never changes. Everything in our power will be done to assure that events like this will not happen again.

    Again, I do apologize for the erroneous benchmark results that went live this morning. And thank you for requiring that we maintain the utmost integrity.

    Thanks,
    Derek Wilson
    Senior CPU & Graphics Editor
    AnandTech.com
  • Dmitheon - Wednesday, June 22, 2005 - link

    I have to say, while I'm am extremely pleased with nVidia doing a real launch, the product leaves me scratching my head. They priced themselves into an extremely small market, and effectively made their 6800 series the second tier performance cards without really dropping the price on them. I'm not going to get one, but I do wonder how this will affect the company's bottom line.
  • OrSin - Wednesday, June 22, 2005 - link

    I not tring to be a buthole but can we get a benchmark thats a RTS game. I see 10+ games benchmarks and most are FPS, the few that are not might as well be. Those RPG seems to use a silimar type engine.
  • stmok - Wednesday, June 22, 2005 - link

    To CtK's question : Nope, SLI doesn't work with dual-display. (Last I checked, Nvidia got 2D working, but NO 3D)...Rumours say its a driver issue, and Nvidia is working on it.

    I don't know any more than that. I think I'd rather wait until Nvidia are actually demonstrating SLI with dual or more displays, before I lay down any money.
  • yacoub - Wednesday, June 22, 2005 - link

    #60 - it's already to the point where it's turning people off to PC gaming, thus damaging the company's own market of buyers. It's just going to move more people to consoles, because even though PC games are often better games and much more customizable and editable, that only means so much and the trade-off versus price to play starts to become too imbalanced to ignore.
  • jojo4u - Wednesday, June 22, 2005 - link

    What was regarding the AF setting? I understand that it was set to 8x when AA was set to 4x?
  • Rand - Wednesday, June 22, 2005 - link

    I have to say I'm rather disappointed in the quality of the article. A number of apparently nonsensical benchmark results, with little to no analysis of most of the results.

    A complete lack of any low level theoretical performance results, no attempts to measure any improvements in efficiency of what may have caused such improvements.

    Temporal AA is only tested on one game with image quality examined in only one scene. Given how dramatically different games and genres utilize alpha textures your providing us with an awfully limited perspective of it's impact.

Log in

Don't have an account? Sign up now