What's Wrong with NVIDIA?

Getting to the meat of the problem, how can it be that NVIDIA could perform so poorly in a native DirectX 9 code path and do better, but not extremely, in their own special "mixed mode." In order to understand why, we have to look at the modifications that Valve made to the NV3x code path; taken directly from Gabe Newell's presentation, here are the three major changes that were made:

Special Mixed Mode for NV3x
- Uses partial-precision registers where appropriate
- Trades off texture fetches for pixel shader instruction count (this is actually backwards, read further to learn more)
- Case-by-case shader code restructuring

So the first change that was made is to use partial-precision registers where appropriate. Well, what does that mean? As we've mentioned in previous articles, NVIDIA's pixel shading pipelines can either operate on 16 or 32-bit floating point numbers, with the 32-bit floats providing greater precision. Just like on a CPU, the actual FPUs that are present in the pixel shader units have a fixed number of local storage locations known as registers. Think of a register as nothing more than a place to store a number. With the NV3x architecture, each register can either hold one 32-bit floating point value or it can be used as two 16-bit floating point registers. Thus, when operating in 16-bit (aka partial precision) mode, you get twice as many physical registers as when you're running in 32-bit mode.

Note that using 32-bit floating point numbers doesn't increase the amount of memory bandwidth you're using. It simply means that you're cutting down the number of physical registers to which your pixel shader FPUs have access. What happens if you run out of registers? After running out of registers, the functional units (FPUs in this case) must swap data in and out of the graphics card's local memory (or caches), which takes a significantly longer time - causing stalls in the graphics pipeline or underutilization of the full processing power of the chip.

The fact that performance increased when moving to partial-precision (16-bit) registers indicates that NVIDIA's NV3x chips may have fewer usable physical registers than ATI's R3x0 series. If we're correct, this is a tradeoff that the NVIDIA engineers have made and it is to conserve die space, but we're not here to criticize NVIDIA's engineers, rather explain NVIDIA's performance here.

Next, Gabe listed the tradeoff in pixel shader instruction count for texture fetches. To sum this one up, the developers resorted to burning more texture (memory) bandwidth instead of putting a heavier load on computations in the functional units. Note that this approach is much more similar to the pre-DX9 method of game development, where we were mainly memory bandwidth bound instead of computationally bound. The fact that NVIDIA benefited from this sort of an optimization indicates that the NV3x series may not have as much raw computational power as the R3x0 GPUs (whether that means that it has fewer functional units or it is more picky about what and when it can execute is anyone's guess).

The final accommodation Valve made for NVIDIA hardware was some restructuring of shader code. There's not much that we can deduce from this other than the obvious - ATI and NVIDIA have different architectures.

ATI & Valve - Defining the Relationship Improving Performance on NVIDIA
Comments Locked

111 Comments

View All Comments

  • Anonymous User - Friday, September 12, 2003 - link

    I perviously posted this in a wrong place so let me just shamelessly repost in here:

    Let me just get my little disclaimer out, before I dive into being a devil's advocate - I own both 9800pro and fx5900nu and am not biased to neither, ATi or nVidia.
    With that being said, let me take a shot at what Anand opted not to speculate about ant that is the question of ATi/Valve colaboration and their present and future relationship.
    First of all, FX's architecture is obviously inferior to R3x0 in terms of native DX9 and tha is not going to be my focus. I would rather debate a little about the business/finacial side of ATi/Valve relationship. That's the area of my expertise and looking at this situation from afinacial angle might add another twist to this.
    What got my attention are Gabe Newell presentations slides that have omitted small but significant things like "pro" behind r9600 and his statement of "optimiztions going too far" without actually going into specifics, other than new detonators don't render fog. Those are small but significant details that add a little oil on a very hot issue of "cheating" in regards to nVidia's "optimizations". But I sopke of inancial side of things, so let me get back to it. After clearly stating how superior ATi's harware is to FX, stating how much effort they have invested to make the game work on FX (which is absolutely commendable) I can not help but notice that all this perfectly leads into the next great thing. A new line of ATi cards will be bundeled with ATi cards (or vice versa), and ATi is just getting ready to offer a value DX9 line. Remember how it was the only area that they have not covered and nVidia was selling truckloads of FX5200 in the meantime. After they have demonstrated how poorly FX flagship performs, let alone the value parts, is't it a perfect lead into selling shiploads of those bundeled cards(games). Add to that Gabe's shooting down of any optimization efforts on nVidia's part (simply insinuate on "chaets") and things are slowly moving in the right direction. And to top it all off, Valve expilcitley said that future additions will not be done for DX8 or so called mixed class but exclusively DX9. What is Joe consumer to do than? The only logical thing - get him/herself one of those bundles.
    That concludes my observations on this angle of this newly emerged attraction and I see only good things on the horizon for ATi stockholders.
    Feel free to debate, disagree and criticize, but keep in mind that I am not defending or bashing anybody, just offering my opinion on the part I considered equally as interesting as hardware performance is.
  • Anonymous User - Friday, September 12, 2003 - link

    Wow...I buy a new video card every 3 years or so..my last one was a GF2PRO....hehe...I'm so glad to have a 9800PRO right now.
    Snif..I'm proud to be Canadian ;-)
  • Anonymous User - Friday, September 12, 2003 - link

    How come the 9600 pros hardly loses any performance going from 1024 to 1280? Shouldn't it be affected by only having 4 pipelines?
  • Anonymous User - Friday, September 12, 2003 - link

    MUHAHAHA!!! Go the 9600pros, i'd like to bitch slap my friends for telling me the 9600's will not run half-life 2. I guess i can now purchase an All-In-Wonder 9600pro.
  • Anonymous User - Friday, September 12, 2003 - link

    Man, I burst into a coughing/laughing spree when I saw an add using nVidia's "The way it's meant to be played" slogan. Funny thing is, I first noticed the add on the page titled "What's Wrong with Nvidia?"
  • Anonymous User - Friday, September 12, 2003 - link

    booyah, i hope my ti4200 can hold me over at 800x600 until i can switch to ATI! big up canada
  • Anonymous User - Friday, September 12, 2003 - link

    You can bet your house nvidia's 50 drivers will get closer performance, but they're STILL thoroughly bitchslapped... Ppl will be buying R9x00's by the ton. Nvidia better watch out, or they'll go down like, whatwassitsname, 3dfx ?
  • dvinnen - Friday, September 12, 2003 - link

    Hehe, I concer. Seeing a 9500on there would of been nice. But I really want to see is some AF turned on. I can live with no AA (ok, 2x AA) but I'll be damn if AF isn't going to be on.
  • Anonymous User - Friday, September 12, 2003 - link

    Anand, you guys rock. It's because of your in depth reviews that I purchased the Radeon 9500 Pro. I noticed the oddity mentioned of the small performance gap between the 9700 Pro and the 9600 Pro at 1280x1024. I would really like to see how the 9500 Pro is affected by this (and all the other benchmarks). If you have a chance, could you run a comparison between the 9500 Pro and the 9600 Pro (I guess what I really want to know if my 9500 Pro is better than a 9600 Pro for this game).

    Arigato,
    The Internal
  • Pete - Friday, September 12, 2003 - link

    (Whoops, that was me above (lucky #13)--entered the wrong p/w.)

Log in

Don't have an account? Sign up now