Let's talk Compilers...

Creating the perfect compiler is one of the more difficult problems in computing. Compiler optimization and scheduling is an NP-complete problem (think chess) so we can't "solve" it. And compounding the issue is that the best compiled code comes from a compiler that is written specifically for a certain processor and knows it inside and out. If we were to use a standard compiler to produce standard x86 code, our program will run much slower than if we tell our compiler we have a P4 with SSE2 and all the goodies that go along with it. I know this all seems pretty obvious, but allow me to illustrate a little.

Since I've always been interested in 3D graphics, back in 1998 I decided to write a 3D engine with a friend of mine for a project in our C++ class. It only did software rendering, but we implemented a software z-buffer and did back face culling with flat shading. Back then, my dad had a top of the line PII 300, and I acquired an AMD K6 200. Using a regular Borland C++ compiler with no real optimizations turned on, our little software 3D engine ran faster on my K6 than it did on my dad's PII. Honestly, I have no idea why that happened. But the point is that the standard output of the compiler ran faster on my slower platform while both systems were producing the same output. Now, if I had had a compiler from Intel optimized for the PII that knew what it was doing (or if I had hand coded the program in assembly for the PII), my code could have run insanely faster on my dad's box.

So, there are some really important points here. Intel and AMD processors were built around the same ISA (Instruction Set Architecture) and had a great deal in common back in 1998. Yet, performance varied in favor of the underpowered machine for my test. When you look at ATI and NVIDIA, their GPUs are completely and totally different. Sure, they both have to be able to run OpenGL and DirectX9, but this just means they are able to map OGL or DX9 function calls (via their drivers) to specific hardware routines (or even multiple hardware operations if necessary). It just so happens that the default Microsoft compiler generates code that runs faster on ATI's hardware than on NVIDIA's.

The solution NVIDIA has is to sit down with developers and help handcode stuff to run better on their hardware. Obviously this is an inelegant solution, and it has caused quite a few problems (*cough* Valve *cough*). The goal NVIDIA has is to eliminate this extended development effort via their compiler technology.

Obviously, if NVIDIA starts "optimizing" their compiler to the point where their hardware is doing things not intended by the developer, we have a problem. I think its very necessary to keep an eye on this, but its helpful to remember that such things are not advantageous to NVIDIA. Over at Beyond3d, there is a comparison of the different compiler (DX9 HLSL and NV Cg) options for NVIDIAs shaders.

We didn't have time to delve into comparisons with the reference rasterizer for this article, but our visual inspections confirm Beyond3d's findings. Since going from the game code to the screen is what this is all about, as long as image quality remains pristine, we think using the Cg compiler makes perfect sense. It is important to know that the Cg compiler doesn't improve performance (except for a marginal gain while using AA), and does a lot over the 45.xx dets for image quality.

Tomb Raider: Angel of Darkness Back to the game...
Comments Locked

117 Comments

View All Comments

  • Anonymous User - Wednesday, October 8, 2003 - link

    Didn't anyone notice that Ati doesn't do dynamic glows in Jedi Academy with the 3.7 cats!? Look at the lightsabre and it's clearly visible. They only work with the 3.6 cats and then they REALLY kill performance (It's barley playable in 800*600 here on my Radeon 9700 PRO)
  • Anonymous User - Wednesday, October 8, 2003 - link

    funny to see that ati fanboys can't believe that nvidia can bring drivers without cheats. And nobody talk about the issues in TRAOD with ATI cards, really very nice...
  • Anonymous User - Wednesday, October 8, 2003 - link

    WTH did you benchmark one card with unreleased drivers (something you said you would never, ever do in the past) and use micro-sized pictures for IQ comparisons?
    You might as well have used 256 colors.

    The Catalyst 3.8's came out today - the 51.75 drivers will not be availible for an indeterminate amount of time. Yet you bench with the Cat 3.7's and use a set of unreleased and unavailible drivers for the competition.

    I suggest you peruse this article:
    http://www.3dcenter.org/artikel/detonator_52.14/
    from 3DCenter (german) to learn just how one goes about determining how IQ differs at different settings with the Nvidia 45's, 51's, and 52's.
    Needless to say, everyone else who has compared full-sized frames in a variety of games and applications has found the 5X.XX nvidia drivers (all of them) do selective rendering, and especially lighting.

    And why claim the lack of shiny water in NWN is ATi's fault?

    Bioware programmed the game using an nvidia exclusive instruction and did not bother to program for the generic case until enough ATI and other brand users complained.
    This is the developer's fault, not a problem with the hardware or drivers.
  • Anonymous User - Wednesday, October 8, 2003 - link

    Nice article. I like that you benched so many games.

    Unfortunately you missed that the Det52.14 driver does no real Trilinear Filtering in *any* DirectX game, regardless of whether you're using anisotropic filtering or not. This often can't be seen in screenshots but in motion only. Please have a look here:

    http://www.3dcenter.de/artikel/detonator_52.14/

    There is *NO* way for a GeForceFX user to enable full trilinear filtering when using Det52.14. No wonder the performance increased...
  • Anonymous User - Wednesday, October 8, 2003 - link

    TR: AOD is a fine game, you just have to play it...

    Sure there are some graphical issues on the later levels but there's nothing wrong with the game as such and considering that it has made its way into a lot of bundles (sapphire and creative audigy 2 ZS to name two) I believe it will recieve a fair share of gameplay.
  • Anonymous User - Wednesday, October 8, 2003 - link

    You guys need to stop talking about gabe newell...for such a supposed good programmer he sure needs to learn about network security...We all know he's got his head up ATI's rearend. The funny part is that they are bundling hl2 with the 9800xt (a coupon) when it isn't coming out until april now. Who's to say who will have the better hardware then? Doom 3 will likely be out by then. In 4 months when the new cards are out you guys won't care who makes the better card the 12 year old fan-boys will be up in arms in support of their company. I owned the 5900u and sold it on ebay after seeing the hl2 numbers. I then bought a 9800pro from newegg and on the first tried ordering the 9800xt from ati which said it was in stock. 2 days later they told me my order was on backorder and hassled me when I wanted to cancel. One thing I'd point out is that war3 looks much better on the 5900u then the 9800. It looks really dull on the 9800 where it's bright and cartoony (like it should be) on the geforce. Either way who knows what the future will hold for both companies but let's hope they both succeed to keep our prices low....
  • Anonymous User - Wednesday, October 8, 2003 - link

    i took over #41's original post... i didnt like his tone :|
  • Anonymous User - Wednesday, October 8, 2003 - link

    IQ part was crappy at best. small screenshots in open not-so-detailed areas and sometimes there was no option for a big one to check.

    You can call me what you want, but there are quite a few reviews there that will disagree BIG time with what has been posted about IQ here. And it is impossible all of them are wrong on this at the same time.

    HomeWorld has shadow issues in ATI cards with cat 3.7, yet that ain't shown there anyways....this goes for both ways.

    If you ask me, NVidia got his DX9 wrapper to work fine this time.
  • Anonymous User - Wednesday, October 8, 2003 - link

    Um what happened to post #41 where the guy detailed all the inconsistencies of the IQ comparisons? Please don't tell me you guys actually modded that post....

    I haven't had the chance to go through everything yet but those few I did, I definitely saw differences even in these miniscule caps (how about putting up some full size links next time guys?).. particularly in the AA+AF ones. It's obvious theres still quite a difference in their implementations.

    I was also surprised at the number of shots that weren't even of the same frame. Honestly, how can you do a IQ test if you aren't even going to use the same frames? A split second difference is enough to change the output because of the data/buffer/angle differences etc.

    Personally I wonder what happened to the old school 400% zoom IQ tests that Anand was promising and I'm fairly disappointed despite the number of games in this article.

    That said, I am glad that Nvidia didn't botch up everything entirely and hopefully they'll have learned their lesson for NV4x.
  • Anonymous User - Wednesday, October 8, 2003 - link

    Where can i get the 52.14 drivers?

Log in

Don't have an account? Sign up now