Let's talk Compilers...

Creating the perfect compiler is one of the more difficult problems in computing. Compiler optimization and scheduling is an NP-complete problem (think chess) so we can't "solve" it. And compounding the issue is that the best compiled code comes from a compiler that is written specifically for a certain processor and knows it inside and out. If we were to use a standard compiler to produce standard x86 code, our program will run much slower than if we tell our compiler we have a P4 with SSE2 and all the goodies that go along with it. I know this all seems pretty obvious, but allow me to illustrate a little.

Since I've always been interested in 3D graphics, back in 1998 I decided to write a 3D engine with a friend of mine for a project in our C++ class. It only did software rendering, but we implemented a software z-buffer and did back face culling with flat shading. Back then, my dad had a top of the line PII 300, and I acquired an AMD K6 200. Using a regular Borland C++ compiler with no real optimizations turned on, our little software 3D engine ran faster on my K6 than it did on my dad's PII. Honestly, I have no idea why that happened. But the point is that the standard output of the compiler ran faster on my slower platform while both systems were producing the same output. Now, if I had had a compiler from Intel optimized for the PII that knew what it was doing (or if I had hand coded the program in assembly for the PII), my code could have run insanely faster on my dad's box.

So, there are some really important points here. Intel and AMD processors were built around the same ISA (Instruction Set Architecture) and had a great deal in common back in 1998. Yet, performance varied in favor of the underpowered machine for my test. When you look at ATI and NVIDIA, their GPUs are completely and totally different. Sure, they both have to be able to run OpenGL and DirectX9, but this just means they are able to map OGL or DX9 function calls (via their drivers) to specific hardware routines (or even multiple hardware operations if necessary). It just so happens that the default Microsoft compiler generates code that runs faster on ATI's hardware than on NVIDIA's.

The solution NVIDIA has is to sit down with developers and help handcode stuff to run better on their hardware. Obviously this is an inelegant solution, and it has caused quite a few problems (*cough* Valve *cough*). The goal NVIDIA has is to eliminate this extended development effort via their compiler technology.

Obviously, if NVIDIA starts "optimizing" their compiler to the point where their hardware is doing things not intended by the developer, we have a problem. I think its very necessary to keep an eye on this, but its helpful to remember that such things are not advantageous to NVIDIA. Over at Beyond3d, there is a comparison of the different compiler (DX9 HLSL and NV Cg) options for NVIDIAs shaders.

We didn't have time to delve into comparisons with the reference rasterizer for this article, but our visual inspections confirm Beyond3d's findings. Since going from the game code to the screen is what this is all about, as long as image quality remains pristine, we think using the Cg compiler makes perfect sense. It is important to know that the Cg compiler doesn't improve performance (except for a marginal gain while using AA), and does a lot over the 45.xx dets for image quality.

Tomb Raider: Angel of Darkness Back to the game...
Comments Locked

117 Comments

View All Comments

  • Anonymous User - Wednesday, October 8, 2003 - link

    I'm impressed. I've never seen a review that actually has the games I play most frequently in it. I've been un-interested in FPS games since Quake II.

    In particular, I like Neverwinter Nights, C&C Generals, SimCity 4, and to some extent WarCraft III (and by extention, their expansions). I was under the impression that SimCity 4 was CPU bound under almost all circumstances, it's useful to have that shot down.

    I also like AA and AF. You can imagine the slideshows I play with my Athlon 2100+, 1GB DDR, and Radeon 64MB DDR (a.k.a. 7200)

    Now I just need to see the ATI AIW 9600 Pro reach general availability.
  • Anonymous User - Tuesday, October 7, 2003 - link

    Thank you so much for this review... the detail is spectacular. After reading and lookig at all 60 pages... I am really tired. Thanks again for your dedication!
  • Anonymous User - Tuesday, October 7, 2003 - link

    Um why are there no comparisons using two monitors with diffrent cards running . Gabe of valve said there is a set of drivers that detect when an screen shot is being taken. Or did anand just get duped by nvidia
  • Anonymous User - Tuesday, October 7, 2003 - link

    I would like to know:

    1. Why fps was left out of TRAOD
    2. Why the weirdo never seen before TRAOD PS2.0 percent loss graph? How about giving us good ole fps which is what we have been seeing for years and what we are use to, at least have both if you are going to introduce new graphs.

    3. How the reveiwer seems to know "Nvidia is aware of it" and never seems to know if ATI is aware of problems? I mean he would have had to talk to Nvidia to know this. Did Nvidia pre read the review and then tell him they are aware of a problem and will fix it??

    4. What motiviation does the reviewers have for helping Nvidia or at least seem optimistic. What has Nvidia done to earn this tip toeing around type of review? If anyting they have dug themselves a well deserved hole. I'm talking about Nvidias horrid behaviour as a company in the past 6 months. Why would they reward a company that pulls the stunts they have lately? Do they feel sorry for them?

    All I can say is the tone of this review leads me to think there is more to this than meets the eye.



  • Anonymous User - Tuesday, October 7, 2003 - link

    #52 yeah im sure people play games in window mode.
    How can u see the differences from such a small screen shots. Its well known that Nvidia hacks or shall I say "optimises" for benchmarks giving no thought to IQ. This article displays Blatant nvidia @ss kissing. There was good reason Gabe didn't want his game to be benched with det.50xx, take a guess, more hackery from nVidia. Also Anand mentions certain anomalties with the geforce fx on certain games but does not try to exlpore what those errors are and assumes nothings wrong. In homeworl the Fx isn't even doing FSSA. Geez wish the nvidia fanboys would get a clue and crawl out from under that rock the've been hiding under.
  • Anonymous User - Tuesday, October 7, 2003 - link

    This is the most interesting article I have ever read for sometime.. First of all, I agree with #41.. I think including this many games into benchmark prohibits Anand/Derek to make detailed analysis of the games.. But there is something more interesting..

    It seems that Anand and Derek tried to put an article that hides the problems with both cards.
    They also deliberately try to avoid giving one company favor. In one sentence, they claim ATI is best, in the next line, they state otherwise.
    As for the IQ comparison, many of screen captures are either dark or can not reflect what AF+AA intended to do.. If I just check the small pictures, I would say that the IQ are really similar. However, more detailed analysis reveals other problems. Besides, the review of the TROAD is the wrost I have ever seen.. If they post the frame rates, I am pretty sure that everybody will be shocked to see the results.. How won't they.. Think about it, the performance percentage loss of FX5950 is 77.5% for 1024x768 noAA/AF. Even if the game runs at 50 fps with PS1.1, the frame rate would drop to 10 fps when you switch to ps2.0 in this case.. However, refering to Beyond3d is interesting, because that site has a very detailed benchmarks of both 5900 and 9800 with this game ( I strongly recommend to anyone to see these articles who really wants to learn the actual performance of NV5900 and R9800 in the PS2.0 scenarios)

    But I totally disagree with Anand in one thing.. TROAD performance is a real indicator for the future games that will uses PS2.0 by default. The games v49 patch also uses HLSL to compile directly to ps2_0_x which is actually the Nvidia's NV30 architecture, and the compiled code runs faster than Cg compiled code. Even in this case, 9800Pro still runs much faster that 5900 ( I am talking about 70 fps vs. 35 fps.).

    I guess nobody want to see that his/her 500$ graphics card would crawl in the new games which uses ps2.0 by default just one year after he puchased the card.. And no! I am not a ATI fanboy.. Just a tech fan who does not tolerate to see how some sites really misdirects the readers because of their connections to the IHVs.
  • Anonymous User - Tuesday, October 7, 2003 - link

    Oh, come on, fanboys, stop yelling at Anand for not making nVidia look bad enough. His job is to benchmark, not to rant. Jesus Christ, you people annoy me. Try printing out the three images from any given test WITHOUT looking at which one's the Radeon.

    And no, I'm no nVidia fanboy, nor am I defending nVidia. I use a softmodded Radeon 9500 and I absolutely love it. I have never, ever put a GeForce FX in my system, and I'm happy to say this. But can't you people just let go?
  • Anonymous User - Tuesday, October 7, 2003 - link

    FIFA 2004 !!! That alone make this worth while !!!
  • Rogodin2 - Tuesday, October 7, 2003 - link

    You should use IL-2 Forgotten Battles with "perfect" detail settings (pixel shaded water and a system knee-bringer) for a simulation benchmark.

    rogo
  • Dasterdly - Tuesday, October 7, 2003 - link

    I could see IQ differences on the dune buggy left side top. The ATI pic has better detail.

    Please add 3dmark benchmark.

    Good review so far almost 1/2 way thru :)

Log in

Don't have an account? Sign up now