Let's talk Compilers...

Creating the perfect compiler is one of the more difficult problems in computing. Compiler optimization and scheduling is an NP-complete problem (think chess) so we can't "solve" it. And compounding the issue is that the best compiled code comes from a compiler that is written specifically for a certain processor and knows it inside and out. If we were to use a standard compiler to produce standard x86 code, our program will run much slower than if we tell our compiler we have a P4 with SSE2 and all the goodies that go along with it. I know this all seems pretty obvious, but allow me to illustrate a little.

Since I've always been interested in 3D graphics, back in 1998 I decided to write a 3D engine with a friend of mine for a project in our C++ class. It only did software rendering, but we implemented a software z-buffer and did back face culling with flat shading. Back then, my dad had a top of the line PII 300, and I acquired an AMD K6 200. Using a regular Borland C++ compiler with no real optimizations turned on, our little software 3D engine ran faster on my K6 than it did on my dad's PII. Honestly, I have no idea why that happened. But the point is that the standard output of the compiler ran faster on my slower platform while both systems were producing the same output. Now, if I had had a compiler from Intel optimized for the PII that knew what it was doing (or if I had hand coded the program in assembly for the PII), my code could have run insanely faster on my dad's box.

So, there are some really important points here. Intel and AMD processors were built around the same ISA (Instruction Set Architecture) and had a great deal in common back in 1998. Yet, performance varied in favor of the underpowered machine for my test. When you look at ATI and NVIDIA, their GPUs are completely and totally different. Sure, they both have to be able to run OpenGL and DirectX9, but this just means they are able to map OGL or DX9 function calls (via their drivers) to specific hardware routines (or even multiple hardware operations if necessary). It just so happens that the default Microsoft compiler generates code that runs faster on ATI's hardware than on NVIDIA's.

The solution NVIDIA has is to sit down with developers and help handcode stuff to run better on their hardware. Obviously this is an inelegant solution, and it has caused quite a few problems (*cough* Valve *cough*). The goal NVIDIA has is to eliminate this extended development effort via their compiler technology.

Obviously, if NVIDIA starts "optimizing" their compiler to the point where their hardware is doing things not intended by the developer, we have a problem. I think its very necessary to keep an eye on this, but its helpful to remember that such things are not advantageous to NVIDIA. Over at Beyond3d, there is a comparison of the different compiler (DX9 HLSL and NV Cg) options for NVIDIAs shaders.

We didn't have time to delve into comparisons with the reference rasterizer for this article, but our visual inspections confirm Beyond3d's findings. Since going from the game code to the screen is what this is all about, as long as image quality remains pristine, we think using the Cg compiler makes perfect sense. It is important to know that the Cg compiler doesn't improve performance (except for a marginal gain while using AA), and does a lot over the 45.xx dets for image quality.

Tomb Raider: Angel of Darkness Back to the game...
Comments Locked

117 Comments

View All Comments

  • Anonymous User - Tuesday, October 7, 2003 - link

    You need to look at the FSAA each card empolys...go back and look again at the screenies, this time looking at all the jaggis on each card....especially in F1, it doesn't even look like nVidia is using FSAA....while on the ATI, it's smooth ......I don't think it's a driver comparison, just the fact that ATI FSAA is far better at doing the job....At least I think that's what he's talking about..hard to tell any IQ differences when the full size screenies are not working, but poor FSAA kinda jumps out at you (If your'e used to smooth FSAA)

    Also worth noting, nVidia made great jumps in performance in DX9, but nothing that actually used PS2.0 shaders : (
  • Anonymous User - Tuesday, October 7, 2003 - link

    #14 Blurred? Are you not wearing your glasses or something? Nice and sharp for me...
  • Anonymous User - Tuesday, October 7, 2003 - link

    of course you do, you're a fanATIc...
  • Anonymous User - Tuesday, October 7, 2003 - link

    I like the way he discounts Tomb raider. Saying it is just not a good game. Thats a matter of opinion. It almost seems like he trys to undermine that game before revealing any benches.

    And the benches for that game are not done in FPS but on percentage lost on PS2.0.

    On first inspection of the graphs it appears that Nvidia is leading in tombraider. But if you look at the blurred print on the graph it does say "lower is better" Very clever!

    Why no FPS in that game?

    Nice information in this review but it almost seems that he is going out of his way to excuse Nvidia.

    I smell a rat in this review.

  • Anonymous User - Tuesday, October 7, 2003 - link

    #3, #7: If you take the screens into photoshop and observe the result of their 'difference', you'll see that there's a fairly significant difference between the 45's and 3.7's, but almost no difference whatsoever between the 52's and 3.7's. In most of those screenshots it's impossible to do this since the shots aren't neccessarily from the exact same position each time. Try the ut2k3 ones for example. Also these are jpeg's, so there'll be a little fuzz due to the differences in compression.

    Also, if I need to take two screenshots into photoshop to be able to discern any difference between them, that's really saying alot. And since we can't refer to a reference software shot, it could be ati's driver that's off for all we know.

    In any event I'm pleasantly surprised with nvidia. Their IQ has definitely caught up, and their performance is quickly improving. Hopefully the cat3.8's will pull a similar stunt.
  • Anonymous User - Tuesday, October 7, 2003 - link

    No he's just a "fanny"
  • AgaBooga - Tuesday, October 7, 2003 - link

    They must have a reason for choosing those drivers. Anandtech has been around long enough for that :)

    The reason is probably along the lines of when they started this benchmarking because they did soooo many games, resolutions, AA and AF levels, times the number of different cards, etc. That takes quite some time. Had they waited for the newer ATI drivers, it may have delayed this article one, or even two weeks till publishing. Also, they did mention they will do a followup articles with the new drivers, so patience is the key here.
  • Anonymous User - Tuesday, October 7, 2003 - link

    #8 seems like a fanboy himself
  • dvinnen - Tuesday, October 7, 2003 - link

    well #8, Nvidia was able to do it with the wonder driver, I dn't see why Ati can't
  • Anonymous User - Tuesday, October 7, 2003 - link

    LOL, the ATI fanboys are already coming out of the woodwork. Listen #3 and #7, it's a fact, there is no IQ difference at all between the 50 Dets and the 3.7 CATs. And if you honestly believe you're going to see much of a difference with the CAT 3.8's....you're just stupid.

Log in

Don't have an account? Sign up now