Let's talk Compilers...

Creating the perfect compiler is one of the more difficult problems in computing. Compiler optimization and scheduling is an NP-complete problem (think chess) so we can't "solve" it. And compounding the issue is that the best compiled code comes from a compiler that is written specifically for a certain processor and knows it inside and out. If we were to use a standard compiler to produce standard x86 code, our program will run much slower than if we tell our compiler we have a P4 with SSE2 and all the goodies that go along with it. I know this all seems pretty obvious, but allow me to illustrate a little.

Since I've always been interested in 3D graphics, back in 1998 I decided to write a 3D engine with a friend of mine for a project in our C++ class. It only did software rendering, but we implemented a software z-buffer and did back face culling with flat shading. Back then, my dad had a top of the line PII 300, and I acquired an AMD K6 200. Using a regular Borland C++ compiler with no real optimizations turned on, our little software 3D engine ran faster on my K6 than it did on my dad's PII. Honestly, I have no idea why that happened. But the point is that the standard output of the compiler ran faster on my slower platform while both systems were producing the same output. Now, if I had had a compiler from Intel optimized for the PII that knew what it was doing (or if I had hand coded the program in assembly for the PII), my code could have run insanely faster on my dad's box.

So, there are some really important points here. Intel and AMD processors were built around the same ISA (Instruction Set Architecture) and had a great deal in common back in 1998. Yet, performance varied in favor of the underpowered machine for my test. When you look at ATI and NVIDIA, their GPUs are completely and totally different. Sure, they both have to be able to run OpenGL and DirectX9, but this just means they are able to map OGL or DX9 function calls (via their drivers) to specific hardware routines (or even multiple hardware operations if necessary). It just so happens that the default Microsoft compiler generates code that runs faster on ATI's hardware than on NVIDIA's.

The solution NVIDIA has is to sit down with developers and help handcode stuff to run better on their hardware. Obviously this is an inelegant solution, and it has caused quite a few problems (*cough* Valve *cough*). The goal NVIDIA has is to eliminate this extended development effort via their compiler technology.

Obviously, if NVIDIA starts "optimizing" their compiler to the point where their hardware is doing things not intended by the developer, we have a problem. I think its very necessary to keep an eye on this, but its helpful to remember that such things are not advantageous to NVIDIA. Over at Beyond3d, there is a comparison of the different compiler (DX9 HLSL and NV Cg) options for NVIDIAs shaders.

We didn't have time to delve into comparisons with the reference rasterizer for this article, but our visual inspections confirm Beyond3d's findings. Since going from the game code to the screen is what this is all about, as long as image quality remains pristine, we think using the Cg compiler makes perfect sense. It is important to know that the Cg compiler doesn't improve performance (except for a marginal gain while using AA), and does a lot over the 45.xx dets for image quality.

Tomb Raider: Angel of Darkness Back to the game...
POST A COMMENT

117 Comments

View All Comments

  • Anonymous User - Tuesday, October 21, 2003 - link

    Why are they using Flash to do this. I can't see the performance charts (or whatever they are)?

    Reply
  • Anonymous User - Tuesday, October 14, 2003 - link

    what a crap this article was. More games - sure. More data. Bot no brains to interprete it right, obviously. Reply
  • Anonymous User - Monday, October 13, 2003 - link

    #114

    Its more than just difference in visuals. By removing some of the visuals the card will run faster. The Nvidia drivers for example do not do trilinear filtering in dx they do some fake bilinear. That makes the card better than it really is.

    The whining is how the reveiwers missed all this stuff. People are not getting the true story here.
    Reply
  • Anonymous User - Monday, October 13, 2003 - link

    Okay now I'm new here and all but DAMN do some of you whine! You act like any visual diffrences between the Nvidia cards and the Ati (of which I can't see at all) are astronomically huge! It's not, this is the first and last time I post here looks like half of the people here are fanboys! Reply
  • Anonymous User - Monday, October 13, 2003 - link

    I don't understand why the obvious differences in IQ in the Aquamark 3 4xAA/8xAF shots, for example, are totally ignored by the reviewer.
    Just looked at the fuzziness in the plants surrounding the explosion in the nvidia shot.
    Reply
  • Anonymous User - Sunday, October 12, 2003 - link

    Well I hope it was worth it.

    You spend all that time on a review and you end up getting caught being in someones back pocket.

    Guess you can't have your cake and eat it to :(
    Reply
  • Anonymous User - Saturday, October 11, 2003 - link

    Here's is part of an addendum to the 3DCenter article direclty addressing this comparison:

    "AnandTech made an extremely extensive article about the performance and image quality of the current high-end graphic cards like Radeon 9800XT and GeForceFX 5950 Ultra (NV38). Beside the game benchmarks with 18 games, the image quality tests made with each of those games are strongly worth to be mentioned. AnandTech uses the Catalyst 3.7 on ATi side and the Detonator 52.14 on the nVidia side to compare the image quality. In contrast to the statements of our youngest driver comparison, AnandTech didn’t notice any general differences of the image quality between the Detonator 52.14 and 45.23 and therefore AnandTech praises the new driver a little into the sky.

    This however not even absolutely contradicts itself with our realizations. The nVidia-"optimizations" of the anisotropic filter with texture stages 1 till 7 in Control panel mode (only a 2x anisotropic filter is uses, regardless if there were made higher settings) are only to find with proper searching for it, besides most image quality comparisons by AnandTech were concerned without the anisotropic filter and therefore it’s impossible to find any differences on those pictures. The generally forced "optimization" of the trilinear filter into a pseudo trilinear filter by the Detonator 52.14 is besides nearly not possible to see on fixed images of real games, because the trilinear filter was created in order to prevent nearly only the MIP-Banding which can be seen in motion.

    Thus it can be stated that the determined "optimizations" of the Detonator 52.14 won’t be recognized with the view of screenshots, if you do not look for them explicitly (why however AnandTech awards the driver 52.14 a finer filter quality than the driver 51.75 is a mystery for us, then the only difference between them is a correctly working Application mode of the Detonator 52.14). Thus the "optimizations" of nVidia are not to be really seen, whereby there is also a clear exception as for example Tron 2.0 (screenshots will follow). Whether this is now a reason to excuse the "optimizations" of nVidia about it, one can surely argue."

    All on-line computer journalists should strive to inform their viewing public like these folks do.

    Once again: 51/52.XX nvidia drivers do *not* apply trilinear filtering in D3D when AF is on. The 51.75 at least, applies trilinear to the first (0) stage, though not *AT ALL* to any other stage - the 52 series does not apply trilinear filtering to any stage in D3D, regardless.


    Bing! Bing! Try again!

    May I suggest the filtering tester used by 3dCenter, and perhaps a mipmap shading program (as used by everyone in the known universe), and rthdribl to discern *ACTUAL* image quality via high dynamic range light source rendering.

    http://www.daionet.gr.jp/~masa/rthdribl/index.html
    Reply
  • Anonymous User - Saturday, October 11, 2003 - link

    http://www.3dcenter.de/artikel/detonator_52.14/ind...

    Take a look at this if you think the 5x.xx drivers have the same image quality of the 45s.

    Reply
  • Anonymous User - Saturday, October 11, 2003 - link

    The only thing I give Anand credit for is allowing us freely write about his review. I mean he did not have to allow us to reply in a open forum.

    After reading it I am not at all suprised at the heat he is taking, I hope he was not either.

    The review had potential but was squandered.

    Todays cards all all fast enough to do Dx8 games.
    The question is can they do it will all the goodies turned on?

    The main reason to buy a ATI 96-9800/5900U is to clean up the graphics but not at the expense of speed. If you don't care about image quality stick with your GF4 or 8500 as they both are horrid vs the new gen cards.

    An old Gforce 4 kicks butt in mnay games so long as you don't have FSAA turned on.

    Most people know that the 5x.xx detonator drivers do reduce image quality in many areas. This is not a driver bug its what Nvidia choose to keep pace. Image quality is much more subjective than FPS. People are not buying 400.00 video cards for the speed alone.

    Anand glossed over/hid quality issues, the one area where subtle reductions here and there add up to large FPS gains.

    People will say so what the XT gets recommended in the end why bitch?

    Well its the principle, The review made the 5900 seem much closer to the XT than it actually is.

    When a driver (beta one at that) improves speed that much it deserves a much closer inspection than what Anad gave.

    Someone threw Anand a pass but he dropped the ball :(



    Reply
  • Anonymous User - Saturday, October 11, 2003 - link

    I didn't care for this review for the following reasons:

    Many comments on IQ in part 1, but no followup in part 2. There were so many comments that they needed to be mentioned, even if it was to say that they discovered it was some wrong setting and they fixed it.

    Small cropped compressed images used for IQ comparison. If the image is compressed how can we judge it? The only way to present IQ comparisons to the reader is to show them the exact images the reviewer saw, without compression or cropping.

    Apples to apples. All of the benchmarks for all games should have been done in the same format unless it was impossible to achieve certain settings on a given card at a given resolution. Changing the metric for TR:AOD was a bad idea. Both parts should also have been done on the same system. For all we know the ignored IQ issues from part 1 could have been due to the AGP implementation on the first board. We just don't know.

    Gunmetal is also a very poor DX9 benchmark, since it relies on VS 2.0 and PS 1.1 only. Since most of the benefits of DX9, and the controversies for that matter, revolve around PS 2.0 this benchmark is not a good exemplar of DX9 performance. I also find the fact that Gunmetal was co-developed by Nvidia something that needs examination. IHVs have no place in developing benchmarks, they should stick to technology demos.

    Now I don't know if the 52.14 drivers do what the article says they do or not. I know Digit-Life said they gave up to a 20% improvement in some cases, and some improvement is certainly credible. However, this article as written does not support the conclusions that the 52.14 provide significant performance boosts with no IQ loss. I am not commenting on whether they do or not perform as advertised, only that you cannot draw that conclusion from the article.
    Reply

Log in

Don't have an account? Sign up now