Let's Talk Performance

This section is likely to generate a lot of flames if left unchecked. First, though, we want to make it abundantly clear that raw, theoretical performance numbers (which is what is listed here) rarely manage to match real world performance figures. There are numerous reasons for this discrepancy, for example the game or application in use may stress different parts of the architecture. A game that pushes a lot of polygons with low resolution textures is going to stress the geometry engine, while a game that uses high resolution textures with lower polygon counts is more likely to stress the memory bandwidth. Pixel and Vertex Shaders are even more difficult to judge, as both ATI and NVIDIA are relatively tight-lipped about the internal layout of their pipelines. These functions are the most like an actual CPU, but they're also highly proprietary and the companies feel a need to protect their technology (probably with good cause). So while we know that AMD Athlon 64 chips have a 12 stage Integer/ALU pipeline and 17 stage FPU/SSE pipeline, we really have no idea how many stages are in the pixel and vertex pipelines of ATI and NVIDIA cards. In fact, we really don't have much more than a simplistic functional overview.

So why even bother talking about performance without benchmarks? In part, by looking at the theoretical performance and comparing it to the real world performance (you'll have to find such real world figures in another article), we can get a better idea of what went wrong and what worked well. More importantly, though, most people referring to a GPU Guide are going to expect some sort of comparison and ranking of the parts. It is by no means definitive, and for some people choosing a graphics card is akin to joining a religion. So, take these numbers with a grain of salt and know that they are not intentionally meant to make one card look better than another. Where performance seriously fails to match expectations, it will be noted.

There are numerous factors that can affect performance, other than the application itself. Drivers are a major one, and it is not unheard of for the performance of a particular card to increase by as much as 50% over its lifetime due to driver enhancements. In light of such examples (i.e. both Radeon and GeForce cards in Quake 3 performance increased dramatically over time), it is somewhat difficult to say that theoretical performance numbers are really that much worse than changing real world numbers. With proper optimization, real world numbers can usually approach theoretical numbers, but this really only occurs for the most popular applications. Features also play a part, all other things being equal, so if two cards have the same theoretical performance but one card is DX9 based and the other is DX8 based, the DX9 card is should be faster.

Speaking of drivers, we would be remiss if we didn't at least mention OpenGL support. Brought into the consumer segment with GLQuake back in 1997, OpenGL is a different platform and requires different drivers. NVIDIA and ATI both have full OpenGL drivers, but all evidence indicates that NVIDIA's drivers are simply better at this point in time. Doom 3 is the latest example of this. However, OpenGL is also used in the professional world, and again NVIDIA tends to lead in performance, even with inferior hardware. Part of the problem is that very few games other than id Software titles and their licensees use OpenGL, so it often takes a back seat to DirectX. However, ATI has vowed to improve their OpenGL performance since the release of Doom 3, and hopefully they can close the gap between their DirectX and OpenGL drivers.

So, how is overall performance determined - in other words, how will the tables be sorted? The three main factors are fill rate, memory bandwidth, and processing power. Fill rate and bandwidth have been used for a long time, and they are well understood. Processing power, on the other hand, is somewhat more difficult to determine, especially with DX8 and later Pixel and Vertex Shaders. We will use the vertices/second rating as am estimate of processing power. For the charts, each section will be normalized relative to the theoretically fastest member of the group, and equal weight will be given to the fill rate, bandwidth, and vertex rate. That's not the best way of measuring performance, of course, but it's a start, and everything is theoretical at this point anyway. If you really want a suggestion on a specific card, the forums and past articles are a better place to search. Another option is to decide which games (or applications) you are most concerned about, and then go find an article that has benchmarks with that particular title.

To reiterate, this is more of a historical perspective on graphics chips and not a comparison of real world performance. And with that disclaimer, let's get on to the performance charts.

The Way It's Meant to be Played Number nine… Number nine…


View All Comments

  • Neo_Geo - Tuesday, September 7, 2004 - link

    Nice article.... BUT....
    I was hoping the Quadro and FireGL lines would be included in the comparison.
    As someone who uses BOTH proffessional (ProE and SolidWorks) AND consumer level (games) software, I am interested in purchasing a Quadro or FireGL, but I want to compare these to their consumer level equivalent (as each pro level card generally has an equivalent consumer level card with some minor, but important, otomizations).

  • mikecel79 - Tuesday, September 7, 2004 - link

    The AIW 9600 Pros have faster memory than the normal 9600 Pro. 9600 Pro memory runs at 650Mhz vs the 600 on a normal 9600.

    Here's the Anandtech article for reference:
  • Questar - Tuesday, September 7, 2004 - link


    This list is not complete at all, it would be 3 times the size if it was from the last 5 or 6 years. It covers about the last 3, and is laden with errors

    Just another exampple of half-asssed job this site has been doing lately.
  • JarredWalton - Tuesday, September 7, 2004 - link

    #14 - Sorry, I went with desktop cards only. Usually, you're stuck with whatever comes in your laptop anyway. Maybe in the future, I'll look at including something like that.

    #15 - Good God, Jim - I'm a CS graduate, not a graphics artist! (/Star Trek) Heheh. Actually, you would be surprised at how difficult it can be to get everything to fit. Maximum width of the tables is 550 pixels. Slanting the graphics would cause issues making it all fit. I suppose putting in vertical borders might help keep things straight, but I don't like the look of charts with vertical separators.

    #20 - Welcome to the club. Getting old sucks - after a certain point, at least.
  • Neekotin - Tuesday, September 7, 2004 - link

    great read! wow! i didn't know there were so much GPUs in the past 5-6 years. its like more than all combined before them. guess i'm a bit old.. ;) Reply
  • JarredWalton - Tuesday, September 7, 2004 - link

    12/13: I updated the Radeon LE entry and resorted the DX7 page. I'm sure anyone that owns a Radeon LE already knows this, but you could use a registry hack to turn them into essentially a full Radeon DDR. (By default, the Hierarchical Z compression and a few other features were disabled.) Old Anandtech article on the subject:

  • JarredWalton - Monday, September 6, 2004 - link

    Virge... I could be wrong on this, but I'm pretty sure some of the older chips could actually be configured with either SDR or DDR RAM, and I think the GF2 MX series was one of those. The problem was that you could either have 64-bit DDR or 128-bit SDR, so it really didn't matter which you chose. But yeah, there were definitely 128-bit SDR versions of the cards available, and they were generally more common than the 64-bit DDR parts I listed. The MX200, of course, was 64-bit SDR, so it got the worst of both worlds. Heh.

    I think the early Radeons had some similar options, and I'm positive that such options existed in the mobile arena. Overall, though, it's a minor gripe (I hope).
  • ViRGE - Monday, September 6, 2004 - link

    Jarred, without getting too nit-picky, your data for the GeForce 2 MX is technically wrong; the MX used a 128bit/SDR configuration for the most part, not a 64bit/DDR configuration(http://www.anandtech.com/showdoc.aspx?i=1266&p... Note that this isn't true for any of the other MX's(both the 200 and 400 widely used 64bit/DDR), and the difference between the two configurations has no effect on the math for memory bandwidth, but it's still worth noting. Reply
  • Cygni - Monday, September 6, 2004 - link

    Ive been working with Adrian's Rojak Pot on a very similar chart to this one for awhile now. Check it out:

  • Denial - Monday, September 6, 2004 - link

    Nice article. In the future, if you could put the text at the top of the tables on an angle it would make them much easier to read. Reply

Log in

Don't have an account? Sign up now