ATI Radeon X800 Pro and XT Platinum Edition: R420 Arrives

Name: ATI Radeon X800 Pro and XT Platinum Edition: R420 Arrives
Item: ATI Radeon X800 Pro and XT Platinum Edition: R420 Arrives
Author: Derek Wilson

by Derek Wilson on May 4, 2004 10:28 AM EST

Posted in
GPUs

95 Comments | Add A Comment

95 Comments

Pixel Shader Performance Tests

ShaderMark v2.0 is a program designed to stress test the shader performance of modern DX9 graphics hardware with Shader Model 2.0 programs written in HLSL running on a couple shapes in a scene.

We haven't used ShaderMark in the past because we don't advocate the idea of trying to predict the performance of real world game code using a synthetic set of tests designed to push the hardware. Honestly, as we've said before, the only way to determine performance of a certain program on specific hardware is to run that program on that hardware. As both software and hardware get more complex, results of any given test become less and less generalize able, and games, graphics hardware, and modern computer systems are some of the most complex entities on earth.

So why are we using ShaderMark you may ask. There are a couple reasons. First this is only a kind of ball park test. ATI and NVIDIA both have architectures that should be able to push a lot of shader operations through. It is a fact that NV3x had a bit of a handicap when it came to shader performance. A cursory glance at ShaderMark should tell us enough to know if that handicap carries over to the current generation of cards, and whether or not R420 and NV40 are on the same playing field. We don't want to make a direct comparison, we just want to get a feel for the situation. With that in mind, here are the benchmarks.

	Radeon X800 XT PE	Radeon X800 Pro	GeForce 6800 Ultra	GeForce 6800 GT	GeForce FX 5950 U
2	310	217	355	314	65
3	244	170	213	188	43
4	238	165
5	211	146	162	143	34
6	244	169	211	187	43
7	277	160	205	182	36
8	176	121
9	157	107	124	110	20
10	352	249	448	410	72
11	291	206	276	248	54
12	220	153	188	167	34
13	134	89	133	118	20
14	140	106	141	129	29
15	195	134	145	128	29
16	163	113	149	133	27
17	18	13	15	13	3
18	159	111	99	89	17
19	49	34
20	78	56
21	85	61
22	47	33
23	49	43	49	46

These benchmarks are run with fp32 on NVIDIA hardware and fp24 on ATI hardware. It isn't really an apples to apples comparison, but with some of the shaders used in shadermark, partial precision floating point causes error accumulation (since this is a benchmark designed to stress shader performance, this is not surprising).

ShaderMark v2.0 clearly shows huge increase in pixel shader performance from NV38 to either flavor of NV40. Even though the results can't really be compared apples to apples (because of the difference in precision), NVIDIA manages to keep up with the ATI hardware fairly well. In fact, under the diffuse lighting and environment mapping, shadowed bump mapping and water color shaders don't show ATI wiping the floor with NVIDIA.

In looking at data collected on the 60.72 version of the NVIDIA driver, no frame rates changed and a visual inspection of the images output by each driver yielded no red flags.

We would like to stress again that these numbers are not apples to apples numbers, but the relative performance of each GPU indicates that the ATI and NVIDIA architectures are very close to comparable from a pixel shader standpoint (with each architecture having different favored types of shader or operation).

In addition to getting a small idea of performance, we can also look deep into the hearts of NV40 and see what happens when we enable partial precision rendering mode in terms of performance gains. As we have stated before, there were a few image quality issues with the types of shaders ShaderMark runs, but this bit of analysis will stick only to how much work is getting done in the same amount of time without regard to the relative quality of the work.

	GeForce 6800 U PP	GeForce 6800 GT PP	GeForce 6800 U	GeForce 6800 GT
2	413	369	355	314
3	320	283	213	188
5	250	221	162	143
6	300	268	211	187
7	285	255	205	182
9	159	142	124	110
10	432	389	448	410
11	288	259	276	248
12	258	225	188	167
13	175	150	133	118
14	167	150	141	129
15	195	173	145	128
16	180	161	149	133
17	21	19	15	13
18	155	139	99	89
23	49	46	49	46

The most obvious thing to notice is that, overall, partial precision mode rendering increases shader rendering speed. Shader 2 through 8 are lighting shaders (with 2 being a simple diffuse lighting shader). These lighting shaders (especially the point and spot light shaders) will make heavy use of vector normalization. As we are running in partial precision mode, this should translate to a partial precision normalize, which is a "free" operation on NV40. Almost any time a partial precision normalize is needed, NV40 will be able to schedule the instruction immediately. This is not the case when dealing with full precision normalization, so the many 50% performance gains coming out of those lighting shaders is probably due to the partial precision normalization hardware built into each shader unit in NV40. The smaller performance gains (which, interestingly, occur on the shaders that have image quality issues) are most likely the result of decreased bandwidth requirements, and decreased register pressure: a single internal fp32 register can handle two fp16 values making scheduling and managing resources much less of a task for the hardware.

As we work on our image quality analysis of NV40 and R420, we will be paying heavy attention to shader performance in both full and partial precision modes (as we want to look at what gamers will actually be seeing in the real world). We will likely bring shadermark back for these tests as well. This is a new benchmark for us, so please bear with us as we get used to its ins and outs.

NVIDIA's Last Minute Effort and The Test Aquamark 3 Performance

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

95 Comments

View All Comments

TrogdorJW - Tuesday, May 4, 2004 - link
Nice matchup we've got here! Just what we were all hoping for. Unfortunately, there are some disappointing trends I see developing....

ShaderMark 2.0, we see many instances where the R420 is about 25% faster than the NV40. Let's see... 520 MHz vs 400 MHz. 'Nuf said, I think. Too bad for Nvidia that they have 222 million transistors, so they're not likely to be able to reach 500 MHz any time soon. (Or if they can, then ATI can likely reach 600+ MHz.)

How about the more moderately priced card matchup? The X800 Pro isn't looking that attractive at $400. 25% more price gets you 33% more pipelines, which will probably help out on games that process a lot of pixels. And the Pro also has 4 vertex pipelines compared to 6? The optimizations make it better than a 9800XT, but not by a huge margin. The X800 SE with 8 pipelines is likely going to be about 20% faster than an 9800XT. Hopefully, it will come in at a $200 price point, but I'm not counting on that for at least six months. (Which is why I recently purchased a $200 9800 Pro 128.)

The Nvidia lineup is currently looking a little nicer. The 6800 Ultra, Ultra Special, and GT all come with 16 pipelines, and there's talk of a lower clocked card for the future. If we can get a 16 pipeline card (with 6 vertex pipelines) for under $250, that would be pretty awesome. That would be a lot like the 5900 XT cards. Anyone else notice how fast the 9800 Pro prices dropped when Nvidia released the 5900 XT/SE? Hopefully, we'll see more of that in the future.

Bottom line has to be that for most people, ATI is still the choice. (OpenGL gamers, Linux users, and professional 3D types would still be better with Nvidia, of course.) After all, the primary benefit of NV40 over R420 - Shader Model 3.0 - won't likely come into play for at least six months to a year. Not in any meaningful way, at least. By then, the fall refresh and/or next spring will be here, and ATI could be looking at SM3.0 support. Of course, adding SM3 might just knock the transistor counts of ATI's chips up into the 220 million range, which would kill their clock speed advantage.

All told, it's a nice matchup. I figure my new 9800 Pro will easily last me until the next generation cards come out, though. By then I can look at getting an X800 Pro/XT for under $200. :)
NullSubroutine - Tuesday, May 4, 2004 - link
I forgot to ask if anyone else noticed a huge difference (almost double) between AnandTechs UnrealTourment 2003 scores and that of Toms Hardware?

(Its not the CPU difference, because the A64 3200+ had a baseline score of ~278 and the 3.2 P4 had a ~247 on a previous section.)

So what gives?
NullSubroutine - Tuesday, May 4, 2004 - link
The guy talking about the 400mhz and the 550mhz I have this to say.

I agree with the other guy about the transistor count.

Dont forget that ATi's cards used to be more powerful per clock speed compared to Nvidia a generation or two ago. So dont be babbling fanboy stuff.

I would agree with that one guy (the # escapes me) about the fanboy stuff, but I said it first! On this thread anyways.
wassup4u2 - Tuesday, May 4, 2004 - link
#30 & 38, I believe that while the ATI line is fabbed at TSMC, NVidia is using IBM for their NV40. I've heard also that yields at IBM aren't so good... which might not bode well for NVidia.
quanta - Tuesday, May 4, 2004 - link
> #14, I think it has more to do with the fact those OpenGL benchmarks are based on a single engine that was never fast on ATI hardware to begin with.

Not true. ATI's FireGL X2 and Quadro FX 1100 were evenly matched in workstation OpenGL tests[1], which do not use Quake engines. Considering FireGL X2 is based on the Radeon 9800XT and Quadro FX 1100 is based on GeForce FX 5700 Ultra, such result is unacceptable. If I were an ATI boss, I would have made sure the OpenGL driver team would not make such a blunder, especially when R420 still sucks in most OpenGL games compared to GeFocre 6800 Ultra cards.

[1] http://www.tomshardware.com/graphic/20040323/index...
AlexWade - Tuesday, May 4, 2004 - link
From my standpoint the message is clear: nVidia is no longer THE standard in graphic cards. Why do I say that? It half the size, it requires less power, it has less transistors, and the performance is about the same. Even if the performance was slightly less, ATI would still be winner. Anyway, whatever, its not like these benchmarks will deter the hardcore gotta-have-it-now fanboys.

Its not like I'm going to buy either. Maybe this will lower the prices of all the other video cards. $Dreams$
rsaville - Tuesday, May 4, 2004 - link
If any 6800 users are wondering how to make their 6800 run the same shadows as the 5950 in the benchmark see this post:
http://forums.relicnews.com/showthread.php?p=39462...

Also if you want to make your GeForceFX run the same shadows as the rest of the PS2.0 capable cards then find a file called driverConfig.lua in the homeworld2\bin directory and remove line 101 that disables fragment programs.
raskren - Tuesday, May 4, 2004 - link
I wonder if this last line of AGP cards will ever completely saturate the AGP 8X bus. It would be interesting to see a true PCI-Express card compared to the same AGP 8X counterpart.

Remember when Nvidia introduced the MX440 (or was it 460?) with an 8X AGP connector...what a joke.
sisq0kidd - Tuesday, May 4, 2004 - link
That was the cheesiest line #46, but very true...
sandorski - Tuesday, May 4, 2004 - link
There is only 1 clear winner here, the Consumer!

ATI and NVidia are running neck-neck.

ATI Radeon X800 Pro and XT Platinum Edition: R420 Arrives

Pixel Shader Performance Tests

Post Your Comment

95 Comments

View All Comments

TrogdorJW - Tuesday, May 4, 2004 - link

NullSubroutine - Tuesday, May 4, 2004 - link

NullSubroutine - Tuesday, May 4, 2004 - link

wassup4u2 - Tuesday, May 4, 2004 - link

quanta - Tuesday, May 4, 2004 - link

AlexWade - Tuesday, May 4, 2004 - link

rsaville - Tuesday, May 4, 2004 - link

raskren - Tuesday, May 4, 2004 - link

sisq0kidd - Tuesday, May 4, 2004 - link

sandorski - Tuesday, May 4, 2004 - link

Log in

Don't have an account? Sign up now

	Radeon X800 XT PE	Radeon X800 Pro	GeForce 6800 Ultra	GeForce 6800 GT	GeForce FX 5950 U
2	310	217	355	314	65
3	244	170	213	188	43
4	238	165
5	211	146	162	143	34
6	244	169	211	187	43
7	277	160	205	182	36
8	176	121
9	157	107	124	110	20
10	352	249	448	410	72
11	291	206	276	248	54
12	220	153	188	167	34
13	134	89	133	118	20
14	140	106	141	129	29
15	195	134	145	128	29
16	163	113	149	133	27
17	18	13	15	13	3
18	159	111	99	89	17
19	49	34
20	78	56
21	85	61
22	47	33
23	49	43	49	46

	GeForce 6800 U PP	GeForce 6800 GT PP	GeForce 6800 U	GeForce 6800 GT
2	413	369	355	314
3	320	283	213	188
5	250	221	162	143
6	300	268	211	187
7	285	255	205	182
9	159	142	124	110
10	432	389	448	410
11	288	259	276	248
12	258	225	188	167
13	175	150	133	118
14	167	150	141	129
15	195	173	145	128
16	180	161	149	133
17	21	19	15	13
18	155	139	99	89
23	49	46	49	46

	Radeon X800 XT PE	Radeon X800 Pro	GeForce 6800 Ultra	GeForce 6800 GT	GeForce FX 5950 U
2	310	217	355	314	65
3	244	170	213	188	43
4	238	165
5	211	146	162	143	34
6	244	169	211	187	43
7	277	160	205	182	36
8	176	121
9	157	107	124	110	20
10	352	249	448	410	72
11	291	206	276	248	54
12	220	153	188	167	34
13	134	89	133	118	20
14	140	106	141	129	29
15	195	134	145	128	29
16	163	113	149	133	27
17	18	13	15	13	3
18	159	111	99	89	17
19	49	34
20	78	56
21	85	61
22	47	33
23	49	43	49	46

	GeForce 6800 U PP	GeForce 6800 GT PP	GeForce 6800 U	GeForce 6800 GT
2	413	369	355	314
3	320	283	213	188
5	250	221	162	143
6	300	268	211	187
7	285	255	205	182
9	159	142	124	110
10	432	389	448	410
11	288	259	276	248
12	258	225	188	167
13	175	150	133	118
14	167	150	141	129
15	195	173	145	128
16	180	161	149	133
17	21	19	15	13
18	155	139	99	89
23	49	46	49	46

	Radeon X800 XT PE	Radeon X800 Pro	GeForce 6800 Ultra	GeForce 6800 GT	GeForce FX 5950 U
2	310	217	355	314	65
3	244	170	213	188	43
4	238	165
5	211	146	162	143	34
6	244	169	211	187	43
7	277	160	205	182	36
8	176	121
9	157	107	124	110	20
10	352	249	448	410	72
11	291	206	276	248	54
12	220	153	188	167	34
13	134	89	133	118	20
14	140	106	141	129	29
15	195	134	145	128	29
16	163	113	149	133	27
17	18	13	15	13	3
18	159	111	99	89	17
19	49	34
20	78	56
21	85	61
22	47	33
23	49	43	49	46

	GeForce 6800 U PP	GeForce 6800 GT PP	GeForce 6800 U	GeForce 6800 GT
2	413	369	355	314
3	320	283	213	188
5	250	221	162	143
6	300	268	211	187
7	285	255	205	182
9	159	142	124	110
10	432	389	448	410
11	288	259	276	248
12	258	225	188	167
13	175	150	133	118
14	167	150	141	129
15	195	173	145	128
16	180	161	149	133
17	21	19	15	13
18	155	139	99	89
23	49	46	49	46