Pixel Shader Performance Tests

ShaderMark v2.0 is a program designed to stress test the shader performance of modern DX9 graphics hardware with Shader Model 2.0 programs written in HLSL running on a couple shapes in a scene.

We haven't used ShaderMark in the past because we don't advocate the idea of trying to predict the performance of real world game code using a synthetic set of tests designed to push the hardware. Honestly, as we've said before, the only way to determine performance of a certain program on specific hardware is to run that program on that hardware. As both software and hardware get more complex, results of any given test become less and less generalize able, and games, graphics hardware, and modern computer systems are some of the most complex entities on earth.

So why are we using ShaderMark you may ask. There are a couple reasons. First this is only a kind of ball park test. ATI and NVIDIA both have architectures that should be able to push a lot of shader operations through. It is a fact that NV3x had a bit of a handicap when it came to shader performance. A cursory glance at ShaderMark should tell us enough to know if that handicap carries over to the current generation of cards, and whether or not R420 and NV40 are on the same playing field. We don't want to make a direct comparison, we just want to get a feel for the situation. With that in mind, here are the benchmarks.

 

  Radeon X800 XT PE Radeon X800 Pro GeForce 6800 Ultra GeForce 6800 GT GeForce FX 5950 U
2
310
217
355
314
65
3
244
170
213
188
43
4
238
165
5
211
146
162
143
34
6
244
169
211
187
43
7
277
160
205
182
36
8
176
121
9
157
107
124
110
20
10
352
249
448
410
72
11
291
206
276
248
54
12
220
153
188
167
34
13
134
89
133
118
20
14
140
106
141
129
29
15
195
134
145
128
29
16
163
113
149
133
27
17
18
13
15
13
3
18
159
111
99
89
17
19
49
34
20
78
56
21
85
61
22
47
33
23
49
43
49
46

These benchmarks are run with fp32 on NVIDIA hardware and fp24 on ATI hardware. It isn't really an apples to apples comparison, but with some of the shaders used in shadermark, partial precision floating point causes error accumulation (since this is a benchmark designed to stress shader performance, this is not surprising).

ShaderMark v2.0 clearly shows huge increase in pixel shader performance from NV38 to either flavor of NV40. Even though the results can't really be compared apples to apples (because of the difference in precision), NVIDIA manages to keep up with the ATI hardware fairly well. In fact, under the diffuse lighting and environment mapping, shadowed bump mapping and water color shaders don't show ATI wiping the floor with NVIDIA.

In looking at data collected on the 60.72 version of the NVIDIA driver, no frame rates changed and a visual inspection of the images output by each driver yielded no red flags.

We would like to stress again that these numbers are not apples to apples numbers, but the relative performance of each GPU indicates that the ATI and NVIDIA architectures are very close to comparable from a pixel shader standpoint (with each architecture having different favored types of shader or operation).

In addition to getting a small idea of performance, we can also look deep into the hearts of NV40 and see what happens when we enable partial precision rendering mode in terms of performance gains. As we have stated before, there were a few image quality issues with the types of shaders ShaderMark runs, but this bit of analysis will stick only to how much work is getting done in the same amount of time without regard to the relative quality of the work.

  GeForce 6800 U PP GeForce 6800 GT PP GeForce 6800 U GeForce 6800 GT
2
413
369
355
314
3
320
283
213
188
5
250
221
162
143
6
300
268
211
187
7
285
255
205
182
9
159
142
124
110
10
432
389
448
410
11
288
259
276
248
12
258
225
188
167
13
175
150
133
118
14
167
150
141
129
15
195
173
145
128
16
180
161
149
133
17
21
19
15
13
18
155
139
99
89
23
49
46
49
46

The most obvious thing to notice is that, overall, partial precision mode rendering increases shader rendering speed. Shader 2 through 8 are lighting shaders (with 2 being a simple diffuse lighting shader). These lighting shaders (especially the point and spot light shaders) will make heavy use of vector normalization. As we are running in partial precision mode, this should translate to a partial precision normalize, which is a "free" operation on NV40. Almost any time a partial precision normalize is needed, NV40 will be able to schedule the instruction immediately. This is not the case when dealing with full precision normalization, so the many 50% performance gains coming out of those lighting shaders is probably due to the partial precision normalization hardware built into each shader unit in NV40. The smaller performance gains (which, interestingly, occur on the shaders that have image quality issues) are most likely the result of decreased bandwidth requirements, and decreased register pressure: a single internal fp32 register can handle two fp16 values making scheduling and managing resources much less of a task for the hardware.

As we work on our image quality analysis of NV40 and R420, we will be paying heavy attention to shader performance in both full and partial precision modes (as we want to look at what gamers will actually be seeing in the real world). We will likely bring shadermark back for these tests as well. This is a new benchmark for us, so please bear with us as we get used to its ins and outs.

NVIDIA's Last Minute Effort and The Test Aquamark 3 Performance
Comments Locked

95 Comments

View All Comments

  • 413xram - Wednesday, May 5, 2004 - link

    They announced they where going to in there release anyway. Later on this summer. Why not now?
  • jensend - Wednesday, May 5, 2004 - link

    #61- nuts. 512 mb ram will pull loads more power, put out a lot more heat, cost a great deal more (especially now, since ram prices are sky-high), and give negligible if any performance gains. Heck, even 256 mb is still primarily a marketing gimmick.
  • 413xram - Wednesday, May 5, 2004 - link

    They (ATI) are using the same technology that their previous cards are using. They pretty much just added more transistors to perform more functions at a higher speed. I am willing to bet my paycheck that they spent no where close to 400 million dollars to run neck and neck with nvidia in performance. I guess "virtually nothing" is an overstatement. My apologies.
  • Phiro - Wednesday, May 5, 2004 - link

    Where do you get your info that ATI spent "virtually nothing"?
  • 413xram - Wednesday, May 5, 2004 - link

    Both cards perform brilliantly. They are truly a huge step in graphics processing. One problem I forsee though,is that Nvidia spent 400 million dollars into development of their new nv40 technology, while ATI spent virtually nothing to have the same performance gains. Economically that is a hard pill for Nvidia to swallow.

    It is true that Nvidia's card has the 3.0 pixel shading, unfortunatly though, they are banking on hardware that is not supported upon release of the card. In dealing with video cards from a consumers standpoint that is a hard sell. I have learned from the past that future possibilties of technology in hardware does nothing for me today. Not to mention the power supply issue that does not help neither.

    Nvidia must find a way to get better performance out of their new card, I can't believe I'am saying that after seeing the specs that it already performs at, or it may be a long, HOT, and expensive summer for them.

    P.S. Nvidia. A little advice. Speed up the release on your 512 mb card. That would definetly sell me. Overclocking your 6800 is something that 90% of us in this forum would do anyway.
  • theIrish1 - Wednesday, May 5, 2004 - link


    heh, whatever.. whatever, and whatever. I love the fanboyisms....

    I admit I am a fan of ATI cards. I bought a 9700pro and a 9500pro(in my secondary gaming rig) when they first came out, and an 8500 "pro" before that...but now I want to upgrade again. I am keeping an open mind. After looking at benchmarks, it is clear the both cards have their wins and losses depending on the test. I don't think there is a clear cut winner. nVidia got there by new innovation/technology. ATI got there by optimizing "older" technology.

    At this point, with pricing being the same.. I think I still have to lean to the ATI cards. Main reasons being heat & power consumption. If the 6800U was $75 or $100 cheaper, I would probably go with that. It will be interesting to see where the 6850 falls benchmark wise, and also in pricing. If the 6850 takes the $500 pricepoint, where will that leave the 6800U? $450? Or with the 6850 be $550?

    Something else about the x800Pro (which by the way, alot of the readers/posters seem to be getting confused as to what they are talking about between the Pro and XT models). Anyway, there are a few online stores out there taking pre-orders still for the x800PRO.... for $500+. I thought the Pro was going to go at $400 and the XT at $500...?!?
  • 413xram - Wednesday, May 5, 2004 - link

  • Pumpkinierre - Wednesday, May 5, 2004 - link

    On the fabrication o the two Gpus- the tech report:

    "Regardless, transistor counts are less important, in reality, than die size, and we can measure that. ATI's chips are manufactured by TSMC on a 0.13-micron, low-k "Black Diamond" process. The use of a low-capacitance dielectric can reduce crosstalk and allow a chip to run at higher speeds with less power consumption. NVIDIA's NV40, meanwhile, is manufactured by IBM on its 0.13-micron fab process, though without the benefit of a low-k dielectric."

    The extra transistors of the 6800U might be taken up with the cinematic encoding/rendering embedded chip. Although ATI claim encoding in their X800p/XT blurb, I havent seen much yet to distinguish it from the 9800p in this field. The Tech report checked power consumption at the wall for their test systems and the 6800s ramp up the power a lot quicker with gpu speed so I'm not too hopeful about the overclock to 520Mhz and 6800u extreme gpu yields. Still, maybe a new stepping or 90nm SOI shrink might help (I noticed both manufacturers shied away from 90nm).

    Anyway brilliant video cards from North America. Congratulations ATI and Nvidia!

  • NullSubroutine - Wednesday, May 5, 2004 - link

    If it was nice sarcasm I can laugh, if it was nasty sarcasm you can back off. I can see it would be simple for me to overlook the map used, however no indication to what Atech used. One could assume or someone could ask for the real answer and if they are really lucky they will get a smart ass remark.

    After checking through 10 different reviews I found similar results to Atech when they had 25 bots, THG had none.

    Next time save us both the hassle and just say THG didnt use bots, and Atech probably did.
  • TrogdorJW - Tuesday, May 4, 2004 - link

    #54 - Think about things for a minute. Gee... I wonder why THG and AT got such different scores on UT2K4.... Might it be something like the selection of map and the demo used? Nah, that would be too simple. /sarcasm

    From THG: "For our tests in UT2004 we used our own timedemo on the map Assault-Torlan (no bots). All quality options are set to maximum."

    No clear indication of what was used for the map or demo on AT, but I'm pretty sure that it was also a home-brewed demo, and likely on a different map and perhaps with a different number of players. Clearly, though, it was not the same demo as THG used... unless THG is in the habit of giving their benchmarking demos out? Didn't think so.

    I see questions like this all the time. Unless two sites use the exact same settings, it's almost impossible to directly compare their scores. There is no conspiracy, though. Both sites pretty much say the same thing: close match, with the edge going to ATI right now, especially in DX9, while NV still reigns supreme in OGL.

Log in

Don't have an account? Sign up now