Original Link: http://www.anandtech.com/show/535



For the past few months the focus of the industry has pretty much been exclusively on the desktop microprocessor market.  With the megahertz and, just recently, the gigahertz wars occupying most of the publication space online, it is refreshing to see that die down for a bit and the graphics war heat up yet again. 

Just about one year ago, the furious 3dfx versus NVIDIA debates began erupting because of the release of the Voodoo3 and the TNT2.  While 3dfx chose to focus on fill rate and the resulting frame rate, NVIDIA chose to focus on adding playable 32-bit color support to their TNT (while the TNT supported 32-bit color rendering, a lack of memory bandwidth kept it from being a truly playable solution).  The lack of any support for 32-bit color rendering left 3dfx with quite a bit of flack from NVIDIA supporters even though, at the time, most games didn’t really benefit from switching to 32-bit color rendering. 

Six months later, both 3dfx and NVIDIA were scheduled to release their next-generation parts, and once again, there was a fine dividing line between what 3dfx’s goals were and what NVIDIA’s goals were in terms of implementing features.  NVIDIA felt that it was time for the graphics card to take some of the load off of the host CPU by performing all of the transforming & lighting calculations on-board instead of on the host CPU, while 3dfx felt that hardware T&L wasn’t worth focusing on because there was still a need for greater fill rates and other features. 

Today, no one can honestly say that NVIDIA’s hardware T&L really made a difference in their gaming experience because the games that would truly take advantage of a hardware T&L engine were not out at the time the GeForce was released and are only now beginning to appear.  Nevertheless, it won’t be until later this year that a large number of games will begin to take advantage of hardware T&L.

Chances are, that if 3dfx released their next-generation Voodoo4/5 parts alongside NVIDIA’s GeForce as planned, the GeForce wouldn’t have grown to become the gaming card of choice.   But, although 3dfx was adamantly denouncing hardware T&L as an unnecessary feature for the time being, the GeForce was racking up sales and the Voodoo4/5 had yet to be seen. 

As painful as it was for them to admit, 3dfx had no part that could compete with the GeForce.  Things have changed quite a bit during the past 6 months, and while the market eagerly awaits the successor to the current gaming card of choice from NVIDIA, 3dfx is finally ready to bring the Voodoo4/5 to the table. 

It will be a couple more weeks until you will see a final review of the Voodoo4/5, and although we usually don’t like doing too many previews on a single product, this time around we are armed with much more to show you than a few screenshots and promises of an amazing product.

We’ll save in-depth talk about T-Buffer and the features of the Voodoo4/5 for the final review of the product and make this preview more of an indication of what to expect from the Voodoo4/5.



The Boards

We’ve known since last November that 3dfx would be producing a total of five different boards based on their VSA-100 (Voodoo Scalable Architecture) chip.  Those boards are the Voodoo4 4500 PCI, Voodoo4 4500 AGP, Voodoo5 5000 PCI, Voodoo5 5500 AGP and the Voodoo5 6000 AGP. 

Here we have a quick refresher on what distinguishes each one of these boards from the rest:

Voodoo4 4500 AGP & PCI

  • Single 3dfx VSA-100
  • 32MB memory
  • 2 pixels per clock rendered
  • 333-367 megapixels/s
  • 2 sample full-scene anti-aliasing
  • $179 US

The Voodoo4 4500 is targeted at "mainstream consumers" and is thus the more cost effective single VSA-100 product. The Voodoo4 4500 will be available in both PCI and AGP versions at $179. Think of the Voodoo4 4500 as the Voodoo3 3000 with 32-bit rendering, large texture support, and 32MB of memory. Expect performance similar to the Voodoo3 3000, but with greatly enhanced image quality thanks to these new features.

Voodoo5 5000 PCI

  • Dual 3dfx VSA-100 SLI
  • 32MB memory
  • 4 pixels per clock rendered
  • 667-733 megapixels/s
  • Real-time full-scene anti-aliasing (2/4 sample)
  • T-Buffer digital cinematic effects
  • $229 US

The entry level for the Voodoo5 line, the 5000 PCI, is actually just $50 more than the Voodoo4 4500. You get quite a lot for that $50 though, including double the fillrate and T-Buffer effects thanks to a second VSA-100 chip. However, the 32MB of memory is slightly less effective here since texture data will be duplicated in memory thanks to the dual chip configuration. Also note that the Voodoo5 5000 is PCI only at this point. Performance is theoretically double the Voodoo4 4500 without full scene anti-aliasing enabled, or approximately the same as the 4500 with it enabled.


Click to enlarge

Voodoo5 5500 AGP

  • Dual 3dfx VSA-100 SLI
  • 64MB memory
  • 4 pixels per clock rendered
  • 667-733 megapixels/s
  • Real-time full-scene anti-aliasing (2/4 sample)
  • T-Buffer digital cinematic effects
  • $299 US

The first AGP card in the Voodoo5 line up is the 5500, which is much like the 5000 PCI with an additional 32MB of memory and an AGP interface. The increased bus transfer rate and onboard RAM serve to enhance performance as game complexity increases.

All this will cost you $50 more than the 5000 PCI, primarily to pay for the additional RAM. If RAM prices drop, expect the cost difference between the boards to also drop.


Click to enlarge

Voodoo5 6000 AGP

  • Quad 3dfx VSA-100 SLI
  • 128MB memory
  • 8 pixels per clock rendered
  • 1.33 - 1.47 gigapixels/s
  • Real-time full-scene anti-aliasing (2/4 sample)
  • T-Buffer digital cinematic effects
  • $599 US

The Voodoo5 6000 is definitely the mother of all graphics cards with easily the highest fillrate of anything available at its launch. With 128MB of RAM, texture space should not be a problem as this card will have more RAM than many systems have. 3dfx is shooting for 85 fps at 1024x768x32 in Quake 3 with full scene anti-aliasing enabled - not too shabby.

The price is quite high at $599 and is clearly targeted at the hardcore gamer. We know some people will buy it because quite a few people paid about $600 for a Voodoo2 SLI setup when it was released. The 6000 AGP will feature an external 100W power supply that hooks up to the board via a connector on the cards back plate.



Our Board

3dfx sent us what they call a beta revision of the final Voodoo5 5500 AGP card.  The revision of the VSA-100 silicon on this board is internally referred to as A2 and is running at 166MHz.  While 3dfx made it a point to tell us that this was not a final production sample, you also have to take into account common sense and realize that 3dfx has to have production quality silicon ready now, if not by the beginning of May, if they expect these boards to be shipping at the end of May. 


Click to Enlarge


Click to Enlarge

Our take on the situation is that you can’t expect the hardware to get much faster than what we have here now; the main points for improvement will be in the drivers.  At this point, it is highly unlikely that the shipping VSA-100 chips will feature a much higher clock speed than 166MHz unless 3dfx has been hiding something from us. 

The board we were provided with, as mentioned above, was a Voodoo5 5500 AGP, which features two VSA-100 chips and a total of 64MB of SDRAM also clocked at 166MHz.  Because the two chips are working together in SLI (Scan Line Interleave) mode, the 64MB of memory is split evenly between the two, and since they are essentially independent of one another, the textures in any scene must be duplicated in each set of 32MB of SDRAM.  This means that if you have a scene with 10MB of textures, it occupies a total of 20MB of memory out of the 64MB on board since each chip requires those 10MB of textures to be available to it locally. 

Each chip has its own 128-bit pathway to its 32MB of SDRAM, meaning that each chip has the bandwidth of an SDR GeForce 256.  When put together, the card theoretically has about 5.3GB/s of available memory bandwidth, but you have to take into account that textures must be duplicated in both 32MB sets, meaning that some of that bandwidth is wasted (although the amount wasted should be very little). 

Each chip can render two single textured pixels per clock or one dual textured pixel per clock.  This gives a single VSA-100 chip a fill rate of 333 megapixels per second when dealing with a single textured game, or 166 megapixels per second when running a dual textured game.  For the Voodoo5 5500 AGP, this results in a fill rate of 667 megapixels per second for a single textured game or 333 megapixels per second for a dual textured game.  Seemingly ages ago, when single textured games were the only things available, this sort of a fill rate made the most sense but as most of today’s games are dual textured, this sort of flexibility is not as useful as it once was. 

Don’t you find it interesting that the most talked about fill rate on the Voodoo3 3500 was its 366 megatexels per second fill rate and not its 183 megapixels per second fill rate, whereas the exact opposite exists today with the Voodoo4/5?



Unlike the Voodoo3 series, which featured varying clock speeds depending on the board, (the Voodoo3 2000 had a 143MHz clock, the Voodoo3 3000 had a 166MHz clock and the Voodoo3 3500 had a 183MHz clock), the Voodoo4/5 boards are all based on the same VSA-100 chip clocked at 166MHz. 

The boards differentiate themselves by featuring more memory (but still 32MB per chip) and more VSA-100 chips.  This means that 3dfx doesn’t have to worry about yield problems holding them back from releasing the faster boards; if they can make a Voodoo4 4500, they can also make a Voodoo5 6000.  The only thing holding them back would be the cost/availability of memory and the availability of VSA-100 chips because they could make twice as many Voodoo5 5500s from a batch of 100 VSA-100 chips as they could Voodoo5 6000s. 

The beauty of the Voodoo5 5500 AGP as an evaluation sample is that, by disabling one of the chips, we essentially have a 32MB Voodoo4 4500 AGP card that we can also use to illustrate the performance we can expect out of that solution.

3dfx outfitted our evaluation board with eight 8MB 6ns SDRAM chips manufactured by Hyundai.  The 6ns rating means that these chips should be able to work at 166MHz (which is what they’re clocked at) and not much higher.  However, SDRAM chips are generally rated pretty liberally, meaning that a chip rated at 166MHz might be able to hit 183MHz. 

Below you’ll see a picture of the board we received from 3dfx.  As you probably already know, it draws its power from the +5V power rail of your power supply courtesy of the 4-pin power connector present on the board. 

The reason for the board’s incredible length is because all the components required to regulate the power supplied to the board must be present, instead of relying on the AGP slot to provide the power and the motherboard to regulate the power supplied. 

While we didn’t have any problems with the Voodoo5 5500 card in most of our test beds, the card did provide us with some problems when used on the ASUS P3V4X motherboard using the Apollo Pro 133A chipset.  The system would POST, but after detecting the installed drives, the system would almost always hang.  We can’t explain the issue, but it is most likely because of the pre-release nature of the hardware. 

Other than this one issue, we had no problems with the hardware. 



The Drivers

This is where the biggest performance improvement will lie, in the drivers.  It is quite obvious that the drivers we were provided with weren’t optimized for performance across the board, since the performance of the Voodoo5 at lower resolutions such as 640 x 480 was sub-par. 

At such a low resolution, there are very few factors, such as the CPU and the graphics card’s drivers, that limit the performance of a graphics card since the fill rate of the card isn’t close to being reached. 

We can expect the performance at resolutions below 1280 x 1024 to improve with updated drivers (the bulk of the improvements will occur at 640 x 480); however, at 1280 x 1024 and above, the performance should generally remain the same because we are beginning to hit the fill rate limitations of the Voodoo5 itself.

By editing a registry key, we could enable an overclocking utility in the 3dfx drivers, but our board was unable to go significantly higher than its default 166MHz clock speed. 



Benchmarking the card

We split up the benchmarks into three specific sections.  The first section is the performance of the Voodoo4 4500 AGP and the Voodoo5 5500 AGP in comparison to the rest of the cards out there, the second section is a comparison of performance including FSAA benchmarks of both the Voodoo4/5 and the GeForce using its software FSAA, and the final performance section is a visual performance comparison consisting of screenshots comparing the various incarnations of FSAA and their impacts on the gaming experience.

Our original intent was to show off the performance of the Voodoo4/5 in as many games as possible, but it quickly became apparent that to do so would not be the best approach.  Many readers suggested we use flight simulators in addition to our usual set of first person shooter benchmarks, but from our experiences with flight simulators, the limiting factor there is CPU power and not the fill rate of a video card. 

It is for this reason that a TNT2 and a Voodoo3 would be just as desirable as a GeForce to a gamer that only plays flight simulators; you are better off getting a TNT2 or a Voodoo3 and a faster CPU than shelling out the big bucks for a GeForce. 

Racing games such as Need for Speed 5: Porsche Unleashed are also not very demanding when it comes to having a video card with high fill rates.  Compared to something like Quake III Arena, NFS5 is a very simple game that requires only a powerful CPU and a decent graphics card.  Once again, it makes more sense to go after something that performs along the lines of a TNT2 or a Voodoo3 and get a faster CPU than to get something that performs like a GeForce.  You’ll only end up buying yourself a few more fps at the cost of around $200 - $300. 

Both of the aforementioned cases are areas where the Voodoo4/5 would excel, not because of its extremely high fill rate, because those games don’t require extremely high fill rates, but because they don’t depend on having hardware with extremely high fill rates, they will perform quite well when 4-sample FSAA is turned on, which would reduce the Voodoo5 5500’s performance to about that of a Voodoo3 3000 (in a 16-bit color single textured situation though), which is just fine for both types of those games.

Performance really becomes an issue with first person shooters such as Quake III Arena where a mid-range CPU is capable of driving the graphics card, but the performance of the setup hits the fill rate limitation of the graphics card before the CPU can really become a limiting factor.

Quake III Arena is still the best gaming benchmark because it scales properly with CPU speed as well as the resolution it is run at.  It also implements most of the features that upcoming games (first person shooters) will be using and thus provides an excellent metric for card performance under Quake III Arena, as well as the performance of the card in general. 

Unfortunately, there is no Direct3D equivalent of Quake III Arena in terms of a good benchmark, as UnrealTournament, while it is a great game, is a horrible benchmark.  Results in UnrealTournament vary greatly and the game does not scale very well with CPU speed or with resolution.  We included benchmarks using our own UnrealTournament benchmark, but the results aren’t nearly as reliable as those from Quake III Arena. 

In general, the performance of UnrealTournament on a system is just fine with a TNT2/Voodoo3 at resolutions of 1024 x 768 x 16 and below; once you get above that mark, you begin to hit the fill rate limitations of the TNT2/Voodoo3. 

In the end, the benchmarks you should pay the most attention to are the Quake III Arena benchmarks, because those say the most about the performance of the card.  If you’re a big UT fan, you should be fine with something that’s around TNT2 speed as long as you’re going to keep the resolution below 1024 x 768.  If you go above that, you’ll need something that has a higher fill rate than a TNT2 (i.e. GeForce or Voodoo4/5).  If you’re going to draw any conclusions from the UnrealTournament benchmarks, be sure to pay the most attention to the scores above 800 x 600 because the game is limited by more than one factor at lower resolutions. 



The Test

We chose three systems to measure the performance of these video cards.  Remember that this is a comparison of the performance of video cards, not of CPUs or motherboard platforms.

For our High End testing platform, we picked an Athlon 750 running on a KX133 motherboard.  The Athlon 750 is fast enough that it won’t be a limiting factor in the benchmarks and should also provide a good estimate of how all of the cards compared would perform on a 600 – 800MHz Athlon or Pentium III system (it will at least tell you which card would be faster).

For our Low End testing platform we picked a Pentium III 550E running on a BX motherboard.  Although this isn’t a very “low-end” processor, it is fast enough  to see a performance difference between video cards without the processor stepping in as a huge limitation.  If we used something like a Celeron 466, the performance of virtually all the cards would be virtually identical at the lower resolutions because the CPU and FSB/memory buses are limiting factors.  Once again, this is a test of graphics cards not of CPU/platform performance. 

For our FSAA testing, we picked a 1GHz Pentium III running on an i820 motherboard with RDRAM.  The reason we picked this platform (we are aware that it isn’t widely available) is because it eliminates virtually all bottlenecks that would be present and allows us to illustrate the biggest performance hit enabling FSAA would result in.  Slower setups would have lesser performance hits because they have more bottlenecks. 

Windows 98 SE Test System

Hardware

CPU(s)

Intel Pentium III 550E

Intel Pentium III 1.0EB
AMD Athlon 750
Motherboard(s) ABIT BE6 AOpen AX6C ASUS K7V-RM
Memory

128MB PC133 Corsair SDRAM

128MB PC800 Samsung RDRAM
128MB PC133 Corsair SDRAM
Hard Drive

IBM Deskstar DPTA-372050 20.5GB 7200 RPM Ultra ATA 66

CDROM

Phillips 48X

Video Card(s)

3dfx Voodoo5 5500 AGP 64MB
3dfx Voodoo5 4500 AGP 32MB
3dfx Voodoo3 3000 AGP 16MB

ATI Rage 128 Pro 32MB

ATI Rage Fury MAXX 64MB

Matrox Millennium G400MAX 32MB (would not run on Athlon platform)

NVIDIA GeForce 256 64MB DDR (default clock - 120/150 DDR)
NVIDIA GeForce 256 32MB DDR (default clock - 120/150 DDR)
NVIDIA GeForce 256 32MB SDR (default clock - 120/166)

NVIDIA Riva TNT2 Ultra 32MB (default clock - 150/183)

S3 Diamond Viper II 32MB

Ethernet

Linksys LNE100TX 100Mbit PCI Ethernet Adapter

Software

Operating System

Windows 98 SE

Video Drivers

3dfx Voodoo5 5500 AGP 64MB - beta drivers v1.00.00
3dfx Voodoo5 4500 AGP 32MB - beta drivers v1.00.00
3dfx Voodoo3 3000 AGP 16MB
- beta drivers v1.04.07

ATI Rage 128 Pro 32MB - 6.31CD25

ATI Rage Fury MAXX 64MB - A6.32CD48

Matrox Millennium G400MAX 32MB - 5.52.015

NVIDIA GeForce 256 64MB DDR (default clock - 120/150 DDR) - Detonator 5.16
NVIDIA GeForce 256 32MB DDR (default clock - 120/150 DDR) - Detonator 5.16
NVIDIA GeForce 256 32MB SDR (default clock - 120/166)
- Detonator 5.16
NVIDIA Riva TNT2 Ultra 32MB (default clock - 150/183) - Detonator 5.16

S3 Diamond Viper II 32MB - 4.12.01.9002-9.10.30

Benchmarking Applications

Gaming

GT Interactive Unreal Tournament 4.04 AnandTech.dem
idSoftware Quake III Arena demo001.dm3
idSoftware Quake III Arena quaver.dm3



This is a perfect example of the immaturity of 3dfx's drivers. Shortly after the Voodoo4/5's release (with any hope) these scores should be up around the GeForce scores at 640 x 480 since the card is far from being fill rate limited.

As the fill rate advantage of the Voodoo5 begins to kick in it jumps up a bit in the chart however the drivers are still holding it back although NVIDIA's hardware T&L is also giving the GeForce the advantage at this low of a resolution.

Updated drivers should help performance here a little bit as well.

At 1024 x 768 we are seeing fill rate become more of an issue which is why the 5500 creeps up to the performance of the GeForce DDR and its 64MB counterpart. One thing must be explained here, the 64MB DDR GeForce is actually performing below that of the 32MB DDR GeForce because we are testing using the new 5.16 drivers (which will be used with the GeForce 2 GTS), these drivers enable S3TC on the GeForce cards which makes 32MB more than enough for texture storage in Quake III Arena. This combined with the fact that the 32MB DDR cards feature DDR SGRAM which is (slightly) faster than the DDR SDRAM on the 64MB cards explains the performance differences you'll see that give the 32MB card the small advantage in most cases.

When switching to 32-bit color, the DDR GeForce is fill rate limited at 1024 x 768 x 32 whereas the Voodoo5 5500 can still pull ahead courtesy of its higher fill rate.

The 4500 is pretty disappointing, it is going to have to be much cheaper than a GeForce in order for it to be a contender this late in the game.



With the GeForce's fill rate limitations kicking in, the Voodoo5 5500 makes 1280 x 1024 very playable in 16-bit color but still lacks the memory bandwidth necessary to make 32-bit color faster than 35 fps, which isn't too bad at all. At resolutions above 1024 x 768 is where the Voodoo5 will really begin to shine, and for that matter, so will the upcoming GeForce 2 GTS.

The Voodoo4 4500 comes in below the SDR GeForce once again. The price of the card will be its only aid, with the cheapest SDR GeForce cards going for under $150 now, the Voodoo4 4500 better be priced at around $100 - $125 in order for it to remain competitive.

1600 x 1200 is almost reasonable at 40 fps, although we'd still like to see it at 60 fps before it truly becomes a playable resolution. At the same time, a large number of users don't have monitors that can run clearly at that high of a resolution in the first place so it's not a major issue. The point is that the Voodoo5 has the fill rate power to run at such a high resolution at a reasonable frame rate.

The DDR GeForce isn't too far behind at 34.5 fps, but it lacks the fill rate power to keep up with the V5.



Shifting to the "slower" CPU we get the same start for the Voodoo4/5, an indication of drivers that definitely need work. Keep in mind that these are prerelease drivers so this was expected.

Once again we see the Voodoo5 climbing up as its fill rate power outweighs its current driver limitations.

At 1024 x 768 the Voodoo5 is essentially a DDR GeForce, but in this case while its fill rate is holding the DDR GeForce back, the CPU is holding the Voodoo5 back.



Here the Voodoo5 steps into the lead, although the performance advantage isn't as dramatic in 32-bit color mode.

Once again we have a small lead over the DDR GeForce at 1600 x 1200, the Voodoo5 6000 will surely make this resolution even more of a viable option for gamers that have a 19"+ monitor since it will feature double the fill rate of the Voodoo5 5500.



Quake III Arena Quaver

Any attempt to substantially push a video card to its limits using the built in Quake III Arena demos is almost impossible. The level is not only low on the texture front, it also does not contain as many rooms as other levels. Resolutions and colors where the demo plays flawlessly are often too high to play the game in full with.

In order to fully test how memory bandwidth and RAM size affects video card performance, a separate and more stressful benchmark is needed. For the purposes of this test, we turned to Anthony "Reverend" Tan's Quaver benchmarking demo. It was a well known fact that some levels of Quake III Arena push video cards harder than others. The most notorious of these levels is the Q3DM9 map, with a reported 30 plus megabytes of textures.



Driver limitations push the Voodoo 4/5 down below the previous generation Voodoo3. Once the final drivers are released, we expect performance to be at least on par with the SDR GeForce.

Both the Voodoo4 and Voodoo5 jump up in performance here, and are virtually identical in 16-bit color performance but the added memory bandwidth of the 64MB 5500 gives it more than a 33% improvement over the 32MB 4500 in 32-bit color.

The Voodoo5 5500 is struggling to keep up with the GeForce in 16-bit color, but because of its slight memory bandwidth advantage and greater fill rate, we see it pull ahead in 32-bit color which is what matters in the end.

The Rage Fury MAXX still seems to suffer from whatever problem plagued it in our initial review of the card at higher resolutions. Some rumors seem to point to a poorly implemented AGP interface of the card, whatever it is it didn't do well in our memory bandwidth intensive tests. If you consider Quaver to be the "crusher" of the Quake III benchmarks, then you'll want to stay away from the MAXX, at least if you plan on playing in 32-bit color mode (which is very desirable in Quake III Arena).



Once again the Voodoo5 takes the lead at 1280 x 1024. It is interesting to note how much of a benefit the 32MB DDR GeForce gains from S3TC which is enabled in the 5.16 drivers we used in this test, it makes the 64MB DDR GeForce utterly useless, maybe that's why you don't see any 64MB DDR GeForces around...



We'll let this slide because these are prerelease drivers, but the die-hard Quake players (you know, the ones that run at the lowest resolutions with every feature turned off) won't be very happy if the release drivers are performing like this at 640 x 480.

The same scenario once again, the Voodoo4 & 5 are creeping up the chart as the resolution increases with the 4500 matching the 5500 in performance at 16-bit color but dropping down noticeably at 32-bit color performance because the card essentially has 1/2 of the memory bandwidth of the 5500.



The 550E standings are pretty much identical to the Athlon 750 standings, the performance numbers are simply lower as you would expect them to be.



UnrealTournament Performance

While UnrealTournament does offer native support for Glide, we refrained from testing the Voodoo5 in Glide. Why? Have a look at the performance numbers taken from a Voodoo5 running UnrealTournament in Glide vs Direct3D:

The first thing to notice is that there is no performance difference between 16 and 32-bit color when running in Glide, this is most likely due to UnrealTournament not allowing 32-bit color/textures when running in Glide, especially since the scores were perfectly identical between 16 and 32-bit color modes.

Secondly, as the resolution increases, the performance of the Voodoo5 running in Glide mode drops below that of Direct3D indicating that it wasn't meant to be run at such high resolutions, which is one possible explanation.

Regardless, the UnrealTournament scores were already difficult to explain because of the number of limitations acting on the UnrealTournament engine (look back at our reasons that UnrealTournament isn't a good benchmark) and adding Glide scores wouldn't do much good other than adding two more lines to the graphs.



This is proof that UnrealTournament is extremely limited initially as a benchmark. At 640 x 480 there should (theoretically) be no limitations acting on the game from a video card standpoint, yet the performance of all of the cards generally falls in a very small range (with the exception of a few outliers such as the Viper II running in D3D mode which is just due to bad D3D drivers).

Once again, there is very little of a performance difference between the previous generation TNT2 Ultra/Voodoo3 and the Voodoo4/5 until you move to 32-bit color which doesn't make a huge difference in UT in the first place.

Even at 1024 x 768 you can get away with running a TNT2 and having a fun UnrealTournament experience.



At 1280 x 1024 we start seeing the cards with the more powerful fill rates pull ahead although for some reason the entire GeForce line is crippled in 32-bit color modes leaving the Voodoo5 to come in and play clean up.



The same issue exists here, where even at 640 x 480 there seems to be a hidden limitation of the UnrealTournament engine acting on the test systems.





Full Scene Anti-Aliasing

Before we dive into the FSAA support of the Voodoo5 we tested, let's take a look at what FSAA is and how it is accomplished on the Voodoo5 courtesy of its "T-Buffer"

We've all probably seen aliasing rear it's ugly head, even if you don't use any 3D graphics and that's because it's a problem even in the world of 2D computer graphics. This can be seen in the "jaggies" found in computer graphics around diagonal lines and round edges as shown below. This is usually what comes to mind when thinking about aliasing issues, and it's just as much of a problem in the 3D world, if not more. To get technical, this is known as spatial aliasing, where, as the name implies, the problem occurs in space.


Image courtesy 3dfx Interactive

Anti-aliasing is a technique that removes these "jaggies" by filling in with intermediate shades to smooth things out. This is relatively easy to implement in 2D and is even available from Windows 98 display properties for screen fonts. But in 3D, things become exponentially more complicated and no consumer solution can implement true anti-aliasing in hardware. Further, in 3D there's the additional problem of pixel popping certain distant objects end up being less than a pixel wide on screen and are sometimes shown, but other times not. This is known as pixel "popping," and is a potentially larger problem than just "jaggies."

Images Copyright 1998, Mango Grits, Inc.

Many cards claim support for anti-aliasing by implementing "edge" anti-aliasing or anti-aliasing through "oversampling." Edge anti-aliasing is accomplished by tagging which polygons are an edge and then going back and letting the CPU perform anti-aliasing on these edges after the scene is rendered. In order for a game to support this, it has to be designed with this in mind as the edges have to be tagged. The extra steps cause serious latency issues and sucks up all the CPU power. Oversampling is simply rendering a scene at a higher resolution than the final output and then scaling it down. This technique is implemented by the PowerVR architecture. Of course, it takes a lot more power to render at 1600x1200 and then scale down to 800x600. In other words, they're useless for games, but are implemented for OEM "checklists" and improving 3D Winbench quality scores.

The T-Buffer provides true full scene anti-aliasing that solves both pixel popping and jaggies. Perhaps the best thing about the T-Buffer is that it is simply turned on in the driver and is then automatically applied to any game ever written for any API. As a complete hardware solution, there is no software or driver overhead.



FSAA Performance

The one thing that 3dfx failed to mention when they first started talking about FSAA was the performance hit. The Voodoo5 offers two forms of FSAA, 2 sample and 4 sample (the Voodoo4 will only offer 2 sample FSAA for performance reasons). The 2 sample FSAA essentially renders the scene twice and blends the two scenes in order to remove some of the "jaggies" while 4 sample FSAA renders the scene four times in order to remove most of the "jaggies" present in that particular scene.

Because it is simply re-rendering a scene x-number of times, 2 sample FSAA will reduce the fill rate to 1/2 of what it was without FSAA enabled and 4 sample FSAA will reduce the fill rate to 1/4 of what it was without FSAA. Now if you're running a game at a resolution that isn't hitting the peak fill rate of the Voodoo5, then the performance hit caused by moving to 2 or 4 sample FSAA should be noticeable but will still allow your game to play smoothly.

Racing and Flight Simulator games would fall into this category since they generally aren't fill rate limited and aren't that demanding on the fill rate of a video card.

However if you have a game that is beginning to expose the fill rate limitations of your card, such as Quake III Arena, then enabling 2 or 4 sample FSAA could be deadly to your frame rate and render your game virtually unplayable.

We chose Quake III Arena's demo001 benchmark in order to illustrate the worst case scenario effects of enabling 2 and 4 sample FSAA and what kind of performance hit you'll be taking because of it. Keep in mind that a first person shooter isn't the best place for FSAA, but because a first person shooter is also the most fill rate demanding type of game it represents the largest performance hit you can expect when enabling 2/4 sample FSAA.

As you can see, the performance hit is incredible in Quake III Arena, making enabling FSAA not too realistic of an option for Quake III or any other first person shooters. At the same time, FSAA doesn't really make that big of a difference in first person shooters because the action is at such a fast paced level that you don't really have the time to notice whether or not the bloody steps you're walking on are jagged or not.

Switching to 32-bit color only worsened the performance hit caused by 2/4 sample FSAA. The tests wouldn't even complete in 4 sample FSAA mode at some of the higher resolutions in 32-bit color mode.

The games that FSAA truly shines in don't need a video card with a 667MP/s fill rate, they don't even need a card with a 480MP/s fill rate, they run just fine on something like a Voodoo3 or a TNT2 which makes the performance hit caused by 2 or 4 sample FSAA much easier to bear.

We played Need for Speed 5: Porsche Unleashed at 800 x 600 with 4 sample FSAA enabled at very reasonable frame rates. We estimated the performance at 60 - 80fps at 800 x 600 x 16 with 4 sample FSAA enabled. Unfortunately, moving over to 32-bit color dropped the performance noticeably, we estimated it at around 25 - 45fps at 800 x 600 x 32.

Is it worth it? We think so, at least in games like NFS5, but because the final decision is up to you, we put together some screenshot comparisons for you all to have a look at.



FSAA Screenshots - Quake III Arena

All screenshots were taken using Hypersnap-DX available at http://www.hyperionics.com

We've highlighted the key points to look for, but in order for you to see the true difference in the images you'll want to download the uncompressed Targa files which vary in size from 900KB up to 5.5MB for the 1600 x 1200 shots.

FSAA Screenshots - Quake III Arena

 

2 Sample FSAA

 

4 Sample FSAA

Original Files (*.tga)

640 x 480 - No FSAA - 2X FSAA - 4X FSAA
800 x 600 - No FSAA - 2X FSAA - 4X FSAA
1024 x 768 - No FSAA - 2X FSAA - 4X FSAA
1280 x 1024 - No FSAA
1600 x 1200 - No FSAA



FSAA Screenshots - NFS5

Click the images to download the original files (*.tga)

No FSAA

 

2 Sample FSAA

 

4 Sample FSAA



More FSAA Screenshots - NFS5

No FSAA


Notice the Spoiler on the GT2, the "jaggies" really ruin the look.

2 Sample FSAA


Notice how most of the "jaggies" are gone but some still remain

4 Sample FSAA


Now they've almost completely vanished



Conclusion

Only so much can be said about a prerelease product, but considering that the Voodoo5 is due out in stores in the next 3 - 5 weeks, it is safe to say that at least some of the performance numbers you've seen here today (those above 800 x 600) are indicative of the performance of the shipping Voodoo4 4500 and Voodoo5 5500.

To some that may seem pretty disappointing, it all depends on what you were expecting from the card. In most cases, the Voodoo5 5500 manages to outperform a DDR GeForce, but not by an incredible amount and the Voodoo4 4500 is having quite a bit of difficulty keeping up with an SDR GeForce.

The FSAA effects are up to the individual gamer to decide on, if all you play are first person shooters, then you probably won't get any real benefit from the FSAA support of the Voodoo5. On the other hand, if you're really into racing games or flight simulators then FSAA may come in handy.

Although the Voodoo4 will support 2 sample FSAA, the performance of the solution isn't high enough for you to truly enjoy the experience with 2 sample FSAA enabled so you probably want to stay away from that unless you don't mind playing at performance levels lower than that of a Voodoo3 3000.

The argument that running a game at 1280 x 1024 or 1600 x 1200 looks just as good as 2 or 4 sample FSAA is not entirely true. While a game running at 800 x 600 with 2 sample FSAA may remove just as many jagged edges as running it at 1600 x 1200, there is a clear difference between running at 1600 x 1200 without FSAA enabled and at 800 x 600 with 4 sample FSAA enabled. It is really a subjective question but the difference is noticeable, whether it is worth the performance penalty is another question.

Unfortunately for 3dfx, more than one manufacturer will be offering a form of FSAA in their next-generation product and the GeForce already offers a software driven FSAA using the 5.xx drivers. Later this week, we'll compare 3dfx's T-Buffer solution to other competing solutions and find out if 3dfx really has something all that unique with the Voodoo5. Until then, eat up that eye candy.

Log in

Don't have an account? Sign up now