Original Link: http://www.anandtech.com/show/641
It was close to one year ago that 3dfx invited us all to Madame Tussaud’s Wax Museum at the Venetian Hotel in Las Vegas. Among the more than 100 wax statues, 3dfx announced what would most definitely be their rendition of the comeback kid, the Voodoo Scalable Architecture. In its first incarnation, the VSA-100 chip proved to be a solution that addressed all of 3dfx’s shortcomings with the Voodoo3. While still not offering AGP texturing support, a feature we have yet to see truly influence performance, the VSA-100 gave 3dfx the 32-bit color support that it had been lacking ever since. With 32-bit color rendering slowly becoming an accepted standard, its support was deemed necessary for 3dfx’s upcoming line to even have a chance of being accepted by the market.
Just like they had done at the previous year’s Fall Comdex, 3dfx quietly stole the limelight with a very impressive product lineup. However this time around, there was much more than three different clock speeds of a Voodoo3 to talk about. Instead, 3dfx showed off the true meaning of the word “Scalable” by introducing a line of VSA-100 based cards that featured one, two or even four VSA-100 chips. These cards were named the Voodoo4 4500, Voodoo5 5000, Voodoo5 5500 and the Voodoo5 6000. Both the Voodoo4 4500 and the Voodoo5 5500 were supposed to have PCI counterparts, while the 5000 was supposed to be introduced as a PCI-only card.
But if we fast-forward to the present day, you’ll quickly realize that only two of the four models made it to the market. The Voodoo4 4500 and the Voodoo5 5500 are currently the only VSA-100 based cards that are available; fortunately, both are available as an AGP or PCI form factor. The card we were originally looking forward to was the Voodoo5 5000 since it was supposed to be marginally more expensive than the Voodoo4 4500 yet boasted the same fillrate as the Voodoo5 5500, the only downside being its smaller 32MB frame buffer.
Unfortunately, the 5000 never made it to market, leaving 3dfx’s product line with only a high-end and a low-end offering. We’ve already seen what their flagship Voodoo5 5500 can do, in both AGP and PCI flavors, now it’s time to take a look at the Voodoo4 4500 in greater depth.
This is the card that will undoubtedly be compared to NVIDIA’s GeForce2 MX and ATI’s recently announced Radeon SDR, since these are the cards in its price range; however, with the Radeon 32DDR selling for $180 with a $30 rebate, it may be too difficult for the Voodoo4 4500 to compete this late in the game.
Let’s take a look at those chip specs and benchmarks before we decide the fate of 3dfx’s latest release.
As we mentioned earlier, the VSA comes from the name “Voodoo Scalable Architecture” with the stress placed on the word “Scalable.” The “Scalable” part of the VSA architecture is in the fact that up to 32 VSA-100 chips can be linked together in “SLI” mode which divides the task of rendering and displaying the lines on your screen among the chips. We have already seen this taken advantage of with the Voodoo5 5500, which features two VSA-100 chips, and we may see its return yet again with the Voodoo5 6000, which should boast four total VSA-100 chips. However, when it comes to being “Scalable,” you’re not talking about making a low-cost product, and thus the Voodoo4 4500 leaves this part of 3dfx’s VSA unused as the card features a single CPU.
Compared to the Voodoo3, the VSA-100 adds support for 32-bit color rendering, 32-bit textures, 32/24-bit Z & W, and an 8-bit stencil buffer. Furthermore, the VSA-100 can also render two single-textured pixels per clock or one dual-textured pixel per clock. Support for 2048 x 2048 textures has now been implemented into the VSA-100, thus the VSA-100 offers essentially everything the Voodoo3 lacked.
The chip is an AGP 4X part, with support for AGP 2X, AGP 1X and PCI operating modes. In spite of this, the VSA-100 does not support AGP texturing. 3dfx still feels that AGP texturing is not truly beneficial and thus there is no reason to pursue support for it with their products. The chip itself is composed of 14 million transistors, a little more than half the count of the original GeForce, and is manufactured on an enhanced 0.25-micron, 6-layer metal process. The "enhanced" 0.25-micron process just means that it takes advantage of shorter gate lengths, which allow for faster switching, thus allowing for higher frequencies and greater yields at those frequencies.
At the launch, 3dfx claimed that they would get better yields out of the tried and true 0.25 micron process than they would by moving to a 0.22 or 0.18 micron process like their competitors. Thanks to the long delays in getting the VSA-100 products to market, this strategy has more or less backfired on 3dfx, leaving them with a slower, hotter, more expensive chip. To compound things, they're apparently not able to get enough chips out of their plant, TSMC in Taiwan, even though that's the same plant that NVIDIA uses.
The VSA-100 supports all T-Buffer effects, Full Screen Antialiasing, FXT1/DXTC texture compression and all of the other features 3dfx has been talking about for the past few months. For more information on those technologies read our in depth coverage of the T-Buffer here.
The VSA-100 supports anywhere from 4MB to 64MB of memory per chip, whose clock is synchronized with the core clock, just like the Voodoo3. The memory bus is 128-bits wide and will offer 2.7GB/s of memory bandwidth per chip. The excellent 350MHz RAMDAC of the Voodoo3 is carried over to the VSA-100, so 2D image quality is up there with the best.
From the above description, the VSA-100 doesn’t appear to be much more than a Voodoo3 with support for a few new visual features and 32-bit color rendering support, but the chip’s support for up to 32-way SLI scalability (hence the name Voodoo Scalable Architecture) is what truly defines it and sets it apart from the Voodoo3.
The single VSA-100 chip on the Voodoo4 4500 features the same 128-bit path to its local memory as each individual chip on the Voodoo5 5500 does, the only difference here being that the Voodoo4 4500 features exactly half the memory as its older brother. With 32MB of SDRAM dedicated to the single VSA-100 chip, the Voodoo4 4500 has exactly half the memory storage and memory bandwidth of the 5500 since each chip on the 5500 gets a dedicated 128-bit path to 32MB of the total 64MB on-board. At the same time, since there is only a single VSA-100 chip on the Voodoo4 4500, it gets all of the memory bandwidth to itself.
The VSA-100 chip is clocked at 166MHz and since the memory clock is synchronous with the core clock, the 32MB of SDRAM is also clocked at 166MHz. Our evaluation card featured four 8MB 6ns Toshiba SDRAM chips. With a 6ns rating, these chips carry a 166MHz operating frequency, meaning that you shouldn’t expect to be able to push them too much higher than their original 166MHz clock frequency. This also means that at 166MHz, the Voodoo4 4500 has a total of 2.7GB/s of peak available memory bandwidth, which is equivalent to the memory bandwidth on the Radeon SDR (almost), the GeForce 256 (SDR) and the GeForce2 MX.
As we mentioned before, the VSA-100 can render two single textured pixels per clock or one dual textured pixel per clock, the latter being the more informative number when dealing with today’s games. The 166MHz core clock frequency translates into a dual textured fill rate of 166 megapixels per second or 333 megatexels per second. Since the Voodoo4 4500 only features one of these VSA-100 chips, it takes the crown as having the lowest fillrate out of all other boards in its class, with the GeForce 256 SDR boasting a 480 megatexels/s fillrate and the Radeon SDR boasting a 1 gigatexel/s fill rate. At the same time, you’ll have to realize that memory bandwidth limitations will kick in before either of those theoretical maximum fill rates will kick in during real world gameplay.
Unlike its larger brother, the Voodoo4 4500 does not feature an external power connector and is powered completely from the AGP or PCI slot. The same AAVID heatsink/fan that is found on each of the two VSA-100s on the Voodoo5 5500 is found on the single VSA-100 on the Voodoo4. It is attached to the chip using thermal glue.
So what you’re essentially getting with the Voodoo4 4500 is a Voodoo5 5500 with one chip disabled (which also disables half of the memory on the card). However 3dfx has made a few changes to the Voodoo4 that separate it from the Voodoo5 5500. We already mentioned that there is no +5V power connector on the card itself since its only using a single VSA-100 chip. This helps to reduce cost, as does the fact that the board is physically smaller than the Voodoo5 5500. The only other thing to change with the Voodoo4 is the fact that 3dfx only allows up to 2-sample FSAA. The reason for this is simple: by enabling 2-sample FSAA, you already cut the Voodoo4 4500’s fill rate in half; by allowing users to enabled 4-sample FSAA the fill rate would be cut into a fourth and very few games would be playable. While there is most likely a hack to get around this limitation and enable 4-sample FSAA, there’s really no reason as to why the Voodoo4 4500 simply doesn’t have the fill rate to make that a viable option.
For more information on the various FSAA settings and how they compare among 3dfx, NVIDIA and ATI, please read our FSAA Comparison, as we will not focus on FSAA in this review.
Windows 98 SE Test System
|CPU(s)||AMD Athlon (Thunderbird) 1.1GHz|
|Memory||128MB PC133 Corsair SDRAM (Micron -7E Chips)|
IBM Deskstar DPTA-372050 20.5GB 7200 RPM Ultra ATA 66
3dfx Voodoo4 4500 AGP 32MB
ATI Radeon 64MB DDR
GeForce2 MX 32MB SDR
Linksys LNE100TX 100Mbit PCI Ethernet Adapter
Windows 98 SE
Quake III Arena demo001.dm3
At 640 x 480 x 32 there are three major factors that play into the performance of a graphics card that isn't fill rate limited at such a lower resolution: driver maturity, hardware T&L performance and CPU performance. Since we're using the same CPU for all of these cards, and since the Voodoo4 4500 is using pretty mature drivers now, the main reason that the Voodoo4 4500 is lagging below the rest of the competition is because of its lack of a hardware T&L unit. The same holds true for the Voodoo5 5500, since the rest of the cards in this list have their own hardware T&L engines.
Because of this, as well as the much lower fill rate of the card, the Voodoo4 4500 is almost 33% slower than the GeForce2 MX and 26% slower than the Radeon SDR.
Normally we make the argument that 640 x 480 x 32 isn't the resolution most people are playing at so it doesn't make sense to pay attention to these scores other than to draw conclusions about drivers or T&L performance. However, in the case of the Voodoo4 4500, as you're about to see, performance at 1024 x 768 x 32 isn't exactly ideal, making 640 x 480 and 800 x 600 two resolutions that you may find yourself playing at with this card.
Whereas the Radeon SDR and the GeForce2 MX are hovering around the 60 fps mark at 1024 x 768 x 32, the Voodoo4 4500 is feeling the pain of its fill rate limitation and comes in very close to 50% slower than its two major competitors.
While the Radeon SDR and the GeForce2 MX are both memory bandwidth limited in this case and could improve in performance with a boost in memory clock, the Voodoo4 4500 simply doesn't have the fill rate to compete with those two.
None of the cards in this market segment are truly reasonable solutions for running at 1600 x 1200 x 32. While 30 fps is a decent number to shoot for, it's definitely not as smooth and as desirable as 60 fps which is what we've become used to in recent times.
As we switch game engines to MDK2 we get another illustration of what the hardware T&L units from ATI and NVIDIA are buying them in terms of low resolution performance.
The GeForce2 MX extends a 57% lead over the Voodoo4 4500 while the Radeon SDR holds a 37% advantage. But once again, this is at 640 x 480 x 32, most users would like to crank up the resolution just a bit, let's see how the Voodoo4 4500 hands the higher resolution in game that isn't as complicated as Quake III Arena...
While the GeForce2 MX is able to remain closer to 70 fps, and the Radeon SDR sticks its head up above 60 fps, being fill rate limited the Voodoo4 4500 finds itself yet again at the 40 fps mark. While this isn't horrible performance, it's probably going to force you to kick the resolution down to 800 x 600 x 32 in a situation like this.
Again at 1600 x 1200 we don't see any reason to even compare these cards. The Radeon SDR's higher fill rate and HyperZ features help give it the 24% lead over the GeForce2 MX and the 73% lead over the Voodoo4 4500 however even with that, at 26 fps you're probably not going to be enjoying the benefits of 1600 x 1200 x 32 for too long.
As we've seen before, it's going to take something with incredible fill rate and memory bandwidth, just as the GeForce2 Ultra to make 1600 x 1200 x 32 a playable resolution. Too bad the Ultra is way out of this price range.
In UnrealTournament, the Minimum Frame Rate at 640 x 480 is mostly CPU limited for the contenders here since the game doesn't take advantage of any of the competing hardware T&L engines.
Without the inclusion of the Voodoo4 4500 there's a 10% spread of frame rates, however including the Voodoo4 increases that spread to 20% which is an indication of the performance to come...
Once again, in the average frame rate score at 640 x 480 x 32, the Voodoo4 4500 falls noticeably behind the competition. Outside of the Voodoo4, the slowest card here is only 5% slower than the fastest Voodoo5 5500. But if you look at the Voodoo4 4500 you're looking at a performance hit of over 20% when compared to the next to last contender.
This is still only at 640 x 480 x 32, and while 76 fps is definitely a playable frame rate let's see how bad things get as the resolution increases...
The Voodoo4 4500 really suffers under the worst case scenario here.
While we've historically knocked UT for being a bad benchmark, combined with a solid demo we can really get quite a bit of information out of it. Case in point being the performance of the Voodoo4 4500 under the worst possible conditions, and as you can see the GeForce2 MX is no less than twice as fast as the Voodoo4 here.
The Radeon SDR benefits from its HyperZ technology and pulls even further ahead of the Voodoo4, surpassing the GeForce2 MX in performance and ends up only 6% slower than the Voodoo5 5500.
UnrealTournament, being a very texture hungry game, makes very good use of the Radeon's efficient memory bandwidth management techniques (HyperZ) and allows it to perform virtually on par with the Voodoo5 5500.
The GeForce2 MX suffers a bit because of its memory bandwidth limitations, but the Voodoo4 4500, with the same amount of memory bandwidth as the GeForce2 MX but with not nearly as high of a fill rate comes in around 56% slower once again.
UnrealTournament is actually pretty well behaved at higher resolutions. While it has a problem with a lot of our cards running at 1600 x 1200 x 32, thus forcing us to run the benchmark at 1600 x 1200 x 16, the fact that there is no real performance difference between the two settings (performance-wise) in UT helps us use these scores to continue our analysis.
The minimum frame rates are much closer as the GeForce2 MX and Radeon SDR are crippled by memory bandwidth limitations here. Percentage-wise, the Radeon SDR is still 55% faster, and the GeForce2 MX isn't too far behind however when you look at the frame rates they're all pretty bad. So let's take a look at how the averages are doing...
While 1600 x 1200 x 16 may bring all of our competing cards to their knees, when they are allowed to shine they really do. The Radeon SDR truly knows how to manage its memory bandwidth as it performs dangerously close to its DDR brothers while extending a 74% lead over the Voodoo4 4500. The GeForce2 MX is much closer to the Voodoo4's performance with only a 33% lead.
16-bit vs 32-bit Performance
We recently switched our testing methodology to a 32-bit only test suite for card performance, however it is important to investigate 16-bit performance as well, and thus we have put together a section on the card's 16-bit performance in comparison to its 32-bit performance
Since the Voodoo4 and the GeForce2 MX have identical memory bandwidth figures, it makes sense that they both take a similar hit when going from 16-bit color to 32-bit color rendering.
With the Voodoo4 4500, just like the GeForce2 MX, you may find yourself wanting to turn off 32-bit color just to get that increased frame rate. Although you will notice that the 16-bit performance of the GeForce2 MX is much greater than that of the Voodoo4 4500.
Once again, because of the similarities in memory bandwidth the performance drops are virtually identical. For the Voodoo4, its 16-bit rendering performance at worst is 73% faster than its 32-bit rendering performance and for the GeForce2 MX we have a very close 71% difference.
CPU Scaling Performance
We purposefully benchmark all of the cards on a fairly high performance platform, in this case a 1.1GHz Thunderbird, in order to gain an understanding for the uncapped performance of the solutions. However, not everyone has a 1.1GHz processor so in order to get a better idea for what CPU speeds are necessary to get the most out of the card, we look to a CPU scaling graph to see how the card performs with various CPUs
The Voodoo4 4500 definitely likes the faster CPUs. The performance increase you see when going from a Celeron 433 with the Voodoo4 to a Pentium III 500E is almost 20% more than the performance improvement we saw with the ATI Radeon SDR, although the Radeon SDR came out faster.
After the Pentium III 500E however the Voodoo4 4500 scales much like the Radeon SDR, with the jump to a 750MHz Duron yielding an 18% performance improvement.
Depending on the games you are looking to play, CPU may or may not be a very important factor. It's obvious by the CPU Scaling graph that the sweet spot for the Voodoo4 4500, like the Radeon SDR, is around the 600 - 750MHz range. An Athlon, Duron or Pentium III running at those speeds would be perfect for most first person shooters. The only time when you're going to possibly need more CPU power is if you're playing a flight simulator or a complex 3D adventure/RPG game, in which case you might want to pay more attention to getting a faster CPU.
Windows 2000 Driver Performance
While not always as bad as ATI has been in the past, 3dfx hasn't exactly had the best driver support under Windows NT. They have recently started to get their act together and while their drivers have generally been solid under Windows 9x, we have yet to be overly impressed with them under NT. Now with Windows 2000 growing in popularity and more people using it as a home/office, professional and gaming OS, it is very important to see how a manufacturer's drivers stack up under this OS.
We used the latest available drivers from 3dfx under Windows 2000, they are dated September 29, 2000 and are version 1.03.00.
Once again we see a manufacturer that has not given the same amount of attention to their Windows 2000 drivers as their Windows 98 drivers. For 3dfx this does make some sense since the bulk of their customers are running Windows 98, however there is still no excuse for Windows 2000 drivers being any slower at all.
At 640 x 480 x 32 the Windows 98 driver is 33% faster, but the lead decreases as the resolution increases. At 1024 x 768 x 32 the difference is "only" 20%, which is still pretty bad.
For comparison's sake, the GeForce2 MX produces the exact same benchmark numbers under Windows 2000 as it does under Windows 98 using NVIDIA's officially released Detonator3 6.31 drivers. The only exception being at 640 x 480 x 32 where the GeForce2 MX is 3% faster under Windows 2000. That is just to show that it can be done, it's just up to the manufacturers to get it right.
It does make sense that Windows 2000 performance should be equivalent to Windows 98 performance, and at lower resolutions it should actually be faster since Windows 2000 is a more robust OS.
Video Features & Playback
The VSA-100 features no Hardware Motion Compensation or other unique video playback features, and in some of our DVD quality tests the image quality of the Voodoo4 4500 seemed a bit more washed out than the competition. For more information on the Voodoo4/5's video playback quality and performance visit our October 2000 Video Card Roundup entitled: DVD Quality, Features & Performance.
Just like the Voodoo5 5500, the Voodoo4 4500 has no video input or output features.
FSAA Performance and Image Quality
While we have come to the conclusion before that 3dfx's 2-sample FSAA was the best overall FSAA setting to choose, the Voodoo4 4500 simply doesn't have the fill rate to cope with the 50% fill rate decrease associated with enabling 2-sample FSAA. While it may be bearable for some, in most situations you will probably opt to leave it turned off; not to say that there aren't situations where it may come in handy, it's just that those that do are few and far between.
As far as performance is concerned, you can realistically expect to lose very close to half your performance by enabling FSAA on the Voodoo4 4500.
For more information on 3dfx's FSAA quality and performance, as well as how it stacks up to competing offerings from ATI and NVIDIA please read our FSAA Comparison.
3dfx's Overclocking Tool, which can be downloaded from their website, is nothing more than a manipulated registry key that enables the "tool" in the drivers. This allows you to adjust your core/memory clock speed (since they are dependent on one another you can't set them independently), however 3dfx clearly states in the utility that they don't recommend overclocking more than 10% higher than the default clock.
Considering that the highest we could reach was 175MHz, we didn't even get a chance to push 3dfx's suggested limit of 10%. At 183MHz there was already corrupted data present on the screen, and it wasn't until we backed down to 175MHz did things become stable. For more information on how to check your video card for stability while overclocked, including what to look for, read our quick one-page guide on how to overclock your video card.
The performance improvement was pretty much what we expected. Since the Voodoo4 is fill rate limited in many cases, it desperately needs the extra clock speed. And at 1024 x 768 the 5% increase in clock speed yields a 5% increase in performance. It is unfortunate that we could not go any higher, however there is a possibility that other cards will be able to push the envelope even further.
With a street price of around $150, the Voodoo4 4500 is going to be a very hard sell. As a $100 or sub-$100 card, the Voodoo4 4500 does have some justification behind it, unfortunately in many cases it is simply too little too late for 3dfx.
The Voodoo4 4500 would have been perfect around the introduction of the Voodoo3 3500TV or before NVIDIA brought the GeForce2 MX to market, however now that you can pick up a GeForce2 MX with TwinView support for around the same price, or one without the feature for a little less, the Voodoo4 4500 definitely loses its appeal.
While we criticized ATI's introduction of the Radeon SDR at the $150 price point, it definitely offers more bang for your buck than the Voodoo4 4500 does.
The problem with the Voodoo4 4500 is mainly that it's lacking in the fill rate department, and while overclocking would be able to help solve that we weren't able to push the card far enough as a 5% overclock was the highest we could achieve. It would take much more to get the Voodoo4 4500 to the point where it could compete in the fill rate department.
While 3dfx could theoretically outfit the Voodoo4 with hand picked 183MHz VSA-100 chips, it doesn't make sense for them to do that now. Their concentration should most definitely be on getting the next product out while keeping their head above water for now, it's definitely been a very bumpy ride for the company that at one point was the undisputed king of the 3D accelerator world.