Original Link: http://www.anandtech.com/show/537
NVIDIA is truly a changed company. Very few remember NVIDIA in their pre-Riva 128 days or even very fondly for their TNT days since both products were plagued with the same problems that their competitors faced. The Riva 128 was a very fast solution that paled in image quality to the competing products from 3dfx and Rendition, and the TNT had the potential to be an instant success had it not been for the fact that the TNT’s drivers still had quite a bit of maturing to do (which they eventually did).
Let’s fast forward a bit and take a look at the launch of the TNT2. While the market still would not let NVIDIA forget that they failed to release the TNT at 125MHz as promised, NVIDIA was given a second chance to deliver as promised. Instead of hyping up a clock speed that they could not deliver on, they instead decided to give the card manufacturers the flexibility to set their own clock frequencies at their own risk and ended up releasing the TNT2 as a higher yield TNT2 part and a lower yield TNT2 Ultra part that featured a higher clock speed.
Companies such as the now defunct Hercules took it upon themselves to take the TNT2 core to the next level, hand-picking solutions capable of running at an amazing 175MHz, a 17% increase over the TNT2 Ultra’s default 150MHz clock speed.
The TNT2 was launched with near perfect drivers, since the TNT2 core was still relatively unchanged from the TNT, whose drivers had been maturing all this time. Although the Voodoo3 versus TNT2 debate raged on in newsgroups, in the end, the TNT2’s raw performance, superior driver support and support for 32-bit color rendering managed to pull it ahead of the competition.
By this time, the market was just getting used to NVIDIA’s 6-month product cycles, which, when executed properly, would result in a new technology being launched every fall followed by a “spring refresh,” which would boast the move to higher clock speeds and an upgraded feature set to keep the current product generation alive until the fall where NVIDIA would again introduce a new technology.
This 6-month product cycle and NVIDIA’s ability to adhere to the short release intervals is actually what led to the virtual elimination of any competition at the release of their GeForce product last fall. Unable to release their Voodoo4/5 around the time of NVIDIA’s GeForce launch, 3dfx was rendered hopeless in competing with the GeForce, since even their fastest Voodoo3 3500 didn’t have a hope of reaching the fill rates the GeForce was capable of pushing. S3 had attempted to compete with the GeForce with their Savage2000, which was launched at last year’s Fall Comdex, but unfortunately, the problems that we’ve come to expect from S3 popped up once again, leaving the GeForce alone yet again.
Let’s fast-forward one more time to the present day, approximately 6 months after the release of NVIDIA’s GeForce. It is time for their “spring refresh,” which, according to the NVIDIA product release model, should consist of a higher clock speed part as well as an expanded feature set to tide the market over until this fall.
Take the GeForce, combine it with a new 0.18-micron fabrication process, pump up the clock rate, and add a boatload of features designed to keep NVIDIA on top for another 6 months and you have the GeForce 2 GTS – what NVIDIA likes to call the “world’s first Giga Texel Shader [GTS].”
The GeForce 2 GTS is a 0.18-micron version of the original GeForce. It still features 4 rendering pipelines, a.k.a. the GeForce’s QuadPipe Rendering Engine, but unlike the original GeForce, the GeForce 2 GTS is capable of processing two textures per pipeline in a single clock versus the one texture per pipeline on the original GeForce. This translates into rendering power equivalent to twice that of the GeForce at the same clock speed.
Since the GeForce 2 GTS received a die shrink down to a 0.18-micron process, NVIDIA was also able to increase the core clock speed of the solution from the 120MHz of the 0.22-micron GeForce up to 200MHz of the GeForce 2 GTS. Even the most insane overclockers couldn’t get their GeForces up beyond 170MHz, so hitting 200MHz on this 0.18-micron process is definitely an improvement over the original GeForce.
Combining these two bits of information, we get a pixel fill rate of 800 megapixels per second for the GeForce 2 GTS, which is up from 480 megapixels per second for the GeForce, and we get a texel fill rate of 1600 megatexels per second or 1.6 gigatexels per second, which is where part of the GTS acronym comes from – Giga Texel Shader.
Until 3dfx officially releases the Voodoo5 6000 (which will be after the GeForce 2 GTS hits store shelves), NVIDIA will have the highest pixel and texel fill rates in the industry, even outweighing those of the Voodoo5 5500, which currently boasts a 667 megapixels & 667 megatexels per second fill rate.
Even the ATI Radeon 256, which we just recently previewed, won’t be able to compete with the GeForce 2 GTS in terms of raw fill rate power as its specs at a 200MHz core clock speed yield a 400 megapixels per second and 1.5 gigatexels per second fill rate.
Only the Voodoo5 6000, which should debut at close to twice the cost of the GeForce 2 GTS, will be able to offer a superior fill rate of 1.33 gigapixels/s and 1.33 gigatexels/s.
If that were the only improvement NVIDIA offered over the original GeForce, they would already have a very powerful product, but fortunately for the sake of competition and furthering the industry, they have much more to offer from this chip.
Another benefit of the die shrink is that the GeForce 2 GTS consumes close to half of the power as the original GeForce, putting it at between 8 – 9W versus the 16W for the GeForce. This makes the GeForce 2 GTS closer to the position where it could be used in a mobile solution, although 8 – 9W is still far from the target mark for a mobile product.
The chip supports up to 128MB of SDR/DDR SDRAM/SGRAM. However, you will see the first boards ship in 32MB configurations, though some board manufacturers may take it upon themselves to ship 64MB GeForce 2 GTS parts. As we’ve already seen from our recent investigations involving the 64MB DDR GeForce versus its 32MB counterpart, the performance advantage of having more memory is negligible when the GeForce’s support for texture compression kicks in, at least under Quake III Arena.
Although the GeForce 2 GTS does support SDR SDRAM, all of the boards available at first will be using DDR SGRAM. There is a possibility that SDR SDRAM boards will emerge but only as OEM solutions and definitely not as a solution directed at the enthusiast market since SDR SDRAM would most definitely cripple the performance of the GeForce 2 GTS. Considering NVIDIA’s past history with releasing cheaper parts outfitted with slower memory solutions, we may eventually see something like an M64 with a GeForce 2 GTS core down the line featuring a 128-bit SDR SDRAM memory bus.
The GeForce 2 GTS still features a 128-bit memory bus and will be coupled with 166MHz DDR SGRAM (effectively running at 333MHz) for approximately 5.3GB/s of available memory bandwidth, which is up from the 4.8GB/s on the DDR GeForce and 2.7GB/s on the SDR GeForce. This amount is also equal to the 5.3GB/s of available memory bandwidth on the Voodoo5 5500, which should definitely help the GeForce 2 GTS out in high resolution/color depth rendering situations as it did when the DDR GeForce made its way into the market.
And taking a page from 3dfx’s book to success, NVIDIA will be offering the GeForce 2 GTS in both AGP and PCI configurations, which is important because more and more systems are being pre-built using i810E motherboards that feature integrated AGP graphics, leaving no AGP slot for a video card upgrade.
NVIDIA Shading Rasterizer
In addition to the improvements made to the core of the chip, the GeForce 2 GTS also boasts a new engine that the original GeForce didn’t have. NVIDIA calls this their Shading Rasterizer (NSR), and its function is much like that of ATI’s Pixel Tapestry Architecture – to allow for a number of per pixel shading effects to be performed in hardware.
The NSR allows for 7 pixel operations to be performed in a single pass, in hardware. These operations include shadow maps, bump mapping (EMBM, Dot Product 3, and embossed), shadow volumes, volumetric explosion, elevation maps, vertex blending, waves, refraction and specular lighting all on a per-pixel basis.
Once again, this is much like the feature set provided for by ATI’s Pixel Tapestry Architecture, which we talked about in greater depth here.
Improved Hardware T&L Engine
Another improvement the GeForce 2 GTS offers is a more powerful Hardware T&L engine. Like the ATI Charisma Engine, NVIDIA’s solution now supports the processing of all transformation, clipping and lighting calculations on the chip itself instead of offloading them onto the host CPU.
The improved engine is also capable of processing 25 Million triangles per second at peak operation. This is definitely an improvement from the 10 – 15 million triangles per second of the original GeForce. In spite of these improvements, we have yet to see any games truly take advantage of hardware T&L, but if NVIDIA and ATI are correct, by the end of this year we will no longer be able to make that statement.
NVIDIA supplied us with a custom made Quake III Arena level that featured an extremely high polygon count in order to test the power of NVIDIA's hardware T&L engine in an actual real world situation. Below we have a screenshot from the level:
NVIDIA's Hi-Poly Quake III Arena level didn't seem to exhibit any performance difference between the GeForce 2 and the original GeForce, although as the resolution increases the GeForce 2 separates itself from the original GeForce because of its more powerful fill rate.
The Voodoo5 5500 is easily outperformed at all resolutions below 1280 x 1024 x 32, but at higher resolutions it begins to outperform the 64MB DDR GeForce as the fill rate becomes the limiting factor instead of the T&L capabilities of the card.
Like the ATI Radeon 256, the GeForce 2 GTS features an integrated TMDS transmitter for outputting directly to DVI flat panels at resolutions up to 1280 x 1024 (ATI's solution supports up to 1600 x 1200).
The GeForce 2 GTS features an integrated High Definition Video Processor (HDVP) which supports all ATSC formats, just like ATI's solution as well.
The drivers we were supplied for use with the GeForce 2 GTS were the Detonator 5.16 drivers, and they offer a couple of improvements over the other 5.xx drivers that have been leaked; however, their biggest improvements are those held over the 3.7x driver series that most GeForce users are currently running.
The 5.16 drivers enable S3TC support under OpenGL games that take advantage of it, such as Quake III Arena. With S3TC enabled, a 32MB GeForce came close to the performance of a 64MB GeForce, and since the GeForce 2 GTS benefits from the same S3TC support, there isn’t a real need for a 64MB GeForce 2 GTS board just yet.
At the same time, NVIDIA just recently released beta drivers for Xfree86 4.0 with support for OpenGL acceleration. While these drivers are still in beta, it is a start for future support for the ever-growing Linux user base from NVIDIA.
Benchmarking the card
Taken from our Voodoo4 4500 & Voodoo5 5500 Preview
We split up the benchmarks into three specific sections. The first section is the performance of the GeForce 2 GTS in comparison to the rest of the cards out there, the second section is a comparison of performance including FSAA benchmarks of both the GeForce 2 GTS and the Voodoo4/5, and the final performance section is a visual performance comparison consisting of screenshots comparing the various incarnations of FSAA and their impacts on the gaming experience.
Our original intent was to show off the performance of the GeForce 2 GTS in as many games as possible, but it quickly became apparent that to do so would not be the best approach. Many readers suggested we use flight simulators in addition to our usual set of first person shooter benchmarks, but from our experiences with flight simulators, the limiting factor there is CPU power and not the fill rate of a video card.
It is for this reason that a TNT2 and a Voodoo3 would be just as desirable as a GeForce to a gamer that only plays flight simulators; you are better off getting a TNT2 or a Voodoo3 and a faster CPU than shelling out the big bucks for a GeForce.
Racing games such as Need for Speed 5: Porsche Unleashed are also not very demanding when it comes to having a video card with high fill rates. Compared to something like Quake III Arena, NFS5 is a very simple game that requires only a powerful CPU and a decent graphics card. Once again, it makes more sense to go after something that performs along the lines of a TNT2 or a Voodoo3 and get a faster CPU than to get something that performs like a GeForce. You’ll only end up buying yourself a few more fps at the cost of around $200 - $300.
Both of the aforementioned cases are areas where the Voodoo4/5 would excel, not because of its extremely high fill rate, because those games don’t require extremely high fill rates, but because they don’t depend on having hardware with extremely high fill rates, they will perform quite well when 4-sample FSAA is turned on, which would reduce the Voodoo5 5500’s performance to about that of a Voodoo3 3000 (in a 16-bit color single textured situation though), which is just fine for both types of those games.
Performance really becomes an issue with first person shooters such as Quake III Arena where a mid-range CPU is capable of driving the graphics card, but the performance of the setup hits the fill rate limitation of the graphics card before the CPU can really become a limiting factor.
Quake III Arena is still the best gaming benchmark because it scales properly with CPU speed as well as the resolution it is run at. It also implements most of the features that upcoming games (first person shooters) will be using and thus provides an excellent metric for card performance under Quake III Arena, as well as the performance of the card in general.
Unfortunately, there is no Direct3D equivalent of Quake III Arena in terms of a good benchmark, as UnrealTournament, while it is a great game, is a horrible benchmark. Results in UnrealTournament vary greatly and the game does not scale very well with CPU speed or with resolution. We included benchmarks using our own UnrealTournament benchmark, but the results aren’t nearly as reliable as those from Quake III Arena.
In general, the performance of UnrealTournament on a system is just fine with a TNT2/Voodoo3 at resolutions of 1024 x 768 x 16 and below; once you get above that mark, you begin to hit the fill rate limitations of the TNT2/Voodoo3.
In the end, the benchmarks you should pay the most attention to are the Quake III Arena benchmarks, because those say the most about the performance of the card. If you’re a big UT fan, you should be fine with something that’s around TNT2 speed as long as you’re going to keep the resolution below 1024 x 768. If you go above that, you’ll need something that has a higher fill rate than a TNT2 (i.e. GeForce or Voodoo4/5). If you’re going to draw any conclusions from the UnrealTournament benchmarks, be sure to pay the most attention to the scores above 800 x 600 because the game is limited by more than one factor at lower resolutions.
We chose three systems to measure the performance of these video cards. Remember that this is a comparison of the performance of video cards, not of CPUs or motherboard platforms.
For our High End testing platform, we picked an Athlon 750 running on a KX133 motherboard. The Athlon 750 is fast enough that it won’t be a limiting factor in the benchmarks and should also provide a good estimate of how all of the cards compared would perform on a 600 – 800MHz Athlon or Pentium III system (it will at least tell you which card would be faster).
For our Low End testing platform we picked a Pentium III 550E running on a BX motherboard. Although this isn’t a very “low-end” processor, it is fast enough to see a performance difference between video cards without the processor stepping in as a huge limitation. If we used something like a Celeron 466, the performance of virtually all the cards would be virtually identical at the lower resolutions because the CPU and FSB/memory buses are limiting factors. Once again, this is a test of graphics cards not of CPU/platform performance.
For our FSAA testing, we picked a 1GHz Pentium III running on an i820 motherboard with RDRAM. The reason we picked this platform (we are aware that it isn’t widely available) is because it eliminates virtually all bottlenecks that would be present and allows us to illustrate the biggest performance hit enabling FSAA would result in. Slower setups would have lesser performance hits because they have more bottlenecks.
Windows 98 SE Test System
Intel Pentium III 550E
Intel Pentium III 1.0EB
AMD Athlon 750
|Motherboard(s)||ABIT BE6||AOpen AX6C||ASUS K7V-RM|
128MB PC133 Corsair SDRAM
128MB PC800 Samsung RDRAM
128MB PC133 Corsair SDRAM
IBM Deskstar DPTA-372050 20.5GB 7200 RPM Ultra ATA 66
Voodoo5 5500 AGP 64MB
Rage 128 Pro 32MB
ATI Rage Fury MAXX 64MB
Matrox Millennium G400MAX 32MB (would not run on Athlon platform)
GeForce 2 GTS 32MB DDR (default clock - 200/166 DDR)
S3 Diamond Viper II 32MB
Linksys LNE100TX 100Mbit PCI Ethernet Adapter
Windows 98 SE
Interactive Unreal Tournament 4.04 AnandTech.dem
As we should all know by now, 640 x 480 is the perfect resolution for picking out any driver performance issues as well as noticing the effects of any hardware T&L engines at play.
Since the GeForce 2 GTS is essentially using the GeForce drivers, the performance is expected, although as is evident by the fact that the GeForce 2 GTS is still a tad slower than the DDR GeForce in this test, there are still some driver optimizations that are necessary.
At this point, the GeForce 2 GTS is in a much better situation than the Voodoo4/5 are in terms of drivers.
It doesn't take long for the GeForce 2 GTS to jump up to the first place position which is where it will remain for the rest of our tests.
The added memory bandwidth present on the GeForce 2 GTS gives it a slight advantage in 32-bit color over the DDR GeForce, and the S3TC support of the 5.16 drivers make the extra memory on the 64MB DDR GeForce virtually useless.
At 1024 x 768 the GeForce 2 GTS is almost as fast in 32-bit color as the DDR GeForce was in 16-bit color which is pretty impressive.
In 16-bit color mode, there is virtually no drop in performance from 800 x 600, what we're seeing here is the incredible fill rate power of the GeForce 2 GTS starting to kick in.
The Voodoo5 5500 was at the top of our charts just two days ago, now it's time for NVIDIA to take over. Then again this isn't a very fair comparison because the GeForce 2 GTS will be priced around $100+ more than the Voodoo5 5500. At the same time, you can't argue with ~88 fps at 1280 x 1024 x 16, although the 44.9 fps in 32-bit color mode indicates that even the 5.3GB/s of memory bandwidth on the GeForce 2 GTS isn't enough.
We've always talked about 60 fps at 1600 x 1200, well is 57.6 fps close enough? While the sweet spot for the GeForce 2 GTS is still at 1024 x 768 x 32 or 1280 x 1024 x 16, the 1600 x 1200 scores once again illustrate exactly how powerful the GeForce 2 GTS is.
The only thing that could beat it here is the Voodoo5 6000.
Starting all over again at 640 x 480, we see the benefits of NVIDIA's unified driver support since the GeForce 2 GTS drivers were ready long before the actual release of the card.
Until the Voodoo5 6000 comes along, you can keep on expecting the GeForce 2 GTS to come up at the top of these performance charts.
Once again, at 1024 x 768 you can start to see a real difference in fill rate between the GeForce 2 GTS and its older brother, the GeForce. At 1024 x 768 x 32 the GeForce 2 is almost as fast as the GeForce is at the same resolution running in 16-bit color mode.
In 16-bit color mode, the GeForce 2 GTS sees no match but when switching to 32-bit color there is a very noticeable drop in performance because of a lack of memory bandwidth. At this point the powerful GeForce 2 GTS is reduced to being only a few fps faster than a Voodoo5 5500.
At 1600 x 1200 we get very playable performance in 16-bit color from the GeForce 2 GTS, but switching to 32-bit color results in the performance dropping to slightly above that of a Voodoo5 5500.
Quake III Arena Quaver
Any attempt to substantially push a video card to its limits using the built in Quake III Arena demos is almost impossible. The level is not only low on the texture front, it also does not contain as many rooms as other levels. Resolutions and colors where the demo plays flawlessly are often too high to play the game in full with.
In order to fully test how memory bandwidth and RAM size affects video card performance, a separate and more stressful benchmark is needed. For the purposes of this test, we turned to Anthony "Reverend" Tan's Quaver benchmarking demo. It was a well known fact that some levels of Quake III Arena push video cards harder than others. The most notorious of these levels is the Q3DM9 map, with a reported 30 plus megabytes of textures.
The GeForce 2 starts off in the familiar position at 640 x 480, its drivers could use some tweaking but nothing major.
At 800 x 600 the performance of the GeForce 2 begins to separate the new solution from the pack, especially in 32-bit color mode.
Courtesy of the increased memory bandwidth combined with the increased fill rate, the GeForce 2 in 32-bit color outperforms the DDR GeForce in 16-bit color.
The GeForce 2 does distance itself from the Voodoo5 5500 in the 32-bit color tests, but it is still clearly limited by its memory bandwidth here, unfortunately 200MHz DDR SGRAM wasn't an option for use with the GeForce 2 at the time of launch.
It's worth pointing out yet again that the Rage Fury MAXX is still suffering from whatever issues plagued the card when we first reviewed it in that its 32-bit color performance at high resolutions is horrible.
While UnrealTournament does offer native support for Glide, we refrained from testing the Voodoo5 in Glide. Why? Have a look at the performance numbers taken from a Voodoo5 running UnrealTournament in Glide vs Direct3D:
The first thing to notice is that there is no performance difference between 16 and 32-bit color when running in Glide, this is most likely due to UnrealTournament not allowing 32-bit color/textures when running in Glide, especially since the scores were perfectly identical between 16 and 32-bit color modes.
Secondly, as the resolution increases, the performance of the Voodoo5 running in Glide mode drops below that of Direct3D indicating that it wasn't meant to be run at such high resolutions, which is one possible explanation.
Regardless, the UnrealTournament scores were already difficult to explain because of the number of limitations acting on the UnrealTournament engine (look back at our reasons that UnrealTournament isn't a good benchmark) and adding Glide scores wouldn't do much good other than adding two more lines to the graphs.
Unreal Tournament is a very texture intensive game, causing the 64MB DDR GeForce (with more memory to store textures in) to float to the top of the performance chart while the GeForce 2 GTS is left below the Voodoo4 4500. But if you actually look at the performance numbers, the difference between the fourth place GeForce 2 GTS and the first place 64MB DDR GeForce is 1.5fps in 16-bit color and 4.3 fps in 32-bit color, we're not talking big numbers here at all.
If you're set on running UnrealTournament as fast as possible, you'd be fine with just about any of these cards with the exception of the Viper II whose performance was noticeably lower than the competition.
Because of its larger memory, the 64MB DDR GeForce can store more textures and is thus less susceptible to texture thrashing than even the new GeForce 2 GTS. UnrealTournament is not very fill rate dependent, rather it is more dependent on a fast memory bus on your graphics card, a fast CPU, and a fast system memory bus.
Once again, the performance range between the cards isn't significant to truly draw any major conclusions from the numbers.
At 1280 x 1024 the fill rate of these cards becomes more of an issue, yet it's still not great enough to push the GeForce 2 up any further in the standings. Any of these cards would be great UT performers, but the fastest seems to be the 64MB DDR GeForce by a hair and the V5 5500 in 32-bit color performance.
Even at 1600 x 1200 the fill rate of the GeForce 2 isn't stressed enough to bring it to the top of the charts, rather the 64MB DDR GeForce benefits from its larger texture memory as does the Voodoo5 5500.
We get a similar situation with the Pentium III 550E at the higher resolutions as we did with the Athlon 750, the scores are simply lower.
Full Scene Anti-Aliasing
NVIDIA’s GeForce 2 GTS does support Full Scene Anti-Aliasing (for more information on FSAA read our explanation of the technology here) in hardware, but it approaches FSAA in a slightly different way than 3dfx’s T-buffer solution, which is essentially like an Accumulation buffer that any company is capable of implementing.
NVIDIA’s FSAA works by using a method called Supersampling. The way supersampling works is that the scene is actually rendered at a higher resolution and then scaled down to the desired resolution before being displayed. As you already know, the higher the resolution you run at, the fewer aliasing effects are present since you have more pixels on the screen to naturally remove those effects.
Under OpenGL applications, NVIDIA’s FSAA can work in one of three modes (much like how 3dfx’s FSAA can be run in either 2-sample or 4-sample mode). The three FSAA settings are as follows:
· 1.5 screen resolution (lowest quality)
· 2x screen resolution, with LOD’s (MIPMaps) at the native game resolution
· 2x screen resolution with MIPMaps at the 2x resolution. (highest quality)
So if you’re running a game at 640 x 480, the first FSAA option will render the scene at 960 x 720 (640 * 1.5 x 480 * 1.5) and then scale it back down to 640 x 480 for displaying.
Depending on who you ask (3dfx or NVIDIA) you will get varying responses as to exactly what each of these modes looks like. 3dfx will have you believe that their 4 sample FSAA looks better than all of these modes while NVIDIA will have you believe that their third FSAA setting looks the best.
The final decision is up to you, but unfortunately, the 5.16 drivers we used for our GeForce 2 GTS tests would only allow us to enable the first setting, which renders each scene at 1.5x the screen resolution so we could only provide you with screen shots at this setting and not the other two settings.
The FSAA situation in Direct3D is a bit more flexible. Here are the options NVIDIA’s FSAA offers in D3D:
· 2x supersampling with MIPMaps at the native game resolution
· 2x supersampling with MIPMaps at the backbuffer (higher) resolution.
· 3x supersampling with MIPMaps at the native game resolution
· 3x supersampling with MIPMaps at the backbuffer (higher) resolution
· 4x supersampling with MIPMaps at the native game resolution
· 4x supersampling with MIPMaps that the backbuffer (higher) resolution
What’s the catch? Unfortunately, FSAA in Direct3D will not work in all Direct3D games. Instead, because of the way NVIDIA implemented the FSAA, most of today’s games can’t take advantage of the GeForce 2 GTS’ FSAA.
Future DirectX 8 games should be able to do just fine, as NVIDIA’s solution will be fully implemented in DX8, but until then, you’ll have to live without FSAA in Direct3D on the GeForce 2 GTS or go after the Voodoo5, which does support it now.
This is a bit of a disappointment since most racing games, flight simulators, and RPGs are in fact Direct3D games not designed with NVIDIA’s FSAA implementation in mind. While their FSAA works just fine under OpenGL, the majority of games that are OpenGL-only are first person shooters such as Quake III Arena, which, although they benefit just fine from FSAA, aren’t really worth the performance hit associated with enabling the mode.
Update (4/26/00): We just received some new information in from NVIDIA regarding their FSAA implementation. The above OpenGL FSAA options are, in fact, the options that will be available on the GeForce 2 GTS, but are not all available using the 5.16 drivers that came with our reference card. Currently, turning on FSAA under OpenGL enables the highest quality 2x screen resolution FSAA, which is what is used in the screenshots on the next page. Driver versions 5.17 and above will offer the other OpenGL FSAA settings mentioned above.
The situation we originally posed for Direct3D is also not 100% correct. While it is true that the current 5.16 drivers do not enable FSAA in every Direct3D game, but it is NVIDIA's intent to have future driver revisions that do enable FSAA for any D3D game. Further, NVIDIA actually has two additional modes for FSAA under D3D, which include:
· 1 x 2 (sampled at 1x in the horizontal direction and 2x in the vertical direction).
· 2x supersampling with a special, more complex filtering algorithm that should produce higher quality images than the other 2x modes mentioned above.
FSAA Image Quality - Quake III Arena
All screenshots were taken using Hypersnap-DX available at http://www.hyperionics.com
We've highlighted the key points to look for, but in order for you to see the true difference in the images you'll want to download the uncompressed Targa files which vary in size from 900KB up to 5.5MB for the 1600 x 1200 shots
Software FSAA - GeForce 256
Hardware FSAA - GeForce 2 GTS
2 Sample FSAA - Voodoo5 5500
4 Sample FSAA - Voodoo5 5500
FSAA Performance - Quake III Arena
First let's look at the performance penalty NVIDIA's GeForce 2 receives when its supersampling FSAA is enabled. The performance hit varies greatly depending on the resolution because of the method in which NVIDIA's FSAA works (it samples a multiple of the screen resolution then scales it down).
The performance hit is pretty significant, over 50% in some cases but the one thing that must be mentioned is that at resolutions above 1280 x 1024 x 16, where the GeForce 2 does not have enough memory to continue rendering in FSAA mode, the drivers shut off FSAA and the performance returns to normal which is why the last three sets of bars jump up suddenly in performance.
In comparison to the Voodoo5 5500, we find that the GeForce 2 with its 2x FSAA enabled features performance that lies between the 5500 using 2 sample FSAA and 4 sample FSAA. You can be the judge as to the quality of NVIDIA's FSAA, but unfortunately, as we mentioned in the section on FSAA, the majority of the games that would benefit directly from FSAA now are Direct3D games that won't work with NVIDIA's AA method.
The GeForce 2 GTS actually fits in the perfect spot between the Voodoo5 5500 and what we can expect from the Voodoo5 6000. With a price tag of around $349 at its launch, the GeForce 2 will obviously be more expensive than the sub $300 Voodoo5 5500 but it also outperforms the 5500. At the same time, the V5 6000, judging entirely by its clearly superior fill rate, will most likely offer a huge performance increase over the GeForce 2 but at the same time will carry a price tag close to $600. This leaves the GeForce 2 in the middle of the two as an expensive, yet powerful solution.
If you were counting on NVIDIA supporting FSAA with the GeForce 2 then you're in luck...sort of. Under OpenGL, the GeForce 2's FSAA looks perfectly fine, unfortunately the games that you'll want to use FSAA in are primarily Direct3D games which is where the GeForce 2's FSAA doesn't exactly work, for now at least. Future NVIDIA driver releases should enable FSAA for all Direct3D games, but this is definitely a work in progress.
As far as availability goes, you can expect to see GeForce 2 cards real soon. Depending on how companies play their cards (no pun intended), the GeForce 2 may end up hitting store shelves before the Voodoo5.
At close to $350 it will be a tough call, if you currently have a DDR GeForce, you'll probably want to hang on to your investment. If you have a SDR GeForce, TNT2, or Voodoo3 you may want to go for the cheaper Voodoo5 5500 solution which does offer FSAA in Direct3D in today's games. Then again if FSAA was never really your game, or you're mainly into first person shooters, the GeForce 2 is probably the card you'll want to get.
For now, NVIDIA has the fastest solution available, upon the release of the Voodoo5 6000 that will change but there are only so many people that are willing to spend $600 on a video card so the GeForce 2 may end up being a more practical solution.
There is always the option of waiting, in 6 months we'll see NV20 from NVIDIA and with any luck, ATI's Radeon 256 and then we'll be able to run through this whole video card competition all over again.
For another view on the GeForce 2 GTS be sure to check out SharkyExtreme's GeForce 2 GTS Review