Original Link: http://www.anandtech.com/show/601

NVIDIA GeForce2 Ultra

by Anand Lal Shimpi on August 14, 2000 9:01 AM EST


NVIDIA has been executing perfectly ever since the release of their TNT back in 1998.  It was October 1998 that NVIDIA’s first TNT based cards began shipping by Diamond. Just about one year later, NVIDIA successfully executed the launch of their TNT2 and TNT2 Ultra based products, which eventually overshadowed the Voodoo3 that preceded it. 

The TNT2 Ultra release was the last product NVIDIA brought to market before they switched to a 6-month product cycle.  This new cycle truly strained their competitors since 3dfx was unable to produce the Voodoo4/5 in time to compete with NVIDIA’s next product, the GeForce, which was released 6 months after the TNT2 Ultra. 

The GeForce was an instant hit, there was nothing available that could possibly compete with it and as more mature drivers were released for the card, its performance did nothing but improve.  3dfx, ATI, Matrox and the now defunct graphics division of S3 had no way of competing with the GeForce and the later versions of the card that featured Double Data Rate SDRAM (DDR).  If they couldn’t compete with the GeForce, there was no way they would be able to catch up in time for the launch of the GeForce2 GTS just 6-months later. 

However, the GeForce2 GTS was met with some competition as 3dfx’s Voodoo5 5500 was launched at around the same time and speculation began to form about ATI’s Radeon chip, but even then, we all knew that 6 months after the GTS’ release, NVIDIA would have yet another product that would help them to distance themselves from the competition yet again.

This brings us up to the present day.  While we were patiently waiting for the elusive ‘NV20’ from NVIDIA, NVIDIA has been shipping 32MB and 64MB GeForce2 GTS cards to make their current customer base happy.  With the release of the GeForce2 MX as well as the new Quadro2, it is clear that NVIDIA has really got their act together, and that it has also built up the expectations we had for NV20, the code name of their next product.

We expected NV20 to literally blow everything away; it would mark a depart from the standard GeForce2 GTS core and present us with NVIDIA’s equivalent of ATI’s HyperZ technology that allows for very efficient memory bandwidth usage.  Rumors began hitting the message boards and newsgroups, which speculated on the NV20’s incredible specifications.  Everyone expected the NV20 to have a 300MHz+ core clock, incredibly fast DDR memory, and an insane amount of memory bandwidth which would be courtesy of its ‘borrowing’ some techniques from tile-based rendering architectures. 

Using our trusty calendar skills, and NVIDIA’s promise to stick to a 6-month product cycle, this put the release of the NV20 in September 2000, under one month away.  With ATI’s Radeon only able to beat a GeForce2 GTS by 10 – 20%, the NV20 would only have to be that much faster in order to beat ATI’s latest creation, and NVIDIA’s closest competitor.  We already assumed the NV20 would be much faster than the Radeon right off the bat.

The specifications we were all expecting were amazing, but guess what guys ‘n gals?  The wonderful NV20 won’t be here until next year.  That’s right, Spring 2001 is when you can expect to see the NV20, but NVIDIA won’t be departing from their 6-month product cycle schedule, they are simply departing from what they define a “product” as. 

Originally, NVIDIA’s plans were to release a new chip every Fall and they would have another version of the product every Spring, a sort of “Spring Refresh,” as they liked to call it.  Now, the GeForce has already gotten it’s “Spring Refresh,” the GeForce2 GTS, but now, apparently the GeForce2 GTS isn’t feeling very “fresh” and NVIDIA has decided to give it another refresh, this time in the Fall.

Update 8/17/2000: There have been reports that the NV20 won't be delayed and it will be released on time contrary to what we've published here. We met with NVIDIA in person and asked them their stance on the issue, according to NVIDIA the NV20 will be out in 4 to 8 months from the release of the GeForce2 Ultra (September). This places the release of the NV20 about 6 months from when the Ultra hits the streets which can be as early as January or as late as May. If you take the average of that range, you get a March release, which does fall in line with our statement of a Spring 2001 launch.

So what is this ultra-fresh GeForce2 going to be called?  None other than the GeForce2 Ultra of course.

We’ll let the shock set in before moving on to the specs of this chip…



The Chip

The architecture of the GeForce2 Ultra is identical to that of the GeForce2 GTS.  It features the same four rendering pipelines and the ability to process two textures per pipeline per clock. 

The only difference that exists is that the GeForce2 Ultra uses an “advanced” 0.18-micron process allowing it to reach a higher operating clock speed.  The GeForce2 GTS runs at 200MHz by default, and the average GeForce2 GTS can overclock to approximately 238MHz with the best of the best able to hit 250MHz.

The GeForce2 Ultra ships, by default, at 250MHz, which proves one of two things: NVIDIA either 1) did improve the 0.18-micron process thus allowing for higher clock speeds to be obtained (since we could overclock far beyond 250MHz as well) or 2) they are simply hand-picking the chips that can hit the higher core clocks. 

The latter seems to be true, as it would make the most sense for them; it is also what they did with the original TNT2 Ultra in comparison to the TNT2.  The parts that made it to the higher core clock were marked as Ultra parts and those that didn’t were simply regular TNT2s.  This is most likely the case with the GeForce2 GTS and the GeForce2 Ultra: the higher clocked Ultras are probably the higher yielding GeForce2 chips. 

At 250MHz, the GeForce2 Ultra has a pixel fill rate of 1000 Megapixels per second, or 1 Gigapixel/s; this is a 25% increase in pixel fill rate over the GeForce2 GTS.  And since the GeForce2 Ultra, like its predecessor, can process two textures per clock, the Ultra has a texel fill rate of 2000 Megatexels/s or 2 Gigatexels/s.  This is a very impressive amount of fill rate, which is even greater than that of the unreleased Voodoo5 6000 (1.33 Gigapixels/Gigatexels per second), but as you’ll remember from our GeForce2 GTS review, the biggest problem for the chip isn’t fill rate, but memory bandwidth.

Unlike the GeForce2 GTS, which could be outfitted with either 32MB or 64MB of DDR SGRAM/SDRAM, the GeForce2 Ultra will only be available as a 64MB card.  While the chip supports up to 128MB of memory, it is unlikely that we will see a 128MB configuration this year; if need be, there is always the possibility that we will see boards with more memory as time goes on and as memory prices (hopefully) drop. 


Click to Enlarge


Click to Enlarge

NVIDIA attempted to lessen the chip’s memory bandwidth bottleneck by using faster memory and pumping up the memory clock from 166MHz DDR (effectively 333MHz) to 230MHz DDR, yielding an effective 460MHz memory clock.  This is a hefty increase over the original GeForce2 GTS, and the 39% increase in actual memory clock speed results in an incredible 7.36GB/s of peak available memory bandwidth.  The only card that could possibly offer more memory bandwidth will be the Voodoo5 6000, which should have twice the available memory bandwidth of the 5500 (5.3GB/s).  However, since the 6000 will be a quad chip solution, that massive amount of memory bandwidth will be cut into by the texture duplication that is necessary for all four chips.

Because of the increase in memory bandwidth, the biggest advantage the GeForce2 Ultra will have over the GeForce2 GTS is at higher resolutions, in 32-bit color modes especially.  This is a very brute force method of achieving higher frame rates (unlike ATI’s HyperZ technology, which has more finesse), but it does get the job done.  The big question is, at what cost?



The Cost

With very fast memory, high yield chips, and a minimum of 64MB of SDRAM on the cards, how expensive do you think the new GeForce2 Ultra will be?  It’s definitely not going to be at the $400 mark the 64MB GeForce2 GTS cards began selling at, nope, you’re looking at around $500 for a GeForce2 Ultra

We whined and moaned when we realized how expensive 3dfx’s forthcoming Voodoo5 6000 would be (approximately $600), and we’re going to do the exact same for NVIDIA’s GeForce2 Ultra. 

At $500, it’s going to be very difficult to justify purchasing the GeForce2 Ultra; while it will obviously be the fastest thing available (as you will soon see by the benchmarks), the cost of that performance is, for most users, entirely too much.  With a fairly well equipped PC falling in the $1500 - $2000 range, spending 1/3 or 1/4 of your total computer cost on a video card will be a stretch for most wallets. 

Especially with new products from 3dfx, ATI and Matrox on the way, at least one of which will be released in the very near future, $500 spent now (or when the GeForce2 Ultra is actually available in 30 – 45 days) may earn you a self-inflicted kick in the pants later on if any of the aforementioned companies can execute properly and deliver a superior product for much less. 

The Memory

The memory is obviously what makes the GeForce2 Ultra what it is.  The reference board we received made use of 4ns ESMT SDRAM (M13L641664 4T), which is rated at 250MHz DDR (500MHz).  This should bring up a red flag since the Ultra is rated at a 460MHz memory clock, which should be able to be attained using 4.5ns memory. 

According to NVIDIA, the reason for the lower memory rating on the card itself is because, by decreasing the memory clock to 460MHz, they could get the best overall balance of performance and yield on their boards.  This also means that the memory overclocking potential for these cards should be quite good, with a 500MHz clock not too far fetched of an idea. 

NVIDIA quite possibly refrained from positioning the Ultra as having a 500MHz memory clock because of yields on the 4ns DDR SDRAM, since in our tests it would not go above 505MHz while it is rated for 500MHz operation.  Since 4ns DDR SDRAM hasn’t been in production for that long, it can be expected that the yields on the chips is not as great as some of the “slower” chips.  In the future this may change, but for now it makes sense for NVIDIA to sacrifice some memory performance for higher overall yields on cards. 



Hot Memory

The reference board we received looked very similar to the reference GeForce2 GTS design with a few very notable exceptions. 

The most noticeable difference is that the GeForce2 Ultra reference design makes use of heatsinks on its memory chips.  While we have seen this adopted by a handful of manufacturers in the past on their GeForce2 GTS designs, such as Absolute Multimedia and Hercules (Guillemot), we discovered that the heatsinks used on the Ultra’s memory actually made a difference. 

The 460MHz (230MHz DDR) memory on the GeForce2 Ultra board we tested ran hot enough to make the heatsinks placed upon them more than just warm to the touch.  While the heatsink placed on the GeForce2 Ultra chip itself wasn’t very hot at all, the memory heatsinks were the exact opposite.  During our tests, the memory reached temperatures as high as 109F (42.8C); this was measured by placing a thermistor on the opposite side of the PCB, directly behind a memory chip.  We compared this to the operating temperature of memory on a 64MB GeForce2 GTS, which turned out to be 102.8F (39.3C).  The small difference in temperature can be accounted for by the heatsinks on the memory actually doing their job.  So what happens when we take off the heatsinks?

Without the heatsinks, the back of the memory reached 112.4F (44.7C) which isn’t much warmer than with the heatsinks, so are they necessary?  After approximately 20 minutes of running a Quake III Arena loop with only one heatsink off (there are two, one on each set of 4 SDRAM devices), there were quite a bit of artifacts on the screen; eventually the demo loop crashed. 

So it seems as if, at least on our board, the heatsinks were necessary for stable operation at such a high frequency. 



Digital Flat Panel Support

The next difference we noticed between the GeForce2 Ultra and the GTS reference designs was that the former features an external TMDS transmitter manufactured by Silicon Image. 

If you recall, one of the features of the GeForce2 GTS offered over its predecessor was it had an integrated TMDS transmitter that allowed for outputting directly to DVI flat panels at resolutions up to 1280 x 1024.  Unfortunately, according to some users that have written us, the GeForce2 GTS is unable of powering certain DVI flat panels, including those made by Viewsonic. 

According to NVIDIA, the external TMDS transmitter is used for supporting resolutions of 1600 x 1200 and above; however, it may also correct some of the compatibility problems DVI flat panel users have been experiencing.  We haven’t been able to confirm this yet, but it is quite possible. 



Faster T&L Means...?

With the higher clock speed comes a higher performing T&L engine, this time around it weighs in at 31 million triangles per second. While NVIDIA has definitely been pushing the T&L bandwagon for quite some time, we have truly yet to see a game that takes advantage of T&L to the point where it looks or performs noticeably worse without hardware T&L support.

An example of one such game is Shiny's Sacrifice which is due out later this year.

We were lucky enough to get a pre-release copy of it and managed to take some screenshots with T&L enabled and without it:

T&L Off

T&L On

Click Here to download the uncompressed versions of the above graphics

As you can probably tell, there is virtually no difference between the above two screenshots, the game looks just as good with T&L off as it does with T&L on. The performance is pretty similar although on slower machines having T&L on does definitely help out the frame rate. It seems like T&L implementations in games is still not as dramatic as NVIDIA & ATI would like it to be, not yet at least.

We managed to pay a visit to Epic not too long ago and take a look at what they're working on for their upcoming followup to UnrealTournament and let's just say that with the number of polygons they are planning to use in the characters, T&L will definitely be helpful.

For now, it's still not a necessity.



NVIDIA's Testing "Suggestion"

ATI gave NVIDIA a nice surprise with the performance of their Radeon in comparison to their flagship GeForce2 GTS product.  In an attempt to “level the playing field”, NVIDIA suggested that if we were to test using MDK2, we should be aware of the fact that the GeForce2 and Radeon use different default texture color depths.  They further suggested that if we wanted to produce an “apples to apples” comparison, then we should run the GeForce2 GTS in 16-bit color mode.  This obviously would create a problem for the ATI Radeon whose 16-bit performance is not nearly as good as NVIDIA’s, but what it doesn’t do is make for a true “apples to apples” comparison.

The reason NVIDIA made this suggestion is because there is an option in ATI’s drivers (and there always has been) to convert 32-bit textures to 16-bit.  But by placing the GeForce2 GTS and the GeForce2 Ultra in 16-bit color mode you immediately place the Radeon at a severe disadvantage because of its poor 16-bit color performance.  You buy a card like a Radeon or a GeForce2 GTS and definitely a GeForce2 Ultra in order to run in 32-bit color, and benchmarking solely in 16-bit color doesn’t make much sense at all.


The proper way to level the playing field, disable ATI's texture conversion

While NVIDIA’s suggestions are one method of approaching benchmarking, we thought of a better idea: simply disable ATI’s “convert 32-bit textures to 16-bit” option, which is what we’ve always done when benchmarking ATI cards, and the playing field is now leveled.  This is what we did for our comparison; while we didn’t use MDK2, this applies for all benchmarks and we are disappointed that NVIDIA would suggest such a thing in order to produce an “apples to apples” comparison. 



The Test

For the testing, we used the same systems as were used for the GeForce 2 GTS review, with updated drivers. In the case of the Radeon, we tested with the shipping drivers with V-sync disabled as well as "Convert 32-bit textures to 16-bit" turned off. We only tested on one platform because we are comparing the performance of the drivers not the video cards themselves.

We left out the 800 x 600 scores since they didn't really show anything with these cards other than a small drop in performance when compared to 640 x 480 frame rates.

Windows 98 SE Test System

Hardware

CPU(s) Intel Pentium III 550E AMD Athlon (Thunderbird) 1GHz
Motherboard(s) AOpen AX6BCPro Gold ABIT KT7-RAID
Memory 128MB PC133 Corsair SDRAM (Micron -7E Chips)
Hard Drive

IBM Deskstar DPTA-372050 20.5GB 7200 RPM Ultra ATA 66

CDROM

Phillips 48X

Video Card(s)

3dfx Voodoo5 5500 AGP 64MB
3dfx Voodoo5 4500 AGP 32MB

ATI Radeon 64MB DDR

ATI Rage Fury MAXX 64MB

Matrox Millennium G400MAX 32MB

NVIDIA GeForce 2 MX 32MB SDR (default clock 175/166)
NVIDIA GeForce 2 GTS 32MB DDR (default clock - 200/166 DDR)
NVIDIA GeForce 256 32MB DDR (default clock - 120/150 DDR)
NVIDIA GeForce 256 32MB SDR (default clock - 120/166)

NVIDIA Riva TNT2 Ultra 32MB (default clock - 150/183)

S3 Diamond Viper II 32MB

Ethernet

Linksys LNE100TX 100Mbit PCI Ethernet Adapter

Software

Operating System

Windows 98 SE

Video Drivers

3dfx Voodoo5 5500 AGP 64MB - final drivers v1.00.01
3dfx Voodoo5 4500 AGP 32MB - final drivers v1.00.01

ATI Rage Fury MAXX 64MB - A6.40CD06
ATI Radeon 64MB DDR - 4.12_3050

Matrox Millennium G400MAX 32MB - 6.00.010 Beta

NVIDIA GeForce2 MX 32MB SDR - Detonator3 6.17
NVIDIA GeForce2 GTS 32MB DDR - Detonator3 6.17
NVIDIA GeForce 256 32MB DDR - Detonator3 6.17
NVIDIA GeForce 256 32MB SDR - Detonator3 6.17
NVIDIA Riva TNT2 Ultra 32MB - Detonator3 6.17

S3 Diamond Viper II 32MB - 4.12.01.9006-9.51.01

Benchmarking Applications

Gaming

GT Interactive Unreal Tournament 4.04 AnandTech.dem
idSoftware Quake III Arena demo001.dm3



As we discovered in our Detonator3 investigation, the new drivers decrease performance at lower resolutions where memory bandwidth limitations don’t kick in quite yet.  Because of this, the ATI Radeon jumps to the very top of the chart with the rest of the NVIDIA cards trailing behind followed by the 3dfx Voodoo5 5500. 

The 64MB GeForce2 Ultra actually comes out slightly behind the 32MB GeForce2 GTS because the latter features DDR SGRAM while all 64MB cards use DDR SDRAM.  The differences between SGRAM and SDRAM vanish at the higher resolutions however.

At 1024 x 768 the GeForce2 Ultra immediately begins to take over, boasting frame rates above 100fps in both 16-bit and 32-bit color.  Because of its 7.3GB/s of memory bandwidth, the Ultra laughs at what we once considered to be a memory bandwidth limited setting.

You definitely get what you’re paying for with the GeForce2 Ultra, at 1280 x 1024 x 32 the card’s 81.8 fps is definitely playable and almost as fast as the GeForce2 GTS is in 16-bit color. 

Nothing comes close to being able to touch the Ultra at 1600 x 1200 x 32, the card is just barely capable of hitting 60 fps at that resolution which is definitely playable.  While some may not be interested in running at that high of a resolution, it does make running at 640 x 480 or 800 x 600 (32-bit) with 2X FSAA enabled, very playable. 



With the slower Pentium III 550E, we see that the Radeon no longer holds its own at the top of the chart.  The GeForce2 GTS once again comes out on top of the Ultra because of its slightly faster DDR SGRAM. 

Once again, the GeForce2 Ultra picks up the pace and finds itself at the top of the performance charts. Because of the slower CPU, the performance difference between the GeForce2 Ultra and the regular GeForce2 GTS isn't too great as the CPU begins to become a limitation.

At 1280 x 1024 the GeForce2 Ultra redefines playable as 80 fps because that's what it's able to deliver. A full 41% faster than the GeForce2 GTS but also a full 50% more expensive.

Even on slower 100MHz FSB CPUs 1600 x 1200 is still very playable. Your friend down the street with a 1GHz Thunderbird and a GeForce2 GTS can't even reach your frame rate with the Ultra.



Once again, UnrealTournament proves to appreciate no card other than the Voodoo5. The Radeon seems to love the 1GHz Thunderbird and jumps up right behind the Voodoo5 and even surpasses it in 32-bit color. The GeForce2 Ultra is clearly behind the Radeon. We managed to talk to Epic about this and according to them, NVIDIA's drivers don't handle textures as well as ATI's and 3dfx's, and since UT is a very texture intensive game this could explain why all the NVIDIA cards are behind by such a large amount.

The standings remain the same as the resolution goes up.

At 1600 x 1200 the Ultra finally comes through with a very small advantage because of its incredible memory bandwidth and fill rate.



The scores are much more bunched together with the Pentium III 550E because of UT's heavy CPU dependency.

The Radeon falls a few places in the charts but remains within a few fps of the top performers as the benchmark becomes CPU limited.

This time memory size (64MB) coupled with fill rate potential comes in handy as two 64MB NVIDIA cards tie for first place with a very small margin over the 32MB GeForce2 GTS.

The Voodoo5 5500 gives up its leading position as it heads towards the bottom of the graph.



Overclocked Performance

As we mentioned earlier on in the review, the GeForce2 Ultra would be the perfect overclocker since it does have 4ns (250MHz) DDR SDRAM. Needless to say that we had no problem getting the Ultra up to 285MHz core and a 500MHz memory clock, and we benchmarked it using Quake III Arena (UT isn't responsive to overclocking because it isn't fill rate/memory bandwidth limited). So how fast did that make the already blazingly fast card?

The below scores were taken on a 1GHz Thunderbird to minimize CPU bottlenecks.

Even at 640 x 480, the overclocked GeForce2 Ultra takes an early lead over the competition, including a hefty lead over the non-overclocked Ultra.

With a 500MHz effective memory clock, it isn’t surprising that the overclocked Ultra can pull over 125 fps at 1024 x 768 x 32. 

The Ultra gives us a nice 10 fps boost when overclocked at 1280 x 1024, almost putting it at 100 fps at this high of a resolution in 32-bit color. 

The overclocked Ultra is the first to break the 60 fps barrier at 1600 x 1200 x 32, very impressive but very expensive.



FSAA Image Quality & Performance

The image quality of the Ultra's FSAA is identical to that of the GeForce2 GTS, visit our FSAA comparison for an image quality/FSAA comparison.

The GeForce2 Ultra makes NVIDIA's 2x2 FSAA plausible at 640 x 480 x 32 and 800 x 600 x 32 without dropping too far below 60 fps, and in the case of 640 x 480 x 32 you're still running at above 80 fps.

With 2x2 FSAA enabled, the GeForce2 Ultra is just barely slower than the Radeon with no FSAA enabled.

With the GeForce2 Ultra's incredible fill rate and memory bandwidth power combined with its 64MB of DDR SDRAM, it is truly the fastest FSAA solution on the market. Is it worth the money though? We don't think so.



Final Words

It's amazing how things can completely turn around for a company with a single product release. To those of you that have any doubt in your mind, as sad as it is to say, NVIDIA's NV20 will not be here until Spring 2001. Instead, we are to live with the $500 GeForce2 Ultra and NVIDIA is going to hope that neither 3dfx nor ATI and not even Matrox are able to execute properly this fall, otherwise NVIDIA will most definitely be dethroned as the 3D graphics performance king.

This is the opportunity for the competitors that haven't been able to compete thus far to step forward and push for the release of their products as soon as possible. The GeForce2 Ultra will be out in 30 - 45 days, and the competition already knows what to expect from NVIDIA for the next 6 months, if they can beat that, then they'll have the upper hand, for now at least.

NVIDIA's biggest fear should be ATI at this point. ATI's Radeon is already dangerously close in performance to the GeForce2 GTS. And provided that ATI can produce a Radeon MAXX in time, it will definitely give the GeForce2 Ultra a run for its money and speaking of which, it most definitely won't debut at $500.

ATI's Radeon is currently available and is priced very competitively with the GeForce2 GTS (32MB). Our biggest fear has been driver support and quality, and in order to give a more accurate recommendation we are going to be stress testing the ATI Radeon drivers and will report on their successes and/or failures in an upcoming article a week from now.

At a $500 price point the GeForce2 Ultra most likely won't see the same popularity that even the $300 GeForce2 GTS and the $400 64MB GeForce2 GTS cards saw since their release. The price of the Ultra is heavily dependent on the availability of its ultra-fast 4ns DDR SDRAM, and there is no sign of that cost dropping anytime soon. It will be interesting to see how much Ultra cards drop in price over the next couple of months if any at all.

Performance-wise, you can't argue that the GeForce2 Ultra is the fastest thing to hit the streets thus far. We're not arguing with the performance of the card at all, but for most users, we can't recommend spending $500 on a card that will most likely be outperformed by a sub $300 card in a couple of months. If you want the fastest thing now and don't care about the cost, there's no question that the Ultra is for you.

While the delay of the NV20 isn't great enough to completely throw NVIDIA off course, they need to make sure that when NV20 does come along that it is powerful enough and attractive enough to shadow the competition once again. If not, then NVIDIA might be facing much bigger problems. For now, this is 3dfx's, ATI's and Matrox's chance to step forward and restore some competition to this industry.

To the big three out there, this is your chance, don't screw it up.

Log in

Don't have an account? Sign up now