Original Link: https://www.anandtech.com/show/1967



Introduction

Today marks the launch of NVIDIA's newest graphics cards: the 7900 GTX, 7900 GT and the 7600 GT. These cards are all based on an updated version of the original G70 design and offer higher performance for the dollar. Today we will see just how much faster the new NVIDIA flagship part is. But first let's take a look at what makes it different.

At the heart of this graphics launch is a die shrink. The functionality of the new parts NVIDIA is introducing is identical to that of the original G70 based lineup. Of course, to say that this is "just a die shrink" would be selling NVIDIA a little short here. In the future, if either NVIDIA or ATI decide to move to TSMC's newly introduced 80nm half-node process, all that would be involved is a simple lithographic shrink. Sure, things might get tweaked a little here and there, but the move from 90nm to 80nm doesn't involve any major change in the design rules. Moving from 110nm to 90nm requires NVIDIA to change quite a bit about their register transfer logic (RTL), update the layout of the IC, and verify that the new hardware works as intended.

The basic design rules used to build ICs must be updated between major process shrinks because the characteristics of silicon circuits change at smaller and smaller sizes. As transistors and wires get smaller, things like power density and leakage increase. Design tools often employ standard components tailored to a fab process, and sometimes it isn't possible to drop in a simple replacement that fits new design rules. These and other issues make it so that parts of the design and layout need to change in order to make sure signals get from one part of the chip to another intact and without interfering with anything else. Things like clock routing, power management, avoiding hot spots, and many other details must be painstakingly reworked.

In the process of reworking the hardware for a new process, a company must balance what they want from the chip with what they can afford. Yield of smaller and smaller hardware is increasingly affecting the RTL of a circuit, and even its high level design can play a part. Making decisions that affect speed and performance can negatively affect yield, die size, and power consumption. Conversely, maximizing yield, minimizing die size, and keeping power consumption low can negatively affect performance. It isn't enough to come up with a circuit that just works: an IC design must work efficiently. Not only has NVIDIA had the opportunity to further balance these characteristics in any way they see fit, but the rules for how this must be done have changed from the way it was done on 110nm.

After the design of the IC is updated, it still takes quite a bit of time to get from the engineers' desks to a desktop computer. After the first spin of the hardware comes back from the fab, it must be thoroughly tested. If any performance, power, or yield issues are noted from this first run, NVIDIA must tweak the design further until they get what they need. Throughout this entire process, NVIDIA must work very closely with TSMC in order to ensure that everything they are doing will work well with the new fab process. As microelectronic manufacturing technology progresses, fabless design houses will have to continue to work more and more closely with the manufacturers that produce their hardware in order to get the best balance of performance and yield.

We have made quite a case for the difficulty involved in making the switch to 90nm. So why go through all of this trouble? Let's take a look at the benefits NVIDIA is able to enjoy.



NVIDIA's Die Shrink: The 7900 and 7600

The first major benefit to NVIDIA comes in the form of die size. The original G70 is a 334mm^2 chip, while the new 90nm GPUs are 196mm^2 and 125mm^2 for G71 and G73 respectively. Compare this to the G72 (the chip used in the 7300) at 77mm^2 and the R580 at 353mm^2 for a good idea of the current range in sizes for 90nm GPUs and you will see that NVIDIA hardware is much smaller than ATI hardware in general. The reasons behind the difference in die size between the high end ATI and NVIDIA hardware comes down to the design decisions made by ATI and NVIDIA. ATI decided to employ full-time fp32 processing with very good loop granularity, floating point blend with anti-aliasing, a high quality anisotropic filtering option, and the capability to support more live registers at full speed in a shader program. These are certainly desirable features, but NVIDIA has flat out told us that they don't believe most of these features have a place in hardware based on current and near term games and the poor performance characteristics of code that makes use of them.

Of course, in graphics there are always chicken and egg problems. We would prefer it if all companies could offer all features at high performance for a low cost, but this just isn't possible. We applaud ATI's decision to stick their neck out and include some truly terrific features at the expense of die size, and we hope it inspires some developers out there to really take advantage of what SM3.0 has to offer. At the same time, a hardware feature that goes unused is useless (hardware is only as good as the software it runs allows it to be). If NVIDIA is right about the gaming landscape, a smaller die size with great performance in current and near term games does give NVIDIA a clear competitive edge. Also note that NVIDIA has been an early adopter of features in the past that went largely unused (i.e. fp32 in the FX line), so perhaps they've learned from past experience.

The smaller the die, the more chips can fit on one silicon wafer. As the wafer costs the same to manufacture regardless of the number of ICs or yield, having a small IC and high yield decrease the cost per die to NVIDIA. Lower cost per die is a huge deal in the IC industry, especially in the GPU segment. Not only does a lower cost to NVIDIA mean the opportunity for higher profit margins, it also gives them the ability to be very aggressive with pricing while still running in the black. With ATI's newest lineup offering quite a bit of performance and more features than NVIDIA hardware, this is all good news for consumers. ATI has a history of being able to pull out some pretty major victories when they need to, but with NVIDIA's increased flexibility we hope to see more bang for your buck across the board.

We haven't had the opportunity to test an X1800 GTO for this launch. We requested a board from ATI, but they were apparently unable to ship us one before the launch. The ability of ATI to sustain this product as well as it did the X800 GTO is certainly questionable as well (after all, the X800 GTO could be built from any of three different GPUs from different generations while the X1800 GTO has significantly fewer options). However, we are hopeful that the X1800 GTO will be a major price performance leader that will put pressure on NVIDIA to drop the price of their newest parts even lower than they already are. After all, in the end we are our readers' advocates: we want to see what is best for our community, and a successful X1800 GTO and the flexibility of NVIDIA after this die shrink would certainly be advantageous for all enthusiasts. But we digress.

The end result of this die shrink, regardless of where the prices on these parts begin to settle, is two new series in the GeForce line: the 7900 at the high end and 7600 at the midrange.

The Newest in High End Fashion

The GeForce 7900 Series is targeted squarely at the top. The 7900 GTX assumes its position at the top of the top, while the 7900 GT specs out very similarly to the original 7800 GTX. Due to the 90nm design, NVIDIA was able to target power and thermal specifications similar to the 7800 GTX 512 with the new 7900 GTX and get much higher performance. In the case of the 7900 GT, performance levels on the order of the 7800 GTX can be delivered in a much smaller, cooler package that pulls less power.

While the 7900 GTX will perform beyond anything else NVIDIA has on the table now, the 7900 GT should give NVIDIA a way to provide a more cost effective and efficient solution to those who wish to achieve 7800 GTX level performance. The specifics of the new lineup are as follows:

7900 GTX:
8 vertex pipes
24 pixel pipes
16 ROPs
650 MHz core clock
1600 MHz memory data rate
512MB of memory on a 256bit bus
$500 +

7900 GT:
8 vertex pipes
24 pixel pipes
16 ROPs
450 MHz core clock
1320 MHz memory data rate
256MB of memory on a 256bit bus
$300 - $350

Image wise, the 7900 GTX takes on the same look as the 7800 GTX 512 with its massive heatsink and large PCB. In sharp contrast to the more powerful 7900 and its 110nm brother, the 7800 GTX 512MB, the 7900 GT sports a rather lightweight heatsink/fan solution. Take a look at the newest high end cards to step onto the stage:





Midrange Chic

With the introduction of the 7600 GT, NVIDIA is hoping they have an X1600 XT killer on their hands. Not only is this part designed to perform better than a 6800 GS, but NVIDIA is hoping to keep it price competitive with ATI's upper midrange. Did we mention it also requires no external power?

In our conversations with NVIDIA about this launch, they really tried to drive home the efficiency message. They like to claim that their parts have fewer transistors and provide performance similar to or greater than competing ATI GPUs (ignoring the fact that the R5xx GPU actually has more features than the G70 and processes everything at full precision). When sitting in on a PR meeting, it's easy to dismiss such claims as hype and fluff, but seeing the specs and performance of the 7600 GT coupled with its lack of power connector and compact thermal solution opened up our eyes to what efficiency can mean for the end user. This is what you get packed into this sleek midrange part:

7600 GT
5 vertex pipes
12 pixel pipes
8 ROPs
560 MHz core clock
1400 MHz memory data rate
256MB of memory on a 128bit bus
$180 - $230

And as NVIDIA wants this card to evolve into the successor to the 6600 GT, we get all of that in a neat little package:



Now that we've taken a look at what NVIDIA is offering this time around, let us take a step back and absorb the competitive landscape.



The Competition

ATI has been very aggressive as of late, and we have been quite happy with what we have seen so far. After their circuit design setback with the X1800 last year, ATI really turned things around and offered the X1900 lineup in rather quick succession. Before today, the X1900 was clearly the king of the hill in all things graphics. With the new RD580 chipset from ATI offering 2x x16 PCI Express slots, Crossfire is looking better than ever as well. The comparison at the high end is very exciting: it's never been a better time to be a graphics enthusiast with tons of excess money.

At the same time, the midrange is heating up as well. With prices on the X1600 looking good, the new pressure on NVIDIA from ATI's upcoming X1800 GTO (which we unfortunately don't have), and solid products like the 6800 GS and 7800 GT already out there, the 7600 GT is a welcome addition in price/performance.

So we can get a good idea of what we will be working with, we are providing tables comparing the features of the high end cards and mid range cards we will be testing from NVIDIA and ATI. CrossFire and SLI will be looked at as well.

NVIDIA Graphics Card Specifications
  Vert Pipes Pixel Pipes Raster Pipes Core Clock Mem Clock Mem Size (MB) Mem Bus (bits) Price
GeForce 7900 GTX 8 24 16 650 800 512 256 ~$500+
GeForce 7900 GT 8 24 16 450 660 256 256 ~$325
GeForce 7800 GTX 512 8 24 16 550 850 512 256 $600+
GeForce 7800 GTX 8 24 16 430 600 256 256 $450
GeForce 7800 GT 8 20 16 400 500 256 256 $300
GeForce 7600 GT 5 12 8 560 700 256 128 ~$200
GeForce 6800 GS 5 12 8 425 500 256 256 $180


ATI Graphics Card Specifications
  Vert Pipes Pixel Pipes Raster Pipes Core Clock Mem Clock Mem Size (MB) Mem Bus (bits) Price
Radeon X1900 XTX 8 48 16 650 775 512 256 $600+
Radeon X1900 XT 8 48 16 625 725 512 256 $500
Radeon X1600 XT 5 12 4 590 690 256 128 $150


We will also be including SLI and CrossFire setups for these cards in all cases but for the X1600 XT. Unfortunately, during testing one of our X1600 cards decided to roll over and die (such is the price of working with engineering samples and prerelease products). The other card we would love to have included is the X1800 GTO which has 12 pixel pipes and is clocked similarly to the X1800 XL. As we mentioned previously, ATI didn't get a card to us for testing.

For our comparison, we have decided to test all applications with 4xAA and 8xAF in all tests but Splinter Cell: Chaos Theory. For Splinter Cell we are testing with SM3.0 options enabled and AA disabled as the game doesn't allow both to be set while playing. With all of this power available, our opinion is that AA is worth enabling in just about any situation. The visual quality benefit, even at high resolutions, is well worth it.



The Test and Power

For our test setup, we are using two different 2x x16 ASUS boards: one based on NVIDIA and the other ATI core logic. For all tests with NVIDIA GPUs we used the NVIDIA motherboard, and for all tests with ATI GPUs we employed the ATI based motherboard. All power tests were performed using the same motherboard (the RD580 board).

In an attempt to keep everything readable and manageable, we have split up the high end and mid range comparisons. Our high end parts will be compared at 1280x1024, 1600x1200, 1920x1440, and 2048x1536. The mid range comparison will look at 1024x768, 1280x1024, and 1600x1200. All SLI and CrossFire tests will be included with the high end data.

Test Setup:
ASUS A8N32 NVIDIA nForce 4 X16 SLI Motherboard
ASUS A8R32 ATI RD580 Motherboard
AMD Athlon 64 FX-57
2GB OCZ 2.5:3:3:8 DDR400 RAM
160GB Seagate 7200.7 Hard Drive
600W OCZ PowerStream PSU

Drivers:
NVIDIA ForceWare 84.17 (Beta)
ATI Catalyst 6.2

For power consumption, we once again take a look at the power draw of the system at the wall using our trusty Kill-A-Watt. Power load was measured while running the 3dmark06 feature tests as they tend to provide something near a worst case power load. What we see in games is usually a handful of watts lower than this. For idle power, we don't see that much difference between the high end cards, and the 7900 GT is similar in power to the 6800 GS. The 7600 GT seems to be on par with the X1600 XT for idle power.

Idle Power

When it comes to load, the new NVIDIA parts simply clean up. The performance per watt leader in this contest is hands down NVIDIA. The 7600 GT and 7900 GT both come in at a lower power than the 6800 GS and the 7900 GTX pulls the same wattage as the much lower clocked 7800 GTX.

Load Power



7900 GT: Just Another 7800 GTX?

Rather than clutter up our performance tests with a full set of numbers from another card, we decided to sum up its performance first. As we saw in our card comparison charts, the 7900 GT has specifications nearly identical to the 7800 GTX. Here we take a look at the stock 7800 GTX vs. the stock 7900 GT. There are plenty of overclocked 7800 GTX cards around that will either perform exactly like or very slightly higher than the 7900 GT. But as our performance tests at 1920x1440 clearly show, the 7900 GT only marginally edges out the stock 7800 GTX in performance.

7900 GT vs. 7800 GTX


In the following performance tests, we included the 7800 GTX 512 in order to compare the performance improvement the 7900 GTX offers over the previous flagship. For a comparison between the original 7800 GTX and the 7900 GTX, simply imagine that the 7900 GT has a different label.



Battlefield 2 High-End Performance

We have finally ironed out our issues with BF2 and SLI, so this time around we get to compare ATI and NVIDIA in multi GPU configurations. DICE has said in the past that results over 100 fps are not always reliable. It will suffice to say that CrossFire leads SLI at the low two resolutions. Putting a finer point on it goes against what we know of the benchmark's behavior. Interestingly, at higher resolutions (above 1600x1200), while the 7900 GTX and 7900 GT fall further behind ATI's single card solutions, SLI is able to take the lead from CrossFire. This would seem to indicate that SLI has a bit more of a CPU limitation at the low end than CrossFire, but that it is ultimately much more efficient in BF2.

Battlefield 2 High-End Performance


Battlefield 2 High-End Performance


Battlefield 2 High-End Performance


Battlefield 2 High-End Performance




F.E.A.R. High-End Performance

F.E.A.R. High-End Performance


F.E.A.R. High-End Performance


F.E.A.R. High-End Performance


F.E.A.R. High-End Performance




Quake 4 High-End Performance

Quake 4 High-End Performance


Quake 4 High-End Performance


Quake 4 High-End Performance


Quake 4 High-End Performance




Splinter Cell: Chaos Theory High-End Performance

Splinter Cell: Chaos Theory High-End Performance


Splinter Cell: Chaos Theory High-End Performance


Splinter Cell: Chaos Theory High-End Performance


Splinter Cell: Chaos Theory High-End Performance




Battlefield 2 Mid-Range Performance

Battlefield 2 Mid-Range Performance


Battlefield 2 Mid-Range Performance


Battlefield 2 Mid-Range Performance




F.E.A.R. Mid-Range Performance

F.E.A.R. Mid-Range Performance


F.E.A.R. Mid-Range Performance


F.E.A.R. Mid-Range Performance




Quake 4 Mid-Range Performance

Quake 4 Mid-Range Performance


Quake 4 Mid-Range Performance


Quake 4 Mid-Range Performance




Splinter Cell: Chaos Theory Mid-Range Performance

Splinter Cell: Chaos Theory Mid-Range Performance


Splinter Cell: Chaos Theory Mid-Range Performance


Splinter Cell: Chaos Theory Mid-Range Performance




Quad SLI and Purevideo

Today NVIDIA is also putting its Quad SLI initiative into action. Unfortunately, it doesn't look like they will be selling add-in Quad SLI based cards in the near future, but for those in need of such a setup will be able to find them from various system builders. Obviously this is a little at odds with the enthusiast community who prefer to build their extreme rigs themselves, but NVIDIA cites thermal, space, and power concerns not easily addressable by the individual as a reason for pushing out this hardware to system builders first. How many power supplies out there can provide enough power for SLI and CrossFire, let alone have the headroom to support four GPUs? Similarly, thermal issues could definitely be a problem in a case without good air flow.

NVIDIA would not commit to any timeframe for bringing Quad SLI to the add-in market, but they did indicate that they want the requirements for Quad SLI to be clear and readily obtainable by an individual. The landscape does have to be ready for something like this. Even if an enthusiast could put together a thermal solution to support Quad SLI, most of us don't dabble in power supply design and manufacture on the side. Unfortunately, we don't have any Quad SLI cards to test out either, but we are certainly looking into getting our hands on a system. We will have benchmarks as soon as we are able.

Quad SLI will provide a few new modes that are basically extensions of what the current SLI technology offers. Split frame rendering (SFR) will now split the frame 4 ways, and alternate frame rendering (AFR) will support one GPU rendering every 4th frame. The latter mode will provide the most benefit in games that support it as geometry processing will be well divided among the GPUs. Additionally, AFR of SFR will take each frame and split it among a pair of GPUs. Each pair then renders every other frame. This mode will be compatible with all titles that currently support AFR. Additional SLI AA modes will also be added to take advantage of up to 32x AA.

The Quad SLI is setup using what NVIDIA calls an x48 PCIe interconnect. This takes the x16 connection from the motherboard and both video cards and manages all 3 at full speed. In this way, full use can be made of available PCI Express bandwidth, both to and from the system and between GPUs.

Last week NVIDIA also launched an update to their Purevideo driver which is supposed to deliver increased performance and support, specifically for H.264 video playback. We will also be looking into updating our video quality and performance tests with the new Purevideo driver as soon as possible. The biggest change we would like to see from Purevideo is a free download from NVIDIA. Currently requiring users to purchase more software in order to get full functionality of advertised features fomr their hardware is more than a little disappointing.



Final Words

Even though we didn't test as many games as we usually do, there is quite a bit of data to digest. On the high end, the 7900 GTX generally performs around the X1900 XT and X1900 XTX. This isn't a blow out victory for either NVIDIA or ATI as far as performance goes, and it looks like we have some very good competition here.

In general, SLI edges out CrossFire in most cases. Under F.E.A.R., Quake 4 and BF2 at high resolutions, SLI shows a larger performance increase than CrossFire. Splinter Cell does do a good job of showing the potential of Crossfire, but as of now we don't see as many games scaling as well with CrossFire as they do with SLI.

While the 7900 GT generally spent its time at the bottom of our high end tests, remember that it performs slightly better than a stock 7800 GTX. This puts it squarely at or better than the X1800 XL and X1800 XT. We didn't include these cards as ATI seems to be backing away from the X1800 lineup with the exception of the X1800 GTO that we were unable to obtain for this launch. As the X1800 GTO looks like a cut down X1800 XL, we can certainly expect the 7900 GT to outperform it as well.

The 7600 GT does quite a good job of splitting the performance difference between the 6800 GS and the 7800 GT. NVIDIA is hoping that we will concentrate on how well the 7600 GT does in comparison to the X1600 XT, but unless the price of the 7600 GT falls to about $150 really fast the comparison isn't really fair. The 6800 GS already performs better than the X1600 and can be found for about $170. It's clear the 7600 GT needs to be positioned against a faster offering from ATI such as their upcoming X1800 GTO. With the X1800 GTO poised to come in at between $250 and $300, we would expect it to compete more with the 7900 GT which will come in somewhere between $300 and $350. The next step up in ATI's lineup after the X1600 XT will be the X1800 GTO, so we need to take that into consideration when looking at the 7600 GT (even though it should be less expensive than the ATI part).

The bottom line here is that it all comes down to price. With the close competition at the high end, we still really don't recommend the X1900 XTX which generally comes in between $580 and $650. In order for the 7900 GTX to really look good compared to the X1900 XT, we will have to push below the $500 mark. NVIDIA has positioned the 7900 GTX as a $500 part, but we can already find X1900 XT cards for about $475; with the tight competition, we would really like to see NVIDIA take advantage of their cost saving die sizes and bring prices down.

The NVIDIA solutions use less power, generate less heat, and are cheaper to produce, but what matters in the end is the performance the end user gets for the price he or she pays. Yes, the 7900 GTX performs on par with the X1900 XT and XTX. With ATI's additional features, will NVIDIA's street prices be low enough to entice gamers? We'll have to wait and see.

Log in

Don't have an account? Sign up now