Original Link: https://www.anandtech.com/show/1930



Introduction

Take all the clichés used to describe a long overdue event or the unexpected fulfillment of a promise (hot places freezing, heavy animals soaring through the air, etc...) and you still couldn't say enough to fully proclaim the news that ATI has finally properly hard launched a product. That's right, looking around the internet this morning has provided us with the joyous realization that the Radeon X1900XT, XTX, and CrossFire parts are available for purchase. We've tried to keep an eye on the situation and it's been quite easy to see that ATI would be able to pull it off this time. Some sites started taking preorders earlier in the week saying their X1900 parts would ship in one to two days, putting the timeframe right on the mark. There were no missing dongles, no problems with customs, and ATI told us last week that thousands of parts had already been delivered to manufacturers.

And if that isn't enough to dance about, ATI has delivered a hugely powerful part with this launch. The Radeon X1900 series is no joke, and every card featuring the name is a behemoth. With triple the pixel shader units of the X1800 XT, and a general increase in supporting hardware throughout the pixel processing engine, ATI's hugely clocked 384 Million transistor GPU is capable of crunching enormous volumes of data very quickly. Fill rate isn't increased very much because the X1900 series still only allows 16 pixels to be drawn to the screen per clock cycle, but power is delivered where it is needed most. With longer and more complex shader programs, pixels need to stay in the shader engine longer which further shifts the performance burden from the theoretical maximum fill rate.

NVIDIA would like us to compare the X1900's increase in ALU (arithmetic logic unit) power to what they did with the FX 5900 after NV30 tanked. Certainly, increasing the math power (and increasing memory bandwidth) helped NVIDIA, but fortunately for ATI the X1900 is not derived from a fundamentally flawed GPU design. The X1800 series are certainly not bad parts, even if they are being completely replaced by the X1900 in ATI's lineup.



I'll spoil the results and make it clear that the X1900XT and XTX are hands down the best cards out there right now. But all positives aside, ATI needed this card to hard launch with good availability, perform better than anything else, and look good doing it. There have been too many speed bumps in ATI's way for there to be any room for a slip up on this launch, and it looks like they've pulled it off. The launch of the X1900 series not only puts ATI back on top, but (much more importantly) it puts them back in the game. Let's hope that both ATI and NVIDIA can keep up the good fight.

But let's not forget why we're here. The first thing we are going to do is talk about what makes the R580 GPU that powers the X1900 series so incredibly good at what it does.



R580 Architecture

The architecture itself is not that different from the R520 series. There are a couple tweaks that found their way into the GPU, but these consist mainly of the same improvements made to the RV515 and RV530 over the R520 due to their longer lead time (the only reason all three parts arrived at nearly the same time was because of a bug that delayed the R520 by a few months). For a quick look at what's under the hood, here's the R520 and R580 vertex pipeline:



and the internals of each pixel quad:



The real feature of interest is the ability to load and filter 4 texture addresses from a single channel texture map. Textures which describe color generally have four components at every location in the texture, and normally the hardware will load an address from a texture map, split the 4 channels and filter them independently. In cases where single channel textures are used (ATI likes to use the example of a shadow map), the R520 will look up the appropriate address and will filter the single channel (letting the hardware's ability to filter 3 other components go to waste). In what ATI calls it's Fetch4 feature, the R580 is capable of loading 3 other adjacent single channel values from the texture and filtering these at the same time. This effectively loads 4 and filters four times the texture data when working with single channel formats. Traditional color textures, or textures describing vector fields (which make use of more than one channel per position in the texture) will not see any performance improvement, but for some soft shadowing algorithms performance increases could be significant.

That's really the big news in feature changes for this part. The actual meat of the R580 comes in something Tim Allen could get behind with a nice series of manly grunts: More power. More power in the form of a 384 million transistor 90nm chip that can push 12 quads (48 pixels) worth of data around at a blisteringly fast 650MHz. Why build something different when you can just triple the hardware?



To be fair, it's not a straight tripling of everything and it works out to look more like 4 X1600 parts than 3 X1800 parts. The proportions work out to match what we see in the current midrange part: all you need for efficient processing of current games is a three to one ratio of pixel pipelines to render backends or texture units. When the X1000 series initially launched, we did look at the X1800 as a part that had as much crammed into it as possible while the X1600 was a little more balanced. Focusing on pixel horsepower makes more efficient use of texture and render units when processing complex and interesting shader programs. If we see more math going on in a shader program than texture loads, we don't need enough hardware to load a texture every single clock cycle for every pixel when we can cue them up and aggregate requests in order to keep available resources busy more consistently. With texture loads required to hide latency (even going to local video memory isn't instantaneous yet), handling the situation is already handled.

Other than keeping the number of texture and render units the same as the X1800 (giving the X1900 the same ratios of math to texture/fill rate power as the X1600), there isn't much else to say about the new design. Yes, they increased the number of registers in proportion to the increase in pixel power. Yes they increased the width of the dispatch unit to compensate for the added load. Unfortunately, ATI declined allowing us to post the HDL code for their shader pipeline citing some ridiculous notion that their intellectual property has value. But we can forgive them for that.

This handy comparison page will have to do for now.



Details of the Cards

There are actually 4 products being launched today, three of which we were able to get our hands on for this article. We have actually spotted all three of these cards we tested around the internet today, so availability is immediate, and we couldn't be happier. As for pricing, ATI's MSRPs are as follows:

Radeon X1900 XTX -- $650
Radeon X1900 CrossFire Edition -- $600
Radeon X1900 XT -- $550

The CrossFire Edition version of the X1900 is clocked the same as the X1900 XT except for its I/O connectors and compositing engine. The X1900 XT weighs in with some very high clock speeds, especially for the number of pixel pipelines it supports. If you are worried about the CrossFire card bringing down the XTX, don't be. The XTX only sees about a 4% increase in core clock speed and a 7% increase in memory clock speed over the stock X1900 XT.

ATI X1000 Series Features
Radeon X1900 XT(X)
Radeon X1600
Radeon X1800 XL
Radeon X1800 XT
Vertex Pipelines
8
5
8
8
Pixel Pipelines
48
12
16
16
Core Clock
625(650)
590
500
625
Memory Size
512MB
256MB
256MB
512MB
Memory Data Rate
1.45GHz (1.55GHz)
1.38GHz
1GHz
1.5GHz
Texture Units
16
4
16
16
Render Backends
16
4
16
16
Z Compare Units
16
8
16
16
Maximum Threads
512
128
512
512


So, while the price gap between the XTX, XT, and CrossFire versions of the card would seem to indicate sizeable performance differences, we can definitively say that this is not the general case. The XTX is only marginally faster even on paper, and, as we will see, in the real world, real performance is what matters. Our advice is to save your money and go with the cheaper XT. 18% more cost for at best 7% more performance is all that the XTX gives.



One Last Thing, there's an All-in-Wonder Version too

The All-In-Wonder version of this card isn't lagging too far behind this time around. Previous AIW launches have seen at least a little gap between the launch of the card it's based on and an announcement. This time ATI is being proactive and bringing out an AIW version of the X1900 immediately.

The card is a single slot solution, clocked a little lower, and aside from being the cheapest X1900 around, it also features all the bells and whistles AIW users have come to know and love. The price tag doesn't exactly scream bargain, but considering all the features smashed into this part it's obviously not going to be a slouch. While the specs for the part are a considerable cut from the faster cards in the series, the combination of all the positives on this card are incredible. Here's a breakdown:

All-In-Wonder X1900:
Core clock speed: 500 MHz
Memory clock speed: 960 MHz
Price (MSRP): $500

Though the All-In-Wonder series is always sold first in North America (all AIW parts bought here are built by ATI), we haven't seen much in the way of availability for this part today. The card is listed at ATI's own store as out of stock and will ship when available. While the focal point of the launch is on the three main products we tested today, we would have preferred that ATI hold off on the announcement of this part until volume was available. We are more inclined to believe ATI's promise that the AIW will be available in the next couple weeks now that we've seen them deliver so well on this hard launch, and we'll try to test one as soon as possible to see how the reduced clocks affect real world performance.



Not Quite Ready: The Ultimate Gamer Platform, RD580

Today's launch was actually supposed to be even more impressive as ATI was planning on debuting their brand new chipset, the RD580 (which we previewed here), to serve as the ultimate gamer's platform with which to run their new R580 GPU. However thanks to ATI's newfound focus on availability, the RD580 launch was pushed back to coincide with actual availability of product. So while today's tests are done on a RD480 motherboard, the real platform of choice, especially for CrossFire users, is supposed to be the RD580.


Click to enlarge.



We are expecting to have RD580 samples in the coming months and you can expect retail availability around the same time, but until then we will have to do with last year's platform.



Hardware Features and Test Setup

We're talking about features and tests today because we are going to be trying something a bit different this time around. In addition to our standard noAA/4xAA tests (both of which always have 8xAF enabled), we are including a performance test at maximal image quality on each architecture. This won't give us directly comparable numbers in terms of performance, but it will give us an idea of playability at maximum quality.

These days, we are running out of ways to push our performance tests. Plenty of games out there are CPU limited, and for what purpose is a card as powerful as an X1900XTX or 7800 GTX 512 purchased except to be pushed to its limit and beyond? Certainly, a very interesting route to go would be for us to purchase a few apple cinema displays and possibly an old IBM T221 and go insane with resolution. And maybe we will at some point. But for now, most people don't have 30" displays (though the increasing power of today's graphics cards is certainly a compelling argument for such an investment). For now, people can push their high end cards by enabling insane features and getting the absolute maximum eye candy possible out of all their games. Flight and space sim nuts now have angle independent anisotropic filtering on ATI hardware, adaptive antialiasing for textured surfaces helps in games with lots of fences and wires and tiny detail work, and 6xAA combined with 16xAF means you'll almost never have to look at a blurry texture with jagged edges again. It all comes at a price, or course, but is it worth it?

In our max quality tests, we will compare ATI parts with 16xAF, 6xAA, adaptive AA, high quality AF and as little catalyst AI as possible enabled to NVIDIA parts with 16xAF, 4x or 8xS AA (depending on reasonable support in the application), transparency AA, and no optimizations (high quality) enabled. In all cases, ATI will have the image quality advantage with angle independent AF and 6x MSAA. Some games with in game AA settings didn't have an option for 8xAA and didn't play well when we forced it in the driver, so we opted to go with the highest in game AA setting most of the time (which is reflected by the highest MSAA level supported in hardware - again most of the time). We tend to like NVIDIA's transparency SSAA a little better than ATI's adaptive AA, but that may just come down to opinion and it still doesn't make up for the quality advantages the X1900 holds over the 7800 GTX lineup.

Our standard tests should look pretty familiar, and here is all the test hardware we used. Multiple systems were required in order to test both CrossFire and SLI, but all single card tests were performed in the ATI reference RD480 board.

ATI Radeon Express 200 based system
NVIDIA nForce 4 based system
AMD Athlon 64 FX-57
2x 1GB DDR400 2:3:2:8
120 GB Seagate 7200.7 HD
600 W OCZ PowerStream PSU

First up is our apples to apples testing with NVIDIA and ATI setup to produce comparable image quality with 8xAF and either no AA or 4xAA. The resolutions we will look at are 1280x960 (or 1024) through 2048x1536.



The Performance Breakdown

Here we're going to take a quick look at overall performance of the X1900 XTX compared to the X1800 XT and to the 7800 GTX 512. This will give us a good idea at the outset of what we are going to see in terms of performance from the new part from ATI. Obviously having individual numbers for multiple resolutions over multiple settings is more conducive to proper analysis of the performance characteristics of the hardware, but for those who just want the bottom line here it is. This is a look at 2048x1536 with 4xAA performance in order to see a snapshot of performance under high stress.



Hold your mouse over the links below to see the quick performance breakdown of the Radeon X1900 XTX at that resolution:



The resounding victory of the X1900 XTX over the 7800 GTX 512 in almost every performance test clearly shows how powerful a part we are playing with. Clearly NVIDIA has been dethroned and will have a difficult time regaining its performance lead. But the more these two companies can leap-frog eachother, the happier we get.

The only real loss the X1900 suffers to the 7800 GTX 512 is in Black and White 2. We've complained about the poor performance of BW2 under ATI hardware for months now, and apparently ATI have located a bug in the application causing the performance issue. They have a patch, which we are currently evaluating, that improves performance. ATI are saying that Lionhead will be including this fix in an upcoming game patch, and we are excited to see something finally being done about this issue.

Of course, with the BW2 test in question, that puts the ATI Radeon X1900 XTX firmly and without question in place as the worlds fastest consumer level graphics product.



Battlefield 2 Performance

Battlefield 2 has been a standard for performance benchmarks here in the past, and it's probably one of our most important tests. This game still stands out as one of those making best use of the next generation of graphics hardware available right now due to its impressive game engine.

One of the first things to note here is something that is a theme throughout all of our performance tests in this review. In all our tests we find that the X1900 XTX and X1900 XT perform very similar to each other, and in some places differ only by a couple of frames per second. This is significant considering that the X1900 XTX costs about $100 more than the X1900 XT.

Below we have two sets of graphs for three different settings: no AA, 4xAA/8xAF, and maximum quality (higher AA and AF settings in the driver). Note that our benchmark for BF2 had problems with NVIDIA's sli so we were forced to omit these numbers. We can see how with and without AA, both ATI and NVIDIA cards perform very similar to each other on each side. Generally though, since ATI tends to do a little better with AA than NVIDIA, they hold a slight edge here. With the Maximum quality settings, we see a great reduction in performance which is expected. Something to keep in mind is that in the driver options, NVIDIA can enable AA up to 8X, while ATI can only enable up to 6X, so these numbers aren't directly comparable.

Battlefield 2 - No AA

Battlefield 2 - 4X AA

Battlefield 2 - Maximum Quality



Black and White 2 Performance

Black and White 2 is a god sim with a very sophisticated graphics and physics engine. One thing very interesting about this game is the advanced in-game Anti-Aliasing option. While, only offering "low", "medium" and "high" AA settings, the game's AA looks surprisingly good, as we will show under the image quality section.

Here with Black and White 2, NVIDIA does quite a bit better than ATI. It would be an understatement to say that this game favors NVIDIA over ATI due to the before-mentioned problem ATI has with this game. Not only do the 1900s perform much lower than the GTX (without AA), but the performance actually becomes worse when Crossfire is enabled. Keep in mind though that ATI has promised a patch, and this issue will hopefully be resolved soon.

Also, Black and White 2 just happens to be possibly the most graphically intensive of our games in this review, so NVIDIA's parts struggle at high resolutions and with AA significantly. This game appears to put even the mighty 7800 GTX 512 sli setup to task, but we still see a playable framerate at the highest resolution with AA enabled. Note that we did not include maximum quality tests here because the in-game AA did a far greater job at image quality with not nearly the same drop in performance.

Black and White 2 - No AA

Black and White 2 - High AA





Day of Defeat Performance

Day of Defeat uses Valve's new HDR technology on the Halflife 2 engine, which makes this game a good performance benchmark. One of the most interesting things to note here is how much of a performance hit NVIDIA takes when maximum quality settings are enabled in the control panel. Specifically, the 7800 GTX 512 gets roughly half the framerate with the max settings enabled as without.

With this game, we've omitted tests without AA enabled because there tends to be a CPU limitation on higher-end cards. Notice that while ATI gets only slightly better scores with AA enabled than NVIDIA, when maximum quality is enabled in the driver, the gap widens considerably and ATI does a much better job across resolutions. ATI gets playable framerates at the highest resolutions with the maximum quality enabled, but without an sli setup, NVIDIA can't really manage similar settings (18.9 fps at 2048x1536 with max quality and 22 fps at 1920x1440 with max quality).

Day of Defeat - 4X AA

Day of Defeat - Maximum Quality





Far Cry Performance

Far Cry is an older game with graphics that, while still good, are starting to look dated. However, this game still provides us with a good performance test as it offers lots of graphics options to bump up the stress on any GPU.

Something interesting we see here is how FarCry favors ATI consistently until the maximum quality settings are enabled. With max quality, the 7800 GTX 512 SLI setup dominates the other cards, including the X1900 XTX Crossfire (48 fps at 2048x1536 verses 16 fps). Without AA enabled, the results are very similar between the ATI and NVIDIA cards, but when 4xAA is enabled, ATI does noticeably better. .

Far Cry - No AA

Far Cry - 4X AA

Far Cry - Maximum Quality





F.E.A.R. Performance

Notoriously demanding on GPUs, F.E.A.R. has the ability to put a very high strain on graphics hardware, and is therefore another great benchmark for these ultra high-end cards. The graphical quality of this game is high, and it's highly enjoyable to watch these cards tackle the F.E.A.R demo.

With F.E.A.R we have a similar situation to Black and White 2, in that the game does so much better with it's own maximum quality settings, that using the driver's max settings would be impractical. With this game we find that it favors ATI hardware over all with and without 4xAA enabled. This game looks like just the kind of thing the X1900 XTX Crossfire setup was made for, as it handles 2048x1536 and 4xAA with ease (49 fps).

F.E.A.R. - No AA

F.E.A.R. - 4X AA





Splinter Cell: Chaos Theory Performance

Splinter Cell: Chaos Theory provides us with a benchmark for a different type of game. Stealth and patience are the themes in this game, and this gives us an alternative to other fast-pace, action-oriented games. Variety is important when considering graphics performance, and games like these offer us just that.

Because this game is somewhat slow-paced, a framerate of around 19 or 20 fps isn't that bad. This is a good thing, because with even the highest settings none of these cards go below this number. For the most part, the Crossfire and SLI setups do very well, with ATI doing better than NVIDIA with and without 4xAA.

Splinter Cell: Chaos Theory - No AA

Splinter Cell: Chaos Theory - 4X AA

Splinter Cell: Chaos Theory - Maximum Quality





Quake 4 Performance

Quake 4 gives us a good performance benchmark based on the Doom 3 engine. The game looks a bit better than Doom 3, but interestingly performs about the same. Without AA enabled, NVIDIA does a little better than ATI, especially in the higher resolutions. With 4x AA, NVIDIA is still a bit better in general, but the gap closes significantly, and the Crossfire setups come out ahead of the 7800 GTX 512 SLI.

Quake 4 - No AA

Quake 4 - 4X AA

Quake 4 - Maximum Quality





Image Quality, Feature Tests, and Power

Something we'd like to look at a bit more in-depth for this review is image quality. It's no secret that due to ATI and NVIDIA's differences in rendering graphics, there is always going to be some variation in the look of the graphics from one brand to another. Most times this variation is too subtle to notice, but upon closer inspection, certain patterns tend to emerge.



Hold your mouse over the links below to see Image Quality (Right Click the links to download the full-resolution images):

ATI
NVIDIA


Battlefield 2 gives us a good view of how the maximum quality settings in the control panel (specifically transparency AA) fix certain graphical problems in games. Fences in particular have a tendency to render inaccurately, especially when looking through them at certain angles. While you can see that the in-game AA without adaptive or transparency AA cleans up a lot of jagged edges (the flag pole for instance), it still has trouble with parts of the fence.

As for power, we ran the multitexturing and pixel shader feature tests under 3dmark06 and measured the maximum powerload via our trusty Kill-A-Watt. This measures power at the wall before the PSU, so it doesn't focus only on the graphics cards.

We can see the CrossFire and SLI systems pull insane ammounts of power, but even as a single card the X1900 XTX is a very hungry part.

Load Power




Final Words

The numbers really do speak for themselves: the X1900XTX is an incredible part. In the end, the difference in performance between the X1900XT and XTX versions was so small that it's hard for us to see how anyone could justify spending another hundred dollars to have someone at a factory eke out that extra little bit of performance. ATI's justification for the X1900 XTX is that it is a pre-overclocked X1900 XT with a $100 manufacturers stamp of approval.

ATI stand behind their position that the X1900XTX isn't going to be another X800 XTPE, but will be a full production part with plenty of availability through its lifetime. Our first reaction is, with the voice of Chris Rock echoing in our ears: "what do you want, a cookie?" But then reality sets in and we are happy to take what we can get... as long as ATI actually delivers on their promises.

But what an excellent position from which to start following through on everything: the R580 launch is a resounding success in our eyes. Availability at launch, 4 new parts based on a huge and powerful chip, a triumphant return to the top with the new fastest graphics card available, and enough power to make the high quality features of the architecture more than useable. ATI couldn't have asked for anything better, and they certainly would not have been in a good position if they had come up with anything less.

There was some question over whether the X1900 CrossFire would be a let down with it's XT clock speeds, but the difference between reality and the theoretical performance of 2 X1900 XTX parts in CrossFire is even smaller than the difference between the performance of an X1900 XTX and an X1900 XT. If there's anything worth seriously questioning it is why anyone thinks that 4% core overclock combined with a 7% memory overclock is worth $100 to anyone.

One of the interesting non-performance related aspects of this launch is that ATI is phasing out the X1800 series. Their future roadmaps seem to leave a gap in the price range from $200 to $500, so it will be quite interesting to watch what ATI tries to fill the hole with this time around. Maybe we'll see some X1600 GTO parts with unlockable R520/R580 cores. Or maybe we'll see another product launch. Only time will tell.

Log in

Don't have an account? Sign up now