R580 Architecture

The architecture itself is not that different from the R520 series. There are a couple tweaks that found their way into the GPU, but these consist mainly of the same improvements made to the RV515 and RV530 over the R520 due to their longer lead time (the only reason all three parts arrived at nearly the same time was because of a bug that delayed the R520 by a few months). For a quick look at what's under the hood, here's the R520 and R580 vertex pipeline:



and the internals of each pixel quad:



The real feature of interest is the ability to load and filter 4 texture addresses from a single channel texture map. Textures which describe color generally have four components at every location in the texture, and normally the hardware will load an address from a texture map, split the 4 channels and filter them independently. In cases where single channel textures are used (ATI likes to use the example of a shadow map), the R520 will look up the appropriate address and will filter the single channel (letting the hardware's ability to filter 3 other components go to waste). In what ATI calls it's Fetch4 feature, the R580 is capable of loading 3 other adjacent single channel values from the texture and filtering these at the same time. This effectively loads 4 and filters four times the texture data when working with single channel formats. Traditional color textures, or textures describing vector fields (which make use of more than one channel per position in the texture) will not see any performance improvement, but for some soft shadowing algorithms performance increases could be significant.

That's really the big news in feature changes for this part. The actual meat of the R580 comes in something Tim Allen could get behind with a nice series of manly grunts: More power. More power in the form of a 384 million transistor 90nm chip that can push 12 quads (48 pixels) worth of data around at a blisteringly fast 650MHz. Why build something different when you can just triple the hardware?



To be fair, it's not a straight tripling of everything and it works out to look more like 4 X1600 parts than 3 X1800 parts. The proportions work out to match what we see in the current midrange part: all you need for efficient processing of current games is a three to one ratio of pixel pipelines to render backends or texture units. When the X1000 series initially launched, we did look at the X1800 as a part that had as much crammed into it as possible while the X1600 was a little more balanced. Focusing on pixel horsepower makes more efficient use of texture and render units when processing complex and interesting shader programs. If we see more math going on in a shader program than texture loads, we don't need enough hardware to load a texture every single clock cycle for every pixel when we can cue them up and aggregate requests in order to keep available resources busy more consistently. With texture loads required to hide latency (even going to local video memory isn't instantaneous yet), handling the situation is already handled.

Other than keeping the number of texture and render units the same as the X1800 (giving the X1900 the same ratios of math to texture/fill rate power as the X1600), there isn't much else to say about the new design. Yes, they increased the number of registers in proportion to the increase in pixel power. Yes they increased the width of the dispatch unit to compensate for the added load. Unfortunately, ATI declined allowing us to post the HDL code for their shader pipeline citing some ridiculous notion that their intellectual property has value. But we can forgive them for that.

This handy comparison page will have to do for now.

Index Details of the Cards
Comments Locked

120 Comments

View All Comments

  • tuteja1986 - Tuesday, January 24, 2006 - link

    wait for firing squad review then :) if you want AAx8
  • beggerking - Tuesday, January 24, 2006 - link

    Did anyone notice it? the breakdown graphs doesn't quite reflect the actual data..

    the breakdown is showing 1900xtx being much faster than 7800 512, but in the actual performance graph 1900xtx is sometimes outpaced by 7800 512..
  • SpaceRanger - Tuesday, January 24, 2006 - link

    All the second to last section describes in the Image Quality. There was no explaination on power consumtion at all. Was this an accidental omit or something else??
  • Per Hansson - Tuesday, January 24, 2006 - link

    Yes, please show us the power consumption ;-)

    A few things I would like seen done; Put a low-end PCI GFX card in the comp, boot it and register power consumption, leave that card in and then do your normal tests with a single X1900 and then dual so we get a real point on how much power they consume...

    Also please clarify exactly what PSU was used and how the consumption was measured so we can figure out more accuratley how much power the card really draws (when counting in the (in)efficiency of the PSU that is...
  • peldor - Tuesday, January 24, 2006 - link

    That's a good idea on isolating the power of the video card.

    From the other reviews I've read, the X1900 cards are seriously power hungry. In the neighborhood of 40-50W more than the X1800XT cards. The GTX 512 (and GTX of course) are lower than the X1800XT, let alone the X1900 cards.
  • vaystrem - Tuesday, January 24, 2006 - link

    Anyone else find this interesting??

    Battlefield 2 @ 2048x1536 Max Detail
    7800GTX512 33FPS
    AIT 1900XTX 32.9FPS
    ATI 1900XTX Crossfire. 29FPS
    -------------------------------------
    Day of Defeat
    7800GTX512 18.93FPS
    AIT 1900XTX 35.5PS
    ATI 1900XTX Crossfire. 35FPS
    -------------------------------------
    Fear
    7800GTX512 20FPS
    AIT 1900XTX 36PS
    ATI 1900XTX Crossfire. 49FPS
    -------------------------------------
    Quake 4
    7800GTX512 43.3FPS
    AIT 1900XTX 42FPS
    ATI 1900XTX Crossfire. 73.3FPS


  • DerekWilson - Tuesday, January 24, 2006 - link

    Becareful here ... these max detail settings enabled superaa modes which really killed performance ... especially with all the options flipped on quality.

    we're working on getting some screens up to show the IQ difference. but suffice it to say that that the max detail settings are very apples to oranges.

    we would have seen performance improvements if we had simply kept using 6xAA ...
  • DerekWilson - Tuesday, January 24, 2006 - link

    to further clarify, fear didn't play well when we set AA outside the game, so it's max quality ended up using the in game 4xaa setting. thus we see a performance improvement.

    for day of defeat, forcing aa/af through the control panel works well so we were able to crank up the quality.

    I'll try to go back and clarify this in the article.
  • vaystrem - Wednesday, January 25, 2006 - link

    I'm not sure how that justifies what happens. Your argument is that it is the VERY highest settings so that its ok for the 'dual' 1900xtx to have lower performance than a single card alternative? That doesn't seem to make sense and speaks poorly for the ATI implementation.
  • Lonyo - Tuesday, January 24, 2006 - link

    The XTX especially in Crossfire does seem to give a fair boost in a number of tests over the XT and XT in Crossfire.

Log in

Don't have an account? Sign up now