Texturing, Caches, and Memory

Texturing

R600 features less texture hardware than we would expect to see, though AMD stands by the argument that compute power will come out on top when it matters. At the same time, we can't compute anything if we don't have any data to work with. So let's take a look at what AMD has done with their texture units.

There are four texture units in R600, one for each SIMD unit. These units don't share resources with the hardware in the SIMD units and are independently scheduled by AMD's dispatch processor. The dispatch processor is able to determine what data will be needed for threads about to execute and can handle setting up the texture units without waiting for the SIMD unit to request data and come up empty.

Texture units on the R600 are able to make both filtered and unfiltered texture requests no matter what shader is running. Unfiltered textures are useful with non-image-based texture data like vertex textures, normal maps, and generic blocks of data. Filtered requests will generally be for image data to be used in determining the color of a pixel. R600 can address one unfiltered texture per clock per texture unit and one filtered textures per clock per texture unit. Filtered units can be used to request unfiltered textures if necessary, providing an extra four unfiltered textures in place of one filtered texture.

The unfiltered texture requests will come back through four fp32 texture samplers (one per component), while the filtered requests will return 16 data points which will be run through the texture filtering hardware resulting in four filtered texture samples. The hardware can at best produce 32 single component fp16 unfiltered results per texture unit per clock. More practically, each texture unit can produce four bilinear filtered four component fp16 samples per clock alongside four unfiltered results. For textures with fp32 components, two clocks would be required to complete a bilinear filter process, as only half the data is loaded at a time to conserve bandwidth.

This is definitely a step up for R600, as R5xx hardware doesn't have texture filtering hardware for floating point textures. All told, with each of its four texture units working, R600 can consume up to 32 unfiltered textures or 16 unfiltered textures plus 16 filtered textures (as long as they're fp16 or fewer bits and we're only using bilinear filtering).

G80 is built with four texture address units and eight texture filters per block of 16 SPs. In total, this means NVIDIA's hardware can produce 32 filtered texture samples per clock (again these are fp16 and bilinear filtered). Of course, NVIDIA is operating on twice as many threads per clock, so it is conceivable that they would benefit more from having the extra filtered data.

We will have to wait and see if AMD's approach of providing unfiltered and filtered texture access in parallel pays off. For the general case on pixel shaders, we would want to see more filtered textures per clock, but with vertex and geometry shaders coming into the mix this could be a good way to save hardware space while offering more texturing power. On a final texturing note, AMD implemented "percentage closer" filter hardware for depth stencil textures. This will allow developers to implement fast soft shadows. The details of the implementation weren't indicated though.

Next Up: NVIDIA's G80 Finally: A Design House Talks Cache Size
Comments Locked

86 Comments

View All Comments

  • GoatMonkey - Monday, May 14, 2007 - link

    That's obviously BS. This IS their high end part, it just doesn't perform as well as nVidia's high end part, so it is priced accordingly.
  • poohbear - Monday, May 14, 2007 - link

    sweet review though! thanks for including all the important and pertinent cards in your roundup (the 8800gts 320mb inparticular). also love how neutral Anand is in their reviews, unlike some other sites.:p
  • Creig - Monday, May 14, 2007 - link

    The R600 is finally here. I'm sure the overall performance is not what AMD was hoping for. Nobody ever shoots to have their newest product be the 2nd best. But pricing it at $399 and including a very nice game bundle will make the HD 2900 XT a VERY worthwhile purchase. I also have the feeling that there is a significant amount of performance increase to be realized through future driver releases ala X1800XT.
  • shady28 - Tuesday, May 15, 2007 - link


    Nvidia has gone over the cliff on pricing.

    I know of no one personally who has an 88xx series card. I know one who recently picked up an 8600 of some kind, that's it. I have the best GPU of anyone I know.

    It's a real shame that there is so much focus on graphics cards that virtually no one buys. These are niche products folks - yet 'who is best' seems to be totally dependent on these niche products. That's patently ridiculous.

    It's like saying, since IBM makes the fastest computers in the world (they do), they're the best and you should be buying IBM (or now, lenovo) laptops and desktops.

    No one ever said that sort of thing because it's patently ridiculous. Why do people say it now for graphics cards? The fact that they do says a lot about the mentality of sites like AT.
  • DerekWilson - Tuesday, May 15, 2007 - link

    We don't say what you are implying, and we are also very upset with some of NVIDIA's pricing (specifically the 8800 ultra)

    the 8800 gts 320mb is one of the best values for your money anywhere and isn't crazy expensive -- it's actually the card I'd recommend to anyone who cares about graphics in games and wants good quality and performance at 1600x1200.

    I would never tell anyone to buy an 8600 gts because nvidia has the fastest high end card. In fact, in this article, I hope I made it clear that AMD has the opportunity to capitalize on the huge performance gap nvidia left between the 8600 and 8800 series ... If AMD builds a part that performs in this range is priced competitively, they'll have our recommendation in a flash.

    Recommending parts based on value at each price or performance segment is something we take pride in and will always do, no matter who has the absolute fastest hardware out there.

    The reason our focus was on AMD's fastest part is because they haven't given us any other hardware to test. We will absolutely be talking a lot and in much depth about midrange and budget hardware when AMD makes these parts available to us.
  • yacoub - Monday, May 14, 2007 - link

    $400 is a lot of money. Not terribly long ago the highest end GPU available didn't cost more than $400. Now they hit $750 so you start to think $400 sounds cheap. It's really not. It's a heck of a lot of money for one piece of hardware. You can put together a 650i SLI rig with 2GB of DDR2 6400 and an E4400 for that much money. I know because I just did that. I kept my 7900GT from my old rig because I wanted to see how R600 did before purchasing an 8800GTS 640MB. Now that we've seen initial results I will wait to see how R600 does with more mature drivers and also wait to see the 640MB GTS price come down even more in the meantime.
  • vijay333 - Monday, May 14, 2007 - link

    http://www.randomhouse.com/wotd/index.pperl?date=1...">http://www.randomhouse.com/wotd/index.pperl?date=1...

    "the expression to call a spade a spade is thousands of years old and etymologically has nothing whatsoever to do with any racial sentiment."

  • yacoub - Monday, May 14, 2007 - link

    Yes, a spade was a shovel long before muslims enslaved europeans to do hard labor in north africa and europeans enslaved africans to do hard labor in the 'new world'.
  • vijay333 - Monday, May 14, 2007 - link

    whoops...replied to the wrong one.
  • rADo2 - Monday, May 14, 2007 - link

    It is not 2nd best (after 8800ULTRA), not 3rd best (after 8800GTX), not 4th best (after 8800GTX-640), but 5th best (after 8800GTS-320), or even worse ;)

    Bad performance with AA turned on (everybody turns on AA), huge power consumption, late to the market.

    A definitive failure.

Log in

Don't have an account? Sign up now