R600 Overview

From a very high level, we have the same capabilities we saw in the G80, where each step in the pipeline runs on the same hardware. There are a lot of similarities when stepping way back, as the same goals need to be accomplished: data comes into the GPU, gets setup for processing, shader code runs on the data, and the result either heads back up for another pass through the shaders or moves on to be rendered out to the framebuffer.

The obvious points are that R600 is a unified architecture that supports DX10. The set of requirements for DX10 are very firm this time around, so we won't see any variations in feature support on a basic level. AMD and NVIDIA are free to go beyond the DX10 spec, but these features might not be exposed through the Microsoft API without a little tweaking. AMD includes one such feature, a tessellator unit, which we'll talk about more later. For now, let's take a look at the overall layout of R600.

Our first look shows a huge amount of stream processing power: 320 SPs all told. These are a little different than NVIDIA's SPs, and over the next few pages we'll talk about why. Rather than a small number of SPs spread across eight groups, our block diagram shows R600 has a high number of SPs in each of four groups. Each of these four groups is connected to its own texture unit, while they share a connection to shader export hardware and a local read/write cache.

All of this is built on an 80nm TSMC process and uses in the neighborhood of 720 Million transistors. All other R6xx parts will be built on a 65nm processes with many fewer transistors, making them much smaller and more power efficient. Core clock speed is on the order of 740MHz for R600 with memory running at 825MHz.

Memory is slower this time around with higher bandwidth, as R600 implements a 512-bit memory bus. While we're speaking about memory, AMD has revised their Ring Bus architecture for this round, which we'll delve into later. Unfortunately we won't be able to really compare it to NVIDIA's implementation, as they won't go into any detail with us on internal memory buses.

And speaking of things NVIDIA won't go into detail on, AMD was good enough to share very low level details, including information on cache sizes and shader hardware implementation. We will be very happy to spend time talking about this, and hopefully AMD will inspire NVIDIA to start opening up a little more and going deeper into their underlying architecture.

To hit the other hot points, R600 does have some rather interesting unique features to back it up. Aside from including a tessellation unit, they have also included an audio processor on their hardware. This will accept audio streams and send them out over their DVI port through a special converter to integrate audio with a video stream over HDMI. This is unique, as current HDMI converters only work with video. AMD also included a programmable AA resolve feature that allows their driver team to create new ways of filtering subsample data.

R600 also features an independent DMA engine that can handle moving and managing all memory to and from the GPU, whether it's over the PCIe bus or local memory channels. This combined with huge amounts of memory bandwidth should really assist applications that require large amounts of data. With DX10 supporting up to 8k x 8k textures, we are very interested in seeing these limits pushed in future games.

That's enough of a general description to whet your appetite: let's dig down under the surface and find out what makes this thing tick.

DX10 Redux Different Types of Stream Processors
Comments Locked

86 Comments

View All Comments

  • dragonsqrrl - Thursday, August 25, 2011 - link

    You forgot c).

    -if you're an ATI fanboy
  • vijay333 - Monday, May 14, 2007 - link

    http://www.randomhouse.com/wotd/index.pperl?date=1...">http://www.randomhouse.com/wotd/index.pperl?date=1...

    "the expression to call a spade a spade is thousands of years old and etymologically has nothing whatsoever to do with any racial sentiment."
  • strikeback03 - Wednesday, May 16, 2007 - link

    What about in Euchre, where a spade can be a club (and vice versa)?
  • johnsonx - Monday, May 14, 2007 - link

    Just wait until AT refers to AMD's marketing budget as 'niggardly'...
  • bldckstark - Monday, May 14, 2007 - link

    What do shovels have to do with race?
  • Stan11003 - Monday, May 14, 2007 - link

    My big hope out all of this that the ATI part forces the Nvidia parts lower so I can use my upgrade option from EVGA to get a nice 8800 GTX instead of my 8800 GTS ACS3 320. However with a quad core and a decent 2GB I have no gaming issues at all. I play at 1600x1200(when that become a low rez?) and everything is butter smooth. Without newer titles all this hardware is a waist anyways.
  • Gul Westfale - Monday, May 14, 2007 - link

    the article says that the part is not a failure, but i disagree. i switched from a radeon 1950pro to an nvidia geforce 8800GTS 320MB about a mont ago, and i paid only $350US for it. now i see that it still outperforms the new 2900...

    one of my friends wanted to wait to buy a new card, he said he hoped that the ATI part was going to be faster. now he says he will just buy the 8800GTS 320, since ATI have failed.

    if they can bring out a part that competes well with the 8800GTS and price it similarly or lower then it would be worth buying, but until then i will stick with nvidia. better performance, better price, and better drivers... why would anyone buy the ATI card now?
  • ncage - Monday, May 14, 2007 - link

    My conclusion is to wait. All of the recent GPU do great with dx9...the question is how will they do with dx10? I think its best to wait for dx10 titles to come out. I think crysis would be a PERFECT test.
  • wingless - Monday, May 14, 2007 - link

    I agree with you. Crysis is going to be the benchmark for these DX10 cards. Its hard to tell both Nvidia and AMD's DX10 performance with these current, first generation DX10 titles (most of which have a DX9 version) because they don't fully take advantage of all the power on both the G80 or R600 yet. Its true that Crysis will have a DX9 version as well but the developer stated there are some big differences in code. I'm an Nvidia fanboy but I'm disappointed with the Pure Video and HDMI support on the 8800 series cards. ATI got this worked out with their great AVIVO and their nice HDMI implementation but for now Nvidia is still the performance champ with "simpler" hardware. The G80 and R600 continue the traditions of their manufacturers. Nvidia has always been about raw power and all out speed with few bells and whistles. ATI is all about refinement, bells and whistles, innovations, and unproven new methods which may make or break them.

    All I really want to wait for is to see how developers embrace CUDA or ATI's setup for PHYSICS PROCESSING! Both companies seem to have well thought out methods to do physics and I cant wait to see that showdown. AGEIA and HAVOK need to hop on-board and get some software support for all this good hardware potential they have to play with. Physics is the next big gimmick and you know how much we all love gimmicks (just like good 'ole 3D acceleration 10 years ago).
  • poohbear - Monday, May 14, 2007 - link

    they dont make a profit from high end parts that's why they're not bothering w/ it? that's AMD's story? so why bother having an FX line w/ their cpus?

Log in

Don't have an account? Sign up now