Different Types of Stream Processors

The first thing we need to do when looking at the R600 shader core is to define our terms. AMD and NVIDIA build and refer to their Stream Processors (SPs) differently, and that makes counting them a little more difficult. Throughout our explanation, it will help to remember from our G80 coverage that threads refer to a vertex, primitive or pixel and not a stream of instructions as it would on a CPU.

Stream Processors: The NVIDIA Way

G80 has 128 SPs (for the 8800 GTX; there are 96 SPs on the 8800 GTS models) that are capable of doing a very small number of things at the same time. They can do either standard FP operations (like a MADD), a special function operation (like sine), or an integer operation. There are some cases where they can squeeze out an extra MUL, but more often than not this MUL isn't accessible. Each of these SPs operates on an individual thread (be it a vertex, primitive or pixel).

This gives us a total of up to 128 threads being processed per clock. It is important to realize that each of the 128 SPs isn't entirely independent. That is, we can't run 128 different instructions in one clock, in spite of the fact that we can run a number of instructions on 128 different threads. We'll delve a little deeper into this shortly, but depending on the type of shader running, the same instruction must be running on multiple threads.

For NVIDIA hardware, the minimum number of threads that must be processed using the same instruction is 16 (for vertex threads). NVIDIA's block diagrams show that each group of 16 SPs shares texture, register, and cache resources, so this makes sense. Pixel shaders, which are more important from a performance perspective, must run one instruction on 32 pixels at a time. What we can extrapolate from this is that NVIDIA can issue up to eight separate instructions across all of its 128 SPs (only four if working on pixels) per clock.

128 SPs / 16 Threads per Instruction per Clock = 8 Vertex Instructions per Clock

128 SPs / 32 Threads per Instruction per Clock = 4 Pixel Instructions per Clock

Stream Processors: AMD's R600

Things are a little different on R600. AMD tells us that there are 320 SPs, but these aren't directly comparable to G80's 128. First of all, most of the SPs are simpler and aren't capable of special function operations. For every block of five SPs, only one can handle either a special function operation or a regular floating point operation. The special function SP is also the only one able to handle integer multiply, while other SPs can perform simpler integer operations.

This isn't a huge deal because straight floating point MAD and MUL performance is by far the limiting factors in shader performance today. The big difference comes in the fact that AMD only executes one thread (vertex, primitive or pixel) across a group of five SPs.

What this means is that each of the five SPs in a block must run instructions from one thread. While AMD can run up to five scalar instructions from that thread in parallel, these instructions must be completely independent from one another. This can place a heavy burden on AMD's compiler to extract parallel operations from shader code. While AMD has gone to great lengths to make sure every block of five SPs is always busy, it's much harder to ensure that every SP within each block is always busy.

If we take a step back, we can determine how many threads AMD is able to work on per clock. With 320 total SPs, each grouped into blocks of five-to-a-thread, we get 64 threads per clock. And here's where it starts to get complicated. Before we go back and compare this to NVIDIA's architecture, let's go a little deeper into the implementation.

R600 Overview Stream Processor Implementation
Comments Locked

86 Comments

View All Comments

  • Roy2001 - Tuesday, May 15, 2007 - link

    The reason is, you have to pay extra $ for a power supply. No, most probably your old PSU won't have enough milk for this baby. I will stick with nVidia in future. My 2 cents.
  • Chaser - Tuesday, May 15, 2007 - link

    quote:


    While AMD will tell us that R600 is not late and hasn't been delayed, this is simply because they never actually set a public date from which to be delayed. We all know that AMD would rather have seen their hardware hit the streets at or around the time Vista launched, or better yet, alongside G80.

    First, they refuse to call a spade a spade: this part was absolutely delayed, and it works better to admit this rather than making excuses.



    Such a revealing tech article. Thanks for other sources Tom.
  • archcommus - Tuesday, May 15, 2007 - link

    $300 is the exact price point I shoot for when buying a video card, so that pretty much eliminates AMD right off the bat for me right now. I want to spend more than $200 but $400 is too much. I'm sure they'll fill this void eventually, and how that card will stack up against an 8800 GTS 320 MB is what I'm interested in.
  • H4n53n - Tuesday, May 15, 2007 - link

    Interesting enough in some other websites it wins from 8800 gtx in most games,especially the newer ones and comparing the price i would say it's a good deal?I think it's just driver problems,ati has been known for not having a very good driver compared to nvidia but when they fixed it then it'll win
  • dragonsqrrl - Thursday, August 25, 2011 - link

    lol...fail. In retrospect it's really easy to pick out the EPIC ATI fanboys now.
  • Affectionate-Bed-980 - Tuesday, May 15, 2007 - link

    I skimmed this article because I have a final. ATI can't hold a candle to NV at the moment it seems. Now while the 2900XT might have good value, I am correct in saying that ATI has lost the performance crown by a buttload (not even like X1800 vs 7800) but like they're totally slaughtered right?

    Now I won't go and comment about how the 2900 stacks up against competition in the same price range, but it seems that GTSes can be acquired for cheap.

    Did ATI flop big here?
  • vailr - Monday, May 14, 2007 - link

    I'd rather use a mid-range older card that "only" uses ~100 Watts (or less) than pay ~$400 for a card that requires 300 Watts to run. Doesn't AMD care about "Global Warming"?
    Al Gore would be amazed, alarmed, and astounded !!
  • Deusfaux - Monday, May 14, 2007 - link

    No they dont and that's why the 2600 and 2400 don't exist
  • ochentay4 - Monday, May 14, 2007 - link

    Let me start with this: i always had a nvidia card. ALWAYS.

    Faster is NOT ALWAYS better. For the most part this is true, for me, it was. One year ago I boght a MSI7600GT. Seemed the best bang for the buck. Since I bought it, I had problems with TVout detection, TVout wrong aspect ratios, broken LCD scaling, lot of game problems, inexistent support (nv forum is a joke) and UNIFIED DRIVER ARQUITECTURE. What a terrible lie! The latest official drivers is 6 months ago!!!

    Im really demanding, but i payed enough to demand a 100% working product. Now ATi latest offering has: AVIVO, FULL VIDEO ACC, MONTHLY DRIVER UPDATES, ALL BUGS I NOTICED WITH NVIDIA CARD FIXED, HDMI AND PRICE. I prefer that than a simple product, specially for the money they cost!

    I will never buy a nvidia card again. I'm definitely looking forward ATis offering (after the joke that is/was 8600GT/GTS).

    Enough rant.
    Am I wrong?
  • Roy2001 - Tuesday, May 15, 2007 - link

    Yeah, you are wrong. Spend $400 on a 2900XT and then $150 on a PSU.

Log in

Don't have an account? Sign up now