Different Types of Stream Processors

The first thing we need to do when looking at the R600 shader core is to define our terms. AMD and NVIDIA build and refer to their Stream Processors (SPs) differently, and that makes counting them a little more difficult. Throughout our explanation, it will help to remember from our G80 coverage that threads refer to a vertex, primitive or pixel and not a stream of instructions as it would on a CPU.

Stream Processors: The NVIDIA Way

G80 has 128 SPs (for the 8800 GTX; there are 96 SPs on the 8800 GTS models) that are capable of doing a very small number of things at the same time. They can do either standard FP operations (like a MADD), a special function operation (like sine), or an integer operation. There are some cases where they can squeeze out an extra MUL, but more often than not this MUL isn't accessible. Each of these SPs operates on an individual thread (be it a vertex, primitive or pixel).

This gives us a total of up to 128 threads being processed per clock. It is important to realize that each of the 128 SPs isn't entirely independent. That is, we can't run 128 different instructions in one clock, in spite of the fact that we can run a number of instructions on 128 different threads. We'll delve a little deeper into this shortly, but depending on the type of shader running, the same instruction must be running on multiple threads.

For NVIDIA hardware, the minimum number of threads that must be processed using the same instruction is 16 (for vertex threads). NVIDIA's block diagrams show that each group of 16 SPs shares texture, register, and cache resources, so this makes sense. Pixel shaders, which are more important from a performance perspective, must run one instruction on 32 pixels at a time. What we can extrapolate from this is that NVIDIA can issue up to eight separate instructions across all of its 128 SPs (only four if working on pixels) per clock.

128 SPs / 16 Threads per Instruction per Clock = 8 Vertex Instructions per Clock

128 SPs / 32 Threads per Instruction per Clock = 4 Pixel Instructions per Clock

Stream Processors: AMD's R600

Things are a little different on R600. AMD tells us that there are 320 SPs, but these aren't directly comparable to G80's 128. First of all, most of the SPs are simpler and aren't capable of special function operations. For every block of five SPs, only one can handle either a special function operation or a regular floating point operation. The special function SP is also the only one able to handle integer multiply, while other SPs can perform simpler integer operations.

This isn't a huge deal because straight floating point MAD and MUL performance is by far the limiting factors in shader performance today. The big difference comes in the fact that AMD only executes one thread (vertex, primitive or pixel) across a group of five SPs.

What this means is that each of the five SPs in a block must run instructions from one thread. While AMD can run up to five scalar instructions from that thread in parallel, these instructions must be completely independent from one another. This can place a heavy burden on AMD's compiler to extract parallel operations from shader code. While AMD has gone to great lengths to make sure every block of five SPs is always busy, it's much harder to ensure that every SP within each block is always busy.

If we take a step back, we can determine how many threads AMD is able to work on per clock. With 320 total SPs, each grouped into blocks of five-to-a-thread, we get 64 threads per clock. And here's where it starts to get complicated. Before we go back and compare this to NVIDIA's architecture, let's go a little deeper into the implementation.

R600 Overview Stream Processor Implementation
Comments Locked

86 Comments

View All Comments

  • TA152H - Monday, May 14, 2007 - link

    Fanboy? What a dork.

    I've had success with ATI, not with NVIDIA, and I know ATI stuff a lot better so it's just easier for me to work with. It's not an irrational like or dislike. I bought one NVIDIA and it was a nightmare. Plus, I'm not as sure they'll be around for very long as I am ATI/AMD, although they had a good quarter, and AMD surely had a dreadful one.

    Selling discrete video cards alone might get a lot more difficult with the integration of CPUs, and GPUs.
  • yyrkoon - Monday, May 14, 2007 - link

    You are a fanboy, face it. 'I tried a nVidia card once . . .' How long ago was that ? Who made the card ? Did you have it configured properly? Once?! Details like this are important, and seemily/conviently left out. Anyhow, anyone claiming that nVIdia cards are 'junk' has definate issues with assembling/configuring hardware. I say this because my current system uses a nVidia based card, and is 100% rock solid. 'Person between the chair and keyboard' rings a bell.

    Ask any Linux user why they refuse to use ATI cards in their system . . . You are also one of these people out there that claims ATI driver support is superior to nVIdias driver support I suppose ? If you have truely been using ATI products for 20 years, then you know ATI has one of the worst reputations on the planet for driver support(and while it may have improved, it is not as good as nVidias still).

    Yeah, anyhow, ATI, and nVidia both can have problems with their hardware, it is not based 100% on their architecture, but the OEM releasing the products have a lot of effect here also. There are bad OEMs to buy from here on both sides of the fence, knowing who to stay away from, is half the work when building a PC, and probably had a lot more to do with your alleged 'bad nVIdia card', assuming you actually configured the card properly.

    I also had a problem with an nVIdia card once, I bought a brand new GF3 card about 7 years ago, and a few of the older games I had, would not display properly with it. What did I do ? I waited about a month, for a new driver, and the problem was solved. I have also had issues with ATI cards, one of which drew too much power from the AGP slot, and would cause the given system to crash 1-2 times a day. This was a design issue/oversight on ATI's behalf(the card was made by Saphire, who also makes ATIs cards). What did I do ? I replaced the card with an nVIdia card, and the system has been stable since.

    So you see, I too can skew things to make anyone look bad also, and in the end, it would only serve to make me look like the dork. But if you want to pay more, for less, that is perfectly fine by me.
  • Pirks - Monday, May 14, 2007 - link

    I've got all problems and crappy drivers (especially Linux ones) only from ATI while nVidia software was always much better in my experience. power hungry noisy monsters made by whom? by ATI! as always :) same shit as with their x1800/x1900 miserable power guzzling series

    discrete video cards are not going away any time soon. ever heard of integrated video used in games, besides ones from 2000, like old Quake 2? no? then please continue your lovefest with ATI, but for me - it looks like I'll pass on them this time again - since Radeon 9800Pro they went downhill and continue in that direction. they MAY make a decent integrated CPU/GPU budget-oriented vendor in a future, for all those office folks playing simple 2D office games, but real stuff? nope, ATI is still out of the game for me. let's see if they manage to come back with reincarnation of R300 in future.

    ironically, AMD CPUs on the other hand have best price/performance ratio, so intel won't see me as their customer. I wish ATI 3D chips were as good as AMD CPUs in that regard (and overclockers please shut up, I'm not bothering to OC my rig because I don't enjoy benchmark numbers, I enjoy REAL stuff like games, and Intel is out of the game for me as well, at least until their budget single core Conroes are out)
  • utube545 - Tuesday, May 22, 2007 - link

    Get a clue, you fucking cretin.
  • dragonsqrrl - Thursday, August 25, 2011 - link

    haha... lol, wow. facepalm.
  • dragonsqrrl - Thursday, August 25, 2011 - link

    Damn you're a fail noob of an ATI fanboy. Time has not been kind to the HD2900XT, and now you sound more ridiculous then ever... lol.
  • yzkbug - Monday, May 14, 2007 - link

    Not a word about new AVIVO HD and digital sound features?
  • DerekWilson - Wednesday, May 16, 2007 - link

    we mentioned this ...

    on the r600 overview page ...
  • photoguy99 - Monday, May 14, 2007 - link

    First to be clear and I do not condone the title of this article, there's no need to bring racism into this.

    But my point is NVidia can and will react by making the performance per dollar competitive for the R600 vs 8800GTS.

    Once the prices are comparable, why buy a more power hungry part (the ATI)?

    This is one disadvantage they can't correct until the next respin.

  • DrMrLordX - Monday, May 14, 2007 - link

    Based on the benchmarks results, the only reason I can see for getting 2900XTs is if a). you don't care about power consumption and b). want to run a Crossfire rig at a lower cost of entry than dual-8800 GTXs or 8800 Ultras.

    As others have said, some more benchmarks in mature DX10 titles might show who the real winner here is performance-wise, and that holds true for multi-GPU scenarios as well.

Log in

Don't have an account? Sign up now