Building a RV770

We did this with NVIDIA's GT200 and it seemed to work out well, so let's start at the most basic level with AMD's RV770. Meet the Stream Processing Unit:

AMD's Stream Processing Unit is very similar to NVIDIA's SP in G80/G92/GT200, so similar in fact that I drew them the same way. Keep in mind that the actual inner workings of one of these units is far more complex than three ALUs but to keep things simple and consistent that's how I drew it (the actual hardware is a fused FP MUL + ADD unit, for those who care). AMD has four of these stream processing units in a processor block and they are called x, y, z or w units.

There's a fifth unit called a t-unit (the t stands for transcendental, meaning the type of operations it is capable of processing):

The t-unit can do everything a x,y,z or w-unit can do, but it also can do transcendental operations (represented by the SFU block in the diagram above). NVIDIA has the same functionality, it simply chooses to expose it in a different way (which we'll get to shortly). AMD considers each one of these units (x,y,z,w and t) a processing unit, and the RV770 has 800 of them (the RV670 had 320).

AMD pairs four of these stream processing units (x,y,z and w) with a t-unit and puts them together as a block, which I have decided to call a Streaming Processor (SP):

The area in red is actually the SP, but unlike one of NVIDIA's SPs, one of AMD's can handle up to five instructions at the same time. The only restriction here is that all five units have to be working on the same thread.

AMD then groups 16 of these SPs into something they like to call a SIMD core (AMD has less confusing, but far worse names for its architectural elements than NVIDIA):

AMD's SIMD Core
NVIDIA's SM

A SIMD core is very similar to NVIDIA's SM with a couple of exceptions:

1) There are more SPs in AMD's SIMD Core (16 vs 8)

2) The SPs are wider and can process, at peak, 5x the number of instructions as NVIDIA's SPs

3) The Instruction and Constant caches are not included in the SIMD core, AMD places them further up the ladder.

4) AMD pairs its texture units and texture cache with its SPs at the SIMD core level, while NVIDIA does it further up the ladder.

5) See the two SFUs in NVIDIA's SM? While NVIDIA has two very fast Special Function Units in its SM, AMD equips each SP with its own SFU. It's unclear which approach is actually faster given that we don't know the instruction latency or throughput of either SFU.

Note that at this point, the RV770 is really no different than the RV670 (the GPU used in the Radeon HD 3870). The next step is where AMD and NVIDIA really diverge; while NVIDIA's GT200 takes three SMs and groups them into a Texture/Processing Cluster (TPC) and then arranging 10 TPCs on its chip, AMD simply combines 10 SIMD cores:


AMD's RV670


10 SIMD cores at your disposal in AMD's RV770, this is how AMD goes from competitive, to downright threatening


NVIDIA's GT200 Streaming Processor Array (SPA), it has fewer execution resources but more encapsulation around them, the focus here is on thread management

With 10 SIMD cores the RV770, it has 2.5x the number of execution units as a RV670. It even has more theoretical processing power than NVIDIA's GT200. If you just look at the number of concurrent instructions that can be processed on RV770 vs. GT200, the RV770's 800 execution units to GT200's 240 (+ 60 SFUs) is in a completely different league.

NVIDIA GT200 AMD RV770 AMD RV670
SP Issue Width 1-way 5-way 5-way
# of SPs 240 160 64
Worst Case Dependent Instruction Throughput 240 160 64
Maximum Scalar Instruction Throughput 480* 800 320
NVIDIA's 60 SFUs can sometimes "help" with scalar instruction throughput, in special situations of course.

We'll be talking about efficiency and resource utilization in the coming pages, but immediately you'll notice that the RV770 (like the RV670 and R600 that came before it) has the potential to be slower than NVIDIA's architectures or significantly faster, depending entirely on how instruction or thread heavy the workload is. NVIDIA's architecture prefers tons of simple threads (one thread per SP) while AMD's architecture wants instruction heavy threads (since it can work on five instructions from a single thread at once).


NVIDIA's GeForce GTX 280


AMD's Radeon HD 4870

The full GPU is pretty impressive:

1) See the Instruction and Constant Caches up top? NVIDIA includes them in each SM while AMD seems to include them outside of the SIMD core clusters.

2) The RV770 only has four 64-bit memory controllers compared to the eight in GT200

3) The Programmable Tessellator is left over from the Xbox 360's GPU (and R600/RV670), unfortunately it is unused by most developers as there is no DirectX support for it yet.

4) AMD has dedicated hardware attribute interpolators, something NVIDIA's hardware shares with its special function units (SFUs).

Other than the differences we mentioned above, AMD's architecture is similar in vain to NVIDIA's, there are just a handful of design choices that set the two apart. Just like NVIDIA took its G80/G92 architecture and made it larger, AMD did the same with RV770 - it took RV670 and more than doubled its execution resources.

AMD took a bigger leap with RV770 from RV670 than NVIDIA did from G80/G92 to GT200, but it makes sense given that AMD had to be more competitive than it even was in the last generation.

AMD's "Small-Die" Strategy That Darn Compute:Texture Ratio
Comments Locked

215 Comments

View All Comments

  • FITCamaro - Wednesday, June 25, 2008 - link

    Yes I noticed it used quite a bit at idle as well. But its load numbers were lower. And as the other guy said, they probably just are still finalizing the drivers for the new cards. I'd expect both performance and idle power consumption to improve in the next month or two.
  • derek85 - Wednesday, June 25, 2008 - link

    I think ATI is still fixing/finalizing the Power Play, it should be much lower when new Catalyst comes out.
  • shadowteam - Wednesday, June 25, 2008 - link

    If a $200 card can play all your games @ 30+fps, does a $600 card even make sense knowing it'll do no better to your eyes? I see quite a few NV biased elements in your review this time around, and what's all that about the biggest die size TSMC's every produced? GTX's die may be huge, but compared to AMD's, it's only half as efficient. Your review title, I think, was a bit harsh toward AMD. By limiting AMD's victory only up to a price point of $299, you're essentially telling consumers that NV's GTX 2xx series is actually worth the money, which is a terribly biased consumer advice in my opinion. From a $600 GX2 to a $650 GTX 280, Nvidia's actually gone backwards. You know when we talk about AMD's financial struggle, and that the company might go bust in the next few years... part of the reason why that may happen is because media fanatics try to keep things on an even keel, and in doing so they completely forget about what the consumers actually want. No offence to AT, but I've been into media myself, and I can tell when even professionals sound biased.
  • paydirt - Wednesday, June 25, 2008 - link

    You're putting words into the reviewer(s) mouth(s) and you know it. I am pretty sure most readers know that bigger isn't better in the computing world; anandtech never said big was good, they are simply pointing out the difference, duh. YOU need to keep in mind that nVidia hasn't done a die shrink yet with the GTX 2XX...

    I also did not read anything in the review that said it was worth it (or "good") to pay $600 on a GPU, did you? Nope. Thought so. Quit trying to fight the world and life might be different for you.

    I'm greatful that both companies make solid cards that are GPGPU-capable and affordable and we have sites like anandtech to break down the numbers for us.

  • shadowteam - Wednesday, June 25, 2008 - link

    Are you speaking on behalf of the reviewers? You've obviously misunderstood the whole point I was trying to make. When you say in your other post that AT is a reviews site and not a product promoter, I feel terribly sorry you because reviews sites are THE best product promoters around, including AT, and Derek pointed this out earlier that AT's too influential to ignore by companies. Well if that is truly the case, why not type in block letters how NV's trying to rip us off, for consumers' sake, may be just for once do it, it'll definitely teach Nvidia a lesson.
  • DaveninCali - Wednesday, June 25, 2008 - link

    I completely agree. Anand, the GTX 260/280 are a complete waste of money. You are not providing adequate conclusions. Your data speaks for itself. I know you have to be "friendly" in your conclusions so that you don't arouse the ire of nVidia but the launch of the 260/280 is on the order of the FX series.

    I mean you can barely test the cards in SLI mode due to the huge power constraints and the price is ABSOLUTELY ridiculous. $1300 for SLI GTX 280. $1300!!!! You can get FOUR 4870 cards for less than this. FOUR OF THEM!!!! You should be screaming how poorly the GTX 280/260 cards are at these performance numbers and price point.

    The 4870 beats the GTX 260 in all but one benchmark at $100 less. Not to mention the 4870 consumes less power than the GTX 280. Hell, the 4870 even beats the GTX 280 in some benchmarks. For $350 more, there shouldn't even be ONE game that the 4870 is better at than the GTX 280. Not even more for more than 100% of the price.

    I'm not quite sure what you are trying to convey in this article but at least the readers at Anandtech are smart enough to read the graphs for themselves. Given what has been written in the conclusion page (3/4 of it about GPGPU jargon that is totally unnecessary) could you please leave the page blank instead.

    I mean come on. Seriously! $1300 compared to $600 with much more performance coming from the 4870 SLI. COME ON!! Now I'm too angry to go to bed. :(
  • DaveninCali - Wednesday, June 25, 2008 - link

    Oh and one other thing. I thought Anandtech was a review site for the consumer. How can you not warn consumers from spending $650 much less $1300 on a piece of hardware that isn't much faster and in some cases not faster at all than another piece of hardware priced at $300/$600 in SLI. It's borderline scam.

    When you can't show SLI numbers because you can't even find a power supply that can provide the power, at least an ounce of criticism should be noted to try and stop someone from wasting all that money.

    Don't you think that consumers should be getting some better advise than this. $1300 for less performance. I feel so sad now. Time to go to sleep.
  • shadowteam - Wednesday, June 25, 2008 - link

    It reminds of that NV scam from yesteryears... I'm forgetting a good part of it, but apparently NV and "some company" racked up some forum/blog gurus to promote their BS, including a guy on AT forums who eventually got rid off due to his extremely biased posts. If AT can do biased reviews, I can pretty much assure you the rest of the reviewers out there are nothing more than just misinformed, over-arrogant media puppets. To those who disagree w/ me or the poster above, let me ask you this... if you were sent out $600 hardware every other week, or in AT's case, every other day (GTX280's from NV board partners), would you rather delightfully, and rightfully, piss NV off, or shut your big mouth to keep the hardware, and cash flowing in?
  • DerekWilson - Wednesday, June 25, 2008 - link

    Wow ...

    I'm completely surprised that you reacted the way you did.

    In our GT200 review we were very hard on NVIDIA for providing less performance than a cheaper high end part, and this time around we pointed out the fact that the 4870 actually leads the GTX 260 at 3/4 of the price.

    We have no qualms about saying anything warranted about any part no matter who makes it. There's no need to pull punches, as what we really care about are the readers and the technology. NVIDIA really can't bring anything compelling to the table in terms of price / performance or value right now. I think we did a good job of pointing that out.

    We have mixed feelings about CrossFire, as it doesn't always scale well and isn't as flexible as SLI -- hopefully this will change with R700 when it hits, but for now there are still limitations. When CrossFire does work, it does really well, and I hope AMD work this out.

    NVIDIA absolutely need to readjust the pricing of most of their line up in order to compete. If they don't then AMD's hardware will continue to get our recommendation.

    We are here because we love understanding hardware and we love talking about the hardware. Our interest is in reality and the truth of things. Sometimes we can get overly excited about some technology (just like any enthusiast can), but our recommendations always come down to value and what our readers can get from their hardware today.

    I know I can speak for Anand when I say this (cause he actually did it before his site grew into what it is today) -- we would be doing this even if we weren't being paid for it. Understanding and teaching about hardware is our passion and we put our heart and soul into it.

    there is no amount of money that could buy a review from us. no hardware vendor is off limits.

    in the past companies have tried to stop sending us hardware because they didn't like what we said. we just go out and buy it ourselves. but that's not likely to be an issue at this point.

    the size and reach of AnandTech today is such that no matter how much we piss off anyone, Intel, AMD, NVIDIA, or any of the OEMs, they can't afford to ignore us and they can't afford to not send us hardware -- they are the ones who want an need us to review their products whether we say great or horrible things about it.

    beyond that, i'm 100% sure nvidia is pissed off with this review. it is glowingly in favor of the 4870 and ... like i said ... it really shocks me that anyone would think otherwise.

    we don't favor level playing fields or being nice to companies for no reason. we'll recommend the parts that best fit a need at a price if it makes sense. Right now that's 4870 if you want to spend between $300 and $600 (for 2).

    While it's really really not worth the money, GTX 280 SLI is the fastest thing out there and some people do want to light their money on fire. Whatever.

    i'm sorry you guys feel the way you do. maybe after a good night sleep you'll come back refreshed and see the article in a new light ...
  • formulav8 - Wednesday, June 25, 2008 - link

    Even in the review you claim 4870 is a $400 performer. So why don't you reflect that in the articles title by adding it after the $300 price?? Would be better to do so I think anyways. :)

    Maybe say 4870 wins up to the $400 price point and likewise with the 4850 version up to the $250 price that you claimed in the article...

    This tweak could be helpful to some buyers out there with a specific budget and could help save them some money in the process. :)


    Jason

Log in

Don't have an account? Sign up now