This was the email that set it off:

Hi Anand,

You have an appointment with Carrell Killebrew at 3pm tomorrow at ATI Santa Clara - he's going to give you the background on what REALLY went on behind 770. He'll meet you in the lobby on the 5th floor.

Chris

The email was from Chris Hook, PR at AMD, I’d worked with him for years and at ATI before the acquisition. I’ve always given him a hard time for trying to spin me, for doing a great job of hosting parties but a terrible job of putting me face to face with the brightest engineers.


Chris Hook seems terribly uninterested in whatever is coming out of my mouth at this dinner years ago :)

Lately Chris has been on a quest to prove me wrong. He gets that I don’t care about the parties or the exotic destinations that AMD usually hosts its press events at, I just want the product and the engineers. Earlier this year Chris let one engineer out of the bag and we had a great conversation about AMD’s manufacturing and packaging technologies (yeah, I’m a boring date). He gained a bit of trust with that interaction, so when he sent me the email above my ears perked.

I made my way back to ATI Santa Clara for the 3PM meeting and as I exited the elevator I heard “Anand?” from behind me. I didn’t recognize any of the men there but that’s not too unusual, in my old age remembering all of the faces is getting difficult, after all I’ve been doing this for nearly 12 years now. Thankfully this wasn’t one of those cases of forgotten identities, the man who I’d soon find out was Carrell Killebrew simply recognized me from a picture. What picture? I have no idea, perhaps AMD keeps pictures of Derek, Gary and myself on walls to know who to be angry at.

We walked around 30 feet into a small room with a table and some chairs, there was a speakerphone in the middle of the table. In the room was myself, Carrell Killebrew, Eric Demers, Mike Schmit and Mark Leather.

Most of these people I’d never met before, although I had heard their names. AMD, and ATI before the acquisition, had historically done a terrible job of giving us access to their smartest people. At best we’d get people in technical marketing, but very rarely the lead architects or any Fellows (read: certified genius title). That day however, on my day off, I found myself in a room with AMD Fellow after Fellow, smart guy after smart guy...and not a single member of AMD PR to muzzle the engineers.

To appreciate Carrell you have to understand that most of the people we talk to about GPUs are there to market us, and do so in a very markety tone. These briefings normally start out with some slides on the lay of the land, talking about how gaming is important, then there’s some architecture talk, a bit about the cards, some performance data that we don’t pay attention to and then a couple of conclusion slides. For a company that builds products that let you blow off peoples’ heads and watch the whole thing in greater fidelity, the way they talk to us about product is pretty lame.

Carrell, was different. Carrell Killebrew was the engineering lead on RV770, the GPU behind the Radeon HD 4800 series, and he was exactly the type of person you’d expect to be lead engineer on a product used to play video games, ridiculously fun, video games.

Carrell started the conversation off by saying that everything he was about to tell me would be on record, and he was assuming that no one had any objections to that. This was going to be good.

He asked me what I’d like to talk about and he offered some choices. We could talk about future GPU trends and architectures, we could talk about GPU accelerated video transcoding or he, along with the rest of the group, could give me the back story on RV770.

Carrell’s final option piqued my interest, I hadn’t really thought about it. When RV770 launched in the summer we took for granted that it was a great part, it upset NVIDIA’s pricing structure and gave us value at $200 and $300. We went through the architecture of the Radeon HD 4800 series and looked at performance, but I spent only a page or so talking about AMD’s small-die strategy that ultimately resulted in the RV770 GPU. AMD had spent much of the past 8 years building bigger and bigger GPUs yet with the RV770 AMD reversed the trend, and I didn’t even catch it. I casually mentioned it, talked about how it was a different approach than the one NVIDIA took, but I didn’t dig deeper.

Normally when a manufacturer like AMD tells me they did something, I ask why. When Intel introduced me to Nehalem’s cache architecture, I asked why and later published my findings. And for the most part, with every aspect of the Radeon HD 4800’s architecture, we did the same. Derek Wilson and I spent several hours on the phone and in emails back and forth with AMD trying to wrap our heads around the RV770’s architecture so that we could do it justice in our reviews. But both of us all but ignored the biggest part of RV770: the decision that led to making GPU itself.

This is a tough article for me to write, there are no graphs, no charts, no architecture to analyze. I simply got to sit in that room and listen as these individuals, these engineers shared with me over the course of two hours the past three years of their lives. I want to do it justice, and I hope that I can, because what they conveyed to me in that room was the best meeting I’d ever had with AMD or ATI.

The Beginning: The Shot Heard Around the World
Comments Locked

116 Comments

View All Comments

  • Sahrin - Monday, January 25, 2010 - link

    Anand,

    I love this piece. Not sure if you'll get notified, but while doing some research on the performance of Hybrid Crossfire, I came back - it was interesting to see the tone of the piece, and hear about the guys at ATI talking vageuly about what would become the 5870. Fascinating stuff, I've got to put a bookmark in my calendar to remind me to come back to this next year when RV970 is released (pending no further difficulties).

    Any chance of a follow-up piece with the guys in SC?
  • caldran - Wednesday, December 24, 2008 - link

    the gpu industry is squeezing more and more transistors(SP s or what ever) .it would be energy efficient if it could disable some cores when there is less load than reducing clock frequency and 2D mode.just like in the latest AMD processor.a HD 4350 would consume power less than HD 4850 in IDLE right.
  • bupkus - Wednesday, December 10, 2008 - link

    I couldn't put it down until I had finished.

    Extremely enjoyable write!
  • yacoub - Tuesday, December 9, 2008 - link

    "that R580 would be similar in vain"

    You want vein. Not vain, not vane. Vein. =)
  • CEO Ballmer - Sunday, December 7, 2008 - link

    You people don't mention their alliance with MS!

    http://fakesteveballmer.blogspot.com">http://fakesteveballmer.blogspot.com
  • BoFox - Sunday, December 7, 2008 - link

    LOL!!!!!
  • BoFox - Sunday, December 7, 2008 - link

    Great article--a nice read!

    However...

    From how I remember history:

    In 2006, when the legendary X1900XTX took the world by surprise, actually beating the scarce and coveted 7800GTX-512, I bought it. It was king of the hill from January 2006 until the 7950GX2 stole the crown back for the fastest "single-slot" solution about 6 months later around June 2006, only a few months after the smaller 90nm 7900GTX was *finally* released in April 2006. Everybody started hailing Nvidia again although it was really an SLI dual-gpu solution sandwiched into one PCI-E slot. Perhaps it was the quad-gpu thingy that sounded so cool. It was obviously over-hyped but really took the attention away from ATI.

    GDDR4 on the X1950XTX hardly did any good, since it was a bit late (Sept 2006) with only like 3-4 performance increase over the X1900. Well then the 8800GTX came in Nov 2006 and had a similar impact that the 9700Pro had.

    As everybody wanted to see how the R600 would do, it was delayed, and disappointed hugely in June 2007. The 8800GTX/Ultra kept on selling for around $600 for nearly 12 months straight, making history. 80nm just did not cut it for the R600, so ATI wanted to have its dual-GPU single card REVENGE against Nvidia. And it would be even better this time since it's done on a single PCB, not a sandwiched solution like Nvidia's 7900GX2. Hence the tiny RV770 chips made on unexpected 55nm process! The 3870X2 did beat the 8800GTX in most reviews, but had to use Crossfire just like with SLI. Also, the 3870X2 only used GDDR3, unlike the single 3870 with fast GDDR4.

    But Nvidia still took the attention away from the 3870 series by tossing an 8800GT up for grabs. When the 3870X2 came out in Jan 2008, Nvidia touted its upcoming 9800GX2 (to be released one month afterwards). So, Nvidia stopped ATI with an ace up its sleeve.

    Round 2 for ATI's revenge: The 4870X2. And it worked this time! There was no way that Nvidia could expect the 4870 to be *that much* better than the 3870. Everybody was saying the 4870 would be 50% faster, and Nvidia yawned at that, thinking that the 4870 still couldnt touch the 9800GTX or 9800GX2 when crossfired. Plus Nvidia expected the 4870 to still have the "AA bug" since the 3870 did not fix it from the 2900XT, and the 4870 had a similar architecture. Boy, Nvidia was all wrong there! The 4870 actually ended up being *50%* faster than the 9800GTX in some games.

    So, now ATI has earned its vengeance with its single-slot dual-GPU solution that Nvidia had with its 7900GX2 and 9800GX2 a while ago. With the 4870X2 destroying the GTX 280, ATI does indeed have its crown or "halo".

    Unfortunately, Quad-crossfire hardly does well against the GTX 280 in SLI. We now know that quad-GPU solutions give a far lower "bang-per-GPU" due to poor driver optimizations, etc.. So most enthusiast gamers with the money and a 2560x1600 monitor are running two GTX 280's right now instead of two 4870X2's.. oh well!

    One thing not mentioned about GDDR5 is that it eats power like mad! The memory alone consumes 40W, even at idle, and that is one of the reasons why the 4870 does not idle so well. If ATI reduces the speed low enough, it messes up the Aero graphics in Vista. It would have been nice if ATI released an intermediate 4860 version with GDDR4 memory at 2800+MHz effective.

    Now, I cannot even start to expect what the RV870 will be like. I think Nvidia is going to really want its own revenge this time around, being so financially hurt with the whole 9800 - GTX 200 range plus being unable to release a 55nm version of G200 to this day. Nvidia just cannot beat the 4870X2 with a dual G200 on 55nm, and this is the reason for the re-spins (delays) with an attempt to reduce the power consumption while maintaining the necessary clock speed. Pardon me for pointing out the obvious...

    Hope my mini-article was a nice supplement to the main article! :)
  • CarrellK - Sunday, December 7, 2008 - link

    Not bad at all.

    BTW, 55nm has less to do with how good the RV770 is than the re-architecture & re-design our engineers did post-RV670.

    To illustrate, scale the RV770 from 55nm to 65nm (only core scales, not pads & analog) and see how big it is. Now compare that to anything else in 65nm.

    Pretty darned good engineers I'd say.

  • BoFox - Sunday, December 7, 2008 - link

    True, and nowhere in the article was it pointed out that since the AA algorithm relied on the shaders, simply upping the shader units from 320 to a whopping 800 completely solved the weak AA performance that plagued 2900's and 3870's. It did not cost too much chip die size or power consumption either. ATI certainly did design the R600 with the future in mind (by moving AA to the shader units, with future expansion). Now the 4870 does amazing well with 8x FSAA, even beating the GTX 280 in some games.

    I wanted to edit my above post by saying that the dual G200 needed to have low enough power consumption so that it could still be cooled effectively in a single-slot sandwich cooling solution. The 4870X2 has a dual-slot cooler, but Nvidia just cannot engineer the G200 on a single PCB with the architecture that they are currently using (monster chip die size, and 16 memory chips that scales with 448-bit to 512-bit bandwidth instead of using 8 memory chips with 512-bit bandwidth). That is why Nvidia must make the move to GDDR5 memory, or else re-design the memory architecture to a greater degree. Just my thoughts... I still have no idea what we'll be seeing in 2009!

  • papapapapapapapababy - Saturday, December 6, 2008 - link

    more like uber mega retards, right? if they are so smart... why do they keep making such terrible, horrible, shitty drivers?

    why?

    i really really, really want to buy a 4850, i really do. but im not going to do it. im going to go and buy the 9800gt. And i know is just a re branded 8800gt. And i know nvidia is making shitty @ explosive hardware ( my 8600gt just died) And i know that gpu is slower, older, oced 65nm tech. And that nvidia is pushing gimmicky tricks " physics" and buying devs. but guess what? NVIDIA = good, clean drivers. New game? New drivers there. Fast. UN-Bloated drivers, that work, is is that hard ati? Really. or maybe you guys just suck?

    Im going to pick all tech@ because of that. Thats how much i fkn hate your bloated and retarded drivers ATI. Install ms, framework for a broken control panel? stupid. And whats up with all those unnecessary services eating my memory and cpu cycles? ATI Hotkey poller?, ATI Smart?, ATI2EVXX.exe, ATI2EVXX.exe,NET 2.0 ? always there and the damm thing takes forever to load? Nvidia dsnt use any bloated crap, so why do you feel entitled to polute my pc with your bloated drivers?

    AGAIN HORRIBLE DRIVERS ATI! I DONT WANT A SINGLE EXTRA SERVICE! i just build a pc for a friend. I choose the hd4670, beautiful card, really cool, fast, efficient. I love it. I want one for myself. But the drivers? ARg, i ended up using just the display driver and still the memory consumption was utterly retarded compared to my nvidia card.

    so geniuses? move your asses and fix your drivers.

    thanks, and good job with your hard.

Log in

Don't have an account? Sign up now