Adjusting Trajectory & Slipping Schedule

Carrell didn’t believe in building big chips anymore. It wasn’t that it was too difficult, it’s that it took too long for a $600 GPU to turn into a $200 GPU. AMD believed that the most important market was the larger (both in quantity and revenue) performance mainstream segment.

Rather than making the $200 - $300 market wait for new technology, Carrell wanted to deliver it there first and then scale up/down to later address more expensive/cheaper markets.

The risk in RV770 was architecture and memory technology. The risk in RV870 was architecture and manufacturing process, the latter which was completely out of AMD’s control.

Early on Carrell believed that TSMC’s 40nm wasn’t mature enough and that when it was ready, its cost was going to be much higher than expected. While he didn’t elaborate on this at the time, Carrell told me that there was a lot of information tuning that made TSMC’s 40nm look cheaper than it ended up being. I'll touch on this more later on in the article.

Carrell reluctantly went along with the desire to build a 400+ mm2 RV870 because he believed that when engineering wakes up and realizes that this isn’t going to be cheap, they’d be having another discussion.

In early 2008, going into Februrary, TSMC started dropping hints that ATI might not want to be so aggressive on what they think 40nm is going to cost. ATI’s costs might have been, at the time, a little optimistic.

Engineering came back and said that RV870 was going to be pretty expensive and suggested looking at the configuration a second time.

Which is exactly what they did.

The team met and stuck with Rick Bergman’s compromise: the GPU had to be at least 2x RV770, but the die size had to come down. ATI changed the configuration for Cypress (high end, single GPU RV870) in March of 2008.

And here’s where the new ATI really showed itself. We had a company that had decided to both 1) not let schedule slip, and 2) stop designing the biggest GPU possible. Yet in order to preserve the second belief, it had to sacrifice the first.

You have to understand, changing a chip configuration that late in the game, 1.5 years before launch, screws everything up. By the time RV770 came out, 870 was set in stone. Any changes even a year prior to that resets a lot of clocks. You have to go back and redo floorplan and configuration, there’s a lot of adjusting that happens. It takes at least a couple of weeks, sometimes a couple of months. It impacted schedule. And ATI had to work extremely hard to minimize that where possible. The Radeon HD 5870 was around 30 - 45 days late because of this change.

Remember ATI’s nothing-messes-with-schedule policy? It took a lot of guts on the part of the engineering team and Rick Bergman to accept a month+ hit on redesigning RV870. If you don’t show up to the fight, you lose by default, and that’s exactly what ATI was risking by agreeing to a redesign of Cypress.

This is also super important to understand, because it implies that at some point, NVIDIA made a conscious decision to be late with Fermi. ATI wasn’t the only one to know when DX11/Windows 7 were coming. NVIDIA was well aware and prioritized features that delayed Fermi rather than align with this market bulge. GPUs don’t get delayed without forewarning. AMD risked being late in order to make a smaller chip, NVIDIA risked being late to make a bigger one. These two companies are diverging.


The actual RV870

Engineering was scrambling. RV870 had to be a lot smaller yet still deliver 2x the computational power of RV770. Features had to go.

The Other Train - Building a Huge RV870 Carrell Loses His Baby: Say Goodbye to Sideport
Comments Locked

132 Comments

View All Comments

  • ImmortalZ - Monday, February 15, 2010 - link

    Long time reader and lurker here.

    This article is one of the best I've read here - hell, it's one of the best I've ever read on any tech site. Reading about and getting perspective on what makes companies like ATI tick is great. Thank you and please, more!
  • tygrus - Sunday, February 14, 2010 - link

    Sequences of numbers in a logical way are easier to remember than names. The RV500, RV600 .. makes order obvious. Using multiple names within a generation of chips are confusing and not memorable. They do not convey sequence or relative complexity.

    Can you ask if AMD are analysing current games/GPGPU and future games/GPGPU to identify possible areas for improvement or skip less useful proposed design changes. Like the Intel >2% gain for <1% cost.
  • Yakk - Sunday, February 14, 2010 - link

    Excellent article! As I've read in a few other comments, this article (and one similar I'd read prior) made me register for the first time, even if I've been reading this site for many years.

    I could see why "Behind the scenes" articles can make certain companies nervous and others not, mostly based on their own "corporate culture" I'd think.

    It was a very good read, and I'm sure every engineer who worked on any given generation on GPU's could have many stories to tell about tech challenges and baffling (at the time) corporate decisions. And also a manager's side of the work in navigating corporate red tape, working with people, while delivering something worthwhile as an end product is also a huge. Having a good manager (people) with a good subject knowledge (tech) is rare, then for Corp. Execs. to know they have one is MUCH rarer still...

    If anyone at AMD/ATI read these comments, PLEASE look at the hardware division and try to implement changes to the software division to match their successes...

    (btw been using nv cards almost exclusively since the TNT days, and just got a 5870 for the first time this month. ATI Hardware I'd give an "A+", Software... hmm, I'd give it a "C". Funny thing is nv is almost the exact opposite right now)
  • Perisphetic - Sunday, February 14, 2010 - link

    Someone nominate this man for the Pulitzer Prize!

    As many have stated before, this is a fantastic article. It goes beyond extraordinary, exceptional and excellent. This has become my new benchmark for high quality computer industry related writing.
    Thank you sir.
  • ritsu - Monday, February 15, 2010 - link

    It's not exactly The Soul of a New Machine. But, fine article. It's nice to have a site willing to do this sort of work.
  • shaggart5446 - Sunday, February 14, 2010 - link

    very appreciative for this article im from ja but reading this make me file like ill go back to school thanks anand ur the best big up yeah man
  • 529th - Sunday, February 14, 2010 - link

    The little knowledge I have about the business of making a graphics card, that it was Eyefinity that stunted the stability-growth of the 5xxx drivers by the allocation of resources of the software engineers to make Eyefinity work.
  • chizow - Sunday, February 14, 2010 - link

    I usually don't care much for these fluff/PR pieces but this one was pretty entertaining, probably because there was less coverage of what the PR/Marketing guys had to say and more emphasis on the designers and engineers. Carrell sounds like a very interesting guy and a real asset to AMD, they need more innovators like him leading their company and less media exposure from PR talking heads like Chris Hook. Almost tuned out when I saw that intro pic, thankfully the article shifted focus quickly.

    As for the article itself, among the many interesting points made in there, a few that caught my eye:

    1) It sounds like some of the sacrifices made with RV870's die size help explain why it fell short of doubling RV770/790 in terms of performance scaling. Seems as if memory controllers might've also been cut as edge real estate was lost, and happen to be the most glaring case where RV870 specs weren't doubled with regard to RV770.

    2) The whole cloak and dagger bit with EyeFinity was very amusing and certainly helps give these soulless tech giants some humanity and color.

    3) Also with EyeFinity, I'd probably say Nvidia's solution will ultimately be better, as long as AMD continues to struggle with CrossFire EyeFinity support. It actually seems as if Nvidia is applying the same framebuffer splitting technology via PCIe/SLI link with their recently announced Optimus technology to Nvidia Surround, both of course lending technology from their Quadro line of cards.

    4) The discussion about fabs/yields was also very interesting and helps shed some light on some of the differences between the strategies used by both companies in the past to present. AMD has always leveraged new process technologies in the past as soon as possible, Nvidia in the past has more closely followed Intel's Tick/Tock cadence of building high-end on mature processes and teething smaller chips on new processes. That clearly changed this time around on 40nm so it'll be interesting to see what AMD does going forward. I was surprised there wasn't any discussion about why AMD hasn't looked into GlobalFoundries as their GPU foundry.

  • SuperGee - Sunday, February 14, 2010 - link

    nV eyeFinity counter solution is a fast software reaction wich is barly the same thing. You need SLI because one GPU can do only 2 active ports. That the main diference. So you depend on a more high-end platform. A SLI mobo PSU capable of feeding two Gcards. While ATI give yo 3 or 6 ou t of one GPU.
    nV can deliver something native in there next design. Equal and the possibility to be better at it. But we are still waiting for there DX11 parts. I wonder if they could slap a solution in the refresh or can do only wenn they introduce the new archtecture "GF200".

  • chizow - Monday, February 15, 2010 - link

    Actually EyeFinity's current CF problems are most likely a software problem which is why Nvidia's solution is already superior from a flexibility and scalability standpoint. They've clearly worked out the kinks of running multiple GPUs to a single frame buffer and then redistributing portions of that framebuffer to different GPU outputs.

    AMD's solution seems to have problems because output on each individual GPU is only downstream atm, so while one GPU can send frame data to a primary GPU for CF, it seems secondary GPUs have problems receiving frame data to output portions of the frame.

    Why I say Nvidia's solution is better overall is simply because the necessity of SLI will automatically decrease the chance of a poor gaming experience when gaming at triple resolutions, which is clearly a problem with some newer games and single-GPU EyeFinity. Also, if AMD was able to use multiple card display outputs, it would solve the problem of requiring a $100 active DP dongle for the 3rd output if a user doesn't have a DP capable monitor.

Log in

Don't have an account? Sign up now