NVIDIA's Bumpy Ride: A Q4 2009 Update

Name: NVIDIA's Bumpy Ride: A Q4 2009 Update
Item: NVIDIA's Bumpy Ride: A Q4 2009 Update
Author: Anand Lal Shimpi

by Anand Lal Shimpi on October 14, 2009 12:00 AM EST

Posted in
GPUs

106 Comments | Add A Comment

106 Comments

Blhaflhvfa.

There’s a lot to talk about with regards to NVIDIA and no time for a long intro, so let’s get right to it.

At the end of our Radeon HD 5850 Review we included this update:

“Update: We went window shopping again this afternoon to see if there were any GTX 285 price changes. There weren't. In fact GTX 285 supply seems pretty low; MWave, ZipZoomFly, and Newegg only have a few models in stock. We asked NVIDIA about this, but all they had to say was "demand remains strong". Given the timing, we're still suspicious that something may be afoot.”

Less than a week later and there were stories everywhere about NVIDIA’s GT200b shortages. Fudo said that NVIDIA was unwilling to drop prices low enough to make the cards competitive. Charlie said that NVIDIA was going to abandon the high end and upper mid range graphics card markets completely.

Let’s look at what we do know. GT200b has around 1.4 billion transistors and is made at TSMC on a 55nm process. Wikipedia lists the die at 470mm^2, that’s roughly 80% the size of the original 65nm GT200 die. In either case it’s a lot bigger and still more expensive than Cypress’ 334mm^2 40nm die.

Cypress vs. GT200b die sizes to scale

NVIDIA could get into a price war with AMD, but given that both companies make their chips at the same place, and NVIDIA’s costs are higher - it’s not a war that makes sense to fight.

NVIDIA told me two things. One, that they have shared with some OEMs that they will no longer be making GT200b based products. That’s the GTX 260 all the way up to the GTX 285. The EOL (end of life) notices went out recently and they request that the OEMs submit their allocation requests asap otherwise they risk not getting any cards.

The second was that despite the EOL notices, end users should be able to purchase GeForce GTX 260, 275 and 285 cards all the way up through February of next year.

If you look carefully, neither of these statements directly supports or refutes the two articles above. NVIDIA is very clever.

NVIDIA’s explanation to me was that current GPU supplies were decided on months ago, and in light of the economy, the number of chips NVIDIA ordered from TSMC was low. Demand ended up being stronger than expected and thus you can expect supplies to be tight in the remaining months of the year and into 2010.

Board vendors have been telling us that they can’t get allocations from NVIDIA. Some are even wondering whether it makes sense to build more GTX cards for the end of this year.

If you want my opinion, it goes something like this. While RV770 caught NVIDIA off guard, Cypress did not. AMD used the extra area (and then some) allowed by the move to 40nm to double RV770, not an unpredictable move. NVIDIA knew they were going to be late with Fermi, knew how competitive Cypress would be, and made a conscious decision to cut back supply months ago rather than enter a price war with AMD.

While NVIDIA won’t publicly admit defeat, AMD clearly won this round. Obviously it makes sense to ramp down the old product in expectation of Fermi, but I don’t see Fermi with any real availability this year. We may see a launch with performance data in 2009, but I’d expect availability in 2010.

While NVIDIA just launched its first 40nm DX10.1 parts, AMD just launched $120 DX11 cards

Regardless of how you want to phrase it, there will be lower than normal supplies of GT200 cards in the market this quarter. With higher costs than AMD per card and better performance from AMD’s DX11 parts, would you expect things to be any different?

Things Get Better Next Year

NVIDIA launched GT200 on too old of a process (65nm) and they were thus too late to move to 55nm. Bumpgate happened. Then we had the issues with 40nm at TSMC and Fermi’s delays. In short, it hasn’t been the best 12 months for NVIDIA. Next year, there’s reason to be optimistic though.

When Fermi does launch, everything from that point should theoretically be smooth sailing. There aren’t any process transitions in 2010, it’s all about execution at that point and how quickly can NVIDIA get Fermi derivatives out the door. AMD will have virtually its entire product stack out by the time NVIDIA ships Fermi in quantities, but NVIDIA should have competitive product out in 2010. AMD wins the first half of the DX11 race, the second half will be a bit more challenging.

If anything, NVIDIA has proved to be a resilient company. Other than Intel, I don’t know of any company that could’ve recovered from NV30. The real question is how strong will Fermi 2 be? Stumble twice and you’re shaken, do it a third time and you’re likely to fall.

Chipsets: One Day You're In and the Next, You're Out

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

106 Comments

View All Comments

AnandThenMan - Wednesday, October 14, 2009 - link
Leave it to Scali to regurgitate the same old same old.
TGressus - Wednesday, October 14, 2009 - link
It's always the same, man. When ATI/AMD is down people get interested in their comeback story too.

I've always wondered why people bother to "take a side". How'd that work out with Blu-Ray? Purchased many BD-R DL recently?

Personally, I'd like to see more CPU and GPU companies. Not less.
Scali - Thursday, October 15, 2009 - link
What comeback story?
My point was that it wouldn't be the first time that the bigger, more expensive GPU was the best bang for the buck.
It isn't about taking sides or comebacks at all.
I'm interested in Fermi because I'm a technology enthusiast and developer. It sounds like an incredible architecture. It has nothing to do with the fact that it happens to have the 'nVidia' brand attached to it. If it was AMD that came up with this architecture, I'd be equally interested.
But let's just view it from a neutral, technical point of view. AMD didn't do all that much to its architecture this time, apart from extending it to support the full DX11 featureset. It will not do C++, it doesn't have a new cache hierarchy approach, it won't be able to run multiple kernels concurrently, etc etc. There just isn't as much to be excited about.
Intel however... now their Larrabee is also really cool. I'm excited to see what that is going to lead to. I just like companies that go off the beaten path and try new approaches, take risks. That's why I'm an enthusiast. I like new technology.
At the end of the day, if both Fermi and Larrabee fail, I'll just buy a Radeon. Boring, but safe.
Scali - Wednesday, October 14, 2009 - link
"Fermi devotes a significant portion of its die to features that are designed for a market that currently isn’t generating much revenue."

The word 'devotes' is in sharp contrast with what Fermi aims to achieve: a more generic programmable processor.
In a generic processor, you don't really 'devote' anything to anything, your execution resources are just flexible and can be used for many tasks.
Even today's designs from nVidia do the same. The execution units can be used for standard D3D/OpenGL rendering, but they can also be used for PhysX (gaming market), video encoding (different market), Folding@Home (different market again), PhotoShop (another different market), HPC (yet another market), to name but a few things.
So 'devoted', and 'designed for a market'? Hardly.
Sure, the gaming market may generate the most revenue, but nVidia is starting to tap into all these other markets now. It's just added revenue, as long as the gaming performance doesn't suffer. And I don't see any reason for Fermi's gaming performance to suffer. I think nVidia's next generation is going to outperform AMD's offerings by a margin.
wumpus - Thursday, October 15, 2009 - link
Go back and read the white paper. Nvidia plans to produce a chip that computes roughly half the double floating point multiplies as it can produce single point. This means that they have doubled the amount of transistors in the multipliers so that they can keep up with the rest of the chip in double mode (1 double or two singles both produce 8 bytes that need to be routed around the chip).

There is no way to deny that this takes more transistors. Simply put if each letter represents 16 bits two singles represent:

(a0)(a1)*(b0)(b1)=16*(a0b0)+8*(a0b1)+8*(a1b0)+(a1)(b1)
(c0)(c1)*(d0)(d1)=16*(c0d0)+8*(c0d1)+8*(c1d0)+(c1d1)

But if you have to multiply one double you get

(a0)(a1)(a2)(a3)*(b0)(b1)(b2)(b3)=
4096*(a0b0)(a0b1)(a0b2)(a0b3)
+256*(a1b0)(a1b1)(a1b2)(a1b3)
+16*(a2b0)(a2b1)(a2b2)(a2b3)
+(a3b0)(a3b1)(a3b2)(a3b3)

Which works to twice the work. Of course, the entire chip isn't multipliers, but they make up a huge chunk. Somehow I don't think either ATI nor nvidia are going to say exactly what percentage of the chip is made up by multipliers. I do expect that it is steadily going down and if such arrays keep being made, they will all eventually use double precision (and possibly full ieee754 with all the rounding that entails).
Scali - Saturday, October 17, 2009 - link
My point is that the transistors aren't 'dedicated' to DP.
They just make each single unit capable of both SP and DP. So the same logic that is used for DP is also re-used for SP, and as such the unit isn't dedicated. It's multi-functional.

Besides, they probably didn't just double up the transistorcount to get from SP to DP.
I think it's more likely that they'll use a scheme like Intel's SSE units. In Intel's case you can either process 4 packed SP floats in parallel, or 2 packed DP floats, with the same unit. This would also make it more logical why the difference in speed is a factor 2.
Namely, if you take the x87 unit, it can always process only one number at a time, but SP isn't twice as fast as DP. Since you always use a full DP unit, SP only benefits from early-out, which doesn't gain that much on most operations (eg add/sub/mul).
So I don't think that Fermi is just a bunch of full DP ALUs which will run with 'half the transistors' when doing SP math. Rather, I think they will just 'split' the DP units in some clever way that they can process two SP numbers at a time (or fuse two SP units to process one DP number, however you look at it). This only requires you to double up a relatively small part of the logic, you split up your internal registers.
Zool - Wednesday, October 14, 2009 - link
Maybe but you forget one thing. Ati could pull out without problem a 5890 (with faster clocks and maybe 384bit memory) in Q1 2010 or a whole new chip somewhere in Q2 2010.
So it doesnt change the fact that they are late. In this position it will be hard for nvidia if ati can make always the first move.
Scali - Wednesday, October 14, 2009 - link
A 5890 doesn't necessarily have to be faster than Fermi. AMD's current architecture isn't THAT strong. It's the fastest GPU on the market, then again, it's the only high-end GPU that leverages 40 nm and GDDR5. So it's not all that surprising.
Fermi will not only leverage 40 nm and GDDR5, but also aim at a scale above AMD's architecture.

AMD may make the first move, but it doesn't have to be the better move.
Assuming Fermi performance is in order, I very much believe that nVidia made the right move. Where AMD just patched up their DX10.1 architecture to support DX11 features, nVidia goes way beyond DX11 with an entirely new architecture.
The only thing that could go wrong with Fermi is that it doesn't perform well enough, but it's too early to say anything about that now. Other than that, Fermi will mark a considerable technological lead of nVidia over AMD.
tamalero - Sunday, October 18, 2009 - link
and you know this.... based on what facts?
the "can of whoopass" from nvidia's marketting?
AnandThenMan - Wednesday, October 14, 2009 - link
"The only thing that could go wrong with Fermi is that it doesn't perform well enough"

Really? You really believe that? So if it has a monstrous power draw, extremely expensive, 6 months late, (even longer for scaled down parts) low yields etc. that's a-okay? Not to mention a new architecture always has software challenges to make the most of it.

NVIDIA's Bumpy Ride: A Q4 2009 Update

Things Get Better Next Year

Post Your Comment

106 Comments

View All Comments

AnandThenMan - Wednesday, October 14, 2009 - link

TGressus - Wednesday, October 14, 2009 - link

Scali - Thursday, October 15, 2009 - link

Scali - Wednesday, October 14, 2009 - link

wumpus - Thursday, October 15, 2009 - link

Scali - Saturday, October 17, 2009 - link

Zool - Wednesday, October 14, 2009 - link

Scali - Wednesday, October 14, 2009 - link

tamalero - Sunday, October 18, 2009 - link

AnandThenMan - Wednesday, October 14, 2009 - link

Log in

Don't have an account? Sign up now