NVIDIA's 1.4 Billion Transistor GPU: GT200 Arrives as the GeForce GTX 280 & 260

Name: NVIDIA's 1.4 Billion Transistor GPU: GT200 Arrives as the GeForce GTX 280 & 260
Item: NVIDIA's 1.4 Billion Transistor GPU: GT200 Arrives as the GeForce GTX 280 & 260
Author: Anand Lal Shimpi & Derek Wilson

by Anand Lal Shimpi & Derek Wilson on June 16, 2008 9:00 AM EST

Posted in
GPUs

108 Comments | Add A Comment

108 Comments

NVIDIA's Dirty Dealing with DX10.1 and How GT200 Doesn't Support it

I know many people were hoping to see DX10.1 implemented in GT200 hardware, but that is not the case. NVIDIA has opted to skip including some of the features of DX10.1 in this generation of their architecture. We are in a situation as with DX9 where SM2.0 hardware was able to do the same things as SM3.0 hardware albeit at reduced performance or efficiency. DX10.1 does not enable a new class of graphics quality or performance, but does enable more options to developers to simplify their code and it does enhance performance when coding certain effects and features.

It's useful to point out that, in spite of the fact that NVIDIA doesn't support DX10.1 and DX10 offers no caps bits, NVIDIA does enable developers to query their driver on support for a feature. This is how they can support multisample readback and any other DX10.1 feature that they chose to expose in this manner. Sure, part of the point of DX10 was to eliminate the need for developers to worry about varying capabilities, but that doesn't mean hardware vendors can't expose those features in other ways. Supporting DX10.1 is all or nothing, but enabling features beyond DX10 that happen to be part of DX10.1 is possible, and NVIDIA has done this for multisample readback and can do it for other things.

While we would love to see NVIDIA and AMD both adopt the same featureset, just as we wish AMD had picked up SM3.0 in R4xx hardware, we can understand the decision to exclude support for the features DX10.1 requires. NVIDIA is well within reason to decide that the ROI on implementing hardware for DX10.1 is not high enough to warrant it. That's all fine and good.

But then PR, marketing and developer relations get involved and what was a simple engineering decision gets turned into something ridiculous.

We know that both G80 and R600 both supported some of the DX10.1 featureset. Our goal at the least has been to determine which, if any, features were added to GT200. We would ideally like to know what DX10.1 specific features GT200 does and does not support, but we'll take what we can get. After asking our question, this is the response we got from NVIDIA Technical Marketing:

"We support Multisample readback, which is about the only dx10.1 feature (some) developers are interested in. If we say what we can't do, ATI will try to have developers do it, which can only harm pc gaming and frustrate gamers."

The policy decision that has lead us to run into this type of response at every turn is reprehensible. Aside from being blatantly untrue at any level, it leaves us to wonder why we find ourselves even having to respond to this sort of a statement. Let's start with why NVIDIA's official position holds no water and then we'll get on to the bit about what it could mean.

The statement multisample readback is the only thing some developers are interested in is untrue: cube map arrays come in quite handy for simplifying and accelerating multiple applications. Necessary? no, but useful? yes. Separate per-MRT blend modes could become useful as deferred shading continues to evolve, and part of what would be great about supporting these features is that they allow developers and researchers to experiment. I get that not many devs will get up in arms about int16 blends, but some DX10.1 features are interesting, and, more to the point, would be even more compelling if both AMD and NVIDIA supported them.

Next, the idea that developers in collusion with ATI would actively try to harm pc gaming and frustrate gamers is false (and wreaks of paranoia). Developers are interested in doing the fastest most efficient thing to get their desired result with as little trouble to themselves as possible. If a techique makes sense, they will either take it or leave it. The goal of a developer is to make the game as enjoyable as possible for as many gamers as possible, and enabling the same experience on both AMD and NVIDIA hardware is vital. Games won't come out with either one of the two major GPU vendors unable to run the game properly because it is bad for the game and bad for the developer.

Just like NVIDIA made an engineering decision about support for DX10.1 features, every game developer must weight the ROI of implementing a specific feature or using a certain technique. With NVIDIA not supporting DX10.1, doing anything DX10.1 becomes less attractive to a developer because they need to write a DX10 code path anyway. Unless a DX10.1 code path is trivial to implement, produces the same result as DX10, and provides some benefit on hardware supporting DX10.1 there is no way it will ever make it into games. Unless there is some sort of marketing deal in place with a publisher to unbalance things which is a fundamental problem with going beyond developer relations and tech support and designing marketing campaigns based on how many games dispaly a particular hardware vendors logo.

The idea that NVIDIA is going to somehow hide the capabilities of their hardware from AMD is also naive. The competition through the use of xrays, electron microscopes and other tools of reverse engineering are going to be the first to discover all the ins and outs of how a piece of silicon works once it hits the market. NIVIDA knows AMD will study GT200 because NVIDIA knows it would be foolish for them not to have an RV670 core on their own chopping block. AMD will know how best to program GT200 before developers do and independantly of any blanket list of features we happen to publish on launch day.

So who really suffers from NVIDIA's flawed policy of silence and deception? The first to feel it are the hardware enthusiasts who love learning about hardware. Next in line are the developers because they don't even know what features NVIDIA is capable of offering. Of course, there is AMD who won't be able to sell developers on support for features that could make their hardware perform better because NVIDIA hardware doesn't support it (even if it does). Finally there are the gamers who can and will never know what could have been if a developer had easy access to just one more tool.

So why would NVIDIA take this less than honorable path? The possibilities are endless, but we're happy to help with a few suggestions. It could just be as simple as preventing AMD from getting code into games that runs well on their hardware (as may have happened with Assassin's Creed). It could be that the features NVIDIA does support are incredibly subpar in performance: just because you can do something doesn't mean you can do it well and admitting support might make them look worse than denying it. It could be that the fundamental architecture is incapable of performing certain basic functions and that reengineering from the ground up would be required for DX10.1 support.

NVIDIA insists that if it reveals it's true feature set, AMD will buy off a bunch of developers with its vast hoards of cash to enable support for DX10.1 code NVIDIA can't run. Oh wait, I'm sorry, NVIDIA is worth twice as much as AMD who is billions in debt and struggling to keep up with its competitors on the CPU and GPU side. So we ask: who do you think is more likely to start buying off developers to the detriment of the industry?

Derek's Conjecture Regarding SP Pipelining and TMT GT200 vs. G80: A Clock for Clock Comparison

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

108 Comments

View All Comments

woofermazing - Tuesday, June 17, 2008 - link
Isn't the R700 high-end model going to have a direct link between the two cores. Could be a false rumor, but i would think that would solve a lot of problems with having two GPU's on a single board, since games would see it as 1 chip instead of a Crossfire/SLI setup. And besides, why the heck does it matter what the card looks like under the cooler. If it delivers better performance than Nvidia's offering without driver headaches, I don't think most gamers are going to care.
VooDooAddict - Tuesday, June 17, 2008 - link
Why am I the only one happy about this product?

Since the release of the 8800GTX top end single GPU performance has been a little stagnant... then came the refresh (8800GT/8800GTS-512) better prices came into effect.

Now we've got the new generation, and like in years prior, the new gen single GPU card has near performance of the previous gen in SLI. Price is also similar with when NVIDIA launched the first 8800GTX.

Sure, I wish they came in at a lower price point and at less power draw. (Same complaints that we had with the original 8800GTX). Lower power and lower price will come with a refresh.

Will I be getting one? ... nahh these cheap 9600GTs, overclocked 8800GT's and 8800GTSs will be the cards I recomend till i see the refresh. But I'm still happy there's progress.

I'm hoping the refresh hits around the same time as Intel's updated quad core.
DerekWilson - Tuesday, June 17, 2008 - link
i think its neat and has very interesting technology under the hood.

but i'm not gonna spend that much money for something that doesn't deliver enough value (or even performance) compared to other solutions that are available. you pretty much reflect my own sentiment there: it's another step forward but not one that you're gonna buy.

i think people "don't like it" because of that though. it just isn't worth it right now and that's certainly valid.
greenx - Tuesday, June 17, 2008 - link
There are two ways I can look at this article.

1)First an foremost at the heart of a real gamer ticks the need for good story lines fed by characters you will never forget, held by a gameplay you will fall in love with and finally covered by graphics that will transport you to another world (kinda like when I first played FF VII on my PC).

Within the context of the world we live in today I wonder what is really going through the minds of these people selling $600+ video cards. Kinda like those $10 000+ PCs. Madness. Sure they have their market up there but I shudder to think of how much money has been poured into appeasing a select few. Furthermore for what reason? Glory? I don't know but seeing as how the average gamer is what has made the PC/Gaming scene what it is, where does a $600+ video card fit into the grand scheme of things?

2) The possibilities that these new cards open up certainly seem exciting. The comparison with intel has been justified, but considering the other alternatives out there are much further ahead in development, who is going to bypass intel/amd/etc for a GPU technology based supercomputer?
DerekWilson - Tuesday, June 17, 2008 - link
two address point 2):

developers will bypass Intel, AMD, SUN, whoever owns Cray these days, and all other HPC developers when a technology comes along that can speed up their applications by two orders of magnitude immediately on hardware that costs thousands (and in large cases millions) less to build, run and develop for.
evolucion8 - Tuesday, June 17, 2008 - link
LOL that was quite funny but incorrect as well, there's more than 4 Billion of people in China, in the future probably nVidia will launch a 4 Billion Transistors GPU hehe. It will require a Nuclear Reactor to turn it on, a and two of them to play games :D
7Enigma - Wednesday, June 18, 2008 - link
4 Billion? Did you just make that out of thin air. Latest tabs show approximately 1.4 billion (give or take a couple hundred million). The world population is only estimated at 6.6 billion, so unless 60% of the people in the world are living in China, you're clueless.

http://geography.about.com/od/populationgeography/...">http://geography.about.com/od/populationgeography/...
Bahadir - Tuesday, June 17, 2008 - link
Firstly I must say I enjoyed reading the whole article written by Anand Lal Shimpi & Derek Wilson. However, what does not make sense to me is the fact that "At most, 105 NVIDIA GT200 die can be produced on a single 300mm 65nm wafer from TSMC", but by looking at the wafer, only 95 full dies can be seen. Is this the wrong die?

Also, it is not fair to compare the die of the Penryn against the GTX 280die because Penryn's die was made in 45nm process and GTX280 was made in 65nm die. Maybe it would be fair to compare it with the Conroe (65nm) die. But well done folks for putting an excellent article together!
Anand Lal Shimpi - Tuesday, June 17, 2008 - link
Thanks for your kind words btw :) Both of us really appreciate it - same to everyone else in this thread, thanks for making a ridiculously long couple of weeks (and a VERY long night) worth it :)

-A
Anand Lal Shimpi - Tuesday, June 17, 2008 - link
You're right, there's actually a maximum of 94 usable die per wafer :)

Take care,
Anand

NVIDIA's 1.4 Billion Transistor GPU: GT200 Arrives as the GeForce GTX 280 & 260

NVIDIA's Dirty Dealing with DX10.1 and How GT200 Doesn't Support it

Post Your Comment

108 Comments

View All Comments

woofermazing - Tuesday, June 17, 2008 - link

VooDooAddict - Tuesday, June 17, 2008 - link

DerekWilson - Tuesday, June 17, 2008 - link

greenx - Tuesday, June 17, 2008 - link

DerekWilson - Tuesday, June 17, 2008 - link

evolucion8 - Tuesday, June 17, 2008 - link

7Enigma - Wednesday, June 18, 2008 - link

Bahadir - Tuesday, June 17, 2008 - link

Anand Lal Shimpi - Tuesday, June 17, 2008 - link

Anand Lal Shimpi - Tuesday, June 17, 2008 - link

Log in

Don't have an account? Sign up now