Lots More Compute, a Leetle More Texturing

NVIDIA's GT200 GPU has a significant increase in computational power thanks to its 240 streaming processors, up from 128 in the previous G80 design. As a result, NVIDIA's GT200 GPU showcases a tremendous increase in transistor count over its previous generation architecture (1.4 billion up from 686 million in G80).

The increase in compute power of GT200 is not mirrored however in the increase in texture processing power. On the previous page we outlined how the Texture/Processing Clusters went from two Shader Multiprocessors to three, and how there are now a total of ten TPCs in the chip up from 8 in the GeForce 8800 GTX.

In the original G80 core, used in the GeForce 8800 GTX NVIDIA's texture block looked like this:

In each block you had 4 texture address units and 8 texture filtering units.

With the move to G92, used in the GeForce 8800 GT, 8800 GTS 512 and 9800 GTX, NVIDIA doubled the number of texture address units and achieved a 1:1 ratio of address/filtering units:

With GT200 in the GeForce GTX 280/260, NVIDIA kept the address-to-filtering ratio at 1:1 but increased the ratio of SPs to texture processors:

In the previous designs you'd have 8 address and 8 filtering units per TPC (or 16 streaming processors), in the GT200 you have the same 8 address and 8 filtering units but for a larger TPC with 24 SPs.

Here's how the specs stand up across the generations:

 NVIDIA Architecture Comparison G80 G92 GT200
Streaming Processors per TPC 16 16 24
Texture Address Units per TPC 4 8 8
Texture Filtering Units per TPC 8 8 8
Total SPs 128 128 240
Total Texture Address Units 32 64 80
Total Texture Filtering Units 64 64 80

 

For a 87.5% increase in compute, there's a mere 25% increase in texture processing power. This ratio echoes what NVIDIA has been preaching for years: that games are running more complex shaders and are not as bound by texture processing as they were in years prior. If this wasn't true then we'd see a closer to 25% increase in performance of GT200 over G80 at the same clock rather than something much greater.

It also means that GT200's performance advantage over G80 or G92 based architectures (e.g. GeForce 9800 GTX) will be determined much by how computationally bound the games we're testing are.

The ratio of increase compute/texture power in the GT200 has been evident in NVIDIA architectures for years now, dating back to the ill-fated GeForce FX. NVIDIA sacrificed memory bandwidth on the GeForce FX, equipping it with a narrow 128-bit memory bus (compared to ATI's 256-bit interface on the Radeon 9700 Pro) and instead focused on building a much more powerful compute engine. Unfortunately, the bet was the wrong one to make at the time and the GeForce FX was hardly competitive (for more reasons than just a lack of memory bandwidth), but today we're dealing in a very different world. Complex shader programs run on each pixel on the screen and there's a definite need for more compute power in today's GPUs.

An Increase in Rasterization Throughput

In addition to the 25% increase in texture processing capabilities of the GT200, NVIDIA added two more ROP partitions to the GPU. While the GeForce 8800 GTX had six ROP partitions, each capable of outputting a maximum of 4 pixels per clock, the GT200 adds two more partitions.

With eight ROP partitions the GT200 can now output a maximum of 32 pixels per clock, up from 24 pixels per clock in the GeForce 8800 GTX and 9800 GTX.

The pixel blend rate on G80/G92 was half-speed, meaning that while you could output 24 pixels per clock, you could only blend 12 pixels per clock. Thanks to the 65nm shrink and redesign, GT200 can now output and blend pixels at full speed - that's 32 pixels per clock for each.

The end result is a non-linear performance improvement in everything from anti-aliasing and fire effects to shadows on GT200. It's an evolutionary change, but that really does sum up many of the enhancements of GT200 over G80/G92.

Building NVIDIA's GT200 Derek Gets Technical: 15th Century Loom Technology Makes a Comeback
Comments Locked

108 Comments

View All Comments

  • gigahertz20 - Monday, June 16, 2008 - link

    I think these ridiculous prices and lackluster performance is just a way for them to sell more SLI motherboards, who would buy a $650 GTX 280 when you can buy two 8800GT's with a SLI mobo and get better performance? Especially now that the 8800GT's are approaching around $150.
  • crimson117 - Monday, June 16, 2008 - link

    It's only worth riding the bleeding edge when you can afford to stay there with every release. Otherwise, 12 months down the line, you have no budget left for an upgrade, while everyone else is buying new $200 cards that beat your old $600 card.

    So yeah you can buy an 8800GT or two right now, and you and me should probably do just that! But Richie Rich will be buying 2x GTX 280's, and by the time we could afford even one of those, he'll already have ordered a pair of whatever $600 cards are coming out next.
  • 7Enigma - Tuesday, June 17, 2008 - link

    Nope, the majority of these cards go to Alienware/Falcon/etc. top of the line, overpriced pre-built systems. These are for the people that blow $5k on a system every couple years, don't upgrade, might not even seriously game, they just want the best TODAY.

    They are the ones that blindly check the bottom box in every configuration for the "fastest" computer money can buy.
  • gigahertz20 - Monday, June 16, 2008 - link

    Very few people are richie rich and stay at the bleeding edge. People that are very wealthy tend not to be computer geeks and purchase their computers from Dell and what not. I'd say at least 96% of gamers out there are value oriented, these $650 cards will not sell much at all. If anything, you'll see people claim to have bought one or two of these in forums and other places, but their just lying.
  • perzy - Monday, June 16, 2008 - link

    Well I for one is waiting for Larabee. Maybee (probably) it isen' all that its cranked up to be, but I want to see.

    And what about some real powersaving Nvidia?
  • can - Monday, June 16, 2008 - link

    I wonder if you can just flash the BIOS of the 260 to get it to operate as if it were a 280...
  • 7Enigma - Tuesday, June 17, 2008 - link

    You haven't been able to do this for a long time....they learned their lessons the hard way. :)
  • Nighteye2 - Monday, June 16, 2008 - link

    Is it just me, or does this focus on compute power mean Nvidia is starting to get serious about using the GPU for physics, as well as graphics? It's also in-line with the Ageia acquisition.
  • will889 - Monday, June 16, 2008 - link

    At the point where NV has actually managed to position SLI mobos and GPU's where you actually need that much power to get decent FPS (above 30 average) from games gaming on the PC will be entirely dead to all those but the most esoteric. It would be different if there were any games worth playing or as many games as the console brethren have. I thought GPU's/cases/power supplies were supposed to become more efficient? EG smaller but faster sort of how the TV industry made TV's bigger yet smaller in footprint with way more features - not towering cases with 1200KW PSU's and 2X GTX 280 GPU's? All this in the face o drastically raised gas prices?

    Wanna impress me? How about a single GPU with the PCB size of a 7600GT/GS that's 15-25% faster than a 9800GTX that can fit into a SFF case? needing a small power supply AND able to run passively @ moderate temps. THAT would be impressive. No, Seargent Tom and his TONKA_TRUCK crew just have to show how beefy his toys can be and yank your wallet chains for said. Hell, everyone needs a Boeing 747 in their case right? cause' that's progress for those 1-2 gaming titles per years that give you 3-4 hours of enjoyable PC gaming.....

    /off box
  • ChronoReverse - Tuesday, June 17, 2008 - link

    The 4850 might actually hit that target...

Log in

Don't have an account? Sign up now