Introduction

Imagine if getting the support for current generation graphics technology didn't require spending more than $79. Sure, performance wouldn't be good at all, and resolution would be limited to the lower end. But the latest games would all run with the latest features. All the excellent water effects in Half-life 2 would be there. Far Cry would run in all its SM 3.0 glory. Any game coming out until a good year into the DirectX 10 timeframe would run (albeit slowly) feature-complete on your impressively cheap card.

A solution like this isn't targeted at the hardcore gamer, but at the general purpose user. This is the solution that keeps people from buying hardware that's obsolete before they get it home. The idea is that being cheap doesn't need to translate to being "behind the times" in technology. This gives casual consumers the ability to see what having a "real" graphics card is like. Games will look much better running on a full DX9 SM 3.0 part that "supports" 128MB of RAM (we'll talk about that later) than on an Intel integrated solution. Shipping higher volume with cheaper cards and getting more people into gaming translates to raising the bar on the minimum requirements for game developers. The sooner NVIDIA and ATI can get current generation parts into the game-buying world's hands, the sooner all game developers can write games for DX9 hardware at a base level rather than as an extra.

In the past, we've seen parts like the GeForce 4 MX, which was just a repackaged GeForce 2. Even today, we have the X300 and X600, which are based on the R3xx architecture, but share the naming convention of the R4xx. It really is refreshing to see NVIDIA take a stand and create a product lineup that can run games the same way from the top of the line to the cheapest card out there (the only difference being speed and the performance hit of applying filtering). We hope (if this part ends up doing well and finding a good price point for its level of performance) that NVIDIA will continue to maintain this level of continuity through future chip generations. We hope that ATI will follow suit with their lineup next time around. Relying on previous generation higher end parts to fulfill current lower end needs is not something that we want to see as long term.

We've actually already taken a look at the part that NVIDIA will be bringing out in two new flavors. The 3 vertex/4 pixel/2 ROP GeForce 6200 that came out only a couple months ago is being augmented by two lower performance versions, both bearing the moniker GeForce 6200 with TurboCache.



It's passively cooled, as we can see. The single memory module of this board is peeking out from beneath the heatsink on the upper right. NVIDIA has indicated that a higher performance version of the 6200 with TurboCache will follow to replace the current shipping 6200 models. Though better than non-existent parts such as the X700 XT, we would rather not see short-lived products hit the market. In the end, such anomalies only serve to waste the time of NVIDIA's partners and confuse customers.

For now, the two parts that we can expect to see will be differentiated by their memory bandwidth. The part priced at "under $129" will be a "13.6 GB/s" setup, while the "under $99" card will sport "10.8 GB/s" of bandwidth. Both will have core and memory clocks at 350/350. The interesting part is the bandwidth figure. On both counts, 8 GB/s of that bandwidth comes from the PCI Express bus. For the 10.8 GB/s part, the extra 2.8 GB/s comes from 16MB of local memory connected on a single 32bit channel running at a 700MHz data rate. The 13.6 GB/s version of the 6200 with TurboCache just gets an extra 32bit channel with another 16MB of RAM. We've seen pictures of boards with 64MBs of onboard RAM, pushing bandwidth way up. We don't know when we'll see a 64MB product ship, or what the pricing would look like.

So, to put it all together, either 112 or 96 MB of framebuffer is stored in system RAM and accessed via the PCI Express bus. Local graphics RAM holds the front buffer (what's currently on screen) and other high priority (low latency) data. If more than local graphics memory is needed, it is allocated dynamically from system RAM. The local graphics memory that is not set aside for high priority tasks is then used as a sort of software managed cache. And thus, the name of the product is born.

The new technology here is allowing writes directly from the GPU to system RAM. We've been able to perform reads from system RAM for quite some time, though technologies like AGP texturing were slow and never delivered on their promises. With a few exceptions, the GPU is able to see system RAM as a normal framebuffer, which is very impressive for PCI Express and current memory technology.

But it's never that simple. There are some very interesting problems to deal with when using system RAM as a framebuffer; this is not simply a driver-based software solution. The foremost and ever pressing issue is latency. Going from the GPU, across the PCI Express bus, through the memory controller, into the System RAM, and all the way back is a very long, round trip. Considering the fact that graphics cards are used to having instant access to data, something is going to have to give. And sure, the PCI Express bus may be 8 GB/s (4 up and 4 down, but it's less if you talk about actual utilization), but we are only going to be getting 6.4 GB/s out of the RAM. And that's if we are talking zero CPU utilization of memory and nothing else going on in the system, only what we're doing with the graphics card.

Let's take a closer look at why anyone would want to use system RAM as a framebuffer, and how NVIDIA has tried to solve the problems that lie within.

UPDATE: We got an email in our inbox from NVIDIA updating us on a change they have made to the naming of their TurboCache products. It seems they have listened to us and are including physical memory sizes on marketing/packaging. Here's what product names will look like:

GeForce 6200 w/ TurboCache supporting 128MB, including 16MB of local TurboCache: $79
GeForce 6200 w/ TurboCache supporting 128MB, including 32MB of local TurboCache: $99
GeForce 6200 w/ TurboCache supporting 256MB, including 64MB of local TurboCache: $129
We were off on pricing a little bit, as the $129 figure we heard was actually for the 64MB/256MB part, and the 64-bit version we tested (which supports only 128MB) actually hits the price point we are looking for.

Architecting for Latency Hiding
POST A COMMENT

43 Comments

View All Comments

  • PrinceGaz - Thursday, December 16, 2004 - link

    #28- see page 2 of the article, the text just above the diagram near the bottom of the page "Even on the 915 chipset from Intel, bandwith is limited across the PCI Express bus. Rather than a full 4GB/s up and 4GB/s down, Intel offers only 3GB/s up and 1GB/s down..."

    #25- I'd also always assumed that all PCIe x16 sockets could support 4GB/s both ways, this is the first time I've heard otherwise. And it isn't even 4/1, it's 3/1 according to the info given.

    Derek- is this limited PCIe x16 bandwidth common to all chipsets?
    Reply
  • DerekWilson - Thursday, December 16, 2004 - link

    We tested the 32MB 64-bit $99 version of the card that "supports" a 128MB framebuffer.

    #31 is correct -- the maximum of 112 of 96 (or 192 for the 256 MB version) of system RAM is not staticly mapped. It's always avalable to the system under 2D operation. Under 3D, it's not likely that the entire framebuffer would be absolutely full at any given time anyway.
    Reply
  • Alphafox78 - Thursday, December 16, 2004 - link

    doesnt it dynamicly allocate the extra memory it needs? so this would just affect games then if it needed more, not regular apps that done need lots of video memory. Reply
  • rqle - Thursday, December 16, 2004 - link

    so total cost of these card is the card price + (price of 128MB worth of DDR at the time)? Reply
  • Maverick2002 - Thursday, December 16, 2004 - link

    I'm likewise confused. At the end of the review they say:

    "There will also be a 64MB 64-bit TC part (supporting 256MB) available for $129 coming down the pipeline at some point, though we don't have that part in our labs just yet."

    Didn't they just test this card???
    Reply
  • KalTorak - Thursday, December 16, 2004 - link

    #25 - huh? (I have no idea what that term means in the context of PCIe, and I know PCIe pretty well...) Reply
  • KayKay - Thursday, December 16, 2004 - link

    I think this is a good product, i think it could be a very good part for companies like dell, if they include it into their systems. cheaper than the x300se's they currently include, but better performance, and will appeal to that type of customer Reply
  • mczak - Wednesday, December 15, 2004 - link

    #24, from the description it sounds like for the radeon igp there is no problem with both using sideport and system memory simultaneously for directly rendering into (the interleaved mode exactly sounds like part of all buffers would be allocated in system memory, though maybe that's not what is meant). Reply
  • IntelUser2000 - Wednesday, December 15, 2004 - link

    WTF!! I never new Intel's 915 chipsets used 4/1GB implementation of PCI Express!! Even Anandtech's own article didn't say that they said 4/4. Reply
  • DerekWilson - Wednesday, December 15, 2004 - link

    As far as I understand Hypermemory, it is not capable of rendering directly to system memory.

    Also, when Hypermemory needs to go to allocate system RAM for anything, there is a very noticeable performance hit.

    We tested the 16MB/32-bit and the 32MB/64-bit

    The 64MB version available is only 64-bit ... NVIDIA uses four 8M x 16 memory chips.
    Reply

Log in

Don't have an account? Sign up now