It's Really Not Scanline Interleaving

So, how does this thing actually work? Well, when NVIDIA was designing NV4x, they decided it would be a good idea to include a section on the chip designed specifically to communicate with another GPU in order to share rendering duties. Through a combination of this block of transistors, the connection on the video card, and a bit of software, NVIDIA is able to leverage the power of two GPUs at a time.




NV40 core with SLI section highlighted.


As the title of this section should indicate, NVIDIA SLI is not Scanline Interleaving. The choice of this moniker by NVIDIA is due to ownership and marketing. When they acquired 3dfx, the rights to the SLI name went along with it. In its day, SLI was very well known for combining the power of two 3d accelerators. The technology had to do with rendering even scanlines on one GPU and odd scanlines on another. The analog output of both GPUs was then combined (generally via a network of pass through cables) to produce a final signal to send to the monitor. Love it or hate it, it's a very interesting marketing choice on NVIDIA's part, and the new technology has nothing to do with its namesake. Here's what's really going on.

First, software (presumably in the driver) analyses what's going on in the scene currently being rendered and divides for the GPUs. The goal of this (patent-pending) load balancing software is to split the work 50/50 based on the amount of rendering power it will take. It might not be that each card renders 50% of the final image, but it should be that it takes each card the same amount of time to finish rendering its part of the scene (be it larger or smaller than the part the other GPU tackled). In the presentation that NVIDIA sent us, they diagramed how this might work for one frame of 3dmark's nature scene.




This shows one GPU rendering the majority of the less complex portion of a scene.


Since the work is split on the way from the software to the hardware, everything from geometry and vertex processing to pixel shading and anisotropic filtering is divided between the GPUs. This is a step up from the original SLI, which just split the pixel pushing power of the chips.

If you'll remember, Alienware was working on a multiple graphics card solution that, to this point, resembles what NVIDIA is doing. But rather than scan out and use pass through connections or some sort of signal combiner (as is the impression that we currently have of the Alienware solution), NVIDIA is able to send the rendered data digitally over the SLI (Scalable Link Interface) from the slave GPU to the master for compositing and final scan out.




Here, the master GPU has the data from the slave for rendering.


For now, as we don't have anything to test, this is mostly academic. But unless their SLI has an extremely high bandwidth, half of a 2048x1536 scene rendered into a floating point framebuffer will be tough to handle. More normally used resolutions and pixel formats will most likely not be a problem, especially as scenes increase in complexity and rendering time (rather than the time it takes to move pixels) dominates the time it takes to get from software to the monitor. We are really anxious to get our hands on hardware and see just how it responds to these types of situations. We would also like to learn (though testing may be difficult) whether the load balancing software takes into account the time it would take to transfer data from the slave to the master.

Scalable Link Interface Final Words
Comments Locked

40 Comments

View All Comments

  • Falloutboy525 - Tuesday, June 29, 2004 - link

    acutally I wouldn't be surprised if one of the board manufactures puts 2 cores on one card. but man just thinking aboutt he physical size of the card gives me nightmares.
  • Pumpkinierre - Tuesday, June 29, 2004 - link

    The last of the SLI Voodoo2s had a dual gpu on a single board for one PCI slot. I cant see why the same couldnt be done for a dual 6800 gpu board on a single x16 PCIe slot which is nowhere near saturation with current gpus. Load balancing would be accomplished on board. In fact, they could do it on AGP 8x as well. They could extend this to multiple gpu (also possible on a 3x 16PCIe slot mobo (+ 3slot bridge) if it ever came out. Just think of the cooling with a Prescott cpu thrown in! Put a Vapochill to room temperature!

    Backward daisy chaining of components is a great idea but I doubt whether the greed of manufacturers will let it happen. The concept should not be limited to gpus but extend to mobo/cpus as well. A high speed link bus(Hypertransport perhaps, but not I2B) should allow systems to act as multiple processor system albeit with a little added latency. With parallel processing and multithreads around the corner, it would be useful to those who detest the enormous waste in the IT industry.
  • quanta - Tuesday, June 29, 2004 - link

    Actually, NFactor, GeForce 6800's dedicated video codec is a step behind from ATI's videoshader. It adds transistor counts for things that can already be done by 3D core. As for power consumption goes, we only have NVIDIA's word for lower power requirement, but consider ATI also use videoshader for mobile parts, I suspect NVIDIA's claim only applies to NVIDIA's own products rather than ATI's.

    As for multiprocessing goes, ATI better catchup. After all, not every gamer can affort Evans & Sutherland simFUSION cards.
  • Phiro - Tuesday, June 29, 2004 - link

    Yes, but there's the economy of scale. Nvidia has a "single" production line churning out the nv4x chipset and they package them accordingly to their price point - no major modifications required.

    The 6800U & x800XT don't really qualify as a "halo" products - they are a high-end version of the *same product the majority of users buy.
  • klah - Tuesday, June 29, 2004 - link

    "And the whole "alienware sells 30k systems a year so there is a market for this" - 30k video cards a year is less than a drop in the bucket for the R&D spent on putting this together."

    The same could be said for the 6800U and x800XT. 99.9% of cards purchased will be sub-$200, so why bother with $500+ units? It's called a halo product. They are not built to make money. They are built for bragging rights and to generate a positive brand image. The 'buzz' this product creates for Nvidia is more substantial than spending the money on magazine ads and lan party sponsorships.

    ---------------

    "Excuse me, but I noticed that one 6800 Ultra takes two slots worth of airspace (due to the gigantic fan). So that means the Ultras would actually occupy the first and third PCIe slots"

    No. All pci-e slots are not they same. This setup require two x16 slots. Dual x16 moptherboards do not have any other slots between these. These two slots have about double the space between them as the rest of the x8, x4 and x2 slots.

    Nvidia is launching their nforce4 chipset later this year which will support dual pci-e x16. This is probably when this product will become available at retail.




  • Phiro - Tuesday, June 29, 2004 - link

    Ugh what a dumb, dumb waste of technology. Give me dual video cards (for dual directx/opengl displays) but not SLI BS. This is far better served with multiple GPU's on the card, not multiple cards.

    If Nvidia is really so concerned with people being able to pay for the ultimate in performance or allowing people to "upgrade" without throwing everything away, Nvidia should go with a user manageable socket on their cards and support multi-core GPUs.

    And the whole "alienware sells 30k systems a year so there is a market for this" - 30k video cards a year is less than a drop in the bucket for the R&D spent on putting this together.

    If this idiotic SLI re-invention cost the release of the nv4x (and prolonged our nv3x agony) a single day, or increased the cost of the nv4x cards by a single dollar, Nvidia is once again crowned king of the dumbshits in my book.

    Good choice buying 3dfx, Nvidia. It took a few years but Nvidia proved the old adage "You are what you eat". Nvidia's cards are hotter, larger, more complicated and more proprietary every day.
  • ScuZZee - Tuesday, June 29, 2004 - link

    Excuse me, but I noticed that one 6800 Ultra takes two slots worth of airspace (due to the gigantic fan). So that means the Ultras would actually occupy the first and third PCIe slots (the second and fourth slot would be made useless since it would be blocked by the coolers).

    So does that means the mobo have to spaced out the two PCIe slots to accommodate the two Ultras?
  • SpeekinSfear - Tuesday, June 29, 2004 - link

    barbary

    Just FYI, if you're gonna buy two, the GT model which $399 instead of the $499 Ultra cost can do it too. They're smaller, less hot and power draining, and did I mention cost $100 less. I think the only power difference is that the GT ones have 50mhz less clock speed.
  • barbary - Tuesday, June 29, 2004 - link

    So now I am stuck what to buy.

    I have a 670 Dell workstation and I was going to buy an ATI X800. But now should I buy a 6800 Ultra??

    Question is do I buy two so I know I have a pair??

    If I do and this technology doesn't come along for months I have wasted my money.

    If I don't buy two a may never get a pair to match and have wasted my money.
  • Swaid - Tuesday, June 29, 2004 - link

    Its not like you *have* to purchase 2 video cards for anything to work, thats only for the big spending enthusiast nuts and the CG/CAD guys. Its already part of the GPU, so its like an added bonus. The hard part in the beginning is getting a motherboard to support 2 PCIe x16.

Log in

Don't have an account? Sign up now