For months we’ve been waiting to take advantage of NVIDIA’s SLI and it’s looking like the tier one motherboard manufacturers will be doing their best to bring the first nForce4 SLI motherboards to market before the end of this year.  So is SLI all it’s cracked up to be? 

With a final board and final drivers, it’s time to look at SLI from a final perspective to see if NVIDIA squandered the opportunity to regain technology and performance leadership or if SLI is really everything it used to be…

How SLI Works

NVIDIA’s Scalable Link Interface (SLI) is based on the simple principle of symmetric distribution of load, meaning that the architecture depends on (and will only really work) if both GPUs get the exact same load as one another.  The nature of NVIDIA’s SLI indicates that odd combinations such as cards with different clock speeds or GPU feature sets (e.g. 16-pipes + 8 pipes) will not work; NVIDIA’s driver will run all cards at the lowest common clock speed, but there’s nothing you can do about trying to get different GPUs to work in SLI mode, the driver simply won’t let you enable the option. 

NVIDIA’s first task in assuring that the load distributed to both GPUs would be balanced and symmetrical was to equip their nForce4 SLI chipset with identical width PCI Express graphics slots.  By default, PCI Express graphics cards use a x16 slot, which features 16 PCI Express lanes offering 8GB/s of total bandwidth.  Instead of outfitting their chipsets with 16 more PCI Express lanes, NVIDIA simply allows the number of lanes to be reconfigurable to either a single x16 slot or two x8 slots, with the use of a little card on the motherboard itself.  The physical slots themselves are both x16 slots, but electrically they can be configured to be two x8 slots.  This won’t cause any compatibility issues with x16 cards, as they will just use fewer lanes for data transfers, and the real world performance impact is negligible in games, which is what NVIDIA is counting on.

The next trick is to make sure that the GPUs receive the exact same vertex data from the CPU, which is done by the CPU sending all vertex data to the primary GPU and then the primary GPU forwards it on to the secondary GPU.  Once data arrives at the primary GPU via the PCI Express bus, all GPU to GPU communication is handled via NVIDIA’s video bridge.  The video bridge is a bus that connects directly to the GPU and is used for transferring data from the frame buffer of one GPU directly to the next.  NVIDIA isn’t offering too much information on the interface, other than saying that it is capable of transferring data at up to 10GB/s.  While it is possible to have this GPU-to-GPU communication go over the PCI Express bus, NVIDIA insists that it would be silly to do so because of latency issues and bandwidth constraints, and has no plans in moving in that direction. 

NVIDIA’s driver plays an important role in maintaining symmetry in the rendering by looking at the workload and making two key decisions: 1) determining rendering method, and depending on the rendering method, 2) determining the workload split between the two GPUs. 

NVIDIA supports two main rendering methods: Alternate Frame Rendering (AFR) and Split Frame Rendering (SFR).  As the names imply, AFR has each GPU render a separate frame (e.g. GPU 1 renders all odd frames and GPU 2 renders all even frames) while SFR splits up the rendering of a single frame amongst the two GPUs.  NVIDIA’s driver does not determine whether to use AFR or SFR on the fly, instead NVIDIA’s software engineers have profiled the majority of the top 100 games and created profiles for each and every one, determining whether they should default to AFR or SFR mode in each game.  NVIDIA’s driver defaults to AFR as long as there are no dependencies between frames; for example, in some games that use slow motion special effects the game itself doesn’t clear the frame buffer and will render the next frame on top of the previous frame, alpha blending the two frames together to get the slow motion effect – in this case there is a frame to frame dependency and AFR cannot be used. 

If AFR can’t be used, the SFR is used but now the driver must determine how much of each frame to send to GPU 1 vs. GPU 2.  Since the driver can count on both GPUs being the exact same speed (see why it’s important?), it makes an educated guess on what the load split should be.  The educated guess comes through the use of a history table that stores the load each GPU was placed under for the past several frames.  Based on the outcomes stored in this history table, NVIDIA’s driver will make a prediction of what the rendering split should be between the two GPUs for future frames and will adjust the load factor accordingly.  This should all sound very familiar to anyone who has ever heard of a branch predictor in a CPU, and just like a branch predictor there is a penalty for incorrectly predicting.  If NVIDIA’s driver predicts incorrectly one GPU will finish its rendering task much sooner than the other, giving it nothing to do but wait until the other GPU is done, thus reducing the overall performance potential of the SLI setup. 

By now you can begin to see where the performance benefits of SLI come into play.  With twice the GPU rendering power you effectively have a 32-pipe 6800GT with twice as much memory bandwidth if you pair two of the cards together, a configuration that you won’t see in a single card for quite some time.  At the same time you should see that SLI does have a little bit of overhead associated with it, and at lower CPU-bound resolutions you can expect SLI to be slightly slower than a single card.  Then again, you don’t buy an SLI setup to run at lower resolutions. 

Once both GPUs have completed their rendering, whether in AFR or SFR mode, the secondary GPU sends its frame buffer to the primary GPU via NVIDIA’s video bridge.  The important thing here is that the data is sent digitally, so there’s no loss in image quality as a result of SLI.  The primary GPU recombines the data and outputs the final completed frame (or frames) through its outputs.  Sounds simple enough, right?

Surprisingly enough, throughout all of our testing, we didn’t encounter any rendering issues in SLI mode.  NVIDIA insists that they have tested quite a few of the top 100 games to ensure that there aren’t any issues with SLI mode and it does seem that they’ve done a good job with their driver.  If the driver hasn’t been profiled with a game, it will default to single-GPU mode to avoid any rendering issues, but the user can always force SLI mode if they wish. 

ASUS’ A8N-SLI Deluxe
POST A COMMENT

74 Comments

View All Comments

  • glennpratt - Wednesday, November 24, 2004 - link

    51 - Yeah, and the Voodoo 2 used analog to pass the signal from one card to the other externally. What would you suggest, nVidia make a card that is PCI and combines the signal using analog cables degrading your video quality? Idiocy.

    How many people owned V2 SLI setup and ran it on a crappy computer anyway?
    Reply
  • bob661 - Wednesday, November 24, 2004 - link

    You guys could buy a cheaper CPU and do a mild overclock to get the performance needed. I have a 3500 and I still plan on getting SLI. There's ways to get around the price issue. If I was buying a new system right now I would've gotten a 3200 "winnie" and OC'd it to 2.6GHz. That would put you at FX-55 speeds. If you're lucky you could hit 2.8 to 2.9GHz. Reply
  • bob661 - Wednesday, November 24, 2004 - link

    #49
    You don't need an extravagent budget to afford a monitor that can handle 1600x1200. The Samsung SyncMaster 997DF-T/T 19" CRT can do that for $209 on Newegg.com. I have the older 955DF version which does it too.
    Reply
  • nserra - Wednesday, November 24, 2004 - link

    VOODOO2 didnt NEED any of this, it worked on any MOBO, any monitor, any CARD, any ...... Reply
  • nserra - Wednesday, November 24, 2004 - link

    Buy Monitor.
    Buy PSU.
    Buy MOBO.
    Buy 2 graphics cards.
    Buy good CASE.
    Buy the top of the line processor.


    Too much buys, And all of these itens ALL HAVE TO BE TOP ($$$) OF THE LINE!!!

    I dont have the money, sorry not for me.
    Reply
  • Gundamit - Wednesday, November 24, 2004 - link

    It sounds like if you don't already have a monitor that supports at 1600x1200 you'll have a hard time justifying the SLI expense since you won't see nearly as much performance gain over the single card set-ups at lower resolutions. Just one more expense to consider. Thank goodness LCD panel prices seem to be dropping. I'm onboard for SLI with 6800GTs late Q1 '05. Should be plenty of info and mobo selections out by then. Reply
  • AtaStrumf - Wednesday, November 24, 2004 - link

    I said it before and it looks like I need to say it again:

    SLI like performance improvement (40 - 70% where it counts) in a single GPU over the previous generation single GPU isn't going to happen for AT LEAST 2 years! Example 9700 Pro (2002)/X800XT (2004)

    The other benefit is obviously MUCH lower upgrade cost. Theoretical example: an new $200 9800 Pro or $400 GF 6800 GT and this is really the worst case scenario for SLI -- it would have a lot of performance to make up; but I think that won't happen for a long time.

    And don't forget that we are hitting walls with current technologies, so future generation cards may take much longer than 2 years to bring the 9700/X800 like performance improvement.

    Just look at what ATi is doing. They're going for SLI as well, because there is no way in hell they can compete with it with a single GPU or any kind of single card design that woudn't require it's own power supply and air conditioning unit.

    SLI and dual core is the future; just not for me :-( TOO EXPENSIVE!
    Reply
  • SignalPST - Wednesday, November 24, 2004 - link

    Thank you for the article, Anand. It was very informative and exciting.

    I would like to make a suggestion. Since SLI configurations, as everyone knows, is targeted towards the very top notch enthusiasts, I think it would make a lot of sense to include benchmarks using HDTV(1920x1080) resolutions and the 2048x1536 resolution. A lot of high end 22" CRT monitors as well as high end 23" widescreen LCD's support these resolutions. I imagine these enthusiasts looking for SLI solutions would also be using those types of displays and wondering what kind of performance they would get with their dual video card setup.

    -SignalPST
    Reply
  • SuperStrokey - Wednesday, November 24, 2004 - link

    I wish this would work with 2 agp cards too, would be nice if i could upgrade my bfg6800 gt to a secong one on teh cheap when the new cards come out rather than having to buy a new card. Reply
  • kongming - Wednesday, November 24, 2004 - link

    Nevermind, the V9999 is still just AGP for the time being. Hopefully, they will offer this card in PCI-e in the future. Reply

Log in

Don't have an account? Sign up now