SLI: The Abridged Version

Not to be outdone by their efforts to reduce input lag, for Pascal NVIDIA is also rolling out some fairly important changes to SLI. These operate at both the hardware level and the software level, and for gamers fortunate enough to be able to own multiple Pascal cards, they will want to pay close attention to this.

On the hardware side of matters, NVIDIA is boosting the speed of the SLI connection. Previously with Maxwell 2 it operated at up to 400MHz, but with Pascal it can now operate at up to 650MHz. This is a substantial 63% increase in link speed.

However to actually get the faster link speed, in many cases new(er) SLI bridges are needed. The older bridges, particularly the flexible bridges, are not rated nor capable of supporting 650MHz. Only the more recent (and relatively rare) LED bridge, and NVIDIA’s brand new High Bandwidth (HB) bridge are capable of 650MHz.

And while the older LED Bridge is 650MHz capable, NVIDIA is still going to be phasing it out in favor of the new HB Bridge. The reason why is because it adds support for Pascal’s second SLI hardware feature: SLI link teaming.

With previous GPU generations, a GPU could only use a single SLI link to communicate with another GPU. The purpose of including multiple SLI links on a high-end card then was to allow it to communicate with multiple (3+) cards. But if you had a more basic 2-way SLI setup, then the second link on each card would go unused.

Pascal changes this up by allowing the SLI links to be teamed. Now two cards can connect to each other over two links, almost doubling the amount of bandwidth between the cards. Combined with the higher frequency of the SLI link itself, and the effective increase in bandwidth between cards in a 2-way SLI setup is 170%, or just short of a 3x increase in bandwidth.

The purpose of teaming SLI links is that even though the bandwidth boost from the higher link frequency is significant, for the highest resolutions and refresh rates it’s still not enough. By NVIDIA’s own admittance, SLI performance at better than 1440p60 was subpar, as the SLI interface would get saturated. The faster link gets NVIDIA enough bandwidth to comfortably handle 2-way SLI at 1440p120 and 4Kp60, but that’s it. Once you go past that, to configurations that essentially require DisplayPort 1.3+ (4Kp120, 5Kp60, and multi-monitor surround), then even a single 650MHz link isn’t enough. Ergo NVIDIA has started link teaming to get yet more bandwidth.

Getting back to the new HB bridge then, the new bridge is being introduced to provide a bridge suitable for link teaming. Previous bridges simply weren’t wired two have multiple links connect the same video cards – the cards didn’t support such a thing – whereas HB bridges are. Meanwhile as these are fixed (PCB) bridges, NVIDIA is offering their reference bridges in 3 sizes: 2 (40mm), 3 (60mm), and 4 (80mm) slot spacing, to mesh with cards that are either directly next to each other, have 1 empty slot between them, or 2 empty slots between them. NVIDIA is selling the new HB bridge for $40 over on their store, and NVIDIA’s partners are also preparing their own custom bridges. EVGA has announced a LED-let HB bridge, as the LED bridges proved rather popular with both system builders and customers looking for a bit more flare for their windowed cases.

Meanwhile, on a brief aside, I asked NVIDIA why they were still using SLI bridges instead of just routing everything over PCI Express. While I doubt they mind selling $40 bridges, the technical answer is that all things considered, this gave them more bandwidth. Rather than having to share potentially valuable PCIe bandwidth with CPU-GPU communication, the SLI links are dedicated links, eliminating any contention and potentially making them more reliable. The SLI links are also directly routed to the display controller, so there’s a bit more straightforward (lower latency) path as well.

Deprecated: 3-Way & 4-Way SLI

These aforementioned hardware updates to SLI are also having a major impact on the kinds of SLI configurations NVIDIA is going to be able (and willing) to support in the future. With both available SLI links on a Pascal card now teamed together for a single card, it’s not possible to do 3-way/4-way SLI and link teaming at the same time, as there aren’t enough links for both. As a result, NVIDIA is going to be deprecating 3-way and 4-way SLI.

Until shortly after the GTX 1080 launch, NVIDIA’s plans here were actually a bit more complex – involving a feature the company called an Enthusiast Key – but thankfully things have been simplified some. As it stands, NVIDIA is not going to be eliminating support for 3-way and 4-way SLI entirely; if you have a 3/4–way bridge, you can still setup a 3+ card configuration, bandwidth limitations and all. But for the Pascal generation there are going to be focusing their development resources on 2-way SLI, hence making 3-way and 4-way SLI deprecated.

In practice the way this will work is that NVIDIA will only be supporting 3 and 4-way SLI for a small number of programs – things like Unigine and 3DMark that are used by competitive benchmarkers/overclockers, so that they may continue their practices. For actual gamer use they are strongly discouraging anything over 2-way SLI, and in fact NVIDIA will not be enabling 3+ card configurations in their drivers for the vast majority of games (unless a developer specifically comes to them and asks). This all but puts an end to 3-way and 4-way SLI on consumer gaming setups.

As for why NVIDIA would want to do this, the answer boils down to two factors. The first of course is the introduction of SLI link teaming, while the second has to deal with games themselves. As we’ve discussed in the past, game engines are increasingly becoming AFR-unfriendly, which is making it harder and harder to get performance benefits out of SLI. 2-way SLI is hard enough, never mind 3/4-way SLI where upwards of 4 frames need to be rendered concurrently. Consequently, with greater bandwidth requirements necessitating link teaming, Pascal is as good a point as any to deprecate larger SLI card configurations.

Now with all of that said, however. DirectX 12 makes the picture a little more complex still. Because DirectX 12 adds new multi-GPU modes – some of which radically change how mGPU works – NVIDIA’s own changes only impact specific scenarios. All DX9/10/11 games are impacted by the new 2-way SLI limit. However whether a DX12 game is impacted depends on the mGPU mode used.

In implicit mode, which essentially recreates DX11 style mGPU under DX12, the 2-way SLI limit is in play. This mode is, by design, under the control of the GPU vendor and relies on all of the same mGPU technologies as are already in use today. This means traffic passes over the SLI bridge, and NVIDIA will only be working to optimize mGPU for 2-way SLI.

However with explicit mode, the 2-way limit is lifted. In explicit mode it’s the game developer that has control over how mGPU works – NVIDIA has no responsibility here – and it’s up to them to decide if they want to support more than 2 GPUs. In unlinked explicit mode this is all relatively straightforward, with the game addressing each GPU separately and working over the PCIe bus.

Meanwhile in explicit linked mode, where the relevant GPUs are presented as a single linked adapter, the GPU limit is still up to the developer. In this mode developers can even use the SLI bridge if they want – though again keeping in mind the bandwidth limitations – and it’s the most powerful mode for matching GPUs.

As for whether developers will actually want to support 3+ GPUs using DX12 explicit multiadapter, this remains to be seen. So far of the small number of games to even use it, none support 3+ GPUs, and as with NVIDIA-managed mGPU, the larger the number of GPUs the harder the task of keeping them all productive. We will have to see what developers decide to do, but outside of dedicated benchmarks (e.g. 3DMark) I would be a bit surprised to see developers support anything more than 2 GPUs.

Fast Sync & SLI Updates: Less Latency, Fewer GPUs GPU Boost 3.0: Finer-Grained Clockspeed Controls
Comments Locked

200 Comments

View All Comments

  • DonMiguel85 - Wednesday, July 20, 2016 - link

    Agreed. They'll likely be much more power-hungry, but I believe it's definitely doable. At the very least it'll probably be similar to Fury X Vs. GTX 980
  • sonicmerlin - Thursday, July 21, 2016 - link

    The 1070 is as fast as the 980 ti. The 1060 is as fast as a 980. The 1080 is much faster than a 980 ti. Every card jumped up two tiers in performance from the previous gen. That's "standard" to you?
  • Kvaern1 - Sunday, July 24, 2016 - link

    I don't think there's much evidence pointing in the direction of GCN 4 blowing Pascal out of the water.

    Sadly, AMD needs a win but I don't see it coming. Budgets matter.
  • watzupken - Wednesday, July 20, 2016 - link

    Brilliant review. Thanks for the in depth review. This is late, but the analysis is its strength and value add worth waiting for.
  • ptown16 - Wednesday, July 20, 2016 - link

    This review was a L O N G time coming, but gotta admit, excellent as always. This was the ONLY Pascal review to acknowledge and significantly include Kepler cards in the benchmarks and some comments. It makes sense to bench GK104 and analyze generational improvements since Kepler debuted 28nm and Pascal has finally ushered in the first node shrink since then. I guessed Anandtech would be the only site to do so, and looks like that's exactly what happened. Looking forward to the upcoming Polaris review!
  • DonMiguel85 - Wednesday, July 20, 2016 - link

    I do still wonder if Kepler's poor performance nowadays is largely due to neglected driver optimizations or just plain old/inefficient architecture. If it's the latter, it's really pretty bad with modern game workloads.
  • ptown16 - Wednesday, July 20, 2016 - link

    It may be a little of the latter, but Kepler was pretty amazing at launch. I suspect driver neglect though, seeing as how Kepler performance got notably WORSE soon after Maxwell. It's also interesting to see how the comparable GCN cards of that time, which were often slower than the Kepler competition, are now significantly faster.
  • DonMiguel85 - Thursday, July 21, 2016 - link

    Yeah, and a GTX 960 often beats a GTX 680 or 770 in many newer games. Sometimes it's even pretty close to a 780.
  • hansmuff - Thursday, July 21, 2016 - link

    This is the one issue that has me wavering for the next card. My AMD cards, the last one being a 5850, have always lasted longer than my NV cards; of course at the expense of slower game fixes/ready drivers.

    So far so good with a 1.5yrs old 970, but I'm keeping a close eye on it. I'm looking forward to what VEGA brings.
  • ptown16 - Thursday, July 21, 2016 - link

    Yeah I'd keep an eye on it. My 770 can still play new games, albeit at lowered quality settings. The one hope for the 970 and other Maxwell cards is that Pascal is so similar. The only times I see performance taking a big hit would be newer games using asynchronous workloads, since Maxwell is poorly prepared to handle that. Otherwise maybe Maxwell cards will last much longer than Kepler. That said, I'm having second thoughts on the 1070 and curious to see what AMD can offer in the $300-$400 price range.

Log in

Don't have an account? Sign up now