Multi-GPU SLI/CF Scaling: Lynnfield's Blemish

When running in single-GPU mode, the on-die PCIe controller maintains a full x16 connection to your graphics card:


Hooray.

In multi-GPU mode, the 16 lanes have to be split in two:

To support this the motherboard maker needs to put down ~$3 worth of PCIe switches:

Now SLI and Crossfire can work, although the motherboard maker also needs to pay NVIDIA a few dollars to legally make SLI work.

The question is do you give up any performance when going with Lynnfield's 2 x8 implementation vs. Bloomfield/X58's 2 x16 PCIe configuration? In short, at the high end, yes.

I looked at scaling in two games that scaled the best with multiple GPUs: Crysis Warhead and FarCry 2. I ran all settings at their max, resolution at 2560 x 1600 but with no AA.

I included two multi-GPU configurations. A pair of GeForce GTX 275s from EVGA for NVIDIA:


A coupla GPUs and a few cores can go a long way

And to really stress things, I looked at two Radeon HD 4870 X2s from Sapphire. Note that each card has two GPUs so this is actually a 4-GPU configuration, enough to really stress a PCIe x8 interface.

First, the dual-GPU results from NVIDIA.

NVIDIA GeForce GTX 275 Crysis Warhead (ambush) Crysis Warhead (avalanche) Crysis Warhead (frost) FarCry 2 Playback Demo Action
Intel Core i7 975 (X58) - 1GPU 20.8 fps 23.0 fps 21.4 fps 41.0 fps
Intel Core i7 870 (P55) 1GPU 20.8 fps 22.9 fps 21.5 fps 40.5 fps
Intel Core i7 975 (X58) - 2GPUs 38.4 fps 42.3 fps 38.0 fps 73.2 fps
Intel Core i7 870 (P55) 2GPUs 38.0 fps 41.9 fps 37.4 fps 65.9 fps

 

The important data is in the next table. What you're looking at here is the % speedup from one to two GPUs on X58 vs. P55. In theory, X58 should have higher percentages because each GPU gets 16 PCIe lanes while Lynnfield only provides 8 per GPU.

GTX 275 -> GTX 275 SLI Scaling Crysis Warhead (ambush) Crysis Warhead (avalanche) Crysis Warhead (frost) FarCry 2 Playback Demo Action
Intel Core i7 975 (X58) 84.6% 83.9% 77.6% 78.5%
Intel Core i7 870 (P55) 82.7% 83.0% 74.0% 62.7%

 

For the most part, the X58 platform was only a couple of percent better in scaling. That changes with the Far Cry 2 results where X58 manages to get 78% scaling while P55 only delivers 62%. It's clearly not the most common case, but it can happen. If you're going to be building a high-end dual-GPU setup, X58 is probably worth it.

Next, the quad-GPU results from AMD:

AMD Radeon HD 4870 X2 Crysis Warhead (ambush) Crysis Warhead (avalanche) Crysis Warhead (frost) FarCry 2 Playback Demo Action
Intel Core i7 975 (X58) - 2GPUs 25.8 fps 31.3 fps 27.0 fps 70.9 fps
Intel Core i7 870 (P55) 2GPUs 24.4 fps 31.1 fps 26.6 fps 71.4 fps
Intel Core i7 975 (X58) - 4GPUs 27.0 fps 57.4 fps 47.9 fps 117.9 fps
Intel Core i7 870 (P55) 4GPUs 24.2 fps 50.0 fps 36.5 fps 116 fps

 

Again, what we really care about is the scaling. Note how single GPU performance is identical between Bloomfield/Lynnfield, but multi-GPU performance is noticeably lower on Lynnfield. This isn't going to be good:

4870 X2 -> 4870 X2 CF Scaling Crysis Warhead (ambush) Crysis Warhead (avalanche) Crysis Warhead (frost) FarCry 2 Playback Demo Action
Intel Core i7 975 (X58) 4.7% 83.4% 77.4% 66.3%
Intel Core i7 870 (P55) -1.0% 60.8% 37.2% 62.5%

 

Ouch. Maybe Lynnfield is human after all. Almost across the board the quad-GPU results significantly favor X58. It makes sense given how data hungry these GPUs are. Again, the conclusion here is that for a high end multi-GPU setup you'll want to go with X58/Bloomfield.

A Quick Look at GPU Limited Gaming

With all of our CPU reviews we try to strike a balance between CPU and GPU limited game tests in order to show which CPU is truly faster at running game code. In fact all of our CPU tests are designed to figure out which CPUs are best at a number of tasks.

However, the vast majority of games today will be limited by whatever graphics card you have in your system. The performance differences we talked about a earlier will all but disappear in these scenarios. Allow me to present data from Crysis Warhead running at 2560 x 1600 with maximum quality settings:

NVIDIA GeForce GTX 275 Crysis Warhead (ambush) Crysis Warhead (avalanche) Crysis Warhead (frost)
Intel Core i7 975 20.8 fps 23.0 fps 21.4 fps
Intel Core i7 870 20.8 fps 22.9 fps 21.5 fps
AMD Phenom II X4 965 BE 20.9 fps 23.0 fps 21.5 fps

 

They're all the same. This shouldn't come as a surprise to anyone, it's always been the case. Any CPU near the high end, when faced with the same GPU bottleneck, will perform the same in game.

Now that doesn't mean you should ignore performance data and buy a slower CPU. You always want to purchase the best performing CPU you can at any given pricepoint. It'll ensure that regardless of the CPU/GPU balance in applications and games that you're always left with the best performance possible.

The Test

Motherboard: Intel DP55KG (Intel P55)
Intel DX58SO (Intel X58)
Intel DX48BT2 (Intel X48)
Gigabyte GA-MA790FXT-UD5P (790FX)
Chipset: Intel X48
Intel X58
Intel P55
AMD 790FX
Chipset Drivers: Intel 9.1.1.1015 (Intel)
AMD Catalyst 9.8
Hard Disk: Intel X25-M SSD (80GB)
Memory: Qimonda DDR3-1066 4 x 1GB (7-7-7-20)
Corsair DDR3-1333 4 x 1GB (7-7-7-20)
Patriot Viper DDR3-1333 2 x 2GB (7-7-7-20)
Video Card: eVGA GeForce GTX 280
Video Drivers: NVIDIA ForceWare 190.62 (Win764)
NVIDIA ForceWare 180.43 (Vista64)
NVIDIA ForceWare 178.24 (Vista32)
Desktop Resolution: 1920 x 1200
OS: Windows Vista Ultimate 32-bit (for SYSMark)
Windows Vista Ultimate 64-bit
Windows 7 64-bit

Turbo mode is enabled for the P55 and X58 platforms.

The Best Gaming CPU? SYSMark 2007 Performance
POST A COMMENT

341 Comments

View All Comments

  • Seramics - Wednesday, September 09, 2009 - link

    So what's the big deal here? I dun tink its that impressive, just good. While S196 of 750 look to outcompete the "way" more expensive $245 of AMD's 965, the truth is that the mobo that you need to pair the 750/860/870 is far from being competitive. P55 is severely stripped down and it is only slightly cheaper than their X58 counterpart. So wht if 750 is cheaper than 965 by about %50? Did you just buy the cpu only? Ppl shud at least look at the CPU+mobo price because they both come together. Truth is, when you take into account mobo price, 750 is far from outcompete 965. Added up, I think its only about balanced. The 750 is a better CPU, but it also cost more. In comparison to their socket 1366 partner, socket 1156 system cost a little less, but they are also inferior a little bit. So what's special them? Sure, there are better turbo and better thermal performance. For me, that is all that is good about the 1156 CPU. For enthusiast, socket 1366 is the way to go. Reply
  • jnr0077 - Friday, July 27, 2012 - link

    i have a i5 750 chip cost £100 a gigabyte GA-P55A-UD6 cost £100 as it has six ram slots 16gb max radeon hd 4850 i love this mobo i cant fault it for the price i find it is a brilliant upgrade for cost i spent £250 considering the price of shops build you own pc you get what you put in :) very happy with the i5 750 1156 socket windows score on basic 500gb 7200 is 5.9 sweet 7.9 with a ssd :) can anyone tell me what the amd 965 hit on base score as i will never DV8 to amd intel 4 me allways :) Reply
  • hob196 - Wednesday, September 09, 2009 - link

    Hi,
    Thanks for another great article.
    I figure that having PCI-e on chip would be great to reduce the latency. Any thoughts about plugging non graphics PCI-e cards into the second PCI-e slot?
    I've heard some motherboards cripple the 2nd slots performance down to x1 if you plug an x1 card in the other slot (in a shared x8 environment)any evidence of this?

    In case you're curious I work with digital audio in a studio environment and I'm always striving to reduce the latency of audio going through the CPU.
    These days, the latency (in streaming audio) is down to how fast the CPU can push floating point plus any overhead for the buffers in the various busses you go through. e.g. A firewire sound interface adds a few ms because of the inherent buffers between CPU -> Northbridge -> Southbridge -> Firewire -> Interface.
    Reply
  • tempestor - Wednesday, September 09, 2009 - link

    Another great article Anand!

    You should consider a 2nd job as a novel writer! :D

    lp, M.
    Reply
  • AndyKH - Wednesday, September 09, 2009 - link

    I don't really get it:
    It is stated that most PCIe cards don't work well with higher frequencies and that the BCLK frequency should be kept at multiples of 133 MHz, and then they overclock it using a BCLK of ~200 MHz in one instance???
    Doesn't the 133 MHz requirement make it pretty much impossible to overclock?

    Someone please enlighten me.
    Reply
  • Anand Lal Shimpi - Wednesday, September 09, 2009 - link

    It doesn't make it impossible to overclock, just impossible to overclock (very high) without additional voltage.

    Take care,
    Anand
    Reply
  • AndyKH - Thursday, September 10, 2009 - link

    Thank you for the response!

    I see how using a higher voltage will increase switching speed of the buffers driving the PCIe bus. However, I fail to see why it would make it any less dificult for PCIe cards to cope with the increased clock frequency, unless the increased voltage is also fed to the PCIe cards (is this the case?). Otherwise I assume they would surely experience the same problems driving communication to the CPU?

    Also, you write multiples of 133 MHz but overclock to 200 MHz BCLK. Shouldn't it read multiples of 33 MHz?
    Reply
  • TotalLamer - Wednesday, September 09, 2009 - link

    I really, really don't understand why Anand is so obsessed with Turbo Modes. Any enthusiast who dares call himself such is going to clock this chip to the moon, at which point Turbo doesn't do anything. So with a 4.2GHz i7 870, all you're really left with is an i7 920 with worse multi-GPU gaming performance and and a less-certain upgrade path. Reply
  • coconutboy - Wednesday, September 09, 2009 - link

    You're assuming all enthusiasts think like you do, but the heavy majority of people (enthusiast or not) want nothing to do with a $500+ i7 870 cpu. The i7 920, 860, and i5 920 are much more attractive options.

    There are plenty of "enthusiasts" who instead prefer silent computers that use no fans, or people living in hot climates who focus on very low temps, or all manner of different things. On top of that, the overwhelming majority of people simply do not care about any of the aforementioned, and those people buy the heavy majority of computers.

    I started OCing in 1996, and used to OC pretty heavily, but got tired of constant tweaking or seeing my well-worn parts die prematurely. Now I tend to focus on very quiet computers that have a small/moderate overclock. So taking an i5 750 or i7 860 and raising it up 200-400 MHz and leaving turbo on is very appealing to me. Also of note is the extra heat generated and the extra money I'll spend on my electric bill by having a 24/7 overclock versus turbo modes. Dig the link and scroll to the bottom-

    http://www.guru3d.com/article/core-i5-750-core-i7-...">http://www.guru3d.com/article/core-i5-750-core-i7-...
    review-test/10

    The 13 watt increase at idle is no big deal, but 133 extra watts under load, well... it's worth the performance boost and heat to some folks, but other people (like me) look at those things as tradeoffs that need to be weighed versus reliability, cost for extra cooling, noise, my electric bill etc.
    Reply
  • Skiprudder - Thursday, September 10, 2009 - link

    I think that some of us are quite honestly getting more green conscious these days too. It's nice to have a CPU this fast that's also this energy efficient. We can get similar to OCed performance at a much smaller power envelope. I know it doesn't add up to a lot over the course of a year (less than $100 I assume), but these things add up and it saves me some dinero on the power bills! Reply

Log in

Don't have an account? Sign up now