To start, we want to thank the many manufacturers who have donated kit for our test beds in order to make this review, along with many others, possible.

Thank you to OCZ for providing us with 1250W Gold Power Supplies.
Thank you to G.Skill for providing us with the memory kits.
Thank you to ASUS for providing us with the AMD GPUs and some IO Testing kit.
Thank you to ECS for providing us with the NVIDIA GPUs.
Thank you to Corsair for providing us with the Corsair H80i CLC.
Thank you to Rosewill for providing us with the 500W Platinum Power Supply for mITX testing, a BlackHawk Ultra, and 1600W Hercules PSU for extreme dual CPU + quad GPU testing, and RK-9100 keyboards.
Thank you to Gigabyte for providing us with the X5690 CPUs.

Also many thanks go to the manufacturers who over the years have provided review samples which contribute to this review.

Testing Methodology

In order to keep the testing fair, we set strict rules in place for each of these setups. For every new chipset, the SSD was formatted and a fresh installation of the OS was applied. The chipset drivers for the motherboard were installed, along with NVIDIA drivers then AMD drivers. The games were preinstalled on a second partition, but relinked to ensure they worked properly. The games were then tested as follows:

Metro 2033: Benchmark Mode, two runs of four scenes at 1440p, max settings. First run of four is discarded, average of second run is taken (minus outliers).

DiRT 3: Benchmark Mode, four runs of the first scene with 8 cars at 1440p, max settings. Average is taken.

Civilization V: One five minute run of the benchmark mode accessible at the command line, at 1440p and max settings. Results produced are total frames in sets of 60 seconds, average taken.

Sleeping Dogs: Using the Adrenaline benchmark software, four scenes at 1440p in Ultra settings. Average is taken.

If the platform was being used for the next CPU (e.g. Maximus V Formula, moving from FX-8150 to FX-8350), there's no need to reinstall. If the platform is changed for the next test, a full reinstall and setup takes place.

How to Read This Review

Due to the large number of different variables in our review, it is hard to accurately label each data point with all the information about that setup. It also stands to reason that just putting the CPU model is also a bad idea when the same CPU could be in two different motherboards with different GPU lane allocations. There is also the memory aspect to consider, as well as if a motherboard uses MCT at stock. Here is a set of labels correlating to configurations you will see in this review:

CPU[+][(CP)] (PCIe version – lane allocation to GPUs [PLX])

First is the name of the CPU, then an optional + identifier for MCT enabled motherboards. (CP) indicates we are dealing with a Bulldozer derived CPU and using the Core Parking updates. Inside the parentheses is the PCIe version of the lanes we are dealing with, along with the lane allocation to each GPU. The final flag is if a PLX chip is involved in lane allocation.

Thus, for example:

A10-5800K (2 – x16/x16): A10-5800K with two GPUs in PCIe 2.0 mode
A10-5800K (CP) (2 – x16/x16): A10-5800K using Core Parking updates with two GPUs in PCIe 2.0 mode
FX-8350K (2 – x16/x16/x8): FX-8350 with three GPUs in PCIe 2.0 mode
i7-3770K (3/2 – x8/x8 + x4): i7-3770K powering three GPUs in PCIe 3.0 but the third GPU is using the PCIe 2.0 x4 from the chipset
i7-3770K+ (3 – x16): i7-3770K (with MCT) powering one GPU in PCIe 3.0 mode
i7-3770K+ (3 – x8/x8/x8/x8 PLX): i7-3770K (with MCT) powering four GPUs in PCIe 3.0 via a PLX chip

Common Configuration Points

All the system setups below have the following consistent configurations points:

- A fresh install of Windows 7 Ultimate 64-bit
- Either an Intel Stock CPU Cooler, a Corsair H80i CLC or Thermalright TRUE Copper
- OCZ 1250W Gold ZX Series PSUs (Rosewill 1600W Hercules for The Beast)
- Up to 4x ASUS AMD HD 7970 GPUs, using Catalyst 13.1
- Up to 2x ECS NVIDIA GTX 580 GPUs, using GeForce WHQL 310.90
- SSD Boot Drives, either OCZ Vertex 3 128GB or Kingston HyperX 120GB
- LG GH22NS50 Optical Drives
- Open Test Beds, either a DimasTech V2.5 EasyHard or a CoolerMaster Test Lab

AMD Configurations

A6-3650 + Gigabyte A75-UD4H + 16GB DDR3-1866 8-10-10
A8-3850 + ASRock A75 Extreme6 + 16GB DDR3 1866 8-10-10
A8-5600K + Gigabyte F2A85-UP4 + 16GB DDR3-2133 9-10-10
A10-5800K + Gigabyte F2A85-UP4 + 16GB DDR3-2133 9-10-10
X2-555 BE + ASUS Crosshair V Formula + 16GB DDR3 1600 8-8-8
X4-960T + ASUS Crosshair V Formula + 16GB DDR3-1600 8-8-8
X6-1100T + ASUS Crosshair V Formula + 16GB DDR3-1600 8-8-8
FX-8150 + ASUS Crosshair V Formula + 16GB DDR3-2133 10-12-11
FX-8350 + ASUS Crosshair V Formula + 16GB DDR3-2133 9-11-10
FX-8150 + ASUS Crosshair V Formula + 16GB DDR3-2133 10-12-11 + CP
FX-8350 + ASUS Crosshair V Formula + 16GB DDR3-2133 9-11-10 + CP

Intel Configurations

E6400 + MSI i975X Platinum + 4GB DDR2-666 5-6-6
E6700 + ASUS P965 Commando + 4GB DDR2-666 4-5-5
Celeron G465 + ASUS Maximus V Formula + 16GB DDR3-2133 9-11-11
i5-2500K + ASUS Maximus V Formula + 16GB DDR3-2133 9-11-11
i7-2600K + ASUS Maximus V Formula + 16GB DDR3-2133 9-11-11
i3-3225 + ASUS Maximus V Formula + 16GB DDR3-2400 10-12-12
i7-3770K + Gigabyte Z77X-UP7 + 16GB DDR3-2133 9-11-11
i7-3770K + ASUS Maximus V Formula + 16GB DDR3-2400 9-11-11
i7-3770K + Gigabyte G1.Sniper M3 + 16GB DDR3-2400 9-11-11
i7-3930K + ASUS Rampage IV Extreme + 16GB DDR3-2133 10-12-12
i7-3960X + ASRock X79 Professional 16GB DDR3-2133 10-12-12
Xeon X5690 + EVGA SR-2 + 6GB DDR3 1333 6-7-7
2x Xeon X5690 + EVGA SR-2 + 9GB DDR3 1333 6-7-7

The Beast

The Beast is a special machine put together to help with the review as a result of various hardware coming into my possession all at the same time. The core of the system is an EVGA SR-2 motherboard, the best and last dual processor motherboard to deal with overclockable Xeon processors. This is paired with a couple of X5690 Xeon processors, the highest clocked Westmere Xeon that Intel offers, and many thanks to Gigabyte for loaning these to us for a pair of reviews. I went and purchased a pair of Intel Xeon socket 1567 coolers for the system, which have a 2U z-height restriction but are copper piped and cooled by powerful (and loud) delta fans. These provided enough cooling power to push the Xeons from 3.43GHz to 4.6GHz during some overclocking attempts, so are more than adequate for the job at hand (if you can put up with the noise).

Gallery: The Beast

Our system is paired with some high quality DDR3 Hyper memory, once famed for its overclocking prowess but due to frequent deaths from high voltage, is now relegated as a memory for overclockers. However at stock this memory performs great, often in the region of DDR3-2000 C7, so our memory kits are well primed for this setup.

Of course a full system is nothing without a case and power supply to help justify a build. With the motherboard being absolutely huge, no standard case would take it – only large cases designed for desktop-based server 2P motherboards are adequate. Luckily there is one case which is selling well, and Dustin reviewed recently – the Rosewill Blackhawk Ultra. Aside from the weight, this case had no issues with installing the motherboard; it could easily fit in another 10 HDDs, four optical bays, and any major GPU setup you could possibly think of – with plenty of fans just for good measure. Read Dustin’s review for a more thorough analysis, but I have some good shots of the system and motherboard installed for you:


Rosewill also has the perfect power supply for dealing with a dual processor, quad CrossFireX setup. First, consider how many connections this 2P setup needs – we have a normal 24-pin ATX connector for the motherboard, one 8-pin CPU power connector for each CPU, an additional 6-pin PCIe power connector for each CPU to provide extra power, another 6-pin PCIe power connector to provide power to the PCIe slots, and then two 6+2 PCIe power connectors for the GPUs. That makes 11 PCIe connectors needed in total, and this is alongside all the fans in the case and whatever SSD/ODD setup a user wants. The power supply used for this monster is the 1600W Hercules, rated 80PLUS Silver. With access to 16 PCIe connectors, the only way you might need any more is with a compute rig having seven single slot cards each needing two connectors. With the CPUs and GPUs both overclocked, our system was drawing almost 1500W at the wall (at a 240V source) under a high CPU+GPU load.

Using a 2P system as a desktop comes with its own set of issues, namely some CPU benchmarks not optimized for 2P, or in this case, some trouble getting some games to even work. It seems that the more money you can throw at a gaming system the more problems start to arise, but The Beast provides a nice comparison point when we look at high-end Ivy Bridge, Sandy Bridge-E and Piledriver processors in multiple-GPU setups.

Our first port of call with all our testing is CPU throughput analysis, using our regular motherboard review benchmarks.

CPUs, GPUs, Motherboards, and Memory CPU Benchmarks
Comments Locked

242 Comments

View All Comments

  • TheQweaker - Friday, May 10, 2013 - link

    Just in case, here is a pointer to the nVidia GPU AI Path finding in the developer zone:

    https://developer.nvidia.com/gpu-ai-path-finding

    And here is the title of a 2011 GPU AI Planning paper (research; not yet in a game): "Exploiting the Computational Power of the Graphics Card: Optimal State Space Planning on the GPU". You should be able to find the PDF on the web.

    My 2 cents is that it's a good topic for a final paper.

    -- The Qweaker.
  • yougotkicked - Friday, May 10, 2013 - link

    Thanks again, I think I will be doing GPU AI as my final paper, probably try to implement the A* family as massively parallel, or maybe a local beam search using hundreds of hill-climbing threads.
  • TheQweaker - Saturday, May 11, 2013 - link

    Nice project.

    2 more cents.

    Keep it simple is the best advice. It's better to have a running algorithm than none, even if it's slow.

    Also, ask you advisor whether he'd want you to compare with a CPU implementation of yours in order to evaluate the pros and cons between your sequential implementation and your // implemenation. I did NOT write "evaluate gains from seq to //" as GPU programming is currently not fully understood, probably even not by nVidia engineers.

    Finally, here is book title: "CUDA Programming: A Developer's Guide to Parallel Computing with GPUs". But there are many others these days.

    OK. That w
  • TheQweaker - Saturday, May 11, 2013 - link

    as my last post.

    -- The Qweaker.
    (sorry for the cut, I wrongly clicked on submit)
  • yougotkicked - Monday, May 13, 2013 - link

    thanks a lot for all your input, I intend to evaluate not only the advantages of GPU computing, but it's weak points as well, so I'll be sure to demonstrate the differences between a sequential algorithm, a parallel CPU algorithm, and a massively parallel GPU algorithm.
  • Azusis - Wednesday, May 8, 2013 - link

    Could you test the Q6600 and i7-920 in your next roundup? I have many PC gaming friends, and we all seem to have a Q6600, i7-920, or 2500k in our rigs. Thanks! Great job on the article.
  • IanCutress - Wednesday, May 8, 2013 - link

    I have a Q9400 coming in soon from family - Getting one of the Nehalem/Westmere range is definitely on my to-do list for the next update :)
  • sonofgodfrey - Thursday, May 9, 2013 - link

    I too have a Q6600, but it would be interesting to see the high end (non-extreme edition) Core 2s as well: E8600 & Q9650. Just for yucks, perhaps a socket 775 Pentium 4 could also make an appearance? :)
  • gonks - Wednesday, May 8, 2013 - link

    i knew it from some time ago, but this proves once again that it's time to upgrade my good old c2d (conroe) E6600 @ 3.2Ghz
  • Quizzical - Wednesday, May 8, 2013 - link

    You've got a lot of data there. And it's good data if your main purpose is to compare a Radeon HD 7970 to a GeForce GTX 580. Unfortunately, most of it is worthless if you're trying to isolate CPU performance, which is the ostensible purpose of the article. You've gone far out of your way to try to make games GPU-limited so that you wouldn't be able to tell what the various CPUs can do when they're the main limiting factors.

    Loosely, the CPU has to do any work to run a game that isn't done by the GPU. The contents of this can vary wildly from game to game. Unless you're using DirectX 11 multithreaded rendering, only one thread can communicate with the video card at a time. But that one rendering thread mostly consists of passing data to the video card, so you don't do much in the way of real computations there. You do sort some things so that you don't have to switch programs, textures, and so forth more often than necessary, though you can have a separate sorting thread if you're (probably unreasonably) worried that this is going to mean too much work for the rendering thread.

    Actually determining what data needs to be passed to the video card can comprise the bulk of the CPU work that a game needs to do. But this portion is mostly trivial to scale to as many threads as you care to--at least within reason. It's a completely straightforward producer-consumer queue with however many "producer" threads you want and the rendering thread as the single "consumer" thread that takes the data set up by other threads and passes it along to the video card.

    Not quite all of the work of setting up data for the GPU is trivial to break into as many threads as necessary, though. At the start of a new frame, you have to figure out exactly where the camera is going to go in that frame. This is likely going to be very fast (e.g., tens or hundreds of microseconds), but it does need to be done before you go compute where everything else is relative to the camera.

    While I haven't programmed AI, I'd expect that you could likewise break it up into as many threads as you cared to, as you could "save" the state of the game at some instant in time and have separate threads compute what all AI has to do based on the state of the game at that moment, without needing to know anything about other game characters were choosing at the same time. Some games are heavy on AI computations, while online games may do essentially no AI computations client-side, so this varies wildly from game to game.

    A game engine may do a lot of other things besides these, such as processing inputs, loading data off of the hard drive, sending data over the Internet, or whatever. Some such things can't be readily scaled to many CPU cores, but if you count by CPU work necessary, few games will have all that much stuff to do other than setting up data for the GPU and computing AI.

    But most of the work that a CPU has to do doesn't care what graphical settings you're using. Anything that isn't part of the graphics engine certainly doesn't care. The only parts of a the CPU side of game engine that care what monitor resolution you're using are likely to be a handful of lines to set the resolution when you change it and a few lines to check whether an object is off the camera and therefore doesn't need to be processed in that particular frame--and culling such objects is likely done mostly to save on the GPU load. Any settings that can be adjusted in video drivers (e.g., anti-aliasing or anisotropic filtering) are done almost entirely on the video card and carry a negligible CPU load.

    Thus, if you're trying to isolate CPU performance, you turn down or off settings that don't affect the CPU load. In particular, you want a very low monitor resolution, no anti-aliasing, no anisotropic filtering, and no post-processing effects of any sort. Otherwise, you're just trying to make the game mostly CPU bound, and end up with data that looks like most of what you've collected.

    Furthermore, even if you do the measurements properly, there's also the question of whether the games you've chosen are representative of what most people will play. If you grab the games that you usually benchmark for video cards reviews, then you're going out of your way to pick games that are unrepresentative. Tech sites like this that review hardware tend to gravitate toward badly-coded games that aren't representative of most of the games that people will play. If this video card gets 200 frames per second at max settings in one game and that video card gets 300, what's the difference in real-world game experience? If you want to differentiate between different video cards, you need games that are more demanding, and simply being really inefficient is one way to do that.

    Of course, if you were trying to see how different CPUs affect performance in a mostly GPU-limited game, that can be interesting in an esoteric sense. It would probably tend to favor high single-threaded performance because the only difference you'd be able to pick out are due to things that happen between frames, which is the time that the video card is most likely to be forced to wait on the CPU briefly.

    But if you were trying to do that, why not just use a Radeon HD 5450? The question answers itself.

    If you would like to get some data that will be more representative of how games handle CPUs, then you'll need to do some things very differently. For starters, use just a single powerful GPU, to avoid any CrossFire or SLI weirdness. A GeForce GTX Titan is ideal, but a Radeon HD 7970 or GeForce GTX 680 would be fine. For that matter, if you're not stupid about picking graphical settings, something weaker like a Radeon HD 7870 or GeForce GTX 660 would probably work just fine. But you need to choose the graphical settings intelligently, by turning down or off any graphical settings that don't affect CPU load. In particular, anti-aliasing, anisotropic filtering, and all post-processing effects should be completely off. Use a fairly low monitor resolution; certainly no higher than 1920x1080, and you could make a good case for 1366x768.

    And then don't pick your usual set of games that you use to do video card reviews. You chose those games precisely because they're outliers that won't give a good gauge of CPU performance, so they'll sabotage your measurements if you're trying to isolate CPU performance. Rather, pick games that you rejected from doing video card reviews because they were unable to distinguish between video cards very well. If the results are that in a typical game, this processor can deliver 200 frames per second and that one can do 300, then so be it. If a Core i7-3570K and an FX-6300 can deliver hundreds of frames per second in most games (as is likely if the game runs well on, say, a 2 GHz Core 2 Duo), then you shouldn't shy away from that conclusion.

Log in

Don't have an account? Sign up now