
After testing for this review, one thing is clear in my mind – the performance of CPUs paired with a single GPU is hitting a limit. As games get more complex, those designing the graphics and physics engines know that shifting calculations onto the GPU gives a greater boost in performance. If an engine is written to take advantage of the GPU, then the CPU does not really matter for the most part. If you can transfer textures over to the GPU and keep them in memory, the work of the CPU is essentially done apart from light maintenance or interfacing with the network.

Perhaps a better test would have been with more mid-range GPUs, such as 660 Tis or 7790s; with limited memory on the GPU itself, having that faster CPU and faster DDR3 memory might make a big difference. However the ecosystem may be that a gamer can buy a good GPU and not have to worry that the CPU might be a bit underpowered. Unless you need the performance of a big CPU, the big GPU should be a main priority if it means the CPU is less of a concern at the higher GPU/resolutions.

There is also scope for those using less powerful GPUs, such that the CPU could matter a lot more in this scenario. With limited memory, the CPU would have to organize more texture copies between the memory and the GPU, causing other aspects of the system to become the limiting factor. This is very important when interpreting our results. However, our results for our testing scenarios show several points worth noting.

Firstly, it is important to test both accurately, fairly, and with a good will. Choosing to perform a comparative test when misleading the audience by not understanding how it works underneath is a poor game to play. Leave the bias at home, let the results do the talking.

In three of our games, having a single GPU make almost no difference to what CPU performs the best. Civilization V was the sole exception, which also has issues scaling when you add more GPUs if you do not have the most expensive CPUs on the market. For Civilization V, I would suggest having only a single GPU and trying to get the best out of it.

In DiRT 3, Sleeping Dogs and Metro 2033, almost every CPU performed the same in a single GPU setup. Moving up the GPUs and DiRT 3 leaned towards PCIe 3.0 above two GPUs, Metro 2033 started to lean towards Intel CPUs and Sleeping Dogs needed CPU power when scaling up.

Above three GPUs, the extra horsepower from the single thread performance of an Intel CPU starts to make sense, with as much as 70 FPS difference in DiRT 3. Sleeping Dogs also starts to become sensitive to CPU choice.

We Know What Is Missing

On my list of future updates to this article, we need an i5-3570K processor, as well as dual and tri-module Piledriver and an i7-920 for a roundup. I will have a short window soon to rummage in a large storeroom of processors, which will be a prime opportunity for some of the harder to acquire CPUs. Haswell is just around the corner and should provide an interesting update to data points across the spectrum, in most of its desktop forms. From now on I will aim to cover all the different PCIe lane allocations in a chipset, as well as some of those odd ones caused by PLX chips.

If you have a specific processor you would like me to test for a future article, please leave a note below in the comments, and we will try to cover it. :) Top of that list is an i5-3750K, followed by Haswell, then some more AMD cores. I have 29 more processors on my 'ideal' list (if I can get them), but if anyone has any suggestions that I may not have thought of, please let me know. If I am able to get a hold of Titans, I may be in a position to retest across the board for NVIDIA results, meaning another benchmark or two as well (Bioshock Infinite perhaps).

Recommendations for the Games Tested at 1440p/Max Settings

A CPU for Single GPU Gaming: A8-5600K + Core Parking updates

If I were gaming today on a single GPU, the A8-5600K (or non-K equivalent) would strike me as a price competitive choice for frame rates, as long as you are not a big Civilization V player and don’t mind the single threaded performance. The A8-5600K scores within a percentage point or two across the board in single GPU frame rates with both a HD7970 and a GTX580, as well as feels the same in the OS as an equivalent Intel CPU. The A8-5600K will also overclock a little, giving a boost, and comes in at a stout $110, meaning that some of those $$$ can go towards a beefier GPU or an SSD. The only downside is if you are planning some heavy OS work – if the software is Piledriver-aware all might be well, although most processing is not, and perhaps an i3-3225 or FX-8350 might be worth a look.

A CPU for Dual GPU Gaming: i5-2500K or FX-8350

Looking back through the results, moving to a dual GPU setup obviously has some issues. Various AMD platforms are not certified for dual NVIDIA cards for example, meaning while they may excel for AMD, you cannot recommend them for Team Green. There is also the dilemma that while in certain games you can be fairly GPU limited (Metro 2033, Sleeping Dogs), there are others were having the CPU horsepower can double the frame rate (Civilization V).

After the overview, my recommendation for dual GPU gaming comes in at the feet of the i5-2500K. This recommendation may seem odd – these chips are not the latest from Intel, but chances are that pre-owned they will be hitting a nice price point, especially if/when people move over to Haswell. If you were buying new, the obvious answer would be looking at an i5-3570K on Ivy Bridge rather than the 2500K, so consider this suggestion a minimum CPU recommendation.

On the AMD side, the FX-8350 puts up a good show across most of the benchmarks, but falls spectacularly in Civilization V. If this is not the game you are aiming for and want to invest AMD, then the FX-8350 is a good choice for dual GPU gaming.

A CPU for Tri-GPU Gaming: i7-3770K with an x8/x4/x4 (AMD) or PLX (NVIDIA) motherboard

By moving up in GPU power we also have to boost the CPU power in order to see the best scaling at 1440p. It might be a sad thing to hear but the only CPU in our testing that provides the top frame rates at this level is the top line Ivy Bridge model. For a comparison point, the Sandy Bridge-E 6-core results were often very similar, but the price jump to such as setup is prohibitive to all but the most sturdy of wallets.

As noted in the introduction, using 3-way on NVIDIA with Ivy Bridge will require a PLX motherboard in order to get enough lanes to satisfy the SLI requirement of x8 minimum per CPU. This also raises the bar in terms of price, as PLX motherboards start around the $280 mark. For a 3-way AMD setup, an x8/x4/x4 enabled motherboard performs similarly to a PLX enabled one, and ahead of the slightly crippled x8/x8 + x4 variations. However investing in a PLX board would help moving to a 4-way setup should that be your intended goal. In either scenario, at stock clocks, the i7-3770K is the processor of choice from our testing suite.

A CPU for Quad-GPU Gaming: i7-3770K with a PLX motherboard

A four-way GPU configuration is for those insane few users that have both the money and the physical requirement for pixel power. We are all aware of the law of diminishing returns, and more often than not adding that fourth GPU is taking the biscuit for most resolutions. Despite this, even at 1440p, we see awesome scaling in games like Sleeping Dogs (+73% of a single card moving from three to four cards) and more recently I have seen that four-way GTX680s help give BF3 in Ultra settings a healthy 35 FPS minimum on a 4K monitor. So while four-way setups are insane, there is clearly a usage scenario where it matters to have card number four.

Our testing was pretty clear as to what CPUs are needed at 1440p with fairly powerful GPUs. While the i7-2600K was nearly there in all our benchmarks, only two sets of CPUs made sure of the highest frame rates – the i7-3770K and any six-core Sandy Bridge-E. As mentioned in the three-way conclusion, the price barrier to SB-E is a big step for most users (even if they are splashing out $1500+ on four big cards), giving the nod to an Ivy Bridge configuration. Of course that i7-3770K CPU will have to be paired with a PLX enabled motherboard as well.

One could argue that with overclocking the i7-2600K could come into play, and I don’t doubt that is the case. People building three and four way GPU monsters are more than likely to run extra cooling and overclock. Unfortunately that adds plenty of variables and extra testing which will have to be made at a later date. For now our recommendation at stock, for 4-way at 1440p, is an i7-3770K CPU.

What to Take Away From Our Testing

Ultimately the spectrum for testing this sort of thing is huge - the minute you deal with multiple GPUs in a system, testing different GPUs, testing different resolutions, testing different quality settings, and then extrapolating those across the normal array of benchmarks we apply to a GPU test, we might as well spend a month just looking at a single CPU platform!

We know the testing done here today looks at a niche scenario - 1440p at Max Settings using very powerful GPUs. The trend in gaming, as I see it, will be towards the higher resolution panels, and with Korean 27" monitors coming into the market, if you're ok with that sort of monitor it is a direction to take to improve your gaming experience. 4K is on the horizon, which means either more pixel pushing power or lower resolutions/settings if you want the quality. Testing at 1440p/max settings is something I like to test as it pushes the GPU and hopefully the rest of the system - if you're a gamer, you want the best experience, and finding the hardware to do that is one of the most important things in that process (after getting good at the game you want).

So these results are offered in order to aid a purchasing decision based on our small sample size. No sample size is ever going to be big enough (unless you are able to test in Narnia), but we hope to expand on this in the future. Consider the data, read our conclusions - you may have a different interpretation of the data. Let us know what you think!

GPU Benchmarks: Sleeping Dogs
  • TheQweaker - Friday, May 10, 2013 - link

    Just in case, here is a pointer to the nVidia GPU AI Path finding in the developer zone:

    And here is the title of a 2011 GPU AI Planning paper (research; not yet in a game): "Exploiting the Computational Power of the Graphics Card: Optimal State Space Planning on the GPU". You should be able to find the PDF on the web.

    My 2 cents is that it's a good topic for a final paper.

    -- The Qweaker.
  • yougotkicked - Friday, May 10, 2013 - link

    Thanks again, I think I will be doing GPU AI as my final paper, probably try to implement the A* family as massively parallel, or maybe a local beam search using hundreds of hill-climbing threads.
  • TheQweaker - Saturday, May 11, 2013 - link

    Nice project.

    2 more cents.

    Keep it simple is the best advice. It's better to have a running algorithm than none, even if it's slow.

    Also, ask you advisor whether he'd want you to compare with a CPU implementation of yours in order to evaluate the pros and cons between your sequential implementation and your // implemenation. I did NOT write "evaluate gains from seq to //" as GPU programming is currently not fully understood, probably even not by nVidia engineers.

    Finally, here is book title: "CUDA Programming: A Developer's Guide to Parallel Computing with GPUs". But there are many others these days.

    OK. That w
  • TheQweaker - Saturday, May 11, 2013 - link

    as my last post.

    -- The Qweaker.
    (sorry for the cut, I wrongly clicked on submit)
  • yougotkicked - Monday, May 13, 2013 - link

    thanks a lot for all your input, I intend to evaluate not only the advantages of GPU computing, but it's weak points as well, so I'll be sure to demonstrate the differences between a sequential algorithm, a parallel CPU algorithm, and a massively parallel GPU algorithm.
  • Azusis - Wednesday, May 8, 2013 - link

    Could you test the Q6600 and i7-920 in your next roundup? I have many PC gaming friends, and we all seem to have a Q6600, i7-920, or 2500k in our rigs. Thanks! Great job on the article.
  • IanCutress - Wednesday, May 8, 2013 - link

    I have a Q9400 coming in soon from family - Getting one of the Nehalem/Westmere range is definitely on my to-do list for the next update :)
  • sonofgodfrey - Thursday, May 9, 2013 - link

    I too have a Q6600, but it would be interesting to see the high end (non-extreme edition) Core 2s as well: E8600 & Q9650. Just for yucks, perhaps a socket 775 Pentium 4 could also make an appearance? :)
  • gonks - Wednesday, May 8, 2013 - link

    i knew it from some time ago, but this proves once again that it's time to upgrade my good old c2d (conroe) E6600 @ 3.2Ghz
  • Quizzical - Wednesday, May 8, 2013 - link

    You've got a lot of data there. And it's good data if your main purpose is to compare a Radeon HD 7970 to a GeForce GTX 580. Unfortunately, most of it is worthless if you're trying to isolate CPU performance, which is the ostensible purpose of the article. You've gone far out of your way to try to make games GPU-limited so that you wouldn't be able to tell what the various CPUs can do when they're the main limiting factors.

    Loosely, the CPU has to do any work to run a game that isn't done by the GPU. The contents of this can vary wildly from game to game. Unless you're using DirectX 11 multithreaded rendering, only one thread can communicate with the video card at a time. But that one rendering thread mostly consists of passing data to the video card, so you don't do much in the way of real computations there. You do sort some things so that you don't have to switch programs, textures, and so forth more often than necessary, though you can have a separate sorting thread if you're (probably unreasonably) worried that this is going to mean too much work for the rendering thread.

    Actually determining what data needs to be passed to the video card can comprise the bulk of the CPU work that a game needs to do. But this portion is mostly trivial to scale to as many threads as you care to--at least within reason. It's a completely straightforward producer-consumer queue with however many "producer" threads you want and the rendering thread as the single "consumer" thread that takes the data set up by other threads and passes it along to the video card.

    Not quite all of the work of setting up data for the GPU is trivial to break into as many threads as necessary, though. At the start of a new frame, you have to figure out exactly where the camera is going to go in that frame. This is likely going to be very fast (e.g., tens or hundreds of microseconds), but it does need to be done before you go compute where everything else is relative to the camera.

    While I haven't programmed AI, I'd expect that you could likewise break it up into as many threads as you cared to, as you could "save" the state of the game at some instant in time and have separate threads compute what all AI has to do based on the state of the game at that moment, without needing to know anything about other game characters were choosing at the same time. Some games are heavy on AI computations, while online games may do essentially no AI computations client-side, so this varies wildly from game to game.

    A game engine may do a lot of other things besides these, such as processing inputs, loading data off of the hard drive, sending data over the Internet, or whatever. Some such things can't be readily scaled to many CPU cores, but if you count by CPU work necessary, few games will have all that much stuff to do other than setting up data for the GPU and computing AI.

    But most of the work that a CPU has to do doesn't care what graphical settings you're using. Anything that isn't part of the graphics engine certainly doesn't care. The only parts of a the CPU side of game engine that care what monitor resolution you're using are likely to be a handful of lines to set the resolution when you change it and a few lines to check whether an object is off the camera and therefore doesn't need to be processed in that particular frame--and culling such objects is likely done mostly to save on the GPU load. Any settings that can be adjusted in video drivers (e.g., anti-aliasing or anisotropic filtering) are done almost entirely on the video card and carry a negligible CPU load.

    Thus, if you're trying to isolate CPU performance, you turn down or off settings that don't affect the CPU load. In particular, you want a very low monitor resolution, no anti-aliasing, no anisotropic filtering, and no post-processing effects of any sort. Otherwise, you're just trying to make the game mostly CPU bound, and end up with data that looks like most of what you've collected.

    Furthermore, even if you do the measurements properly, there's also the question of whether the games you've chosen are representative of what most people will play. If you grab the games that you usually benchmark for video cards reviews, then you're going out of your way to pick games that are unrepresentative. Tech sites like this that review hardware tend to gravitate toward badly-coded games that aren't representative of most of the games that people will play. If this video card gets 200 frames per second at max settings in one game and that video card gets 300, what's the difference in real-world game experience? If you want to differentiate between different video cards, you need games that are more demanding, and simply being really inefficient is one way to do that.

    Of course, if you were trying to see how different CPUs affect performance in a mostly GPU-limited game, that can be interesting in an esoteric sense. It would probably tend to favor high single-threaded performance because the only difference you'd be able to pick out are due to things that happen between frames, which is the time that the video card is most likely to be forced to wait on the CPU briefly.

    But if you were trying to do that, why not just use a Radeon HD 5450? The question answers itself.

    If you would like to get some data that will be more representative of how games handle CPUs, then you'll need to do some things very differently. For starters, use just a single powerful GPU, to avoid any CrossFire or SLI weirdness. A GeForce GTX Titan is ideal, but a Radeon HD 7970 or GeForce GTX 680 would be fine. For that matter, if you're not stupid about picking graphical settings, something weaker like a Radeon HD 7870 or GeForce GTX 660 would probably work just fine. But you need to choose the graphical settings intelligently, by turning down or off any graphical settings that don't affect CPU load. In particular, anti-aliasing, anisotropic filtering, and all post-processing effects should be completely off. Use a fairly low monitor resolution; certainly no higher than 1920x1080, and you could make a good case for 1366x768.

    And then don't pick your usual set of games that you use to do video card reviews. You chose those games precisely because they're outliers that won't give a good gauge of CPU performance, so they'll sabotage your measurements if you're trying to isolate CPU performance. Rather, pick games that you rejected from doing video card reviews because they were unable to distinguish between video cards very well. If the results are that in a typical game, this processor can deliver 200 frames per second and that one can do 300, then so be it. If a Core i7-3570K and an FX-6300 can deliver hundreds of frames per second in most games (as is likely if the game runs well on, say, a 2 GHz Core 2 Duo), then you shouldn't shy away from that conclusion.

