CPU Scaling

Diving into our look at DirectX 12, let’s start with what is going to be the most critical component for a benchmark like Star Swarm, the CPU scaling.

Because Star Swarm is designed to exploit the threading inefficiencies of DirectX 11, the biggest gains from switching to DirectX 12 on Star Swarm come from removing the CPU bottleneck. Under DirectX 11 the bulk of Star Swarm’s batch submission work happens under a single thread, and as a result the benchmark is effectively bottlenecked by single-threaded performance, unable to scale out with multiple CPU cores. This is one of the issues DirectX 12 sets out to resolve, with the low-level API allowing Oxide to more directly control how work is submitted, and as such better balance it over multiple CPU cores.

Star Swarm CPU Scaling - Extreme Quality - GeForce GTX 980

Star Swarm CPU Scaling - Extreme Quality - Radeon R9 290X

Starting with a look at CPU scaling on our fastest cards, what we find is that besides the absurd performance difference between DirectX 11 and DirectX 12, performance scales roughly as we’d expect among our CPU configurations. Star Swarm's DirectX 11 path, being single-threaded bound, scales very slightly with clockspeed and core count increases. The DirectX 12 path on the other hand scales up moderately well from 2 to 4 cores, but doesn’t scale up beyond that. This is due to the fact that at these settings, even pushing over 100K draw calls, both GPUs are solidly GPU limited. Anything more than 4 cores goes to waste as we’re no longer CPU-bound. Which means that we don’t even need a highly threaded processor to take advantage of DirectX 12’s strengths in this scenario, as even a 4 core processor provides plenty of kick.

Meanwhile this setup also highlights the fact that under DirectX 11, there is a massive difference in performance between AMD and NVIDIA. In both cases we are completely CPU bound, with AMD’s drivers only able to deliver 1/3rd the performance of NVIDIA’s. Given that this is the original Mantle benchmark I’m not sure we should read into the DirectX 11 situation too much since AMD has little incentive to optimize for this game, but there is clearly a massive difference in CPU efficiency under DirectX 11 in this case.

Star Swarm D3D12 CPU Scaling - Extreme Quality

Having effectively ruled out the need for 6 core CPUs for Star Swarm, let’s take a look at a breakdown across all of our cards for performance with 2 and 4 cores. What we find is that Star Swarm and DirectX 12 are so efficient that only our most powerful card, the GTX 980, finds itself CPU-bound with just 2 cores. For the AMD cards and other NVIDIA cards we can get GPU bound with the equivalent of an Intel Core i3 processor, showcasing just how effective DirectX 12’s improved batch submission process can be. In fact it’s so efficient that Oxide is running both batch submission and a complete AI simulation over just 2 cores.

Star Swarm CPU Batch Submission Time (4 Cores)

Speaking of batch submission, if we look at Star Swarm’s statistics we can find out just what’s going on with batch submission. The results are nothing short of incredible, particularly in the case of AMD. Batch submission time is down from dozens of milliseconds or more to just 3-5ms for our fastest cards, an improvement just overof a whole order of magnitude. For all practical purposes the need to spend CPU time to submit batches has been eliminated entirely, with upwards of 120K draw calls being submitted in a handful of milliseconds. It is this optimization that is at the core of Star Swarm’s DirectX 12 performance improvements, and going forward it could potentially benefit many other games as well.


Another metric we can look at is actual CPU usage as reported by the OS, as shown above. In this case CPU usage more or less perfectly matches our earlier expectations: with DirectX 11 both the GTX 980 and R9 290X show very uneven usage with 1-2 cores doing the bulk of the work, whereas with DirectX 12 CPU usage is spread out evenly over all 4 CPU cores.

At the risk of speaking to the point that it’s redundant, what we’re seeing here is exactly why Mantle, DirectX 12, OpenGL Next, and other low-level APIs have been created. With single-threaded performance struggling to increase while GPUs continue to improve by leaps and bounds with each generation, something must be done to allow games to better spread out their rendering & submission workloads over multiple cores. The solution to that problem is to eliminate the abstraction and let the developers do it themselves through APIs like DirectX 12.

Star Swarm & The Test GPU Scaling
Comments Locked

245 Comments

View All Comments

  • james.jwb - Saturday, February 7, 2015 - link

    Every single paragraph is completely misinformed, how embarrassing.
  • Alexey291 - Saturday, February 7, 2015 - link

    Yeah ofc it is ofc it is.
  • inighthawki - Saturday, February 7, 2015 - link

    >> There's no benefit for me who only uses a Windows desktop as a gaming machine.

    Wrong. Game engines that utilize DX12 or Mantle, and actually optimized for it, can increase the number of draw calls that usually cripples current generation graphics APIs. This leads to a large number of increased objects in game. This is the exact purpose of the starswarm demo in the article. If you cannot see the advantage of this, then you are too ignorant about the technology you're talking about, and I suggest you go learn a thing or two before pretending to be an expert in the field. Because yes, even high end gaming rigs can benefit from DX12.

    You also very clearly missed the pages of the article discussing the increased consistency of performance. DX12 will have better frame to frame deltas making the framerate more consistent and predictable. Let's not even start with discussions on microstutter and the like.

    >> Dx12 is not interesting either because my current build is actually limited by vsync. Nothing else but 60fps vsync (fake fps are for kids). And it's only a mid range build.

    If you have a mid range build and limited by vsync, you are clearly not playing quite a few games out there that would bring your rig to its knees. 'Fake fps' is not a term, but I assume you are referring to unbounded framerate by not synchronizing with vsync. Vsync has its own disadvantages. Increase input lag and framerate halving by missing the vsync period. Now if only directx supported proper triple buffering to allow reduces input latency with vsync on. It's also funny how you insult others as 'kids' as if you are playing in a superior mode, yet you are still on a 60Hz display...

    >> So why should I bother if all I do in Windows at home is launch steam (or a game from an icon on the desktop) aaaand that's it?

    Because any rational person would want better, more consistent performance in games capable of rendering more content, especially when they don't even have a high end rig. The fact that you don't realize how absurdly stupid your comment is makes the whole situation hilarious. Have fun with your high overhead games on your mid range rig.
  • ymcpa - Wednesday, February 11, 2015 - link

    Question is what do you lose from installing it? There might not be much gain initially as developers learn to take full advantage and they will be making software optimized for windows 7. However, as more people switch, you will start seeing games optimized for dx12. If you wait for that point, you will be paying for the upgrade. If you do it now you get it for free and I don't see how you will lose anything, other than a day to install the OS and possibly reinstall apps.
  • Morawka - Friday, February 6, 2015 - link

    Historically new DX release's have seen a 2-3 year adoption lag by game developers. Now, some AAA Game companies always throw in a couple of the new features at launch, the core of these engines will use DX 11 for the next few years.

    However with DX 12, The benifits are probably going to be to huge to ignore. Previous DX releases were new effects and rendering technologies. with DX 12, it effectively is cutting the minimum system requirements by 20-30% on the CPU side and probably 5%-10% on the GPU side.

    So DX12 adoption should be much faster IMHO. But it's no biggie if it's not.
  • Friendly0Fire - Saturday, February 7, 2015 - link

    DX12 also has another positive: backwards compatibility. Most new DX API versions introduce new features which will only work on either the very latest generation, or on the future generation of GPUs. DX12 will work on cards released in 2010!

    That alone means devs will be an awful lot less reluctant to use it, since a significant proportion of their userbase can already use it.
  • dragonsqrrl - Saturday, February 7, 2015 - link

    "DX12 will work on cards released in 2010"

    Well, at least if you have an Nvidia card.
  • Pottuvoi - Saturday, February 7, 2015 - link

    DX11 supported 'DX9 SM3' cards as well.
    DX12 will be similar, it will have layer which works on old cards, but the truly new features will not be available as hardware just is not there.
  • dragonsqrrl - Sunday, February 8, 2015 - link

    Yes but you still need the drivers to enable that API level support. Every DX12 article I've read, including this one, has specifically stated that AMD will not provide DX12 driver support for GPU's prior to GCN 1.0 (HD7000 series).
  • Murloc - Saturday, February 7, 2015 - link

    I don't think so given that they're giving out free upgrades and the money-spending gamers who benefit from this the most will mostly upgrade, or if they're not interested in computers besides gaming, will be upgraded when they change their computer.

Log in

Don't have an account? Sign up now