After several requests and a week’s break from our initial DirectX 12 article, we’re back again with an investigation into Star Swarm DirectX 12 performance scaling on AMD APUs. As our initial article was run on various Intel CPU configurations, this time we’re going to take a look at how performance scales on AMD’s Kaveri APUs, including whether DX12 is much help for the iGPU, and if it can help equalize the single-threaded performance gap been Kaveri and Intel’s Core i3 family.

To keep things simple, this time we’re running everything on either the iGPU or a GeForce GTX 770. Last week we saw how quickly the GPU becomes the bottleneck under Star Swarm when using the DirectX 12 rendering path, and how difficult it is to shift that back to the CPU. And as a reminder, this is an early driver on an early OS running an early DirectX 12 application, so everything here is subject to change.

CPU: AMD A10-7800
AMD A8-7600
Intel i3-4330
Motherboard: GIGABYTE F2A88X-UP4 for AMD
ASUS Maximus VII Impact for Intel
Power Supply: Rosewill Silent Night 500W Platinum
Hard Disk: OCZ Vertex 3 256GB OS SSD
Memory: G.Skill 2x4GB DDR3-2133 9-11-10 for AMD
G.Skill 2x4GB DDR3-1866 9-10-9 at 1600 for Intel
Video Cards: MSI GTX 770 Lightning
Video Drivers: NVIDIA Release 349.56 Beta
AMD Catalyst 15.200 Beta
OS: Windows 10 Technical Preview 2 (Build 9926)


Star Swarm CPU Scaling - Extreme Quality - GeForce GTX 770


Star Swarm CPU Scaling - Mid Quality - GeForce GTX 770

Star Swarm CPU Scaling - Low Quality - GeForce GTX 770

To get right down to business then, are AMD’s APUs able to shift the performance bottleneck on to the GPU under DirectX 12? The short answer is yes. Highlighting just how bad the single-threaded performance disparity between Intel and AMD can be under DirectX 11, what is a clear 50%+ lead for the Core i3 with Extreme and Mid qualities becomes a dead heat as all 3 CPUs are able to keep the GPU fully fed. DirectX 12 provides just the kick that the AMD APU setups need to overcome DirectX 11’s CPU submission bottleneck and push it on to the GPU. Consequently at Extreme quality we see a 64% performance increase for the Core i3, but a 170%+ performance increase for the AMD APUs.

The one exception to this is Low quality mode, where the Core i3 retains its lead. Though initially unexpected, examining the batch count differences between Low and Mid qualities gives us a solid explanation as to what’s going on: low pushes relatively few batches. With Extreme quality pushing average batch counts of 90K and Mid pushing 55K, average batch counts under Low are only 20K. With this relatively low batch count the benefits of DirectX 12 are still present but diminished, leading to the CPU no longer choking on batch submission and the bottleneck shifting elsewhere (likely the simulation itself).

Star Swarm CPU Batch Submission Time - Extreme - GeForce GTX 770

Meanwhile batch submission times are consistent between all 3 CPUs, with everyone dropping down from 30ms+ to around 6ms. The fact that AMD no longer lags Intel in batch submission times at this point is very important for AMD, as it means they’re not struggling with individual thread performance nearly as much under DirectX 12 as they were DirectX 11.

Star Swarm GPU Scaling - Mid Quality

Star Swarm GPU Scaling - Low Quality

Finally, taking a look at how performance scales with our GPUs, the results are unsurprising but none the less positive for AMD. Aside from the GTX 770 – which has the most GPU headroom to spare in the first place – both AMD APUs still see significant performance gains from DirectX 12 despite running into a very quick GPU bottleneck. This simple API switch is still enough to get another 44% out of the A10-7800 and 25% out of the A8-7600. So although DirectX 12 is not going to bring the same kind of massive performance improvements to iGPUs that we’ve seen with dGPUs, in extreme cases such as this it still can be highly beneficial. And this still comes without some of the potential fringe benefits of the API, such as shifting the TDP balance from CPU to GPU in TDP-constrained mobile devices.

Looking at the overall picture, just as with our initial article it’s important not to read too much into these results right now. Star Swarm is first and foremost a best case scenario and demonstration for the batch submission benefits of DirectX 12. And though games will still benefit from DirectX 12, they are unlikely to benefit quite as greatly as they do here, thanks in part to the much greater share of non-rendering tasks a CPU would be burdened with in a real game (simulation, AI, audio, etc.).

But with that in mind, our results from bottlenecking AMD’s APUs point to a clear conclusion. Thanks to DirectX 12’s greatly improved threading capabilities, the new API can greatly close the gap between Intel and AMD CPUs. At least so long as you’re bottlenecking at batch submission.

Comments Locked


View All Comments

  • D. Lister - Saturday, February 14, 2015 - link

    Right, the problem is, this wasn't really a head-to-head comparison between AMD and Intel CPUs. The title is: "Star Swarm, DirectX 12 AMD APU Performance Preview". The purpose of putting an Intel CPU with an Nvidia dGPU was not to make AMD look bad, but to test the new API with a completely different setup to see how its benefits scale across various hardware platforms.
  • FlushedBubblyJock - Sunday, February 15, 2015 - link

    Yes, where is the midrange AMD card ? It's not working is my answer. That we can expect, so the reviewers are excused.
  • akamateau - Monday, February 23, 2015 - link

    The mid range AMD card isn't necessary. That is the whole point of Mantle and Dx12. GCN and HSA are working as designed. Now when AMD releases high bandwidth memory in APU cache the APU's will really fly. nVidia is at least a year away from HBM and in Intel silicon it's not even a glimmer on the horizon.

    i3-4330 can ONLY be competitive with GeForce 770. Take away 770 and i3-4330 becomes a joke.
  • akamateau - Monday, February 23, 2015 - link

    Actually AMD A10-7800 APU is LIGHT YEARS faster than i3-4330. Intel's i3 is only competitve when the Geforce 770 is used. Take away the nVidia GeForce 770 and the Intel i3-4330 alone would be a joke.

    Dx12 is basically Mantle and Intel HD IGP can't compete.
  • akamateau - Monday, February 23, 2015 - link

    It's not about reducing CPU overhead, it's about allowing the IGP to run at full efficiency, especially running gaming software with tens of thousands of draw calls.

    Direct x11 was crippled intentionally to keep Intel HD IGP competitive. Mantle changed that game.

    Now gaming studios can develop games of truly epic proportions and inexpensive hardware can run it!
  • akamateau - Monday, March 6, 2017 - link

    If Anand ran the same DX12 benchmarks using RX480 the results would be quite different.

    What Anand did not tell you was they disabled Asynch Compute. This of course results in favorable Intel bench scores as no GTX board runs well with it enabled.

    Why would Anand omit such critical data and run a bench test without using the best AMD dGPU AIB available?

    In fact DX12 Explicit Multi-adaptor would show that 2 RX 480 boards crushes GTX 1080 for far less money.

    Another fact the online media will twist themselves into knots trying to discredit.

    EMA is NOT Crossfire or SLI. It is BETTER.
  • akamateau - Monday, March 6, 2017 - link

    It's too bad that Anand LIED.

    Anand disabled Asychronous Compute which is NOT supported by NVidia.

    AMD GPU's can accept non-serial data streams from ALL cpu cores as AMD GPU;s manage that data. This is a massive performance multiplier for DX12 3d game engines and does make a huge difference while running Star Swarm.

    DX12 also support Explicit Multi-adaptor which would scale 2x RX480 and allow performance exceeding GTX 1080.

    "Async Compute Praised by Several Devs; Was Key to Hitting Performance Target in DOOM on Consoles"

    The consoles both run 8 core Jaguar APU's. And they fire on "all 8 cylinders".

    Both XBOX DX11.X and PS4 GNM and GNMX use extensions that support Asynchronous Compute.

    While this is a GPU feature, this feature allows the CPU to process and send data to the shader pipelines in a far faster rate. This reduces CPU bottleneck.

    NVidia can do this and in fact unless disabled, ALL GTX boards run DX12 3d gaming engines SLOWER.

    Did Anand tell the readers that they were disabling this MAJOR PERFROMANCE ENHANCING FEATURE?
  • Ian Cutress - Friday, February 13, 2015 - link

    It's an i3-4330, forgot to put it in. :)
  • FlushedBubblyJock - Sunday, February 15, 2015 - link

    I'm guessing one of AMD's main new high end cards don't work yet with DX12 properly(testing hassles), so that's why we have an nVidia 770 only in comparison.
  • akamateau - Tuesday, February 24, 2015 - link

    You missed the entire point of the benchtest.

    Intel i3-4330 with Intel HD IGP is ONLY competitive with an AMD A10-7800 or A8-6800 APU if an good mid range nVidia AIB GPU is added.

    AMD A10-7800 = Intel i3-4330 + nVidia Geforce 770.

    That is staggering.

    Direct x 12 is the best friend that AMD ever had.

    I would really like to see ALL Intel HD IGP benched against A10-7800. I bet even i5 and i7 gets trashed by AMD.

Log in

Don't have an account? Sign up now