Star Swarm, DirectX 12 AMD APU Performance Preview
by Ryan Smith & Ian Cutress on February 13, 2015 10:00 AM EST- Posted in
- GPUs
- AMD
- Microsoft
- APUs
- DirectX 12
After several requests and a week’s break from our initial DirectX 12 article, we’re back again with an investigation into Star Swarm DirectX 12 performance scaling on AMD APUs. As our initial article was run on various Intel CPU configurations, this time we’re going to take a look at how performance scales on AMD’s Kaveri APUs, including whether DX12 is much help for the iGPU, and if it can help equalize the single-threaded performance gap been Kaveri and Intel’s Core i3 family.
To keep things simple, this time we’re running everything on either the iGPU or a GeForce GTX 770. Last week we saw how quickly the GPU becomes the bottleneck under Star Swarm when using the DirectX 12 rendering path, and how difficult it is to shift that back to the CPU. And as a reminder, this is an early driver on an early OS running an early DirectX 12 application, so everything here is subject to change.
CPU: | AMD A10-7800 AMD A8-7600 Intel i3-4330 |
Motherboard: | GIGABYTE F2A88X-UP4 for AMD ASUS Maximus VII Impact for Intel |
Power Supply: | Rosewill Silent Night 500W Platinum |
Hard Disk: | OCZ Vertex 3 256GB OS SSD |
Memory: | G.Skill 2x4GB DDR3-2133 9-11-10 for AMD G.Skill 2x4GB DDR3-1866 9-10-9 at 1600 for Intel |
Video Cards: | MSI GTX 770 Lightning AMD APU iGPU |
Video Drivers: | NVIDIA Release 349.56 Beta AMD Catalyst 15.200 Beta |
OS: | Windows 10 Technical Preview 2 (Build 9926) |
To get right down to business then, are AMD’s APUs able to shift the performance bottleneck on to the GPU under DirectX 12? The short answer is yes. Highlighting just how bad the single-threaded performance disparity between Intel and AMD can be under DirectX 11, what is a clear 50%+ lead for the Core i3 with Extreme and Mid qualities becomes a dead heat as all 3 CPUs are able to keep the GPU fully fed. DirectX 12 provides just the kick that the AMD APU setups need to overcome DirectX 11’s CPU submission bottleneck and push it on to the GPU. Consequently at Extreme quality we see a 64% performance increase for the Core i3, but a 170%+ performance increase for the AMD APUs.
The one exception to this is Low quality mode, where the Core i3 retains its lead. Though initially unexpected, examining the batch count differences between Low and Mid qualities gives us a solid explanation as to what’s going on: low pushes relatively few batches. With Extreme quality pushing average batch counts of 90K and Mid pushing 55K, average batch counts under Low are only 20K. With this relatively low batch count the benefits of DirectX 12 are still present but diminished, leading to the CPU no longer choking on batch submission and the bottleneck shifting elsewhere (likely the simulation itself).
Meanwhile batch submission times are consistent between all 3 CPUs, with everyone dropping down from 30ms+ to around 6ms. The fact that AMD no longer lags Intel in batch submission times at this point is very important for AMD, as it means they’re not struggling with individual thread performance nearly as much under DirectX 12 as they were DirectX 11.
Finally, taking a look at how performance scales with our GPUs, the results are unsurprising but none the less positive for AMD. Aside from the GTX 770 – which has the most GPU headroom to spare in the first place – both AMD APUs still see significant performance gains from DirectX 12 despite running into a very quick GPU bottleneck. This simple API switch is still enough to get another 44% out of the A10-7800 and 25% out of the A8-7600. So although DirectX 12 is not going to bring the same kind of massive performance improvements to iGPUs that we’ve seen with dGPUs, in extreme cases such as this it still can be highly beneficial. And this still comes without some of the potential fringe benefits of the API, such as shifting the TDP balance from CPU to GPU in TDP-constrained mobile devices.
Looking at the overall picture, just as with our initial article it’s important not to read too much into these results right now. Star Swarm is first and foremost a best case scenario and demonstration for the batch submission benefits of DirectX 12. And though games will still benefit from DirectX 12, they are unlikely to benefit quite as greatly as they do here, thanks in part to the much greater share of non-rendering tasks a CPU would be burdened with in a real game (simulation, AI, audio, etc.).
But with that in mind, our results from bottlenecking AMD’s APUs point to a clear conclusion. Thanks to DirectX 12’s greatly improved threading capabilities, the new API can greatly close the gap between Intel and AMD CPUs. At least so long as you’re bottlenecking at batch submission.
152 Comments
View All Comments
FlushedBubblyJock - Sunday, February 15, 2015 - link
Well industry commenters including articles here at Anand said what Andrew said, and noted the console wins for AMD are very low margin items.What is AMD going to "make it's money on" according to you for the 3 or 4 years the consoles lifespans run ? They get one year of oomph, then the rest is tiny residual sales - is that "a good profit plan" ?
The pro AMD'er above linked Forbes and it's talking about 2013 sales...ROFL - proving my above point, which is of course easily surmised common sense.
One year, minor profit - 3-4 years next to nothing = "a good plan ?"
At least AMD tied in the fan boy set groping with "Mantle and consoles!!!!" for the win... I admit that had many foaming at the mouth with glee for quite some time.
Alexvrb - Sunday, February 15, 2015 - link
I don't really see what your point is. The designs used in the consoles are primarily using technology they already built. Existing CPU cores, existing GCN cores. The R&D investment was minimal... they made money on consoles. Perhaps not much money, but it's still a positive flow of income and the consoles continue to sell in reasonable numbers. Some money is better than no money, and they need all the help they can get.akamateau - Monday, February 23, 2015 - link
Sony and Microsoft didn't invite nVidia OR Intel to the table.akamateau - Monday, February 23, 2015 - link
Dx12 = Mantlegame studios LOVE Mantle as they can write EPIC games.
Direct x12 is several orders of magnitude BETTER than Direct x11.
beginner99 - Saturday, February 14, 2015 - link
Problem is once software (games) catch up and actually make use of the better efficiency and get more complex the lower IPC of APUs will again become apparent.Samus - Sunday, February 15, 2015 - link
Further proof that AMD CPU's just aren't properly optimized in mainstream applications and API's.nissangtr786 - Wednesday, February 18, 2015 - link
Back in real world intel 32nm sandy bridge from early 2011 non trigate transistors which only happened on 22nm destroy 28nm steamroller in ipc and performance per watt and have far stronger memory controllers and floating point performance. Intel so far ahead if intel were on 32nm and amd were on 20nm intel would still lead performance per watt in cpu design.ppi - Friday, February 13, 2015 - link
AMD CPUs have slower single-thread performance, but typically more cores. Therefore, they certainly significantly benefit from software optimized for multi-threading. And removing any bottleneck is always a good thing. After all, even i3 saw significant performance improvement.amilayajr - Saturday, February 14, 2015 - link
Are you blind? i3 was paired with a dedicated graphics you numb nuts. Look at the test set up. AMD APU beats the shit out of i3 intel integrated graphics anytime. lol.That's why this test is biased. Why compare AMD APU with an i3 + dedicated GPU. I commented about it and it got deleted lol. Thanks. Please do this test again, AMD APU vs Intel integrated graphics. Now we're talking.
Ryan Smith - Saturday, February 14, 2015 - link
"I commented about it and it got deleted lol."Just to be clear, no comments have been deleted from this article.