The Performance Impact of Asynchronous Shading

Finally, let’s take a look at Ashes’ latest addition to its stable of DX12 headlining features; asynchronous shading/compute. While earlier betas of the game implemented a very limited form of async shading, this latest beta contains a newer, more complex implementation of the technology, inspired in part by Oxide’s experiences with multi-GPU. As a result, async shading will potentially have a greater impact on performance than in earlier betas.

Update 02/24: NVIDIA sent a note over this afternoon letting us know that asynchornous shading is not enabled in their current drivers, hence the performance we are seeing here. Unfortunately they are not providing an ETA for when this feature will be enabled.

Ashes of the Singularity (Beta) - High Quality - Async Shader Performance

Since async shading is turned on by default in Ashes, what we’re essentially doing here is measuring the penalty for turning it off. Not unlike the DirectX 12 vs. DirectX 11 situation – and possibly even contributing to it – what we find depends heavily on the GPU vendor.

Ashes of the Singularity (Beta) - High Quality - Async Shading Perf. Gain

All NVIDIA cards suffer a minor regression in performance with async shading turned on. At a maximum of -4% it’s really not enough to justify disabling async shading, but at the same time it means that async shading is not providing NVIDIA with any benefit. With RTG cards on the other hand it’s almost always beneficial, with the benefit increasing with the overall performance of the card. In the case of the Fury X this means a 10% gain at 1440p, and though not plotted here, a similar gain at 4K.

These findings do go hand-in-hand with some of the basic performance goals of async shading, primarily that async shading can improve GPU utilization. At 4096 stream processors the Fury X has the most ALUs out of any card on these charts, and given its performance in other games, the numbers we see here lend credit to the theory that RTG isn’t always able to reach full utilization of those ALUs, particularly on Ashes. In which case async shading could be a big benefit going forward.

As for the NVIDIA cards, that’s a harder read. Is it that NVIDIA already has good ALU utilization? Or is it that their architectures can’t do enough with asynchronous execution to offset the scheduling penalty for using it? Either way, when it comes to Ashes NVIDIA isn’t gaining anything from async shading at this time.

Ashes of the Singularity (Beta) - Extreme Quality - Async Shading Perf. Gain

Meanwhile pushing our fastest GPUs to their limit at Extreme quality only widens the gap. At 4K the Fury X picks up nearly 20% from async shading – though a much smaller 6% at 1440p – while the GTX 980 Ti continues to lose a couple of percent from enabling it. This outcome is somewhat surprising since at 4K we’d already expect the Fury X to be rather taxed, but clearly there’s quite a bit of shader headroom left unused.

DirectX 12 vs. DirectX 11 Closing Thoughts
Comments Locked

153 Comments

View All Comments

  • Beany2013 - Wednesday, February 24, 2016 - link

    You are aware that Mantle and DX12 are actually different APIs, yeah?
  • zheega - Wednesday, February 24, 2016 - link

    AMD just released new drivers that say are made for this benchmark. Can we get a quick follow-up if their performance improves even more??

    http://support.amd.com/en-us/kb-articles/Pages/AMD...

    AMD has partnered with Stardock in association with Oxide to bring gamers Ashes of the Singularity – Benchmark 2.0 the first benchmark to release with DirectX® 12 benchmarking capabilities such as Asynchronous Compute, multi-GPU and multi-threaded command buffer Re-ordering. Radeon Software Crimson Edition 16.2 is optimized to support this exciting new release.
  • revanchrist - Wednesday, February 24, 2016 - link

    See? Every time when there's a pro AMD game tested, there'll be much butt hurt fanboy comments. And i guess everyone knows why. Because when you bought something, you'll always want to justified your purchase and you know who's got the lion share of the dGPU market now. Guess nowadays people are just too sensitive or has a heart of glasses, which makes them judging things ever so subjectively and personally.
  • Socius - Wednesday, February 24, 2016 - link

    For anyone who missed it:

    "Update 02/24: NVIDIA sent a note over this afternoon letting us know that asynchornous shading is not enabled in their current drivers, hence the performance we are seeing here. Unfortunately they are not providing an ETA for when this feature will be enabled."
  • ToTTenTranz - Wednesday, February 24, 2016 - link

    "Unfortunately they are not providing an ETA for when this feature will be enabled."

    If ever...
  • andrewaggb - Wednesday, February 24, 2016 - link

    Makes sense why it would be slightly slower. Also makes through benchmarks less meaningful
  • Ext3h - Wednesday, February 24, 2016 - link

    "not enabled" is a strange and misleading wording, since it obviously is both available and working correctly according to the specification.

    Should be read as "not being made full use of", as it is only lacking any clever way of profiting from asynchronous compute in hardware.
  • barn25 - Thursday, February 25, 2016 - link

    If you google around you will find out nvidia does not have asynchornous shading on its DX"12" cards. this was actually first found out in WDDM 1.3 back in windows 8.1 when they would not support the optional features which AMD does.
  • Ext3h - Thursday, February 25, 2016 - link

    I know that the wrong terminology kept being used for years now, especially driven by major tech review websites like this one. But that's still not making it any less wrong.

    The API is fully functional. So the driver does support it. Whether it does so efficiently is an entirely different matter, you don't NEED hardware "support" to provide that feature. Hardware support is only required to provide parallel execution, as opposed to the default sequential fallback. The latter one is perfectly within the bounds in the specification, and counts as fully functional. It's just not providing any additional benefits, but it's neither broken nor deactivated.
  • barn25 - Thursday, February 25, 2016 - link

    Don't try to change it. I am referring to HW Asyc compute, which AMD supports and NVidia does not. Using a shim will impact performance even greater.

Log in

Don't have an account? Sign up now