Throughout this year we’ve looked at several previews and technical demos of DirectX 12 technologies, both before and after the launch of Windows 10 in July. As the most significant update to the DirectX API since DirectX 10 in 2007, the release of DirectX 12 marks the beginning of a major overhaul of how developers will program for modern GPUs. So to say there’s quite a bit of interest in it – both from consumers and developers – would be an understatement.

In putting together the DirectX 12 specification, Microsoft and their partners planned for the long haul, present and future. DirectX 12 has a number of immediately useful features in it that has developers grinning from ear to ear, but at the same time given the fact that another transition like this will not happen for many years (if at all), DirectX 12 and the update to the underlying display driver foundation were meant to be very forward looking and to pack in as many advanced features as would be reasonable. Consequently the first retail games such as this quarter’s Fable Legends will just scratch the surface of what the API can do, as developers are still in the process of understanding the API and writing new engines around it, and GPU driver developers are similarly still hammering out their code and improving their DirectX 12 functionality.

Of everything that has been written about DirectX 12 so far, the bulk of the focus has been on the immediate benefits of the low-level nature of the API, and this is for a good reason. The greatly reduced driver overhead and better ability to spread out work submission over multiple CPU cores stands to be extremely useful for game developers, especially as the CPU submission bottleneck is among the greatest bottlenecks facing GPUs today. Even then, taking full advantage of this functionality will take some time as developers have become accustomed to minimizing the use of draw calls to work around the bottleneck, so it is safe to say that we are at the start of what is going to be a long transition for gamers and game developers.

A little farther out on the horizon than the driver overhead improvements are DirectX 12’s improvements to multi-GPU functionality. Traditionally the domain of drivers – developers have little control under DirectX 11 – DirectX 12’s explicit controls extend to multi-GPU rendering as well. It is now up to developers to decide if they want to use multiple GPUs and how they want to use them. And with explicit control over the GPUs along with the deep understanding that only a game’s developer can have for the layout of their rendering pipeline, DirectX 12 gives developers the freedom to do things that could never be done before.

That brings us to today’s article, an initial look into the multi-GPU capabilities of DirectX 12. Developer Oxide Games, who is responsible for the popular Star Swarm demo we looked at earlier this year, has taken the underlying Nitrous engine and are ramping up for the 2016 release of the first retail game using the engine, Ashes of the Singularity. As part of their ongoing efforts to Nitrous as a testbed for DirectX 12 technologies and in conjunction with last week’s Steam Early Access release of the game, Oxide has sent over a very special build of Ashes.

What makes this build so special is that it’s the first game demo for DirectX 12’s multi-GPU Explicit Multi-Adapter (AKA Multi Display Adapter) functionality. We’ll go into a bit more on Explicit Multi-Adapter in a bit, but in short it is one of DirectX 12’s two multi-GPU modes, and thanks to the explicit controls offered by the API, allows for disparate GPUs to be paired up. More than SLI and more than Crossfire, EMA allows for dissimilar GPUs to be used in conjunction with each other, and productively at that.

So in an article only fitting for the week of Halloween, today we will be combining NVIDIA GeForce and AMD Radeon cards into a single system – a single rendering setup – to see how well Oxide’s early implementation of the technology works. It may be unnatural and perhaps even a bit unholy, but there’s something undeniably awesome about watching a single game rendered by two dissimilar cards in this fashion.

A Brief History & DirectX 12
Comments Locked


View All Comments

  • jimjamjamie - Tuesday, October 27, 2015 - link

    [pizza-making intensifies]
  • geniekid - Monday, October 26, 2015 - link

    On one hand the idea of unlinked EMA is awesome. On the other hand, I have to believe 95% of developers will shy away from implementing anything other than AFR in their game due to the sheer amount of effort the complexity would add to their QA/debugging process. If Epic manages to pull off their post-processing offloading I would be very impressed.
  • DanNeely - Monday, October 26, 2015 - link

    I'd guess it'd be the other way around. SLI/XFire AFR is complicated enough that it's normally only done for big budget AAA games. Other than replacing two vendor APIs with a single OS API DX12 doesn't seem to offer a whole lot of help there; so I don't expect to see a lot change.

    Handing off the tail end of every frame seems simpler; especially since the frame pacing difficulties that make AFR so hard and require a large amount of per game work won't be a factor. This sounds like something that could be baked into the engines themselves, and that shouldn't require a lot of extra work on the game devs part. Even if it ends up only being a modest gain for those of us with mid/high end GPUs; it seems like it could end up being an almost free gift.
  • nightbringer57 - Monday, October 26, 2015 - link

    That's only half relevant.
    I wonder how much can be implemented at the engine level. This kind of thing may be at least partially transparent to devs if says Unreal Engine and Unity get compatibility for it... I don't know how much it can do, though.
  • andrewaggb - Monday, October 26, 2015 - link

    Agreed, I would hope that if the Unreal Engine, Unity, Frostbite etc support it that maybe 50% or more of new games will support it.

    We'll have to see though. The idea of having both an AMD and Nvdia card in the same machine is both appealing and terrifying. Occasionally games work better on one than the other, so you might avoid some pain sometimes, but I'm sure you'd get a whole new set of problems sometimes as well.

    I think making use of the iGPU and discrete cards is probably the better scenario to optimize for. (Like Epic is apparently doing)
  • Gigaplex - Monday, October 26, 2015 - link

    Problems such as NVIDIA intentionally disabling PhysX when an AMD GPU is detected in the system, even if it's not actively being used.
  • Friendly0Fire - Monday, October 26, 2015 - link

    It really depends on a lot of factors I think, namely how complex the API ends up being.

    For instance, I could really see shadow rendering being offloaded to one GPU. There's minimal crosstalk between the two GPUs, the shadow renderer only needs geometry and camera information (quick to transfer/update) and only outputs a single frame buffer (also very quick to transfer), yet the process of shadow rendering is slow and complex and requires extremely high bandwidth internally, so it'd be a great candidate for splitting off.

    Then you can also split off the post-processing to the iGPU and you've suddenly shaved maybe 6-8ms off your frame time.
  • Oogle - Monday, October 26, 2015 - link

    Yikes. Just one more exponential factor to add when doing benchmarks. More choice is great for us consumers. But reviews and comparisons are going to start looking more complicated. I'll be interested to see how guys will make recommendations when it comes to multi-gpu setups.
  • tipoo - Monday, October 26, 2015 - link

    Wow, seems like a bigger boost than I had anticipated. Will be nice to see all that unused silicon (in dGPU environments) getting used.
  • gamerk2 - Monday, October 26, 2015 - link

    As this test is a smaller number of combinations it’s not clear where the bottlenecks are, but it’s none the less very interesting how we get such widely different results depending on which card is in the lead. In the GTX 680 + HD 7970 setup, either the GTX 680 is a bad leader or the HD 7970 is a bad follower, and this leads to this setup spinning its proverbial wheels. Otherwise letting the HD 7970 lead and GTX 680 follow sees a bigger performance gain than we would have expected for a moderately unbalanced setup with a pair of cards that were never known for their efficient PCIe data transfers. So long as you let the HD 7970 lead, at least in this case you could absolutely get away with a mixed GPU pairing of older GPUs.

    Drivers. Pretty much that simple. Odds are, the NVIDIA drivers are treating the HD 7970 the same way it's treating the 680 GTX, which will result in performance problems. AMD and NVIDIA use very different GPU architectures, and you're seeing it here. NVIDIA is probably attempting to utilize the 7970 in a way it just can't handle.

    I'd be very interested to see something like 680/Titan, or some form of lower/newer setup, which is what most people would actually use this for (GPU upgrade).

Log in

Don't have an account? Sign up now