Early Direct3D 12 Demos

Wrapping things up, while DirectX 12 is not scheduled for public release until the Holiday 2015 time period, Microsoft tells us that they’ve already been working on the API for a number of years now. So although the API is 18-20 months off from its public release, Microsoft already has a very early version up and running on partner NVIDIA’s hardware.

In their demos Microsoft showed off a couple of different programs. The first of which was Futuremark’s 3DMark 2011, which along with being a solid synthetic benchmark for heavy workloads, also offers the ability to easily be dissected to find bottlenecks and otherwise monitor the rendering process.


3DMark 2011 CPU Time: Direct3D 11 vs. Direct3D 12

As part of their presentation Microsoft showed off some CPU utilization data comparing the Direct3D 11 and Direct3D 12 versions of 3DMark, which succinctly summarize the CPU performance gains. By moving the benchmark to Direct3D 12, Microsoft and Futuremark were able to significantly reduce the single-threaded bottlenecking, distributing more of the User Mode Driver workload across multiple threads. Meanwhile the use of the Kernel Mode Driver and the CPU time it consumed were eliminated entirely, as was some time within the Windows kernel itself. Finally, the amount of time spent within Direct3D was again reduced.

This benchmark likely leans towards a best case outcome for the use of Direct3D 12, but importantly it does show all of the benefits of a low level API at once. Some of the CPU workload has been distributed to other threads, other aspects of the CPU workload have been eliminated entirely. Yet despite all of this there’s still a clear “master” thread, showcasing the fact that not even the use of a low level graphics API can result in the workload being perfectly distributed among CPU threads. So there will still be a potential single-threaded bottleneck even with Direct3D 12, however it will be greatly diminished compared to the kinds of bottlenecking that could occur before.

Moving on, Microsoft’s other demo was a game demo, showcasing Forza Motorsport 5 running on a PC. Developer Turn 10 had ported the game from Direct3D 11.X to Direct3D 12, allowing the game to easily be run on a PC. Powered by a GeForce GTX Titan Black, Microsoft tells us the demo is capable of sustaining 60fps.

First Thoughts

Wrapping things up, it’s probably best to start with a reminder that this is a beginning rather than an end. While Microsoft has finally publically announced DirectX 12, what we’ve seen thus far is the parts that they are ready to show off to the public at large, and not what they’re telling developers in private. So although we’ve seen some technical details about the graphics API, it’s very clear that we haven’t seen everything DirectX 12 will bring. Even a far as Direct3D is concerned, it’s a reasonable bet right now that Microsoft will have some additional functionality in the works – quite possibly functionality relating to next-generation GPUs – that will be revealed as the API is closer to completion.

But even without a complete picture, Microsoft has certainly released enough high level and low level information for us to get a good look at what they have planned; and based on what we’re seeing we have every reason to be excited. A lot of this is admittedly a rehash of we’ve said several months ago when Mantle was unveiled, but then again if Direct3D 12 and Mantle are as similar as some developers are hinting, then there may not be very many differences to discuss.

The potential for improved performance in PC graphics is clear, as are the potential benefits to multi-platform developers. A strong case has been laid out by AMD, and now Microsoft, NVIDIA, and Intel that we need a low level graphics API to better map to the capabilities of today’s GPUs and CPUs. Direct3D 12 in turn will be the common API needed to bring those benefits to everyone at once, as only a common API can do.

It’s important to be exceedingly clear that at least for the first phase the greatest benefits are on the CPU side and not the GPU side – something we’ve already seen in practice with Mantle – so the benefits in GPU-bound scenarios will not be as great at first. But in the long run this means changing how the GPU itself is fed work and how that work is processed, so through features such as descriptor heaps the door to improved GPU efficiency is at least left open. But since we are facing an increasing gap between GPU performance and single-threaded CPU performance, even just the CPU bottlenecking reductions alone can be worth it as developers look to push larger and larger batches.

Finally, while I feel it’s a bit too early to say anything definitive, I do want to close with the question of what this means for AMD’s Mantle. For low level PC graphics APIs Mantle will be the only game in town for the next 18-20 months; but after that, then what? If nothing else Mantle is an incredibly important public proving ground for the benefits of low level graphics APIs, so even if Direct3D 12 were to supplant Mantle, Mantle has done its job. But I’m nowhere close to declaring Mantle’s fate yet, as we only have a handful of details on Direct3D 12 and Mantle itself is still in beta. Does Mantle continue alongside Direct3D 12, an easy target for porting since the two APIs are (apparently) so similar? Does Mantle disappear entirely? Or does AMD take Mantle and make it an open API, setting it up against Direct3D 12 in a similar manner as OpenGL sits against Direct3D 11 today? I imagine AMD already has a plan in mind, but that will be a discussion for another day…

Game Development, Consoles, and Mobile Devices
Comments Locked

105 Comments

View All Comments

  • ninjaquick - Tuesday, March 25, 2014 - link

    And the second look will wind up the same way. Indipendents who can starve a little longer will probably make sure to release on the Steam Machines, but larger developers, with larger codebases and way more stuff on their minds can't just jump ship without spending way too much time on re-engineering much of their code.
  • martixy - Monday, March 24, 2014 - link

    I see a bright future for the gaming industry...
    On that note, does anyone happen to have a time machine? Or a ship that goes really really fast?
  • Rezurecta - Monday, March 24, 2014 - link

    What piqued my interest is the fact that even MS uses Chrome. ;)

    Seriously though, posted the same on Overclock.net. Given the expected time to launch, it seems that this was only thought about because of AMD and Mantle. It is a shame that AMD paved the way and may not be a vastly supported API.

    Hopefully, Nvidia and Intel accept AMD's open offer to join Mantle and we can put the control in the IHV's instead of the OS maker.
  • errorr - Monday, March 24, 2014 - link

    MS has a lot of work to do if they want to be relevant for mobile. OpenGL ES has been largely optimized for tile-based solutions and takes into account the numerous benefits and flaws compared to desktop GPUs. Just about everything in the mobile space is created to limit memory access which is slow, narrow, and power intensive. The entire paradigm is completely different. Adreno is also VLIW which means any low-level api stuff is bound to be very hard to implement. At least it will work on Nvidia chips I guess but that is still only 10% of the market at best.
  • errorr - Monday, March 24, 2014 - link

    On another note, there was some desire to get some better understanding on mobile GPU chips in the powerVR article and the ARM Mali blog at least did the math on publicly available statements and outlined the capabilities of each "shader core".

    Each Mali has 1-16 shader cores (4-8 usu.). Each shader core has 1-4 Arithmetic pipes (SIMD). Each pipe has 128-bit quad-word registers. The registers can be flexibly accessed as either 2 x FP64, 4 x FP32, 8 x FP16, 2 x int64, 4 x int32, 8 x int16, or 16 x int8. There is a speed penalty for FP64 and a speed bump for FP16 etc. from the 17 FP32 FLOPS per pipeline per clock. So at max with 16 shader cores with 4 pipes per core @ 600mhz that gives a theoretical compute of 652 FP32 GFLOPS. Although it seems like a 16/2 design (T-760) will do 326 FP32 GFLOPS as the more likely.
    There is also a load/store pipeline and a texture pipeline (1 textel per clock or 1/2 textel w/ trilinear filtering)

    Wasn't sure where to put this but they have been sharing/implying a bunch of info on their cores publicly for a while.
  • lightyears - Monday, March 24, 2014 - link

    Please give your opinion about following question:
    What about notebooks with nVidia Optimus? I have a notebook with a GTX680M dedicated graphics combined with Ivy Bridge integrated graphics. So the 680M will support DirectX12, but the Ivy Bridge dedicated probably wont.
    Unfortunately those two are connected by nVidia Optimus technology. A technology that it seems is impossible to put off. I looked already in my usual BIOS but I cant get rid of it. Whether I like it or not I am forced to have Optimus.

    So will Optimus automatically select the 680M for DX12 applications automatically?

    Or wont it work at all. And wont the game be installed because my stupid integrated graphics card doesnt support it?

    The last option would be a true shame and I would really be frsutrated. Given that I spend a lot of money on a high end notebook. And I paid a lot to have a heavy (DX12 capable) 680M in it. And I still wont be able to do DX12 altough I have a DX12 capable card...
  • Ryan Smith - Tuesday, March 25, 2014 - link

    "What about notebooks with nVidia Optimus?"

    There is no reason that I'm aware of that this shouldn't work on Optimus. The Optimus shim should redirect any flagged game to the dGPU, where it will detect a D3D12 capable device and be able to use that API.
  • ninjaquick - Tuesday, March 25, 2014 - link

    Awesome use of the word shim.
  • lightyears - Tuesday, March 25, 2014 - link

    I looked at the internet and it looks like it wont be a real problem indeed. Back in 2011 the same situation existed with DX11. Some Optimus notebooks had Sandy Bridge CPU (DX 10.1 capacle) and GTX 555 (DX 11 capable). By some people the Optimus didnt automatically detect the DX 11 capable device and they had some problem,. But after some changes in the settings they managed to get DX 11 going with the GTX 555 on the Optimus notebooks.Altough the Sandy Bridge was not DX 11 capable.
    So I suppose Optimus also wont be a problem this time with DX12. Good news.
    Altough I truely hate Optimus. It already forbid me to use stereoscopic 3D on a supported 3DTV.
  • ericore - Monday, March 24, 2014 - link

    "But why are we seeing so much interest in low level graphics programming on the PC? The short answer is performance, and more specifically what can be gained from returning to it."

    That's absolute BS.
    The reason is 3 fold: 1. For Xbox One 2. To prevent surge of Linux Gaming 3. To fulfill alliance/pack with Intel and Nvidia

Log in

Don't have an account? Sign up now