Closing Thoughts

Wrapping things up, Futuremark’s latest benchmark certainly gives us a new view on DirectX 12, and of course another data point in looking at the performance of the forthcoming API.

Since being announced last year – and really, since Mantle was announced in 2013 – the initial focus on low-level APIs has been on draw call throughput, and for good reason. The current high-level API paradigm has significant CPU overhead and at the same time fails to scale well with multiple CPU cores, leading to a sort of worst-case scenario for trying to push draw calls. At the same time console developers have low enjoyed lower-level access and the accompanying improvement in draw calls, a benefit that is an issue for the PC in the age of so many multiplatform titles.

DirectX 12 then will be a radical overhaul to how GPU programming works, but at its most basic level it’s a fix for the draw call problem. And as we’ve seen in Star Swarm and now the 3DMark API Overhead Feature Test, the results are nothing short of dramatic. With the low-level API offering a 10x-20x increase in draw call throughput, any sort of draw call problems the PC was facing with high-level APIs is thoroughly put to rest by the new API. With the ability to push upwards of 20 million draw calls per second, PC developers should finally be able to break away from doing tricks to minimize draw calls in the name of performance and focus on other aspects of game design.


GDC 2014 - DirectX 12 Unveiled: 3DMark 2011 CPU Time: Direct3D 11 vs. Direct3D 12

Of course at the same time we need to be clear that 3DMark’s API Overhead Feature Test is a synthetic test – and is so by design – so the performance we’re looking at today is just one small slice of the overall performance picture. Real world game performance gains will undoubtedly be much smaller, especially if games aren’t using a large number of draw calls in the first place. But the important part is that it sets the stage for future games to use a much larger number of draw calls and/or spend less time trying to minimize the number of calls. And of course we can’t ignore the multi-threading benefits from DirectX 12, as while multi-threaded games are relatively old now, the inability to scale up throughput with additional cores has always been an issue that DirectX 12 will help to solve.

Ultimately we’re looking at just one test, and a synthetic test at that, but as gamers if we want better understand why game developers such as Johan Andersson have been pushing so hard for low-level APIs, the results of this benchmark are exactly why. From discrete to integrated, top to bottom, every performance level of PC stands to gain from DirectX 12, and for virtually all of them the draw call gains are going to be immense. DirectX 12 won’t change the world, but it will change the face of game programming for the better, and it will be very interesting to see just what developers can do with the API starting later this year.

Integrated GPU Testing
Comments Locked

113 Comments

View All Comments

  • tipoo - Friday, March 27, 2015 - link

    4X gains seen here

    http://www.pcworld.com/article/2900814/tested-dire...
  • Ryan Smith - Friday, March 27, 2015 - link

    Sorry, that was an error in that table. We didn't have the 4770R for this article.
  • geekfool - Saturday, March 28, 2015 - link

    hhm pcw says " All of our tests were performed at 1280x720 resolution at Microsoft's recommendation."
    if that's the case with your tests too then its seems that the real test today should be 1080p and a provisional 4k/UHD1 to get a set of future core numbers regardless of MS's wishes...
  • Ryan Smith - Sunday, March 29, 2015 - link

    720p is the internal rendering resolution, and is used to avoid potential ROP bottlenecks (especially at the early stages). This is supposed to be a directed, synthetic benchmark, and the ability to push pixels is not what is intended to be tested.

    That said, the actual performance impact from switching resolutions on most of these GPUs is virtually nil since there's more than enough ROP throughput for all of this.
  • Winterblade - Friday, March 27, 2015 - link

    Very interesting results, and very informative article, the only small caveat I find is that for proper comparison of 2, 4 and 6 cores (seems to be one of the focal points of the article) the clock should be the same for all 3 configurations, it is a bit misleading otherwise. The difference seems to be around 10 - 15% in going from 4 to 6 cores but there is also a 10% difference in clock rate between them.
  • chizow - Friday, March 27, 2015 - link

    Fair point, it almost looks like they are trying to artifically force some contrast in the results there. Biggest issue I have with that is you are more likely to find higher clocked 4-cores in the wild since they tend to overclock better than the TDP and size limited 6-core chips.

    That's the tradeoff any power-user faces there, higher overclock on that 4790K (and soon Broadwell-K) chip or the higher L3 cache and more cores of a 6-core chip with lower OC potential.
  • dragonsqrrl - Friday, March 27, 2015 - link

    I got 1.7M draw calls per second with an i7-970 and GTX480 in DX11, and 2.3M in DX11MT. Pretty much identical to every other Nvidia card benchmarked. Interested to see what kind of draw call gains I get with a 480 once Windows 10 and DX12 come out with finalized drivers.
  • godrilla - Friday, March 27, 2015 - link

    Vulkan seems more attractive for devs though.

    The battle of the APIs incoming.
  • junky77 - Friday, March 27, 2015 - link

    Well, currently, the limiting factor is almost always the GPU, with with a powerful GPU, unless we are talking AMD CPUs which are TDP limtied in many cases or an I3 and even then the differences are not great

    So, I think that it's mainly a look for the future, allowing higher draws scenes, potentially
  • Mat3 - Friday, March 27, 2015 - link

    Would be interesting to see how the FX-8350 compares to the i7-4960X for this test.

Log in

Don't have an account? Sign up now