GPU Scaling

Switching gears, let’s take a look at performance from a GPU standpoint, including how well Star Swarm performance scales with more powerful GPUs now that we have eliminated the CPU bottleneck. Until now Star Swarm has never been GPU bottlenecked on high-end NVIDIA cards, so this is our first time seeing just how much faster Star Swarm can get until it runs into the limits of the GPU itself.

Star Swarm GPU Scaling - Extreme Quality (4 Cores)

As it stands, with the CPU bottleneck swapped out for a GPU bottleneck, Star Swarm starts to favor NVIDIA GPUs right now. Even accounting for performance differences, NVIDIA ends up coming out well ahead here, with the GTX 980 beating the R9 290X by over 50%, and the GTX 680 some 25% ahead of the R9 285, both values well ahead of their average lead in real-world games. With virtually every aspect of this test still being under development – OS, drivers, and Star Swarm – we would advise not reading into this too much right now, but it will be interesting to see if this trend holds with the final release of DirectX 12.

Meanwhile it’s interesting to note that largely due to their poor DirectX 11 performance in this benchmark, AMD sees the greatest gains from DirectX 12 on a relative basis and comes close to seeing the greatest gains on an absolute basis as well. The GTX 980’s performance improves by 150% and 40.1fps when switching APIs; the R9 290X improves by 416% and 34.6fps. As for AMD’s Mantle, we’ll get back to that in a bit.

Star Swarm GPU Scaling - Extreme Quality (2 Cores)

Having already established that even 2 CPU cores is enough to keep Star Swarm fed on anything less than a GTX 980, the results are much the same here for our 2 core configuration. Other than the GTX 980 being CPU limited, the gains from enabling DirectX 12 are consistent with what we saw for the 4 core configuration. Which is to say that even a relatively weak CPU can benefit from DirectX 12, at least when paired with a strong GPU.

However the GTX 750 Ti result in particular also highlights the fact that until a powerful GPU comes into play, the benefits today from DirectX 12 aren’t nearly as great. Though the GTX 750 Ti does improve in performance by 26%, this is far cry from the 150% of the GTX 980, or even the gains for the GTX 680. While AMD is terminally CPU limited here, NVIDIA can get just enough out of DirectX 11 that a 2 core configuration can almost feed the GTX 750 Ti. Consequently in the NVIDIA case, a weak CPU paired with a weak GPU does not currently see the same benefits that we get elsewhere. However as DirectX 12 is meant to be forward looking – to be out before it’s too late – as GPU performance gains continue to outstrip CPU performance gains, the benefits even for low-end configurations will continue to increase.

CPU Scaling DirectX 12 vs. Mantle, Power Consumption
POST A COMMENT

245 Comments

View All Comments

  • dakishimesan - Friday, February 06, 2015 - link

    Because DirectX 10 and WDDM 2.0 are tied at the hip, and by extension tied to Windows 10, DirectX 12 will only be available under Windows 10. Reply
  • dakishimesan - Friday, February 06, 2015 - link

    PS: great article. Reply
  • FlushedBubblyJock - Sunday, February 15, 2015 - link

    First thoughts: R9 290X dx11=8 frames mantle=46 frames TEST= TOTAL FRAUD

    Although the difference there is what AMD told us mantle would do, only in this gigantic liefest is such hilarity achieved.

    Another big industry lie-test blubbered out to the sheep at large.
    Reply
  • 0ldman79 - Monday, February 16, 2015 - link

    It looks more like the people that coded that game are not very experienced and have spent far more time optimizing for future API than DX11. Reply
  • Christopher1 - Monday, February 16, 2015 - link

    Not necessarily. DX11 no matter how 'optimized' still does not get you as close 'to the metal' as Mantle does. So yes, there can be these kinds of extreme differences in FPS. Reply
  • The_Countess666 - Thursday, February 19, 2015 - link

    they are in fact very experienced. but they choose to do the things that previously DX11 bottleneck prevented them from doing in the past. Reply
  • 0ldman79 - Saturday, February 21, 2015 - link

    That makes sense, still not quite an apples to apples comparison in that situation, though using previously unavailable features on the new API tends to show the differences.

    The question still remains, will we see similar improvements on the current crop of DX11 games?

    I don't think that will be the case, though I could be wrong.

    Seems the gains are from multithreading, which is part of the DX11 or 11.1 spec.
    Reply
  • RobATiOyP - Sunday, February 21, 2016 - link

    Of course you won't see such a performance increase, because games have to be designed and tuned to what the platform is capable of. The console API's have allowed games, lower level access, Mantle, DX12 & Vulkan are about removing a bottleneck caused by the assumptions in DX11 & OpenGL API's which were designed when GPUs were novel items and much evolution has occured since. Those doubting the benchmark, please say why a graphics application would not want to do more draw calls per second! Reply
  • Fishymachine - Monday, February 16, 2015 - link

    DX11 can manage up 10k draw calls, Star Storms makes 100k. Also Assasins Creed Unity makes up to 50k in case you wanted a retail game that would skyrocket in low API(there's a spot where even 2 GTX980 get 17fps) Reply
  • The_Countess666 - Thursday, February 19, 2015 - link

    this engine was spefiically written to do all the things that previously DX11 doesn't allow game developers to do. it was designed to run headlong into every bottleneck that DX11 has.

    it is in fact a great demonstrations of the weaknesses of DX11.

    the fact that nvidia gets higher framerates in dx11 then ATI is because they optimized the hell out of this game. that isn't viable (costs too much, far too time consuming) for every game and was purely done by nvidia for marketing, but all it really does is further illustrate the need for a low level API where the burden of optimizations is shifted to the game engine developers where it belongs, not the driver developers.
    Reply

Log in

Don't have an account? Sign up now