We’re back once again for the 3rd and likely final part to our evolving series previewing the performance of DirectX 12. After taking an initial look at discrete GPUs from NVIDIA and AMD in part 1, and then looking at AMD’s integrated GPUs in part 2, today we’ll be taking a much requested look at the performance of Intel’s integrated GPUs. Does Intel benefit from DirectX 12 in the same way the dGPUs and AMD’s iGPU have? And where does Intel’s most powerful Haswell GPU configuration, Iris Pro (GT3e) stack up? Let’s find out.

As our regular readers may recall, when we were initially given early access to WDDM 2.0 drivers and a DirectX 12 version of Star Swarm, it only included drivers for AMD and NVIDIA GPUs. Those drivers in turn only supported Kepler and newer on the NVIDIA side and GCN 1.1 and newer on the AMD side, which is why we haven’t yet been able to look at older AMD or NVIDIA cards, or for that matter any Intel iGPUs. However as of late last week that changed when Microsoft began releasing WDDM 2.0 drivers for all 3 vendors through Windows Update on Windows 10, enabling early DirectX 12 functionality on many supported products.

With Intel WDDM 2.0 drivers now in hand, we’re able to take a look at how Intel’s iGPUs are affected in this early benchmark. Driver version 10.18.15.4098, these drivers enable DirectX 12 functionality on Gen 7.5 (Haswell) and newer GPUs, with Gen 7.5 being the oldest Intel GPU generation that will support DirectX 12.

Today we’ll be looking at all 3 Haswell GPU tiers, GT1, GT2, and GT3e. We also have our AMD A10 and A8 results from earlier this month to use as a point of comparison (though please note that this combination of Mantle + SS is still non-functional on AMD APUs). With that said, before starting we’d like to once again remind everyone that this is an early driver on an early OS running an early DirectX 12 application, so everything here is subject to change. Furthermore Star Swarm itself is a very directed benchmark designed primarily to showcase batch counts, so what we see here should not be considered a well-rounded look at the benefits of DirectX 12. At the end of the day this is a test that more closely measures potential than real-world performance.

CPU: AMD A10-7800
AMD A8-7600
Intel Core i3-4330
Intel Core i5-4690
Intel Core i7-4770R
Intel Core i7-4790K
Motherboard: GIGABYTE F2A88X-UP4 for AMD
ASUS Maximus VII Impact for Intel LGA-1150
Zotac ZBOX EI750 Plus for Intel BGA
Power Supply: Rosewill Silent Night 500W Platinum
Hard Disk: OCZ Vertex 3 256GB OS SSD
Memory: G.Skill 2x4GB DDR3-2133 9-11-10 for AMD
G.Skill 2x4GB DDR3-1866 9-10-9 at 1600 for Intel
Video Cards: AMD APU Integrated
Intel CPU Integrated
Video Drivers: AMD Catalyst 15.200 Beta
Intel 10.18.15.4098
OS: Windows 10 Technical Preview 2 (Build 9926)

Since we’re looking at fully integrated products this time around, we’ll invert our usual order and start with our GPU-centric view first before taking a CPU-centric look.

Star Swarm GPU Scaling - Mid Quality

Star Swarm GPU Scaling - Low Quality

As Star Swarm was originally created to demonstrate performance on discrete GPUs, these integrated GPUs do not perform well. Even at low settings nothing cracks 30fps on DirectX 12. None the less there are a few patterns here that can help us understand what’s going on.

Right off the bat then there are two very apparent patterns, one of which is expected and one which caught us by surprise. At a high level, both AMD APUs outperform our collection of Intel processors here, and this is to be expected. AMD has invested heavily in iGPU performance across their entire lineup, where most Intel desktop SKUs come with the mid-tier GT2 GPU.

However what’s very much not expected is the ranking of the various Intel processors. Despite having all 3 Intel GPU tiers represented here, the performance between the Intel GPUs is relatively close, and this includes the Core i7-4770R and its GT3e GPU. GT3e’s performance here immediately raises some red flags – under normal circumstances it substantially outperforms GT2 – and we need to tackle this issue first before we can discuss any other aspects of Intel’s performance.

As long-time readers may recall from our look at Intel’s Gen 7.5 GPU architecture, Intel scales up from GT1 through GT3 by both duplicating the EU/texture unit blocks (the subslice) and the ROP/L3 blocks (the slice common). In the case of GT3/GT3e, it has twice as many slices as GT2 and consequently by most metrics is twice the GPU that GT2 is, with GT3e’s Crystal Well eDRAM providing an extra bandwidth kick. Immediately then there is an issue, since in none of our benchmarks does the GT3e equipped 4770R surpass any of the GT2 equipped SKUs.

The explanation, we believe, lies in the one part of an Intel GPU that doesn’t get duplicated in GT3e, which is the front-end, or as Intel calls it the Global Assets. Regardless of which GPU configuration we’re looking at – GT1, GT2, or GT3e – all Gen 7.5 configurations share what’s essentially the same front-end, which means front-end performance doesn’t scale up with the larger GPUs beyond any minor differences in GPU clockspeed.

Star Swarm for its part is no average workload, as it emphasizes batch counts (draw calls) above all else. Even though the low quality setting has much smaller batch counts than the extreme setting we use on the dGPUs, it’s still over 20K batches per frame, a far higher number than any game would use if it was trying to be playable on an iGPU. Consequently based on our GT2 results and especially our GT3e result, we believe that Star Swarm is actually exposing the batch processing limits of Gen 7.5’s front-end, with the front-end bottlenecking performance once the CPU bottleneck is scaled back by the introduction of DirectX 12.

The result of this is that while the Intel iGPUs are technically GPU limited under DirectX 12, it’s not GPU limited in a traditional sense; it’s not limited by shading performance, or memory bandwidth, or ROP throughput. This means that although Intel’s iGPUs benefit from DirectX 12, it’s not by nearly as much as AMD’s iGPUs did, never mind the dGPUs.

Update: Between when this story was written and when it was published, we heard back from Intel on our results. We are publishing our results as-is, but Intel believes that the lack of scaling with GT3e stems in part from a lack of optimizations for lower performnace GPUs in our build of Star Swarm, which is from an October branch of Oxide's code base. Intel tells us that newer builds do show much better overall performance and more consistent gains for the GT3e, all the while the Oxide engine itself is in flux with its continued development. In any case this reiterates the fact that we're still looking at early code here from all parties and performance is subject to change, especially on a test as directed/non-standard as Star Swarm.

So how much does Intel actually benefit from DirectX 12 under Star Swarm? As one would reasonably expect, with their desktop processors configured for very high CPU performance and much more limited GPU performance, Intel is the least CPU bottlenecked in the first place. That said, if we take a look at the mid quality results in particular, what we find is that Intel still benefits from DX12. The 4770R is especially important here, as it’s a relatively weaker GPU (base frequency 3.2GHz) coupled with a more powerful GPU. It starts out trailing the other Core processors in DX11, only to reach parity with them under DX12 when the bottleneck shifts from the CPU to the GPU front-end. The performance gain is only 25% - and at framerates in the single digits – but conceptually it shows that even Intel can benefit from DX12. Meanwhile the other Intel processors see much smaller, but none the less consistent gains, indicating that there’s at least a trivial benefit from DX12.

Star Swarm CPU Batch Submission Time - Mid - iGPU

Taking a look under the hood at our batch submission times, we can much more clearly see the CPU usage benefits of DX12. The Intel CPUs actually start at a notable deficit here under DX11, with batch submission times worse than the AMD APUs and their relatively weaker CPUs, and 4770R in particular taking nearly 200ms to submit a batch. Enabling DX12 in turn causes the same dramatic reduction in batch submission times we’ve seen elsewhere, with Intel’s batch submission times dropping to below 20ms. Somewhat surprisingly Intel’s times are still worse than AMD’s, though at this point we’re so badly GPU limited on all platforms that it’s largely academic. None the less it shows that Intel may have room for future improvements.

Star Swarm CPU Scaling - Mid Quality - iGPUStar Swarm CPU Scaling - Low Quality - iGPU

With this data in hand, we can finally make better sense of the results we’re seeing today. Just as with AMD and NVIDIA, using DirectX 12 has a noticeable and dramatic reduction in batch submission times for Intel’s iGPUs. However in the case of Star Swarm the batch counts are so high that it appears GT2 and GT3e are bottlenecked by their GPU front-ends, and as a result the gains from enabling DX12 at very limited. In fact at this point we’re probably at the limits of Star Swarm’s usefulness, since it’s meant more for discrete GPUs.

The end result though is that one way or another Intel ends up shifting from being CPU limited to GPU limited under DX12. And with a weaker GPU than similar AMD parts, performance tops out much sooner. That said, it’s worth pointing out that we are looking at desktop parts here, where Intel goes heavy on the CPU and light on the GPU; in mobile parts where Intel’s CPU and GPU configurations are less lopsided, it’s likely that Intel would benefit more than they do on the desktop, though again probably not as much as AMD has.

As for real world games, just as with our other GPUs we’re in a wait-and-see situation. An actual game designed to be playable on Intel’s iGPUs is very unlikely to push as many batch calls as Star Swarm, so the front-end bottleneck and GT3e’s poor performance are similarly unlikely to recur. But at the same time with Intel generally being the least CPU bottlenecked in the first place, their overall gains under DX12 may be the smallest, particularly when exploiting the API’s vastly improved draw call performance.

In the meantime GDC 2015 will be taking place next week, where we will be hearing more from Microsoft and its GPU partners about DirectX 12. With last year’s unveiling being an early teaser of the API, the sessions this year will be focusing on helping programmers ramp up for its formal launch later this year, and with any luck we’ll find the final details on feature level 12_0 and whether any current GPUs are 12_0 compliant. Along with more on OpenGL Next (aka glNext), it should make for an exciting show for GPU events.

Comments Locked

67 Comments

View All Comments

  • eanazag - Thursday, February 26, 2015 - link

    He's an angry elf.

    Anyhow, I'm happy to see the results. It confirms some assumptions and adds unexpected information. It confirms that Intel iGPUs weren't CPU limited in the first place and therefore the gains shown would not be as pronounced as AMD's. Secondly, it demonstrates a weakness in Intel's iGPU that I wasn't aware was there; this is particularly important on the 4770R, which is a pricey chip. I don't think it means much today, but possibly in several years a game that has more batch calls will severely underperform on iGPU.

    I don't think Intel's mobile lineup will do any better. The difference we will see is thermal and power improvements while gaming due to less work being necessary. Laptops like the Razer Blade may see the biggest changes. That's exciting.

    My takeaway here is that for a iGPU gaming desktop if purchased right now, AMD will be the better option for the next few years of ownership. Unfortunately, AMD seems to be abandoning desktop from future chips at the moment. It is a shame because DX12 makes them more relevant. This does bolster the value laptop gaming market in favor of AMD. At $500-600 and lower I will be recommending AMD to those light gamers or dablers. HSA needs to come out swinging in an application or two for AMD to move up the market tiers for recommendations.
  • patrickjp93 - Friday, February 27, 2015 - link

    You have to remember Intel only started doing real 3D graphics designs 5-6 years ago. Everything before that was an implementation of a PowerVR/3DFX design or an in-house meant to just drive graphics enough for business applications (knocking Nvidia and AMD out of that client space).

    Right now Intel is focused on GPGPU compute anyway, hence the 50% core jump on Skylake and putting the iGPU on Skylake Xeons to aid compute density and offer a synergistic layer to work alongside the KNL Xeon Phi and socketed chips under OpenMP and OpenCL. Any wins in gaming are just icing on the cake for Intel. Right now they're after spilling Nvidia's blood in the server/supercomputer space after Nvidia pulled a bunch of graphics licensing during the Larrabee project. The accelerator world used to be almost universally Teslas. Now not so much, especially since CUDA is a rare skill compared to C++ and most HPC courses take some time to teach OpenMP usage.

    And of course Intel has to fight off HSA as well, though it looks like adoption rates are so slow as to be negligible unless Zen is a perfect competitor. 2016 will be the biggest enterprise chip war since the great slugfest between Intel and IBM back during the days of the 8086.
  • mr_tawan - Friday, February 27, 2015 - link

    What would be a better option for benchmarking/previewing DX12 performance then ?
  • MrSpadge - Friday, February 27, 2015 - link

    Feel free to ignore anything labeled "preliminary" and let the rest of us enjoy peeking at the potential of DX12. The articels are very clear about the fact that this performance won't translate directly into real games.
  • MikeMurphy - Thursday, February 26, 2015 - link

    I wonder if the significant increases for AMD APUs will translate into substantially better Xbox One performance.
  • lioncat55 - Thursday, February 26, 2015 - link

    I think it will depend on the game. With the Xbox One and PS4 they have Mantel. We have seen that there is not a huge gain from Mantel to DX12. I think it will take time for the developers to get use to the lower level coding and get the full power out of the current gen consoles.
  • dragonsqrrl - Thursday, February 26, 2015 - link

    Mantle was designed specifically for PC. The Xbox One and PS4 don't use Mantle, AMD came out and addressed this topic a long time ago.
  • jabber - Friday, February 27, 2015 - link

    Mantle was designed specifically to force Microsoft to properly write optimised code/prceedures for DirectX 12.

    Now that mantles job is done AMD can drop it.
  • Gigaplex - Thursday, February 26, 2015 - link

    What makes you think DirectX 12 will have any noticeable performance benefit on consoles? They already use a low level API. DirectX 12 brings a console style API to the desktop, this isn't a brand new innovation.
  • mkozakewich - Thursday, February 26, 2015 - link

    Are games generally tuned better for AMD and NVidia cards than for Intel? If Intel is reporting different results with the newer engine, it makes it sound like either games have to target specific architectures or it's just really easy for a game to miss optimizations on platforms they don't bother testing.

Log in

Don't have an account? Sign up now