Haswell GPU Architecture & Iris Pro

In 2010, Intel’s Clarkdale and Arrandale CPUs dropped the GMA (Graphics Media Accelerator) label from its integrated graphics. From that point on, all Intel graphics would be known as Intel HD graphics. With certain versions of Haswell, Intel once again parts ways with its old brand and introduces a new one, this time the change is much more significant.

Intel attempted to simplify the naming confusion with this slide:

While Sandy and Ivy Bridge featured two different GPU implementations (GT1 and GT2), Haswell adds a third (GT3).

Basically it boils down to this. Haswell GT1 is just called Intel HD Graphics, Haswell GT2 is HD 4200/4400/4600. Haswell GT3 at or below 1.1GHz is called HD 5000. Haswell GT3 capable of hitting 1.3GHz is called Iris 5100, and finally Haswell GT3e (GT3 + embedded DRAM) is called Iris Pro 5200.

The fundamental GPU architecture hasn’t changed much between Ivy Bridge and Haswell. There are some enhancements, but for the most part what we’re looking at here is a dramatic increase in the amount of die area allocated for graphics.

All GPU vendors have some fundamental building block they scale up/down to hit various performance/power/price targets. AMD calls theirs a Compute Unit, NVIDIA’s is known as an SMX, and Intel’s is called a sub-slice.

In Haswell, each graphics sub-slice features 10 EUs. Each EU is a dual-issue SIMD machine with two 4-wide vector ALUs:

Low Level Architecture Comparison
  AMD GCN Intel Gen7 Graphics NVIDIA Kepler
Building Block GCN Compute Unit Sub-Slice Kepler SMX
Shader Building Block 16-wide Vector SIMD 2 x 4-wide Vector SIMD 32-wide Vector SIMD
Smallest Implementation 4 SIMDs 10 SIMDs 6 SIMDs
Smallest Implementation (ALUs) 64 80 192

There are limitations as to what can be co-issued down each EU’s pair of pipes. Intel addressed many of the co-issue limitations last generation with Ivy Bridge, but there are still some that remain.

Architecturally, this makes Intel’s Gen7 graphics core a bit odd compared to AMD’s GCN and NVIDIA’s Kepler, both of which feature much wider SIMD arrays without any co-issue requirements. The smallest sub-slice in Haswell however delivers a competitive number of ALUs to AMD and NVIDIA implementations.

Intel had a decent building block with Ivy Bridge, but it chose not to scale it up as far as it would go. With Haswell that changes. In its highest performing configuration, Haswell implements four sub-slices or 40 EUs. Doing the math reveals a very competent looking part on paper:

Peak Theoretical GPU Performance
  Cores/EUs Peak FP ops per Core/EU Max GPU Frequency Peak GFLOPs
Intel Iris Pro 5100/5200 40 16 1300MHz 832 GFLOPS
Intel HD Graphics 5000 40 16 1100MHz 704 GFLOPS
NVIDIA GeForce GT 650M 384 2 900MHz 691.2 GFLOPS
Intel HD Graphics 4600 20 16 1350MHz 432 GFLOPS
Intel HD Graphics 4000 16 16 1150MHz 294.4 GFLOPS
Intel HD Graphics 3000 12 12 1350MHz 194.4 GFLOPS
Intel HD Graphics 2000 6 12 1350MHz 97.2 GFLOPS
Apple A6X 32 8 300MHz 76.8 GFLOPS

In its highest end configuration, Iris has more raw compute power than a GeForce GT 650M - and even more than a GeForce GT 750M. Now we’re comparing across architectures here so this won’t necessarily translate into a performance advantage in games, but the takeaway is that with HD 5000, Iris 5100 and Iris Pro 5200 Intel is finally walking the walk of a GPU company.

Peak theoretical performance falls off steeply as soon as you start looking at the GT2 and GT1 implementations. With 1/4 - 1/2 of the execution resources as the GT3 graphics implementation, and no corresponding increase in frequency to offset the loss the slower parts are substantially less capable. The good news is that Haswell GT2 (HD 4600) is at least more capable than Ivy Bridge GT2 (HD 4000).

Taking a step back and looking at the rest of the theoretical numbers gives us a more well rounded look at Intel’s graphics architectures :

Peak Theoretical GPU Performance
  Peak Pixel Fill Rate Peak Texel Rate Peak Polygon Rate Peak GFLOPs
Intel Iris Pro 5100/5200 10.4 GPixels/s 20.8 GTexels/s 650 MPolys/s 832 GFLOPS
Intel HD Graphics 5000 8.8 GPixels/s 17.6 GTexels/s 550 MPolys/s 704 GFLOPS
NVIDIA GeForce GT 650M 14.4 GPixels/s 28.8 GTexels/s 900 MPolys/s 691.2 GFLOPS
Intel HD Graphics 4600 5.4 GPixels/s 10.8 GTexels/s 675 MPolys/s 432 GFLOPS
AMD Radeon HD 7660D (Desktop Trinity, A10-5800K) 6.4 GPixels/s 19.2 GTexels/s 800 MPolys/s 614 GFLOPS
AMD Radeon HD 7660G (Mobile Trinity, A10-4600M) 3.97 GPixels/s 11.9 GTexels/s 496 MPolys/s 380 GFLOPS

Intel may have more raw compute, but NVIDIA invested more everywhere else in the pipeline. Triangle, texturing and pixel throughput capabilities are all higher on the 650M than on Iris Pro 5200. Compared to AMD's Trinity however, Intel has a big advantage.

The Prelude Crystalwell: Addressing the Memory Bandwidth Problem
Comments Locked

177 Comments

View All Comments

  • Death666Angel - Tuesday, June 4, 2013 - link

    "What Intel hopes however is that the power savings by going to a single 47W part will win over OEMs in the long run, after all, we are talking about notebooks here."
    This plus simpler board designs and fewer voltage regulators and less space used.
    And I agree, I want this in a K-SKU.
  • Death666Angel - Tuesday, June 4, 2013 - link

    And doesn't MacOS support Optimus?
    RE: "In our 15-inch MacBook Pro with Retina Display review we found that simply having the discrete GPU enabled could reduce web browsing battery life by ~25%."
  • GullLars - Tuesday, June 4, 2013 - link

    Those are strong words in the end, but i agree Intel should make a K-series CPU with Crystalwell. What comes to mind is they may be doing that for Broadwell.

    The Iris Pro solution with eDRAM looks like a nice fit for what i want in my notebook upgrade coming this fall. I've been getting by on a Core2Duo laptop, and didn't go for Ivy Bridge because there were no good models with a 1920x1200 or 1920x1080 display without dedicated graphics. For a system that will not be used for gaming at all, but needs resolution for productivity, it wasn't worth it. I hope this will change with Haswell, and that i will be able to get a 15" laptop with >= 1200p without dedicated graphics. 4950HQ or 4850HQ seems like an ideal fit. I don't mind spending $1500-2000 for a high quality laptop :)
  • IntelUser2000 - Tuesday, June 4, 2013 - link

    ANAND!!

    You got the FLOPs rating wrong on the Sandy Bridge parts. They are at 1/2 of Ivy Bridge.

    1350MHz with 12 EUs and 8 FLOPs/EU will result in 129.6GFlops. While its true in very limited scenarios Sandy Bridge's iGPU can co-issue, its small enough to be non-existent. That is why a 6EU HD 2500 comes close to 12EU HD 3000.
  • Hrel - Tuesday, June 4, 2013 - link

    If they use only the HD4600 and Iris Pro that'd probably be better. As long as it's clearly labeled on laptops. HD 4600 Pro (don't expect to do any video work on this) Iris Pro (it's passable in a pinch).

    But I don't think that's what's going to happen. Iris Pro could be great for Ultrabooks; I don't really see any use outside of that though. A low end GT740M is still a better option in any laptop that has the thermal room for it. Considering you can put those in 14" or larger ultrabooks I still think Intel's graphics aren't serious. Then you consider the lack of Compute, PhysX, Driver optimization, game specific tuning...

    Good to see a hefty performance improvement. Still not good enough though. Also pretty upsetting to see how many graphics SKU's they've released. OEM'S are gonna screw people who don't know just to get the price down.
  • Hrel - Tuesday, June 4, 2013 - link

    The SKU price is 500 DOLLARS!!!! They're charging you 200 bucks for a pretty shitty GPU. Intel's greed is so disgusting it over rides the engineering prowess of their employees. Truly disgusting Intel; to charge that much for that level of performance. AMD we need you!!!!
  • xdesire - Tuesday, June 4, 2013 - link

    May i ask a noob question? Question: Do we have no i5s, i7s WITHOUT on board graphics any more? As a gamer i'd prefer to have a CPU + discrete GPU in my gaming machine and i don't like to have extra stuff stuck on the CPU, lying there consuming power and having no use (for my part) whatsoever. No ivy bridge or haswell i5s, i7s without iGPU or whatever you call it?
  • flyingpants1 - Friday, June 7, 2013 - link

    They don't consume power while they're not in use.
  • Hrel - Tuesday, June 4, 2013 - link

    WHY THE HELL ARE THOSE SO EXPENSIVE!!!!! Holy SHIT! 500 dollars for a 4850HQ? They're charging you 200 dollars for a shitty GPU with no dedicated RAM at all! Just a cache! WTFF!!!

    Intel's greed is truly disgusting... even in the face of their engineering prowess.
  • MartenKL - Wednesday, June 5, 2013 - link

    What I don't understand is why Intel didn't do a "next-gen console like processor". Like takeing the 4770R and doubling the GPU or een quadrupling, wasn't there space? The thermal headroom must have been there as we are used to CPUs with as high as 130W TDP. Anyhow, combining that with awesome drivers for Linux would have been a real competition to AMD/PS4/XONE for Valve/Steam. A complete system under 150w capable of awesome 1080p60 gaming.

    So now I am looking for the best performing GPU under 75W, ie no external power. Which is it, still the Radeon HD7750?

Log in

Don't have an account? Sign up now