Haswell's GPU

Although Intel provided a good amount of detail on the CPU enhancements to Haswell, the graphics discussion at IDF was fairly limited. That being said, there's still some to talk about here.

Haswell builds on the same fundamental GPU architecture we saw in Ivy Bridge. We won't see a dramatic redesign/re-plumbing of the graphics hardware until Broadwell in 2014 (that one is going to be a big one).

Haswell's GPU will be available in three physical configurations: GT1, GT2 and GT3. Although Intel mentioned that the Haswell GT3 config would have twice the shader count of Haswell GT2, it was careful not to disclose the total number of EUs in any of the versions. Based on the information we have at this point, GT3 should be a 40 EU configuration while GT2 should feature 20 EUs. Intel will also be including up to one redundant EU to deal with the case where there's a defect in an EU in the array. This isn't an uncommon practice, but it does indicate just how much of the die will be dedicated to graphics in Haswell. The larger of an area the GPU covers, the greater the likelihood that you'll see unrecoverable defects in the GPU. Redundancy at the EU level is one way of mitigating that problem.

Haswell's processor graphics extends API support to DirectX 11.1, OpenCL 1.2 and OpenGL 4.0.

At the front of the graphics pipeline is a new resource streamer. The RS offloads some driver work that the CPU would normally handle and moves it to GPU hardware instead. Both AMD and NVIDIA have significant command processors so this doesn't appear to be an Intel advantage although the devil is in the (unshared) details. The point from Intel's perspective is that any amount of processing it can shift away from general purpose CPU hardware and onto the GPU can save power (CPU cores go to sleep while the RS/CS do their job).

Beyond the resource streamer, most of the fixed function graphics hardware sees a doubling of performance in Haswell.

At the shader core level, Intel separates the GPU design into two sections: slice common and sub-slice. Slice common includes the rasterizer, pixel back end and GPU L3 cache. The sub-slice includes all of the EUs, instruction caches and EUs.

In Haswell GT1 and GT2 there's a single slice common, while GT3 sees a doubling of slice common. GT3 similarly has two sub-slices, although once again Intel isn't talking specifics about EU counts or clock speeds between GT1/2/3.

The final bit of detail Intel gave out about Haswell's GPU is the texture sampler sees up to a 4x improvement in throughput over Ivy Bridge in some modes.

Now to the things that Intel didn't let loose at IDF. Although originally an option for Ivy Bridge (but higher ups at Intel killed plans for it) was a GT3 part with some form of embedded DRAM. Rumor has it that Apple was the only customer who really demanded it at the time, and Intel wasn't willing to build a SKU just for Apple.

Haswell will do what Ivy Bridge didn't. You'll see a version of Haswell with up to 128MB of embedded DRAM, with a lot of bandwidth available between it and the core. Both the CPU and GPU will be able to access this embedded DRAM, although there are obvious implications for graphics.

Overall performance gains should be about 2x for GT3 (presumably with eDRAM) over HD 4000 in a high TDP part. In Ultrabooks those gains will be limited to around 30% max given the strict power limits.

As for why Intel isn't talking about embedded DRAM on Haswell, your guess is as good as mine. The likely release timeframe for Haswell is close to June 2013, there's still tons of time between now and then. It looks like Intel still has a desire to remain quiet on some fronts.

TSX Haswell Media Engine: QuickSync the Third
POST A COMMENT

248 Comments

View All Comments

  • tipoo - Friday, October 05, 2012 - link

    "Overall performance gains should be about 2x for GT3 (presumably with eDRAM) over HD 4000 in a high TDP part."

    Does this mean the regular GT3 without eDRAM cache will be twice the performance of the HD4000 and the one with the cache will be 4x? Or that the one with the cache will be 2x? In which case, what would the one with no cache perform like, with so many more EUs the first is probably correct, right?
    Reply
  • tipoo - Friday, October 05, 2012 - link

    "presumably with eDRAM"...So the GT3 in Haswel has over double the EUs of Ivy Bridge, but without the cache it doesn't even get to 2x the performance? Seems off to me, doesn't it seem like the GT3 on its own would be 2x the performance while the eDRAM cache would make for another 2x? Reply
  • DanNeely - Saturday, October 06, 2012 - link

    It probably means that, like AMD, Intel is hitting the wall on memory bandwidth for IGPs. When it finally arrives, DDR4 will shake things up a bit; but DDR3 just isn't fast enough. Reply
  • tipoo - Sunday, October 07, 2012 - link

    I don't think so, doesn't the HD4000 have more bandwidth to work with than AMDs APUs yet offers worse performance? They still had headroom there. I think it's just for TDP, they limit how much power the GPUs can use since the architecture is oriented at mobile. Reply
  • magnimus1 - Friday, October 05, 2012 - link

    Would love to hear your take on how Intel's latest and greatest fares against Qualcomm's latest and greatest! Reply
  • cosmotic - Friday, October 05, 2012 - link

    Ah, an MPEG2 encoder. Just in time! Reply
  • jamyryals - Friday, October 05, 2012 - link

    This made me :) Reply
  • name99 - Friday, October 05, 2012 - link

    We laugh but one possibility is that Intel hopes to sell Haswell's inside US broadcast equipment.
    There isn't much broadcast equipment sold, but the costs are massive, and there's no obvious reason not to replace much of that custom hardware with intel chips.
    And much of the existing broadcast hardware (at least the MPEG2-encoding part) is obviously garbage --- the artifacts I see on broadcast TV are bad even for the prime-time networks, and are truly awful for the budget independent operators.

    Much like they have written a cell-tower stack to run on i7's to replace the similarly grossly over-priced custom hardware that lives in cell towers, and are currently deploying in China. Anand wrote about this about two weeks ago.
    Reply
  • vt1hun - Friday, October 05, 2012 - link

    Do you have an idea when Intel will move to DDR4 ? Not with Haswell according to this article.

    Thank you
    Reply
  • tipoo - Friday, October 05, 2012 - link

    Haswell EX for servers will support DDR4, but even Broadwell on desktops is only DDR3, we won't see DDR4 in desktops until 2015. Reply

Log in

Don't have an account? Sign up now