The Next Generation Gen11 Graphics: Playable Games and Adaptive Sync!

Some of the first words out of the mouth of Raja Koduri about graphics is that Intel has a duty to its one billion customers with integrated graphics to give them something that is useful, and that it is time for Intel to provide graphics which people can actually play games on. Given his expertise on the matter, it shouldn’t sound too far-fetched: more people play games than ever before, and these users want to play no matter what their hardware. To that end, Raja stated that Gen11 graphics is the first step in a new graphics policy to provide the performance and features to let gamers play the most popular games, no matter what implementation.

Gen11: Intel’s first GT2 TFLOPS Graphics

In 2015, Intel launched the Skylake processor with Gen9 integrated graphics. Rather than moving straight to Gen10 the next time around, we were given Gen 9.5 in both Kaby Lake and Coffee Lake, which supposedly draw features from what would have been Gen 10. Actually, the graphics for Intel’s failed 10nm Cannon Lake chip were meant to be called Gen10, however Intel never released a Cannon Lake processor with working integrated graphics, and because Gen11 goes above and beyond what Gen10 would have been, we’ve gone straight to Gen11. Make sense? Well Intel didn’t even bother to acknowledge Gen10 in its history graph:

We will see Gen11 graphics being paired with Sunny Cove cores on 10nm sometime in 2019 according to the roadmaps. However rather than give a detailed architecture layout for the new product, we instead were given a rather high level diagram.

From here we can deduce a few things. We were told that this configuration is the GT2 config, which will have 64 execution units, up from 24 in Gen9.5. These 64 EUs are split into four slices, with each slice being made of two sub-slices of 8 EUs a piece. Each sub-slice will have an instruction cache and a 3D sampler, while the bigger slice gets two media samplers, a PixelFE, and additional load/store hardware. Intel lists Gen11 targeting efficiency, performance, advanced 3D and media capabilities, and a better gaming experience.

Intel didn’t go into too much detail regarding how the EUs are at higher performance, however the company did say that the FPU interfaces inside the EU are redesigned and it still has support for fast (2x) FP16 performance as seen in Gen9.5. Each EU will support seven threads as before, which means that the entire GT2 design will essentially have 512 concurrent pipelines. In order to help feed these pipes, Intel states that it has redesigned the memory interface, as well as increasing the L3 cache of the GPU to 3 MB, a 4x increase over Gen9.5, and it is now a separate block in the unslice section of the GPU.

Other features include tile-based rendering, which Intel stated the graphics hardware will be able to enable/disable on a render pass basis. This will make Intel the final member of the PC GPU vendor community to implement this, following NVIDIA in 2014 and AMD in 2017. While not a panacea to all performance woes, a good tile rendering setup plays well to the bandwidth limitations of an integrated GPU. Meanwhile Intel's lossless memory compression has also improved, with Intel listing a best case performance boost of 10% or a geometric mean boost of 4%. The GTI interface now supports 64 bytes per clock read and write to increase throughput, which works with the better memory interface.

Coarse Pixel Shading, Intel's implementation of multi-rate shading and similar in scope to NVIDIA’s own Variable Pixel Shading, is also supported. This allows the GPU to reduce the amount of total shading work required by shading some pixels on a less than 1:1 basis. Intel showed two demos for CPS, where pixel shading was reduced either as a function of object distance from camera (so you do less work when things are further away), or reduced as a function of how close the object is to the center of the screen, designed to help features like foveated rendering for VR. With a 2x2 pixel stencil applied – meaning only one pixel shading operation was done per block of 4 pixels – Intel stated a ~30% increase in frame rates in supported games. Unfortunately this needs to be applied on a game-by-game basis in order to prevent significant image quality losses, so the performance gains won't be immediate or universal.

For the media block, Intel says that the Gen11 design includes a ground up HEVC encoder design, with high quality encode and decode support. Intel cited the fact that its media fixed function units are already used in the datacenter for video processing, and home users can take advantage of the same hardware. Intel also stated that by using parallel decoders it can either support concurrent video streams or they can be combined to support a single large stream, and this scalable design will allow future hardware to push the peak resolutions up to 8K and beyond.

The highlight of the display engine is support for Adaptive Sync technologies. We were told that it was announced back at the launch of Skylake, but now it is finally ready to go into Intel’s integrated graphics. This goes in hand with HDR support due to its high-precision data path.

One thing in this presentation that Intel didn’t mention directly is that Gen11 graphics would appear to have Type-C video output support, potentially indicating that Intel has integrated the necessary mux into the chipset itself, removing another IC from the motherboard design.

Sunny Cove Microarchitecture: A Peek At the Back End Demonstrating Sunny Cove and Gen11 Graphics
Comments Locked

148 Comments

View All Comments

  • peevee - Tuesday, December 18, 2018 - link

    "Normally cache misses decrease by a factor of a square root of the proportional size when the cache is increased"

    This is neither true in most performance-critical real cases nor can provide any estimate of actual performance increase.
  • mikato - Friday, December 21, 2018 - link

    I'm here for the "raja inside" comments. Disappointed.
  • peevee - Sunday, December 23, 2018 - link

    "although it was pointed out that these improvements won’t help everyone, and might require new algorithms in order to use specific parts of the core."

    Which means it will help almost no one, as very few will optimize specifically for that core.

    "We’re waiting to see what changes Intel has made on the front-end, which is where a lot of low-hanging fruit often lies for performance."

    Low-hanging fruit in x86 was picked up in Pentium. Since then it is just more and more kludges which cost more energy than improve performance (normalizing for node).
  • peevee - Sunday, December 23, 2018 - link

    "64 EUs... Each EU will support seven threads as before, which means that the entire GT2 design will essentially have 512 concurrent pipelines."

    Math?
    And are these threads? Or ALUs?
  • peevee - Sunday, December 23, 2018 - link

    "The 7-Zip demo was relatively straight forward, showing how the new instructions such as Vector-AES and SHA-NI in Sunny Cove can give the processor a 75% boost in performance over an equivalent Skylake based platform at iso-frequency."

    Huh? Have they recompiled (what compiler supports the new instructions then), or manually wrote a codepath in asm? And enabled encryption so to get any increase, so the increase is not actually for compression? Have they disabled compression too? ;)
  • dampf - Wednesday, January 2, 2019 - link

    Really Intel? Adding AI improvements to Core architecture in 2021? Smartphone vendors were doing it last year... way too late. And 5G will take off in the end of 2019.
  • TheJian - Wednesday, January 2, 2019 - link

    I guess I'm not getting why I should be impressed by this.
    https://www.electronicsweekly.com/news/design/comm...
    Leti already did it? They say it's IP can be used by others, so is this Intel's solution (what they're using I mean)?

    AMD already does chiplets, everyone does socs (Intel failed them)...etc. 144mm^2 not that small (about an large apple soc size). Current 7nm A12 is 83mm^2 with 6.9B transistors and two big cores, 4 small. AMD already did interposer/chiplets. Memory has been stacking for a while now. Not sure what is supposed to impress me here.

    "Very much like a mobile chip" ...Pretty much...Again, why so impressed?

    And as OP noted, you have no idea how big the market is, nor how much they can make on them. I think they have to try to sell some before we can say that (many Intel things killed over the years), as their last mobile strategy cost them 16B+ in giveaways, and lost the fab race for a while (maybe forever, because that 16B lost should have went DIRECTLY into fabs and 10nm wouldn't be crap now), as once 7nm Intel hits, it looks like TSMC beats them anyway with 5nm (ok, tie? whatever). My point here is Intel's 7nm won't be much ahead of tsmc 5nm if at all as that is what it will compete with since tapeouts happen q2 2019 and chips 12-15 months later.
    https://www.extremetech.com/computing/278742-tsmc-...
    Many other articles out there like this, but has a good chart of when and how much wafers etc. But if risk production is really as they say, 5nm chips by xmas 2020. That puts Intel where with this @7nm? Unless that answer is XMAS 2020, I'm thinking behind tsmc. It looks like TSMC is aiming before xmas and they've been moving at a good clip without many glitches recently, so Intel better get busy IMHO. TSMC is 2q 2019 risk, or 2H 2019 depending on who you believe I guess. But still, Intel 7nm better hit by xmas 2020 then right?

    Comments on last page: Uh, should have bought NV under $10 but couldn't take the best from gpu side because nobody could handle Jen as president :) WOW, look at that value you passed up Intel, oh, and you'd RULE mobile by now with all those tegras being on Intel's process 5+yrs ago (never mind what gpus would have done on Intel during this time) and you already had the modem solution too (NV bought one, and had to kill it, intel would have taken over everything cpu/gpu/modem/mobile).

    With chromebooks, 2b mobile units not using NV gpu's etc, nobody would have stopped them at FTC since more gpus, and arguably more computing devices ship without WINTEL, Intel's gpus (even with NV in there) etc. Intel gpus wouldn't have been needed, mobile wouldn't have been lost (14nm Intel NV socs would have competed well against 20nm everyone else, same story before 14/20, Intel 22nm NV socs vs. 28nm everyone else)., fab money wouldn't have been blown on mobile etc etc. All the problem Intel has now are because they blew 16B on failing instead of BUYING NV for that or a bit more. They had a value back then ~6B or less 659mil shares at $10, I bought at 12...ROFL. They should have owned NV anywhere in there and all this crap wouldn't have happened...LOL. We'll see how this "ideas from outside" crap works out now. To be fair AMD had the same problems to some extent, firing Dirk for not liking mobile/tablet/apu, and wanting a KING first then that cheap crap later. Now they chase king cpu (not gpu yet so far) again...LOL. Yeah, I own AMD stock but still think management is dumb. Can't price anything right, always trying to be a friend or get share which means NOTHING if it doesn't come with MARGIN as a poor man. Sure the rich guy can flood a market, kill enemy sales, but only because he has wads of cash and can wait until he breaks you. Poor company needs NET INCOME for the next gen R&D and to retain people like KELLER etc.

    I'm only in AMD stock for the 7nm server stuff, then out likely. Rumor/hype work well in advance of real product at amd (talking stock price here), so you don't likely have to wait for anything other then "shipping soon" or some leaked benchmarks etc. and the price will head to 40+ probably. Just run before that reality hits or brave the waves...LOL. I think AMD will make money, certainly has the server chips to do it, but management just seems to fail at pricing anything to take advantage while they can. Too worried about market, instead of MARGIN for R&D. I'd rather own the 10% that makes most of the money than the 80% that makes crap+a little midrange crap. Apple thinks the same, see their Q reports for ages etc. Own the rich so you can afford to supply the poor. It doesn't work the other way around generally speaking, especially as the little guy. You can't bleed as the poor little guy ;)
  • TheJian - Wednesday, January 2, 2019 - link

    One more point, in case anyone brings it up, A12x 122mm^2 10B transistors. just adds two more big cores IIRC (maybe a few other small changes). Same point though.

Log in

Don't have an account? Sign up now