The Kaby Lake-U/Y GPU - Media Capabilities

While from a feature standpoint Kaby Lake is not a massive shift from Skylake, when it comes to GPU matters it none the less brings across some improvements that are directly visible to the end-user. As with the CPU cores, Intel’s 14nm+ process will allow for higher GPU frequencies and overall better GPU performance, but arguably the more impressive change with Kaby Lake is the updated media capabilities. To be clear, Kaby Lake is still an Intel Gen9 GPU – the core GPU architecture has not changed – but Intel has revised the video processing blocks to add further functionality and improve their performance for Kaby Lake.

The media capabilities of the Skylake GPU was analyzed in great detail in our 2015 IDF coverage. The updates to Kaby Lake-U/Y should be analyzed while keeping those features in mind. The major feature change in the Kaby Lake-U/Y media engine is the availability of full hardware acceleration for encode and decode of 4K HEVC Main10 profile videos. This is in contrast to Skylake, which can support HEVC Main10 decode up to 4Kp30, but does so using a “hybrid” process that spreads out the workload over the CPU, the GPU’s media processors, and the GPU’s shader cores. As a result, not only can Kaby Lake process more HEVC profiles in fixed function hardware than before, but it can do so at a fraction of the power and with much better throughput.

Also along these lines, Kaby Lake has implemented full fixed function 8-bit encode and 8/10-bit decode support for Google’s VP9 codec. Skylake offered hybrid decode support for the codec, which is useful from a feature standpoint, but is a bit more problematic in real-world use since it’s not as power-efficient to use VP9 a codec implemented in fixed function hardware. Google has proven eager to serve up VP9 to its YouTube users, so they can now much more efficiently decode the codec. Meanwhile, on the encode side, brand-new to Kaby Lake is VP9 encoding support, to go with the aforementioned HEVC encode support.

Intel Video Codec Support
  Kaby Lake Skylake Broadwell
H.264 Decode Hardware Hardware Hardware
HEVC Main Decode Hardware Hardware Hybrid
HEVC Main10 Decode Hardware Hybrid No
VP9 8-Bit Decode Hardware Hybrid Hybrid
VP9 10-Bit Decode Hardware No No
   
H.264 Encode FF & PG-Mode FF & PG-Mode PG-Mode
HEVC Main Encode FF & PG-Mode PG-Mode No
HEVC Main10 Encode FF & PG-Mode No No
VP9 8-Bit Encode FF & PG-Mode No No
VP9 10-Bit Encode No No No

An overview of the GPU engine in Kaby Lake-U/Y is presented in the slide below.

The new circuitry for hardware accelerating HEVC Main10 and VP9 are part of the MFX block. The MFX block can now handle 8b/10b HEVC and VP9 decode and 10b HEVC / 8b VP9 encode. The QuickSync block also gets a few updates to improve quality further, and AVC encode performance also receives a boost.

The Video Quality Engine also receives some tweaks for HDR and Wide Color Gamut (Rec.2020) support.  Skylake's VQE brought in RAW image processing support with a 16-bit image pipeline for selected filters. While Intel has not discussed the exact updates that enable Rec.2020 support, we suspect that more components in the VQE can now handle higher bit-widths. Intel pointed out that the HDR capabilities involve usage of both the VQE and the EUs in the GPU. So, there is still scope for further hardware acceleration and lower power consumption in this particular use-case.

Intel claims that Kaby Lake-U/Y can handle up to eight 4Kp30 AVC and HEVC decodes simultaneously. HEVC decode support is rated at 4Kp60 up to 120 Mbps (especially helpful for premium content playback and Ultra HD Blu-ray). With Kaby Lake-U/Y's process improvements, even the 4.5W TDP Y-series processors can handle real-time HEVC 4Kp30 encode.

On the subject of premium content, in their presentation Intel rather explicitly mentioned that the improved decode capabilities were, in part, for “premium content playback.” When we pushed Intel a bit on the matter – and specifically on 4K Netflix support – they didn’t have much to say beyond the fact that to play 4K Netflix, you need certification. Based on what was said and what was not said (and what we know about the certification process) our educated guess is that the updates in Kaby Lake-U/Y include some new DRM requirements for 4K content, and 4K Netflix should hopefully be good to go with the new platform. However on that note, because of those DRM requirements and that this is being pitched as a new feature for Kaby Lake, we suspect that when 4K Netflix streaming does come to the PC platform, Skylake owners are going to be out of luck.

Update: On a related note, one of the Intel press releases that has gone out today is that Sony's 4K movie and television streaming service, ULTRA, will be coming to Kaby Lake PCs in 2017. To date the service has only been available on Sony's televisions - in part for security reasons - so this is an example of one such premium content service that's coming to Kaby Lake thanks to its stronger DRM abilities.

It must be kept in mind that all the encode / decode aspects discussed above are for 4:2:0 streams. This is definitely acceptable for consumer applications, as even Blu-ray video streams (that have plenty of bandwidth at their disposal) are encoded in 4:2:0. However, if Intel wants to use the new media engine in professional broadcast and datacenter applications, 4:2:2, and, to a much lesser extent, even 4:4:4 support might become necessary. For the purpose of the Kaby Lake-U/Y consumer platforms being introduced today, this is not an issue at all.

Moving on, like the GPU core itself, Kaby Lake-U/Y's display pipeline is the same as that of Skylake. This means the iGPU can support up to three simultaneous displays.

One of the disappointing aspects from Skylake that has still not been addressed in Kaby Lake-U/Y is the absence of a native HDMI 2.0 port with HDCP 2.2 support. Intel has been advocating the addition of an LSPCon (Level Shifter - Protocol Converter) in the DP 1.2 path. This approach has been used in multiple motherboards and even SFF PCs like the Intel Skull Canyon NUC (NUC6i7KYK) and the ASRock Beebox-S series. Hopefully, future iterations of Kaby Lake (such as the desktop and high-performance mobile parts coming in January) address this issue to simplify BOM cost for system vendors.

In summary, Kaby Lake-U/Y resolves one of the major complaints we had about Skylake's media engine: the absence of hardware-accelerated 4Kp60 HEVC Main10 decode. There are a few other improvements under the hood that enable a more satisfying multimedia experience for consumers. The software and content-delivery ecosystems have plenty of catching up to do when it comes to taking full advantage of Kaby Lake-U/Y's media capabilities.

The New CPUs, Updates to Core M Branding Updated 14nm, Speed Shift v2, Performance Updates
POST A COMMENT

131 Comments

View All Comments

  • hansmuff - Tuesday, August 30, 2016 - link

    Does any of the new fixed-function logic that is part of the GPU get to work when I use a discrete GPU instead of the integrated?

    I remember that on my old SB chip, the GPU just was turned off because I use discrete. How have things changed, if at all?
    Reply
  • Ryan Smith - Tuesday, August 30, 2016 - link

    Typically you'll be using the dGPU for video decoding since it's closer to the display pipeline. However you can totally use QuickSync for video encoding, even with a dGPU. Reply
  • hansmuff - Tuesday, August 30, 2016 - link

    Ah yes, QuickSync in particular was a question for me. While NVENC certainly does do a fine job, if I have a hardware encoder laying dormant in the CPU, it might as well do stream encoding for me :) Reply
  • npz - Tuesday, August 30, 2016 - link

    It's trickier if you wanted to do real time streaming/encoding with it. You need software or a set of software actually to capture the raw video and pass it to the QS encoder. Reply
  • fabarati - Tuesday, August 30, 2016 - link

    I just messed about with NVENC, QSVEncC and x265 when ripping som DVDs. X265 still gives the best quality and size. With a i5-6500, the encoding speed wasn't all that, at around 65 fps. Of course, QSVEncC was closer to 200 fps and NVENC (GTX 1070) clocked in at 1300-2000 FPS.

    Quality and size of the file are of course the opposite, with x265 looking the best and being the smallest, then QSVEncC and finally NVENC.
    Reply
  • Guspaz - Tuesday, August 30, 2016 - link

    Can you? Last I looked, that required enabling both the dGPU and iGPU simultaneously (and simply not plugging a monitor into the iGPU). Attempts to enable the iGPU while having a dGPU plugged in on my Ivy Bridge resulted in Windows not booting. Reply
  • nathanddrews - Tuesday, August 30, 2016 - link

    I can't speak for your system, but my Z77 motherboard features Virtu multi-GPU support that allows me to use Quick Sync while having my monitor plugged into my dGPU. You have to activate both IGP and dGPU in BIOS, then load both drivers. It worked for me under W7 and W10. Reply
  • Guspaz - Tuesday, August 30, 2016 - link

    Errm, you've got dedicated hardware specifically for the purpose of supporting multiple GPUs (the Lucid Virtu), so that's not really a typical example. Reply
  • extide - Tuesday, August 30, 2016 - link

    Lucid Virtu is all software Reply
  • Gigaplex - Tuesday, August 30, 2016 - link

    Last I checked, it requires motherboard support. You can't just install some software and expect it to work. That's what they meant by dedicated hardware. Reply

Log in

Don't have an account? Sign up now