Thunderbolt 3: Now on the CPU*

One of the big failures of the Thunderbolt technology since its inception has been its adoption beyond that Apple ecosystem. In order to use it, both the host and the device needed TB controllers supplied by Intel. It wasn’t until Thunderbolt 3 started to use USB Type-C, and Thunderbolt 3 having enough bandwidth to support external graphics solutions, that we started to see the number of available devices start to pick up. The issue still remains that the host and device need an expensive Intel-only controller, but the ecosystem was starting to become more receptive to its uses.

With Ice Lake, that gets another step easier.

Rather than bundle TB3 support into the chipset, Intel has integrated it on the die of Ice Lake, and it takes up a sizable amount of space. Each Ice Lake CPU can support up to four TB3 ports, with each TB3 port getting a full PCIe 3.0 x4 root complex link internally for full bandwidth. (For those keeping count, it means Ice Lake technically has 32 PCIe 3.0 lanes total).

Intel has made it so each side of the CPU can support two TB3 links direct from the processor. There is still some communication back and forth with the chipset (PCH), as the Type-C ports need to have USB modes implemented. It’s worth noting that TB3 can’t be directly used out of the box, however.

Out of the four ports, it will be highly OEM dependent on how many of those will actually make it into the designs – it’s not as simple as just having the CPU in the system, but other chips (redrivers) are needed to support the USB Type-C connector. Power delivery too requires extra circuitry, which costs money. So while Intel advertises TB3 support on Ice Lake, it still needs something extra from the OEMs. Intel states that a retimer for the integrated solution is only half the size compared to the ones needed with the TB3 external chips, as well as supporting two TB3 ports per retimer, therefore halving the number of retimers needed.

Here’s a more detailed schematic, showing the complexities of adding in TB3 into a chip, with the four PCIe x4 complexes shown moving out to each of the individual PHYs at the bottom, and connected back into the main SoC interconnect fabric. The display engine also has to control what mode the TB3 ports are in, and what signals are being sent. Wake up times for TB3 in this fashion, according to Intel, are actually slightly longer compared to a controller implementation, because the SoC is so tightly integrated. This sounds somewhat counterintuitive, given that the requisite hardware blocks are now closer together, but it all comes down to power domains – in a separate chip design, each segment has a separate domain with individual power up/down states. In an integrated SoC, Intel has unified the power domains to reduce complexity and die area, which means that more careful management is required but latency ultimately increases a little bit.

The other upside to the tightly coupled integration is that Intel stated that this method of TB3 is a lot more power efficient that current external chip implementations. However they wouldn’t comment on the exact power draw of the TB3 block on the chip as it corresponds to the full TDP of the design, especially in relation to localized thermal density (Intel was initially very confused by my question on this, ultimately saying that the power per bit was lower compared to the external chip, so overall system power was lower – they seemed more interested in discussing system power over chip power). Intel did state that the difference between an idle and a fully used link was 300 mW, which suggests that if all four links are in play, we’re looking at 1.2 W. When asked, Intel stated that there are three different power delivery domains within the TB3 block depending on the logic, that the system uses integrated voltage regulation, and the TB3 region has an internal power rail that is shared with some of the internal logic of the CPU. This has implications when it comes to time-to-wake and idle power, but Intel believes it has found a good balance.

Regarding USB4 support, Intel stated that it is in the design, and they are USB4 compliant at this point, but there might be changes and/or bugs which stop it from being completely certified further down the line. Intel said that it ultimately comes down to the device side of the specification, although they have put as much in as they were able given the time constraints of the design. They hope to be certified, but it’s not a guarantee yet.

Depending on who you speak to, this isn’t Intel’s first crack at putting TB3 into CPU silicon: the chip that Intel never wants to talk about, Cannon Lake, supposedly also had an early TB3 design built inside that never worked. But Intel is confident in its Ice Lake implementation, especially with supporting four ports. I wouldn’t be surprised if this comes to desktop when Intel releases its first generation 10nm desktop processors.

*The asterisk in the title of this page is because you still need external hardware in order to enable TB3.

Using Power More Efficiently: Dynamic Tuning 2.0 Wi-Fi 6: Implementing AX over AC*
Comments Locked

107 Comments

View All Comments

  • repoman27 - Tuesday, July 30, 2019 - link

    "It stands to reason then that the smaller package is for lower performance and low power options, despite being exactly the same silicon."

    I know the die floorplans are the same, but have Intel ever actually confirmed that U and Y (or H and S series for that matter) are the exact same silicon? Is it strictly binning and packaging that separates the platforms, or is there a slight tweak to the manufacturing process to target lower power / higher frequencies? Intel production roadmaps would seem to indicate this isn't just a binning situation, but I've never been entirely certain on that point.

    And isn't Comet Lake-U 6+2 more likely to be 25 W, with Whiskey Lake-U 4+2 continuing to pull 15 W duty alongside Ice Lake-U 4+2?
  • CaedenV - Tuesday, July 30, 2019 - link

    Those goals for Aethena are OK, but my old Dell XPS 12 with a carousel frame hit all of those except biometric, and wake from sleep in <1 sec... well, and the bezel... but that was due to the carousel design which I would LOVE to come back in a more modern form.
    Not saying these goals are bad... but if a 6 year old midrange laptop can hit almost all of them, then this isn't exactly aiming for something amazing.
  • AshlayW - Tuesday, July 30, 2019 - link

    Quad core for 179 USD? What is this, 2015? No thanks.
  • HStewart - Tuesday, July 30, 2019 - link

    You do realize these are ultra-portable low power cpu's and not desktop chips
  • Samus - Tuesday, July 30, 2019 - link

    Intel is a mess right now, the execution of this along with the naming scheme is ridiculous.
  • shabby - Tuesday, July 30, 2019 - link

    18% ipc gain and 20% clock loss
    Place your bets how intel will spin this.
  • CHADBOGA - Tuesday, July 30, 2019 - link

    I'm quite disappointed the issue of security mitigation in hardware was not addressed. o_O
  • CityBlue - Saturday, August 3, 2019 - link

    Disappointed, but not surprised.

    Security (and by inference the performance overhead required to implement proper security) is not important according to Anandtech/Ian Cutress. Which is obvious nonsense, so the only logical conclusion is that Anandtech are now a thoroughly biased outfit incapable of any critical reporting, which is quite sad particularly as it means all their articles (particularly when they relate to Intel) have to be read with a very heavy dose of cynicism.
  • eek2121 - Wednesday, July 31, 2019 - link

    That picture of you biting a wafer is priceless.
  • Santoval - Wednesday, July 31, 2019 - link

    If Ice Lake-U has a ~3.5% higher single core performance (and, assuming the "multi-core overhead" is the same, multi-core performance as well) than Whiskey Lake-U despite having a 20% lower single core boost clock, then Sunny Cove must be an extremely impressive μarch. Or, er, that might not actually be the case : Ice Lake-U has a 18% higher IPC than the *original* Skylake of 2015, not Whiskey Lake. While Whiskey Lake is basically the same design it must have a somewhat higher IPC due to its much more mature process node and other optimizations.

    Let's be conservative and assume that Ice Lake-U (more specifically Sunny Cove) has a nice round 15% higher IPC than Whiskey Lake-U, with both at 15W. In that case, at a 20% lower boost clock Ice Lake-U should have a 5% lower performance than Whiskey Lake-U. Where is that +3.5% performance derived from then?

    Even if we assumed that Ice Lake-U 18% IPC edge is over Whiskey Lake-U (highly unlikely, otherwise Intel would not have dug out the original Skylake from its computing grave) that would still translate to Ice Lake-U having a 1.5% lower single core performance than Whiskey Lake-U, rather than being 3.5% faster than it.

    Maybe, just maybe, this is why Intel used just a single synthetic benchmark (surely compiled with aggressive flags and optimized for Intel CPUs) for that graph and avoided to disclose other synthetic benchmarks and real world use benchmarks? Is this also why they avoided to talk about CPU performance of Ice Lake in their Computex presentation, and instead focused on iGPU, Wifi and AI performance?

    Based on the disclosed clocks and the "disclosed" (in obfuscated form) IPC of Ice Lake-U I just cannot see it being in any way faster than Whiskey Lake-U. It will probably also have worse power efficiency, since it has the same or higher TDP range at a much lower clock.

Log in

Don't have an account? Sign up now