SoC Tile, Part 1: Low-Power Island E-Cores, Designed for Ultimate Efficiency

Diving a little deeper into the SoC tile within Intel's Meteor Lake architecture, Intel hasn't just opted for a minor change but has made a significant leap forward, especially regarding I/O fabric scalability. The SoC tile itself isn't built on Intel 4 like the compute tile but is made by TSMC on their N6 node. Intel has ditched the old limitations of mesh routing by implementing a Network-On-Chip (NOC) on the silicon. This isn't just about making data lanes faster; it's about outlining smarter and more power-efficient access to memory. Likely an innovation from Intel's acquisition of NetSpeed back in 2018, which specialized in NoC and Fabric IP for SoCs, Intel opting for a physical NOC allows Intel to reduce the limits on bandwidth. Using EMIB and the nature of 2D scaling, the data paths are a lot shorter, translating into less power loss, but shorter wires also help reduce overall latency penalties.

Switching gears to low-power workload efficiency, Meteor Lake incorporates E-cores directly into the SoC tile, which Intel calls Low Power Island (LP) E-cores. Think of it as Intel's way of saying, "Why use a sledgehammer when a scalpel will do?". This means that the LP E-Cores are driven purely from a power-saving perspective. Having these LP E-cores available for workloads with the aid of Intel's Thread Director means lighter threads and background tasks that don't require the grunt of the P and E-cores on the compute tile can be directed onto the lower-powered LP E-cores.

While both the E and LP E-cores are based on the same Crestmont microarchitecture, the E-cores on the compute tile are built on Intel 4, along with the P-cores. The LP E-cores are made on TSMC's N6 node, like the rest of the SoC tile. These low-power island E-cores are tuned for finer-grained voltage control through an integrated Digital Linear Voltage Regulator (DLVR), and they also have a lower voltage-to-frequency (V/F) curve than the big E-cores on the compute tile, meaning they can operate with a lower power cost, thus saving power when transitioning low-intensity workloads off of the compute tile and onto the LP E-cores.

Part of the disaggregated nature of Foveros combined with individual power management controllers (PMC) within each tile means that IP blocks can be independently powered on or off when required.

SoC Tile: Bandwidth Scalability, Can't Stop The NOC

So adding a variety of tiles requires a highly competent pathing to ensure bandwidth is adequately structured. As I/O bandwidth bottlenecking was a major concern in previous iterations, Intel aims to solve bandwidth flow restrictions with a couple of solutions.

The first is through the scalability of the I/O bandwidth, which Intel does by adding what it calls 'Scalable Fabric,' which is configured for up to 128 GB/s of bandwidth throughput. All of the I/O ordering and address translation is fed through the IOC, while Intel has implanted a Network-on-Chip (NOC) to interconnect many of the different areas of the SoC.

The Network-on-Chip or NOC is designed to be coherent, and for Meteor Lake on Intel 4, this uses unordered processing, which moves data in an unordered fashion. Connecting all the tiles together through the NOC and independently through the IOC gives plenty of bandwidth headroom for devices or agents requiring it. The NOC is directly connected to the compute and graphics tile, while other elements, including the traffic fed directly through the LP E-cores on the SoC tile, media, display, the NPU, and the imaging processing unit (IPU), all going through the NOC. In terms of the connection to the I/O tile, this is connected directly to the IO fabric and then is fed through the IOC, which then goes directly to the NOC.

The SoC tile also integrates Wi-Fi 6E and can be made to support the latest Wi-Fi 7 standard. Having a future-proof method of including Wi-Fi 7 and Bluetooth 5.4 can add the next level of wireless connectivity to Intel's mobile platform. Wi-Fi 7 offers 320 MHz of bandwidth, doubling the channel width compared to its predecessor, Wi-Fi 6. It also uses 4096-QAM (4K QAM) to enable transmission speeds capable of hitting 5.8 Gbps.

We're still waiting for clarification on what this actually means. Whether it's supportive of Wi-Fi 7 or if there's some underlying compatibility within the Wi-Fi MAC integrated into the silicon. One option could be that Intel is adding a full external controller into the silicon to get to Wi-Fi 7 instead of CNVio splitting up parts of the radio stack. We have asked Intel for more details and will update you when we have a response.

That being said, Intel discloses 'support' for Wi-Fi 7 and BT 5.4, but there's a chance Intel could differentiate which wireless MAC is implemented into different chip segments. An example would be an Intel 9 Ultra SKU, bolstered by Wi-Fi 7 support, whereas a lower-end SKU like a Core 3 might utilize Wi-Fi 6E to save on cost.

Additional features include Multi-Resource Unit (RU) Puncturing and Military-grade security with WPA3 that supports GCMP-256 encryption to ensure both speed and security when connected to a wireless network. Unique features like Multi-Link Operation (MLO) in Wi-Fi 7 are designed to reduce latency and jitter by up to 60%, making it a competent solution for various user's connectivity needs. Adding Bluetooth 5.4 further complements this by improving audio quality, and it is claimed to offer up to 50% lower power consumption for longer battery life.

Also present on the SoC tile is the display controller and the media engine from the GPU. These are always-on (or at least, mostly-on) blocks that do not need to be built on a cutting-edge process node, making them good candidates to place on the SoC tile. Meteor Lake offers support for 8K HDR and AV1 video playback and contains native HDMI 2.1 and DisplayPort 2.1 connectivity.

Finally, the SoC tile also includes other key platform components, such as PCIe lanes, which are integral for connectivity to external devices such as discrete graphics cards and the platform's I/O capabilities, such as USB 4 and 3, as well as offering a direct interconnect to a separate I/O tile with Thunderbolt 4 and additional PCIe lanes. While we've touched on wireless connectivity, the SoC tiler also includes Ethernet support, although Intel hasn't disclosed yet which PHY will be included; it is likely to be capable of 2.5 GbE at the minimum.

A Note on Meteor Lake's Security Features: New Silicon Security Engine (ISSE)

Security has also been given closer attention in Meteor Lake. The architecture introduces the Intel Silicon Security Engine (ISSE), a dedicated component focused solely on securing things at a silicon level. Various vulnerabilities have been at the forefront of media over the last few years, including Meltdown, Spectre, and Foreshadow.

With real threats around the world, securing data is ever prevalent, and CPU architects and designers not only need to consider performance and efficiency, but security and doing some architecturally is just as important as a competent software stack. The Converged Security and Manageability Engine (CSME) has also been partitioned to further enhance platform security. These features collectively work to give a wide range of on-chip and off-chip securities designed to mitigate attacks on multiple fronts.

Compute Tile: New P and E-Cores on Intel 4 SoC Tile, Part 2: Neural Processing Unit (NPU) Adds AI Inferencing on Chip
Comments Locked

107 Comments

View All Comments

  • PeachNCream - Thursday, September 21, 2023 - link

    Nice trolling lemur! You landed like an entire page of nerd rage this time. You're a credit to your profession and if I could give you an award for whipping dead website readers into a frenzy (including regulars who have seen you do this for years now) I would. Congrats! 10/10 would enjoy again.
  • IUU - Thursday, September 28, 2023 - link

    Intel does not need to do anything about its architecture to to match or surpass m3. It just needs to build its cpus on a similar node. Which is not happening anytime soon, thus perpetuating the illusion of efficiency of apple cpus.

    Two things more. First it is hilarious to compare the prowess of Intel on designing cpus to that of Apple. Apple has long time "building" machines like a glorified Dell borrowing cpus from IBM or Intel and only recently understood the scale and effort needed to design your silicon by improving on ARM designs.

    Secondly, it is misguided to say that if a cpu needs 10 times more wattage on the same node to achieve 2 or 3 times the performance is less efficient. This is not how physics works . If Intel built their cpus on N3 of tsmc they would be 2 or 3 times faster best case scenario. Wattage does not scale linearly with performance. This is the same as saying that a car that has 10 times the power would be 10 times faster. Lololol.

    Apple designs good cpus recently , but all the hype about its efficiency is just hype. Even if we assume the design is totally coming from Apple , which it doe not, being a very good modification at best, it does not even build its nodes. By large its efficiency is TSMC efficiency. If it were not for TSMC Apple would be non existent on the performance charts.
  • Silma - Tuesday, September 19, 2023 - link

    TLDR:
    - Intel 4 < TSMC N6
    - To not be late, Intel 3 must arrive within 3 months,which is highly doubtful, since Intel 4 isn't even shipping yet
    - I assume Intel 3 < TSMC N6, otherwise, why bother enriching the competition?
    - Parts of the new tech stack looks promising, but Intel refrains from any real performance claims, or any comparison with offerings from AMD or Apple.
    - Did Intel announce another architecture for desktop computers, probably more similar to that of AMD, e.g. perhaps many performance tiles plus one cache tile?
  • Drumsticks - Tuesday, September 19, 2023 - link

    Maybe. Or maybe TSMC6 is cheaper, and Intel doesn't need the power savings or area savings of I4 over TSMC6 for what the non-compute tiles need to accomplish. It's not exactly uncommon to see the SoC / IO tile on a lower node, doesn't AMD do the same thing?
  • Roy2002 - Tuesday, September 19, 2023 - link

    Intel 4 and 3 are basically the same with the same device density as 3 is enhanced 4. I assume it has slightly higher density value than TSMC 5nm and performance is slightly better. Let's see.
  • kwohlt - Tuesday, September 19, 2023 - link

    Intel 4 is not library complete. It can't be used for the SoC tile.
  • sutamatamasu - Tuesday, September 19, 2023 - link

    I wonder if current processor have an dedicated NPU, then what the heck happen with GNA?

    It still in there or they're remove it?
  • Exotica - Tuesday, September 19, 2023 - link

    Intel should've either implemented TB5 in Meteor Lake or waited until after Meteor Lake shipped to announce TB5. Because as cool and impressive as meteor lake seems, for some of us, it's already obsolete in that it makes no sense to buy a TB4 laptop/PC and instead wait on TB5 silicon to hit the market.
  • FWhitTrampoline - Tuesday, September 19, 2023 - link

    Why use TB4 or USB4/40Gbs and have to deal with the extra latency and bandwidth robbing overhead compared to PCI-SIG's OCuLink that's just pure PCIe signalling delivered over an external OCuLink Cable. OCuLink and PCIe requires no extra protocol encapsulation and encoding/decoding steps at the PCIe link stage so that's lower latency there compared to USB4/TB4 and later generations that have to have extra encoding/decoding of any PCIe protocol packets to send that out over TB4/USB4. And for external GPUs 4 lanes of PCIe 4.0 connectivity can provide up to 64Gbs of bandwidth over an OCuLink port/cable and OCuLonk ports can be 8 PCIe lanes and wider there.

    Once can obtain an M.2/NVMe slot to OCuLink adapter and get an external OCuLink connection of up to 64Gbs as long as the M.2 is 4, PCIe 4.0 lanes wide and no specialized controller chip required on the MB to drive that. And GPD on their Handhelds offers a dedicated OCuLiink port and an external portable eGPU that supports OCuLink or USB4/40Gbs-TB interfacing. TB5 and USB4-V2 will take years to be adopted whereas OCuLink is just PCIe 3.0/4.0 there delivered over an external cable.
  • Exotica - Tuesday, September 19, 2023 - link

    Unlike thunderbolt, Occulink doesn't have hotplugging, meaning your device must be connected at cold boot. Not so good for external storage needs.

Log in

Don't have an account? Sign up now