The Future of Lakefield

Lakefield as a product is a lateral move for Intel. The company is taking some of its new and popular IP, and placing it into a novel form factor that has required a significant amount of R&D from a manufacturing and construction perspective. The goal of Lakefield was to meet particular customer requirements, which we understand to be around battery life, performance, and multi-screen support, and according to Intel, those goals have been met, and they will be producing future generations of Lakefield products.

In particular, Intel has produced this slide at a couple of conferences.

This slide essentially states that Lakefield product in the yellow box has two silicon die – one optimized for compute on Intel’s P1274 process (10+ nm) and the Foveros layer (the active interposer layer) on Intel’s 22FFL process.

The next product with heterogeneous manufacturing integration will be Intel’s big Xe-HPC product, Ponte Vecchio, which will use Intel’s P1276 process (7nm) as a compute die and Intel’s P1274 (10+) process as a base interposer layer.

Beyond this, Intel looks to continue with its multi-layered products by having the compute layer on the most advanced process node, with the interposer layer one generation behind, on a ‘Foveros’ optimized variant.

So the first generation Lakefield is essentially a product that combines P1274 and 22FFL, and a future product is likely to be built on P1276 on the compute layer and P1274 for the interposer layer. Keeping this sort of cadence makes a lot of sense. However, Intel is going to have to learn from Lakefield in a number of ways, especially as we look at ways in which the heterogeneous layering concept can expand. I’ve split this into several areas that I feel is critical to where layered processors can really make a difference.

Growing a Stacked Die to Higher TDP and Core Count

I’ve combined these two points because they essentially go together. Implementing two simple silicon die together in a small form factor product, while is interesting on the power side of the equation, doesn’t probe the question of scaling the product up. It’s easy enough to scale the product out by adding in some form of connectivity to the stack and then connecting them together (which is what’s happening in Ponte Vecchio), but at some point the stack has to move to a higher level of power consumption if it wants to move upwards in power.

This means that thermals become a bigger issue if it wasn’t already. If we take the current Lakefield design, with one compute die over an active interposer, with the right routing then moving to a physically larger floorplan and a higher power shouldn’t be too much of an issue – if anything, making the base die larger should help spread a lot of that IO about, making the interposer a functionally less active interposer. Or Intel will implement the next generation of its die-to-die stacking technology, where the top dies can be larger than the base dies, in a cantilevered fashion.

The bigger deal with the thermals is going to be on the top, with the stacked PoP memory. We go more into the memory communications aspect in a bit, but ideally that memory needs to be on the side so the compute die can have access to a proper heatspreader. The only reason it is stacked in Lakefield is because of the size constraints and attempting to get everything into that small form factor. For anything larger, there needs to be a memory controller that looks outside the chip, which is kind of what we’re expecting from Ponte Vecchio with HBM. A desktop-class product would likely be in the middle.

Growing a Stacked Die to More Stacks

The other angle for a stacked silicon product is to put more stacks in place. This again brings about the question on cooling between the stacks, depending on what is actually there. Lakefield is only two stacks right now, with one high-powered stack and one low-powered stack. Intel would have to prove that it could manage multiple high-powered stacks in order to expand compute in the vertical dimension, but that brings about its own problems.

To start, with Lakefield, the main power to the top compute die is provided with TSVs going through the active interposer layer. For each compute die in a multi-die stack, there would have to be TSVs for each one in order to provide individual power. Unless the active interposer also acted as a PMIC, this could become difficult depending on what other TSVs or data paths need to be put in place between the layers.

Note, when we spoke with Intel’s Ramune Nagisetty at IEDM last year, when asked if Intel would ever discuss if a stacked product would use ‘dummy’ layers to help in cooling, we were told that this would unlikely be mentioned, focusing only on the layers that actually do any work. But ultimately there could be cause for dummy layers to aid in cooling, such that they can provide mass and distance between thermal hotspots between compute dies involved. As the number of layers increases, however, something like Lakefield would have to move the PoP memory off the top, as already mentioned.

Memory Communications

One element to the Lakefield design we haven’t really covered here is how the memory communicates. In the current Lakefield design, the compute cores and the memory controllers are located on the compute die. In order for a portion of main memory to be read into the compute die, the communication has to travel down through the active interposer, go into the package, and then loop back up to the stacked memory.

In the following diagram, on the left, we have (1) going from Compute Die to DRAM, and (2) DRAM back to Compute Die.

This path is a lot longer than simply going from the compute die straight up into the memory, which would be theoretical on the right hand side if the two were bonded and had appropriate pathways.

If a future Lakefield product wants to continue down the memory-on-top route, one optimization could be to bond that top memory die in a Foveros-like fashion. One could argue that it means Intel would have to bond the memory on at the manufacturing stage, but this already happens with the current generation of Lakefield designs. The only downside would be getting the bonding pads on the top of the compute die and the bottom of the memory die to line up, and then manage the communications from there. The power for the memory would have to also come through on TSVs.

But if we’re bonding the memory into the stack, then technically it could go at any layer – there are likely benefits to keeping the compute die/dies on top. This could lead to multiple layers of memory as needed.

Power Management

With the current Lakefield design, both the compute die and the active interposer die have their own power management IC (PMICs) in order to help deliver power. Based on Intel’s own diagrams, these PMIC designs take up more physical PCB space than Lakefield itself.

At some level, Intel is going to have to figure out to create a unified PMIC solution to cover every layer on the product. It likely reduces board space and would make things a lot simpler, as it does with laptops that can manage power to the CPU and GPU on the same die with an onboard power controller. A PMIC that can scale with layer counts is obviously going to be a plus.

Cooling

Through all of this, as I’ve mentioned several times, cooling is going to be a major concern. There’s no easy way around the physics of dissipating 5-10 W in such a small space, or over 100 W if the product scales up into something in a form factor that has a wider appeal. Previously in the article I mentioned that we had discussed this with Intel, and how areas such as microfluidic channels have obviously had some research put into, but nothing to the point where it could be done commercially and at scale. It’s a paradigm worth solving, because the benefits would be tremendous.

Beyond Windows and Enabling 5G

One thing to note is that Intel's Lakefield is only planned with Windows 10 support right now. Linux is currently not in the plan for this product, but it would have to be if Intel wants wider adoption of the technology.

Not only this, but as most people are comparing these devices to Qualcomm's hardware, appropriate 5G support will need to be applied - the current generation Lakefield is not part of Intel and Mediatek's collaboration on 5G, which only applies to Tiger Lake and beyond. Lakefield customers will have to rely on 4G as an optional extra, or 5G through an external modem.

The Future Of Lakefield

Even if this first generation version of Lakefield gets slammed pretty hard in performance-focused benchmark reviews for being slower than a dual-core Whiskey Lake, Lakefield marks some very big steps for Intel. Hybrid CPU designs, and stacked die-to-die connectivity, are going to feature in Intel’s future roadmaps – at what points will depend on how much Intel is willing to experiment but also how well Intel can execute. There have been discussions on Intel perhaps looking at an 8+8 hybrid CPU design in the future, although nothing we can substantiate, but we do know that Ponte Vecchio with stacked die is coming in late 2021.

One of the key ingredients in all of this is going to be at what points Intel’s technology portfolio is going to intersect its product portfolio. Some of these technologies might find their way better suited to aspects such as 5G networking, or automotive, rather than something we can consume on the desktop. As far as Lakefield goes, this first generation is going to be a rough challenge for Intel – they are pitching a low performance product in a high-cost segment based on technology (and to a certain extent, battery life). Die-to-die stacking will get easier to do as scale ramps, and hopefully new process node technologies will drive the power efficiency of those big cores lower to enable 2+4 or bigger designs when in a stacked form factor.

We eagerly await a chance to test 1st Gen Lakefield, but we’re also keeping an eye on what might be in the second and third generations.

Performance Numbers: How To Interpret Them
Comments Locked

221 Comments

View All Comments

  • PaulHoule - Saturday, July 4, 2020 - link

    @DrK,

    the engineering on this part is like what you'd get if you contracted out to Rockwell or Litton Industries for a brain for a Stinger missile. Compact, brilliantly packaged, with adequate performance, but no concern at all about thermal dissipation because the missile is going to hit or miss its target before the CPU fries.

    Foveros is an expensive technology for a mass market device (cheap tablet) because the fabrication cost depends on the total area and there is an expensive step of stitching the chips together at the end. If you could avoid fabricating "glue" components and just snap together chips from a library this might be an amazing technology to build 500 of something at low development cost and time (e.g. weeks) If you have to make a new mask for the chip, however, it is a lot less fun.

    So far as AVX the problem is as you say: "who cares about AVX?" Intel has shipped a backlog of features that people don't use because of overhead and complexity. As a software dev I get paid to work on certain aspect of my products, and maximizing performance with the latest instructions may or may not be on my agenda. If it is easy to do I will push for it but it means debugging compatibility problems it is a tough ask. "Optimal" performance for a range of users can mean shipping many versions of a function; the performance of loading, installing, updating, those libraries will be not in the least optimal.

    Intel is like that Fatboy Slim album, 'We're #1, Why Try Harder?' The world has changed and Intel is not the #1 CPU firm any more. Intel has to get more Paranoid or it might not Survive.
  • Spunjji - Monday, July 6, 2020 - link

    Why start with "I'm not one to criticise" and then do it? Clearly you are, and as a rhetorical flourish it's tedious in the extreme.

    1 - It's a first-gen product and it shows, but they're putting it in premium products.
    2 - No deep-dive, for sure, but Intel's own figures are not very encouraging.
    3 - Citation needed here. There's no sign of it being used outside of low-power premium devices.
    4 - Who cares about AVX indeed! Tell that to the Intel fanboys pissing all over the AMD threads?

    I'm entirely in favour of your final conclusion, but it's not really supported by the previous statements. 🤷‍♂️
  • Oxford Guy - Friday, July 3, 2020 - link

    Bricklake or bust.
  • Meteor2 - Friday, July 3, 2020 - link

    Ultimately this is another attempt by Intel to stay relevant in a space where it's always struggled: mobile. With the progress being made by Apple, Microsoft, and Qualcomm using ARM, Intel is looking at losing an ever-growing chunk of what was the laptop market.

    But whatever Intel tries, bottom line is that ARM is more efficient than x86.
  • Beaver M. - Friday, July 3, 2020 - link

    Thats not the issue. The issue is that theres not much software in that sector for x86.
  • Valantar - Sunday, July 5, 2020 - link

    A few errors in the article: 2 16-bit channels of LPDDR4X should be 2 32-bit channels of LPDDR4X, given that Renoir (with 4 32-bit LP4X channels at the same clock speed) delivers exactly 2x the bandwidth. Right?

    You should also proofread the pasted-in laptop descriptions; a lot of stuff in them clashes with the previous text.

    Beyond that though: great article! Part of the reason why I love AT is for these technical yet understandable deep-dives. Looking forward to the next one.
  • Pixelpusher6 - Sunday, July 5, 2020 - link

    Interesting choice to place the DRAM right over the core, seems like it would make more sense to move it next to the chip but on package. I guess my question is was it worth the complexity to implement this Foveros design to save a little space? It seems like they could have gotten the same benefit by using a traditional packaging i.e. with a little large package. Can you imagine paying $2500 like the price of that Lenovo and having Atom-esque performance?
  • Pixelpusher6 - Sunday, July 5, 2020 - link

    *larger
  • Farfolomew - Monday, July 6, 2020 - link

    Agreed on the DRAM placement. It seems really out of place. Another "dime size" piece of silicon right next to the Lakefield CPU doesn't seem like it would take up much more board space, and would alleviate a ton of the heat dissipation problems by allowing the compute-layer die to be directly connected to a heatsink
  • serendip - Monday, July 6, 2020 - link

    It seems to be an interesting technical answer to a question nobody asked. Board space is a lot cheaper than what Lakefield would cost. It could also cost more for Intel to produce and they'd be stuck carrying multiple RAM SKUs.

    Heat dissipation could be a major issue. The slow chip could become even slower if it has to constantly throttle down because of thermal loads. Intel is sadly mistaken if this is supposed to be an ARM competitor.

Log in

Don't have an account? Sign up now