Intel has already disclosed that it will have a next generation Atom core, code named Tremont, which is to appear in products such as the Foveros-based hybrid Lakefield, as well as Snow Ridge designed for 5G deployments. In advance of the launch of the core and the product, it is customary for some documentation and tools to be updated to prepare for it; in this case, one of those updates has disclosed that the Tremont core would contain an L3 cache – a first for one of Intel’s Atom designs.

01.org is an Intel website which hosts all of its open source projects. One of those projects is perfmon, a simple performance monitoring tool that can be used by developers to direct where code may be bottlenecked by either throughput, memory latency, memory bandwidth, TLBs, port allocation, or cache hits/misses. In this case, the profiles for Snow Ridge have been uploaded to the platform, and one of the counters provided includes provisions for L3 cache monitoring. This provision is directly listed under the Tremont heading.

Enabling an L3 cache on Atom does two potential things to Intel’s design: it adds power, but also adds performance. By having an L3, it means that data in the L3 is quicker to access than it would be in memory, however there is an idle power hit by having L3 present. Intel can mitigate this by enabling parts of the L3 to be powered on as needed, but there is always a tradeoff. There can also be a hit to die area, so it will be interesting to see how Intel has changed the microarchitecture of it’s Atom design. There is also no indication if the Tremont L3 cache is an inclusive cache, or a non-inclusive cache, or if it can be pre-fetched into, or if it is shared between cores or done on a per-core basis.

Intel’s Atom roadmap, as disclosed last year at Architecture day, shows that the company is planning several more generations of Atom core, although beyond Tremont we get Gracemont in 2021, and beyond that is ‘increased ST Perf, Frequency, Features’ listed around 2023. In that time, Intel expects to launch Sunny Cove, Willow Cove, and Golden Cove on the Core side.


Lakefield

The first public device with Tremont inside is expected to be the Core/Atom hybrid Lakefield processor, which uses Intel’s new Foveros stacking technology. We know that this design will have one Sunny Cove core and pair it with four Tremont cores. Intel expects chip production of Lakefield for consumer use by the end of the year.

Related Reading

Source: InstLatX64, 01.org

POST A COMMENT

63 Comments

View All Comments

  • konbala - Monday, July 15, 2019 - link

    ARM's x86 emulation is so good now it is ridiculous, can the new nano lineup beat it? Its efficiency gonna be life or death determining for x86 I think. Reply
  • HStewart - Monday, July 15, 2019 - link

    I thought for second this was April 1st, this sounds like a April Fools joke believing x86 emulation. Reply
  • mode_13h - Monday, July 15, 2019 - link

    Emulation done right is like just-in-time recompilation. There's always a performance penalty vs. fully recompiled code, but it can get pretty close. Reply
  • HStewart - Monday, July 15, 2019 - link

    Big difference between ARM and x86 based cpu is simple RISC vs CISC which by definition means more RISC instructions are required for single CISC instruction. But fortunately for most applications is they are likely a simple set of instructions. They might be ok for most application but more complex application uses higher level instructions. I am curious if application designed to work with AVX or higher other instruction. I would think anything that uses Machine language would have difficultly. But like something with .NET could be made portable in both architexture quite easy. In fact a .NET interpreted library could made in native ARM, Reply
  • Wilco1 - Tuesday, July 16, 2019 - link

    .NET already compiles to native Arm of course! It's only old x86 applications that need translation. And the performance of the translated code is not all that important for Windows applications which spend much of their time in native Windows libraries.

    Microsoft demonstrated good Windows performance on a 2 year old Arm SoC (https://www.youtube.com/watch?v=PaSmZzo3Y_c), and performance has doubled since then. Cortex-A77 will be out in a few months and is about 3 times faster. Emulation is more than fast enough.
    Reply
  • Zoolook13 - Friday, July 19, 2019 - link

    Emulation done right is keeping the recompiled native binaries, speeding up the application the more you run it until it's completely native, like DEC's FX!32.

    A similar approach with todays compiler technology and processing power and largely similar dev librarys on many platforms could yield you close to native speed at the cost of a larger storage footprint, but the compiled code part of an application is much less today, GUI resources etc is a much larger part of the footprint today than back then.
    Reply
  • npz - Monday, July 15, 2019 - link

    No it's not. Every test of non-native win32 binaries running on ARM has proven slow. And it can't even run 64-bit binaries. Reply
  • HStewart - Monday, July 15, 2019 - link

    One OS that would be better suited for Snapdragon machines is Chrome OS, would not have to worried about emulation. Reply
  • Wilco1 - Wednesday, July 17, 2019 - link

    Emulation is not a worry, a Cortex-A77 would emulate x86 code faster than this future Atom. Reply
  • lmcd - Monday, July 15, 2019 - link

    It's not "ARM's x86 emulation," it's a Windows-specific technology. Windows uses a HAL for x86 on x86_64, and they added a similar layer for x86 on ARM. The times when the HAL isn't molasses are when it can drop in an ARM binary instead of an x86, and the binary is doing most of the heavy work. In essence, the "emulation" is fast when it's not emulation. Reply

Log in

Don't have an account? Sign up now