E1 Implementation & Performance Targets

The Neoverse E1 CPU being a small CPU core aimed at cost-effective and dense implementation naturally needs to be quite small, as well as power efficient.

Implemented on a 7nm process, Arm physical design team is able to get an E1 CPU core with 32KB L1 and 128KB L2 cache down to 0.46mm² - all while reaching a high clock of 2.5GHz and a power consumption of 183mW. The higher clock was a surprise as it is quite notably higher than what we’ve seen vendors achieve on the A55 – although we are talking about different implementation targets.

Arm envisions the most popular implementations of the E1 to be found in lower power edge applications. At the lower end, ranging from 8-16 cores would be a good for wireless access points and gateways, delivering data throughputs in the 10-25Gbps rang. A tier up we would see 16-32 core designs in use-cases such as edge data aggregation deployments, achieving data rates in the 100’s of Gbps.

The Neoverse E1 reference design that Arm offers and sees as being the most popular “sweet-spot” is based on a 16 core design. Here we have to clusters of 8 cores in a small CMN-600 2x4 mesh network, allowing for system cache options as well as integration of possible additional third-part IP. The envisioned memory system would be a 2-ch DDR4 configuration.

Such as SoC would have a power consumption of less than 15W, of which less than 4W would actually be used by the CPU cores. SPECint2006 rate scores would come in at 153 – which given the actual size and power consumption of the platform is quite impressive. The system would also be capable of 25Gb/s network throughput, enabled solely by a software transport layer (Meaning no hardware acceleration).

On a per-core comparison to the Cortex A53 and A55, the new E1 CPU would again offer significant throughput performance benefits, but also very importantly it would represent an efficiency boost compared to its predecessors (ISO process comparison).

The Neoverse E1 CPU: A small SMT core for the data-plane First N1 Silicon: Enabling the Ecosystem with SDPs
Comments Locked

101 Comments

View All Comments

  • wumpus - Thursday, February 21, 2019 - link

    This was supposed to be a reply to "X86 is less efficient than RISC".
  • Neutral - Wednesday, February 20, 2019 - link

    "All of these new microarchitectures are important to Arm because they represent an infliction point in the market"...
    Dictionary result for infliction
    /inˈflikSHən/
    noun
    the action of inflicting something unpleasant or painful on someone or something.
  • Ryan Smith - Wednesday, February 20, 2019 - link

    Whoops, there's one heck of a typo. Thanks!
  • phoenix_rizzen - Thursday, February 21, 2019 - link

    There's several other typos, incorrect word usage, and "correctly spelled but incorrect word" errors in this article.

    Excellent article with a *lot* of great information. But a lot of niggles that should have been picked up by an editor before it was published.
  • GreenReaper - Wednesday, February 20, 2019 - link

    They're bringing *pain* to the competition! :-D
  • Antony Newman - Wednesday, February 20, 2019 - link

    Ryan : I think I counted several (6?) typos - a spellcheck will pick up most of them (there was also a DDR related one with a missing letter).

    AJ
  • eastcoast_pete - Wednesday, February 20, 2019 - link

    @Andrei: Thanks! Sounds very promising, I look forward to ARM-based solutions giving Intel (and AMD) some competitive pressure in the server space, now that SPARC and Power are either down or almost out of this market. Question: Any mention of Microsoft server-type applications being ported to run native on ARM64/Neoverse? That would open a large market for Neoverse and following designs.

    Other comments: So, did Qualcomm get out of the server market exactly at the wrong time? Sure looks that way. Huawei played it smart, using its development and know-how from the mobile space to make itself a serious contender for ARM-based servers (China's push to have non-US solutions for their home markets doesn't hurt them, either).
    The big technical questions for me are: What will the performance and energy use of the CMN-600 mesh network and CCIX be? AMD is trying its chiplet approach, but their first generation had high energy consumption and some performance degradation by the interconnect, which still has them at a disadvantage to Intel's all cores on one die approach. However, scaling up becomes prohibitive as the die gets bigger and transistor counts get higher. I see the interconnect tech as the next big thing for servers and HPC. If they (ARM) and partners can pull this off and be the first with a high-performance, energy-efficient interconnect, they can clobber Intel or AMD just by combining more and more cheap, high-performing cores using chiplets, and do so at a much lower per-core and per-performance cost.
  • SarahKerrigan - Wednesday, February 20, 2019 - link

    Power is not even remotely out of the server market. IBM sells billions of dollars of servers per year, and the top two systems on Top500 are Power.
  • eastcoast_pete - Wednesday, February 20, 2019 - link

    Didn't say that Power is completely out of the server market; SPARC is, thanks to Larry. However, while Power is big in HPC and Supercomputing, it's become rare these days to find a Power-based unit in a server closet or rack for more general use, especially outside of large enterprises. IBM also sells quite a number of Intel-based servers, and I believe that accounts for a sizeable chunk of their remaining server business.
  • SarahKerrigan - Wednesday, February 20, 2019 - link

    That's not accurate. IBM offloaded their x86 server business (to Lenovo) in 2014. At this point, it is just z mainframes and Power.

    My company owns a pair of Power systems, one single-socket and one two-socket, and we're pretty small. It's not just large enterprises. OpenPower has brought down entry costs significantly.

Log in

Don't have an account? Sign up now