The Neoverse V1 Microarchitecture: Platform Enhancements

Aside from the core-side microarchitectural aspects of the V1, the new design also features some new system-facing novelties that promise to help vendors integrate the CPU IP better in larger scale implementations.

MPAM, or Max Power Mitigation Mechanism is a new fine-grained (to around 100 clock cycles) power management mechanism that promises to help smooth out the power behaviour of the core, and allow vendors’ implementations of the chip’s power delivery mechanisms to be so to say, be built to lesser requirements.

As we’ve seen in our review of the Ampere Altra, instead of fluctuating frequency at maximum TDP like how most x86 CPUs behave right now, the chip rather prefers to stay most of the time at maximum frequency, with the actual power consumption many times landing in at quite below the TDP (maximum allowed power consumption).  A mechanism such as MPAM would allow, if possible, for the system’s average frequency to be higher by throttling the power limited cores to a finer degree. The mechanism to which this can be achieved can also include microarchitectural features such as dispatch throttling where the core slows down the dispatched instructions, smoothing out high power requirements in workloads having high execution periods, particularly important now with the new wider 2x256b SVE pipelines for example.

MPAM is a different mechanism helping interactions in larger system implementations. The Memory partitioning and monitoring feature is supposed to help with quality of service and reducing side-effects of noisy neighbours in deployments where multiple workloads, such as multiple VMs or processes, operate on the same system. This naturally requires software-hardware cooperation and implementation, but should be something that is particularly helpful in cloud environments.

CBusy or Completer Busy is also a new system-side mechanism where the CPU cores interact with the mesh interconnect on a feedback-based basis, where the CPUs can vary their memory prefetcher aggressiveness depending on the overall mesh and system memory load. This ties in with the previously mentioned dynamic prefetcher behaviour where one can have the best of both worlds – better prefetching for more performance per core when the bandwidth is available, and very conservative prefetching when the system is under high load and there’s no room for wasted speculative bandwidth and data transfers.

The Neoverse V1 Microarchitecture: X1 with SVE? The Neoverse N2 Microarchitecture: First Armv9 For Enterprise
Comments Locked

95 Comments

View All Comments

  • Dug - Tuesday, April 27, 2021 - link

    Now is when I wish ARM was publicly traded.
  • mode_13h - Tuesday, April 27, 2021 - link

    Well, you could buy NVDA, under the assumption the acquisition will go through.
  • dotjaz - Thursday, April 29, 2021 - link

    SoftBank is already publicly traded on the Tokyo Stock Exchange. Why rely on NVIDIA buyout which for all likelihood won't happen any time soon if at all.
  • mode_13h - Thursday, April 29, 2021 - link

    > SoftBank is already publicly traded on the Tokyo Stock Exchange.

    They also invested heavily in WeWork, when it was highly over-valued. I have no idea what other nutty positions they might've taken, but I think it's not a great proxy for ARM just due to its sheer size.
  • cjcoats - Tuesday, April 27, 2021 - link

    As an environmental modeling (HPCC) developer: what is the chance of putting a V1 machine on my desk in the foreseeable future?
  • Silver5urfer - Tuesday, April 27, 2021 - link

    Never. Since there has to be an OEM for these chips to put in DIY and Consumer machines, so far except the HPE's A64FX ARM there's no way any consumer can buy these ARM processors and that is also highly expensive over 5 digit figure. And then the drivers / sw ecosystem comes into play, there's passion projects like Pi as we all know but they are nowhere near the Desktop class performance.

    ARM Graviton 2 was made because AWS wants to save money on their Infrastructure, that's why their Annapurna design team is working there. Simply because of that reason Amazon put more effort onto it AND the fact that ARM is custom helps them to tailor it to their workloads and spread their cost.

    Altra is niche, Marvell is nowhere near as their plans was to make custom chips on order. And from the coverage above we see India, Korea, EU use custom design licensing for their HPC Supercomputer designs.

    Then there's a rumor that MS is also making their own chips, again custom tailored for their Azure, Google also rumored esp their Whitechapel mobile processor (it won't beat any processor on the market that's my guess) and maybe their GCP oriented own design.

    These numbers projection do look good vs x86 SMT machines finally to me after all these years, BUT have to see how they will compete once they are out vs 2021 HW is the big question, since if these CPUs outperform the EPYC Milan technically AWS should replace all of them right ? since you have Perf / Power improvements by a massive scale. Idk, gotta see. Then the upcoming AMD Genoa and Sapphire Rapids competition will also show how the landscape will be.
  • SarahKerrigan - Tuesday, April 27, 2021 - link

    If they don't replace all the x86 systems in AWS with ARM, that *must* mean Neoverse is somehow secretly inferior, right??

    Or, you know, it could mean that x86 compatibility matters for a fair chunk of the EC2 installed base, especially on the Windows Server side (which is not small) but on Linux too (Oracle DB, for instance, which does not yet run on ARM.)
  • Silver5urfer - Tuesday, April 27, 2021 - link

    That was a joke.
  • Spunjji - Friday, April 30, 2021 - link

    Was it, though? Schrodinger's Joke strikes again.
  • Raqia - Tuesday, April 27, 2021 - link

    Maybe not an V1 but you could probably get a more open high performance ARM core than the Apple MX series pretty soon:

    https://investor.qualcomm.com/news-events/press-re...

    "The first Qualcomm® Snapdragon™ platforms to feature Qualcomm Technologies' new internally designed CPUs are expected to sample in the second half of 2022 and will be designed for high performance ultraportable laptops."

Log in

Don't have an account? Sign up now