It’s been nearly 10 years since Arm had first announced the Armv8 architecture in October 2011, and it’s been a quite eventful decade of computing as the instruction set architecture saw increased adoption through the mobile space to the server space, and now starting to become common in the consumer devices market such as laptops and upcoming desktop machines. Throughout the years, Arm has evolved the ISA with various updates and extensions to the architecture, some important, some maybe glanced over easily.

Today, as part of Arm’s Vision Day event, the company is announcing the first details of the company’s new Armv9 architecture, setting the foundation for what Arm hopes to be the computing platform for the next 300 billion chips in the next decade.

The big question that readers will likely be asking themselves is what exactly differentiates Armv9 to Armv8 to warrant such a large jump in the ISA nomenclature. Truthfully, from a purely ISA standpoint, v9 probably isn’t an as fundamental jump as v8 was over v7, which had introduced a completely different execution mode and instruction set with AArch64, which had larger microarchitectural ramifications over AArch32 such as extended registers, 64-bit virtual address spaces and many more improvements.

Armv9 continues the usage of AArch64 as the baseline instruction set, however adds in a few very important extensions in its capabilities that warrants an increment in the architecture numbering, and probably allows Arm to also achieve a sort of software re-baselining of not only the new v9 features, but also the various v8 extensions we’ve seen released over the years.

The three new main pillars of Armv9 that Arm sees as the main goals of the new architecture are security, AI, and improved vector and DSP capabilities. Security is a very big topic for v9 and we’ll go into the new details of the new extensions and features into more depth in a bit, but getting DSP and AI features out of the way first should be straightforward.

Probably the biggest new feature that is promised with new Armv9 compatible CPUs that will be immediately visible to developers and users is the baselining of SVE2 as a successor to NEON.

Scalable Vector Extensions, or SVE, in its first implementation was announced back in 2016 and implemented for the first time in Fujitsu’s A64FX CPU cores, now powering the world’s #1 supercomputer Fukagu in Japan. The problem with SVE was that this first iteration of the new variable vector length SIMD instruction set was rather limited in scope, and aimed more at HPC workloads, missing many of the more versatile instructions which still were covered by NEON.

SVE2 was announced back in April 2019, and looked to solve this issue by complementing the new scalable SIMD instruction set with the needed instructions to serve more varied DSP-like workloads that currently still use NEON.

The benefit of SVE and SVE2 beyond addition various modern SIMD capabilities is in their variable vector size, ranging from 128b to 2048b, allowing variable 128b granularity of vectors, irrespective of what the actual hardware is running on. Purely from a view of vector processing and programming, it means that a software developer would only ever have to compile his code once, and if in the future a CPU would come out with say native 512b SIMD execution pipelines, the code would be able to already take advantage of the full width of the units. Similarly, the same code would be able to run on more conservative designs with a lower hardware execution width capability, which is important to Arm as they design CPUs from IoT, to mobile, to datacentres. It also does this all whilst remaining within the 32b encoding space of the Arm architecture, whereas alternative implementations such as on x86 have to add on new extensions and instructions depending on vector size.

Machine learning is also seen as an important part of Armv9 as Arm sees more and more ML workloads to become common place in the next years. Running ML workloads on dedicated accelerators naturally will still be a requirement for anything that is performance or power efficiency critical, however there still will be vast new adoption of smaller scope ML workloads that will run on CPUs.

Matrix multiplication instructions are key here and will represent an important step in seeing larger adoption across the ecosystem as being a baseline feature of v9 CPUs.

Generally, I see SVE2 as probably the most important factor that would warrant the jump to a v9 nomenclature as it’s a more definitive ISA feature that differentiates it from v8 CPUs in every-day usage, and that would warrant the software ecosystem to go and actually diverge from the existing v8 stack. That’s actually become quite a problem for Arm in the server space as the software ecosystem is still baselining software packages on v8.0, which unfortunately is missing the all-important v8.1 Large System Extensions.

Having the whole software ecosystem move forward and being able to assume new v9 hardware has the capability of the new architectural extensions would help push things ahead, and probably solve some of the current situation.

However v9 isn’t only about SVE2 and new instructions, it also has a very large focus on security, where we’ll be seeing some more radical changes.

Introducing the Confidential Compute Architecture
POST A COMMENT

76 Comments

View All Comments

  • skavi - Tuesday, March 30, 2021 - link

    So is Matterhorn v8? I thought it was pretty much expected to launch with v9. Reply
  • dotjaz - Friday, April 2, 2021 - link

    "Armv9 designs to be unveiled soon, devices in early 2022"
    What exactly do you think? ARM will release Matterhorn v8 and Something v9 back to back expecting nobody to use v8 and Qualcomm and Samsung to tape out Something v9 which should be happening NOW for a Q4 production and early 2022 release?

    How stupid does that sound?
    Reply
  • brucethemoose - Tuesday, March 30, 2021 - link

    SVE2 is a huge existential threat for x86.

    Even if Intel, AMD, and VIA's subsidiaries agreed to standardize variable-width SIMD instructions overnight, ARM is still going to beat them to the punch. Heck, Intel couldn't even standardize AVX512 within their own product stack.
    Reply
  • lmcd - Tuesday, March 30, 2021 - link

    A) VIA doesn't matter.
    B) Intel and AMD could standardize this overnight.
    C) If they standardize this overnight, the only ARM implementation that will beat Intel and AMD to the punch will be internal-only Amazon chips and Apple. Might as well be a win.
    Reply
  • brucethemoose - Tuesday, March 30, 2021 - link

    Cores take a long time to design and produce. ARM and their licences presumably have some SVE2 designs in the pipeline by now.

    In addition, Fujitsu, Qualcomm (via Nuvia), Ampere, and Nvidia/ARM all have pretty compelling shots at competitive designs. There are probably more.

    AMD and Intel could be cooperating in secret, but that would be surprising. It would also catch developers by surprise, unless they do something simple like solidify AVX512 across the board, and break up instructions on smaller cores kinda like Zen 1 does.
    Reply
  • lmcd - Tuesday, March 30, 2021 - link

    The SVE2 core designs might be in the pipeline but my point is that the transition from core design -> SoC release appears to be pretty slow still.

    I suppose the data center SoCs might match or slightly beat an Intel/AMD implementation. I still can't see that mattering as much as making it available to developers on local hardware. Until there's a dev loop on a single affordable local device running mainline Linux or Windows with modern WDDM that supports SVE2, it's not a threat. It only affects data centers that are either priced into keeping their current architecture, or are too big to care and already switched.

    If Qualcomm delivers one of those in a laptop SoC, that could change the game. But imo that won't happen before Intel/AMD deliver.
    Reply
  • TheinsanegamerN - Tuesday, March 30, 2021 - link

    We've heard repeatedly that (X) will be the downfall of x86 for years now. ARM was prophacized in 2013 as the next big thing, and it went nowhere. SVE2 will only become a "threat" to x86 if implementations are available across the industry. Reply
  • michael2k - Tuesday, March 30, 2021 - link

    TSMC, not ARM, is currently the biggest threat to x86.
    After TSMC will be Samsung.
    Behind those two it is Apple, not ARM, that is the biggest threat to x86

    And they are all different threats. ARM is slowly displacing x86 as more and more people use Android, iOS, and Chromebooks, and including Macs Intel's market share has dropped a measurable amount in the last decade, assuming Apple doesn't lose customers over their ARM switch.
    Reply
  • dotjaz - Tuesday, March 30, 2021 - link

    You stupid? Zen2 onwards are built by TSMC, how are they a "threat" to x86? Intel ≠ x86. Reply
  • HardwareDufus - Friday, April 2, 2021 - link

    you are a rather offensive and unpleasant person.... why do you repeatedly say things like are you stupid, that sounds stupid, are you on drugs?

    can you find a nicer way to express your disagreement with what others have posted?
    Reply

Log in

Don't have an account? Sign up now