Fujitsu Preps Monaka Datacenter CPU to Succeed A64FX: Greater Efficiency and More Featuresby Anton Shilov on March 10, 2023 2:00 PM EST
- Posted in
Fujitsu has revealed that the company is prepping the successor for its A64FX processor for high-performance computing. The company's second-generation Arm-based server CPU is slated to offer considerably higher performance and energy efficiency than its predecessor, as well as will add features to address AI and data analytics applications. The CPU is codenamed Monaka and it will arrive sometimes in 2027 and will power a next-generation supercomputer due in 2028.
Like the original A64FX, Fujitsu's Monaka will once again be an Arm ISA processor. But it will also integrate hardware to accelerate artificial intelligence (AI) and data analytics applications, according to details released by the company at its ActivateNow: Technology Summit at the Computer History Museum in Mountain View, California, reports The Register.
The promise to boost performance in traditional HPC and emerging AI workloads is logical. Although Fujitsu's existing A64FX already has support for 512-bit Scalable Vector Extensions (SVE) and can operate in FP64, FP32, FP16 and INT8 modes for a variety of AI and traditional supercomputer applications, the rapidly developing field of AI workloads has been adopting new data formats beyond FP16 and INT8. Meanwhile, Retaining the Arm architecture will ensure that the Monaka processor will be able to run code developed for the original A64FX CPU as well as other Arm-based system-on-chips for datacenters.
"The next-generation DC CPU (Monaka) that we are developing will have a wider range of features and will prove more energy efficient," a Fujitsu spokesperson told The Register. "The range of potential applications is wider than that of the A64FX, which has special characteristics (e.g., interconnects) specific to Fugaku.
One of Fujitsu's main goals with Monaka is to provide 'overwhelming energy efficiency' when compared with competing processors available at the time, claims The Register citing the company's officials. The firm is aiming to deliver 70% higher overall performance and 100% higher performance-per-watt than competing chips. Though with delivery not expected until 2027, it goes without saying that any competitive performance expectations are aspirational at best.
Fujitsu's current 48+4-core A64FX processor for HPC has proven that the Arm architecture is perfectly capable of powering supercomputers, in this case Fugaku, which was the world's fastest supercomputer from 2020 to 2022. But the CPU is chiefly tailored for traditional supercomputer workloads, and as a result it's only been used in a handful of systems, including Fugaku, Fujitsu's PrimeHPC FX700 and FX1000 systems (which are available for purchase), and HPE's Apollo 80 HPC platform.
Monaka, in turn, will allow Fujitsu to take a stab at supplying the broader HPC market with a high performance Arm processor. While the company isn't offering specific technical details at this time, they are making it clear that they're designing the chip for a wider audience, as opposed to the supercomputer-focused A64FX and its niche features like on-package HBM2 and the Tofu Interconnect D fabric to connect multiple nodes in a cluster. Shifting to a broader audience opens up more sales opportunities for Fujitsu, but it will put the company in more direct competition with other Arm server CPU vendors such as NVIDIA, Ampere, and the many internal projects at hyperscalers.
In any case, it'll be interesting to see how things unfold once Monaka arrives in 2027. The Arm server CPU market has quickly blossomed over the last few years, so by the time Monaka hits the scene, it's going to be coming into a market with lots of opportunity for Arm servers and Arm software, but also a market with no shortage of companies trying to claim their piece of the pie.
Source: The Register
Post Your CommentPlease log in or sign up to comment.
View All Comments
dwillmore - Friday, March 10, 2023 - linkI just googled for the price of those HPE Apollo 80 HPC boxes. Yeah, family car price range.
Will the new chip get us down to the cheap car/nice motorcycle range? Reply
Threska - Friday, March 10, 2023 - linkI think the point of these is to show that ARM is competitive with the x86 architecture. Now all we need is some PowerPC and RISC-V. Reply
Cooe - Saturday, March 11, 2023 - linkPowerPC literally doesn't exist anymore and hasn't for like over a decade... You're thinking of IBM POWER, which is already decently competitive for Big Iron. Reply
mode_13h - Saturday, March 11, 2023 - linkI don't see how they'll be able to repeat the performance or efficiency leaps they achieved with A64FX. It was the first out of the gate with SVE, but now others are doing it. And their hard-wired AI accelerator will be challenging industry players with several generations under their belts.
They have one possible advantage on efficiency, which is that if you have a sufficiently large budget, you can add more nodes and clock them lower. If GPU-based supercomputers wanted better efficiency numbers, that's all they'd have to do. However, budget constraints mean they have to run a smaller number of GPUs well outside their optimal efficiency range.
Since Fugaku was designed in the post-Fukushima era, energy was probably very expensive. That could've pushed them to budget more on equipment, for the benefit of lower operating costs. Reply
brucethemoose - Sunday, March 12, 2023 - linkHPC customers like really wide SIMD, if they go for that again. Other SVE2 implementations (other than SiPearl's mysterious and delayed(?) design) are 128 bit or 256 bit, and sometimes a bunch of wide cores are a better fit than a GPU. Reply
brucethemoose - Monday, March 13, 2023 - linkActually I think this is incorrect, it may be a stock ARM core. Reply
mode_13h - Monday, March 13, 2023 - link> sometimes a bunch of wide cores are a better fit than a GPU.
GPUs have a very weak memory model, and that really helps with scaling.
If you're running intrinsically branchy code, then GPUs' SIMD-oriented programming model might indeed be a poor fit. But, you're going to have more overhead from running it on a cache-coherent CPU with lots of cores.
Where I think A64FX did so well on the efficiency front is that their cores were wide, relatively simple, and clocked rather conservatively. Scaling the performance of such a CPU will necessarily come at the expense of efficiency. Especially since, for the more general kinds of server workloads they want to address, you're going to need more complex cores. Reply
Silver5urfer - Saturday, March 11, 2023 - linkSpecialized use cases with custom IP blocks for acceleration of specific workloads. That's what ARM is best at. But for the people who want power and performance, the only solution is x86. ARM cannot replace that even if Fujitsu's A64FX successor is 100x faster.
Look at TR Pro, most of the home users who want a simple Server cannot buy, extremely high cost and lack of even users across various forums to troubleshoot. Now look at used Xeon, Opteron and other x86 CPUs. Abundance of resources, you can buy any Mobo, get a chip and start your own Proxmox, VMWare instances or any piece of code or HW PCIe expansion cards, SAS cards etc you name it.
That's the beauty of x86, I look forward to own a Xeon / EPYC system hopefully soon. Reply
Dolda2000 - Sunday, March 12, 2023 - linkThis isn't so much a collection of custom accelerators, as much as it is a GPU with a CPU-like memory model and the ability to run an operating system and handle page faults. Kind of like Xeon Phi, but hopefully working better.
That's a wonderful thing, and something that I hope we'll see a lot more of in the future. I've been hoping since A64FX that it work out well for them, and the fact that they're making a second generation of it is perhaps a positive indication. Reply
brucethemoose - Monday, March 13, 2023 - linkWe will see.
The slides make it sound more like a server CPU with some accelerators, not something with a weird core/memory config like the A64FX, but I too hope it stats weird. Reply