When speaking about Arm in the enterprise space, the main angle for discussion is on the CPU side. Having a high-performance SoC at the heart of the server has been a key goal for many years, and we have had players such as Amazon, Ampere, Marvell, Qualcomm, Huawei, and others play for the server market. The other angle to attack is for co-processors and accelerators. Here we have one main participant: Fujitsu. We covered the A64FX when the design was disclosed at Hot Chips last year, with its super high cache bandwidth, and it will be available on a simple PCIe card. The main end-point for a lot of these cards will be the Fugaku / Post-K supercomputer in Japan, where we expect it to hit a one of the top numbers on the TOP500 supercomputer list next year.

After the design disclosure last year at Hot Chips, at Supercomputing 2018 we saw an individual chip on display. This year at Supercomputing 2019, we found a wafer.

I just wanted to post some photos. Enjoy.

 

The A64FX is the main recipient of the Arm Scalable Vector Extensions, new to Arm v8.2, which in this instance gives 48 computing cores with a 512-bit wide SIMD powered by 32 GiB of HBM2. Inside the chip is a custom network, and externally the chip is connected via a Tofu interconnect (6D/Torus), and the chip provides 2.7 TFLOPs of DGEMM performance. The chip itself is built on TSMC 7nm and has 8.786 billion transistors, but only 594 pins. Peak memory bandwidth is 1 TB/s.

The chip is built for both high performance, high throughput, and high performance per watt, supporting FP64 through to INT8. The L1 data cache is designed for sustained throughput, and power management is tightly controlled on chip. Either way you slice it, this chip is mightily impressive. We even saw HPE deploy two of these chips in a single half-width node.

Related Reading

POST A COMMENT

23 Comments

View All Comments

  • prisonerX - Thursday, December 5, 2019 - link

    Intel can't compete with this kind of high core count and high bandwidth memory strategy because their arch is just to complicated and runs too hot. The future is increasingly parallel and ARM is positioned well but RISC-V probably has the drop on it long term. Reply
  • jeffsci - Thursday, December 5, 2019 - link

    https://ark.intel.com/content/www/us/en/ark/produc... shipped three years ago with up to 72 cores, ~490 GB/s memory bandwidth and ~2 TF/s DGEMM. Reply
  • Meteor2 - Thursday, December 5, 2019 - link

    And was discontinued shortly afterwards. Reply
  • tuxRoller - Thursday, December 5, 2019 - link

    Err, that link says bandwidth is 115.2GB. Reply
  • Spunjji - Friday, December 6, 2019 - link

    1/8 the memory bandwidth (it's 1/4 what you stated here), inferior interconnect capabilities, and you can't buy it anymore.

    Certainly seems a lot like they couldn't make it compete.
    Reply
  • Vatharian - Friday, December 6, 2019 - link

    First generation used Pentium Pro with 4T/1C extensions. Second generation used modified Atom cores. These underperform. Third will do the same (newer arch step), but demand could be really low for that to even surface. Only saving grace for at least socketed Phi was Omnipath. Trouble is low-power x86 doesn't scale well sideways without massive interconnects that eat silicon and power budget, and given the sole reason for x86 at this point is backward compatibility, it is really inflexible mess. Redesigning it from ground up is troublesome, with minimum improvement for great amount of work. Intel never felt great with HUGE fabrics, too. Otherwise we would see more chips like doomed Avoton, but i.e. in 256C configuration already. Reply
  • FreckledTrout - Thursday, December 5, 2019 - link

    Interesting. That memory bandwidth is spectacular. I wonder how this will compete with AMD's EPYC in similar core counts? Seems ARM has a leg up in some areas. Reply
  • Betty66 - Sunday, December 8, 2019 - link

    #1 in green top500 https://www.top500.org/green500/lists/2019/11/ Reply
  • willis936 - Thursday, December 5, 2019 - link

    Fujitsu ARM? Then what does Socionext do? Sorry these companies are not easy to get straight. Reply
  • arnd - Thursday, December 5, 2019 - link

    Socionext is a fabless embedded SoCs design company and a joint venture of Fujitsu and Panasonic.

    Fujitsu builds their own server/supercomputer/mainframe chips and apparently acts as a foundry for fabless semiconductor companies.
    Reply

Log in

Don't have an account? Sign up now