Conclusion & End Remarks

We’ve been hearing about Arm in the server space for many years now, with many people claiming “it’s coming”; “it’ll be great”, only for the hype to fizzle out into relative disappointment once the performance of the chips was put under the microscope. Thankfully, this is not the case for the Graviton2: not only were Amazon and Arm able to deliver on all of their promises, but they've also hit it out of the park in terms of value against the incumbent x86 players.

The Graviton2 is the quintessential reference Neoverse N1 platform as envisioned by Arm, aiming for nothing less than disruption of the datacentre market and making Arm servers a competitive reality. The chip is not only  able to compete in terms of raw throughput thanks to its 64 physical cores in a single socket, but it also manages to showcase competitive single-thread performance, keeping in line with AMD and Intel systems in the market.

The Amazon chip isn’t perfect, we definitely would have wanted to see more L3 cache integrated into the mesh interconnect as the 32MB does seem quite mediocre for handling 64 cores, and the chip does suffer from this aspect in terms of its performance scaling in memory heavy workloads. Only Amazon knows if this is a real-world bottleneck for the chip and the kind of workloads that are typical in the cloud.

Performance wise, there’s a big empty outline of an elephant in the room that's been missing from our data today, and that’s AMD’s new EPYC2 Rome processors. AMD has showed it had been able to vastly scale performance and do away with a lot of the limitations presented by the first generation EPYC processors that we saw today. Even if we can somewhat estimate the performance that Rome would represent against the Graviton2, we don’t have any idea of what kind of pricing Amazon will be launching the new c5a type instances at.

In terms of value, the Graviton2 seemingly ends up with top grades and puts the competition to shame. This aspect not only will be due to the Graviton2’s performance and efficiency, but also due to the fact that suddenly Amazon is now vertically integrated for its EC2 hardware platforms. If you’re an EC2 customer today, and unless you’re tied to x86 for whatever reason, you’d be stupid not to switch over to Graviton2 instances once they become available, as the cost savings will be significant.

What does this mean for non-Amazon users? Well the Arm server has become a reality, and companies such as Ampere and their new Altra server chips are trying to quickly follow up with the same recipe as the Graviton2 and offer similar ready-made meals for the non-Amazons of the world. These chips however will have to compete with AMD’s Rome, and later in the year the new Milan, which won’t be easy. Meanwhile Intel doesn’t seem to be a likely competitor in the short term while they’re attempting to resolve their issues.

Long-term, things are looking bright for the Arm ecosystem. Arm themselves are aiming to maintain a yearly 20-25% compound annual growth rate for performance, and Ampere already stated they’re looking for yearly hardware refreshes. We don’t know Amazon’s plans, but I imagine it’ll be similar, if not skipping some generations. Around the 2022 timeframe we should see Matterhorn-based products, Arm’s new Very Large™ CPU microarchitecture which should again accelerate things dramatically. In a similar sense, the newly founded Nuvia has lofty goals for their entrance into the datacentre market, and they do have the design talent with a track record to possibly deliver, in a few years’ time.

The Graviton2 is a great product, and we’re looking forward to see more such successful designs from the Arm ecosystem.

Cost Analysis - An x86 Massacre
Comments Locked

96 Comments

View All Comments

  • eek2121 - Tuesday, March 10, 2020 - link

    It is worth noting AnandTech’s own numbers: https://www.anandtech.com/show/14694/amd-rome-epyc...
  • RallJ - Tuesday, March 10, 2020 - link

    I understand that, but consider everything boils down to just $/vCPU/hr, I think a discussion around the new Xeon Gold R is warranted. For example, the existing dual-socket Xeon Amazon is using can be substituted by the new 6248R for 60% lower price while providing a modest turbo and base frequency improvement at lower a slight TDP reduction versus the existing Platinum they have. Unless Amazon decides to pocket the saving, that would have a massive impact on the vCPU $ comparison.

    https://www.anandtech.com/show/15542/intel-updates...
  • Andrei Frumusanu - Tuesday, March 10, 2020 - link

    Hyperscalers never pay full list price for their special SKUs, so comparisons to public new SKUs like the 6248R are not relevant.

    We're happy to update the landscape once EC2 introduces newer generation instances, but for now, these are the current prices and costs for what's available today and in the next few months.
  • Spunjji - Wednesday, March 11, 2020 - link

    I'm confused. Either you can think that everything boils down to $/vCPU/hr, in which case the only thing that's relevant is what Amazon actually offer, or you can think that "a discussion around the 'new' Xeon Gold R is warranted". They're mutually exclusive.
  • close - Tuesday, March 10, 2020 - link

    Great write-up Andrei. One question (I hope I didn't miss the answer in the article). Does Amazon's chip come out in front in the cost analysis because Amazon decided to take a loss or overcharge the other options, or is it an organic difference where it's intrinsically better?
  • Andrei Frumusanu - Tuesday, March 10, 2020 - link

    We have no idea of Amazon's internal cost structure, so take the cost analysis from and end-user TCO perspective.
  • eek2121 - Tuesday, March 10, 2020 - link

    I suspect the TDP of this chip is likely in the 150 watt range. We also know nothing about the operating environment of any of the chips. For example, the chip is rated for DDR4 3200, but is it running at 3200 speeds? The EPYC chip likely is NOT. So many questions here...
  • Andrei Frumusanu - Tuesday, March 10, 2020 - link

    It is running 3200, Amazon confirmed that.

    They didn't comment on TDP, but given Arm and Ampere's figures, I think my estimate is correct.
  • Flunk - Thursday, April 9, 2020 - link

    They're comparing VMs with the same cost/hour. What number of cores/threads is isn't really relevant.
  • autarchprinceps - Sunday, October 25, 2020 - link

    That’s exactly why they reserved the entire hardware. If you run only a single workload on SMT, that single thread can use the entire core. That’s kind of the point of SMT.

Log in

Don't have an account? Sign up now