Cost Analysis - An x86 Massacre

The Graviton2 showcased that it can keep up extremely well in terms of performance and throughput, even beating the competition in a lot of the tests. However sometimes you don’t care too much about performance, and you just want to get some workload completed in the cheapest way possible, at which point value comes into play.

Amazon does allude to that, stating that the new chip is able to achieve 40% better performance per dollar than its competition. As covered in the introduction, for the 64-vCPU count 16xlarge instances the m6g (Graviton2), m5a (EPYC1), and m5n (Xeon Cascade Lake) are priced at an hourly cost of $2.464, $2.752 and $3.808 respectively.

Translating the time to completion of our various SPEC tests to hours and multiplying by the hourly cost, we end up with a cost per fixed workload metric:

An aggregate of all workloads summed up together, which should hopefully end up in a representative figure for a wide variety of real-world use-cases, we do end up seeing the Graviton2 coming in 40% cheaper than the competing platforms, an outstanding figure.

If we were to compare the same fixed workload at smaller instance counts, because of Graviton2’s better per-thread performance, we’re seeing even better results on 4xlarge (16 vCPUs) instances. Here the Amazon chip showcases 43% better value than the Xeon chip, and beats the AMD instances by being 53% cheaper.

If we were to transform the results into a fixed throughput per dollar metric, we again see the Graviton2 far ahead. The unit here is SPEC runs per dollar.

The lower the vCPU instance size, the better value the Graviton2 seemingly becomes, as its performance with increased vCPUs scales sublinearly, but the cost of bigger vCPU instances scales linearly, an effect that’s almost not present at all in the AMD system, and only marginally present in the Xeon instances.

Again, the Graviton2’s scaling here might differ in production instances, but given that you can’t just chop off half the chip (or have access to only one of two sockets, in Intel’s case here) and that Amazon seemingly isn’t doing any static partitioning of the chip’s shared resources, I do think it’s more likely than not that such performance and value figures will be encountered in the real-world.

Even ignoring the lower vCPU instances, Amazon was able to deliver on its promise of 40% better performance per dollar, and it’s a massive shakeup for the AWS and EC2 ecosystem.

SPEC - MT Performance (4xlarge 16 vCPU) Conclusion & End Remarks
POST A COMMENT

95 Comments

View All Comments

  • eek2121 - Tuesday, March 10, 2020 - link

    It is worth noting AnandTech’s own numbers: https://www.anandtech.com/show/14694/amd-rome-epyc... Reply
  • RallJ - Tuesday, March 10, 2020 - link

    I understand that, but consider everything boils down to just $/vCPU/hr, I think a discussion around the new Xeon Gold R is warranted. For example, the existing dual-socket Xeon Amazon is using can be substituted by the new 6248R for 60% lower price while providing a modest turbo and base frequency improvement at lower a slight TDP reduction versus the existing Platinum they have. Unless Amazon decides to pocket the saving, that would have a massive impact on the vCPU $ comparison.

    https://www.anandtech.com/show/15542/intel-updates...
    Reply
  • Andrei Frumusanu - Tuesday, March 10, 2020 - link

    Hyperscalers never pay full list price for their special SKUs, so comparisons to public new SKUs like the 6248R are not relevant.

    We're happy to update the landscape once EC2 introduces newer generation instances, but for now, these are the current prices and costs for what's available today and in the next few months.
    Reply
  • Spunjji - Wednesday, March 11, 2020 - link

    I'm confused. Either you can think that everything boils down to $/vCPU/hr, in which case the only thing that's relevant is what Amazon actually offer, or you can think that "a discussion around the 'new' Xeon Gold R is warranted". They're mutually exclusive. Reply
  • close - Tuesday, March 10, 2020 - link

    Great write-up Andrei. One question (I hope I didn't miss the answer in the article). Does Amazon's chip come out in front in the cost analysis because Amazon decided to take a loss or overcharge the other options, or is it an organic difference where it's intrinsically better? Reply
  • Andrei Frumusanu - Tuesday, March 10, 2020 - link

    We have no idea of Amazon's internal cost structure, so take the cost analysis from and end-user TCO perspective. Reply
  • eek2121 - Tuesday, March 10, 2020 - link

    I suspect the TDP of this chip is likely in the 150 watt range. We also know nothing about the operating environment of any of the chips. For example, the chip is rated for DDR4 3200, but is it running at 3200 speeds? The EPYC chip likely is NOT. So many questions here... Reply
  • Andrei Frumusanu - Tuesday, March 10, 2020 - link

    It is running 3200, Amazon confirmed that.

    They didn't comment on TDP, but given Arm and Ampere's figures, I think my estimate is correct.
    Reply
  • Flunk - Thursday, April 9, 2020 - link

    They're comparing VMs with the same cost/hour. What number of cores/threads is isn't really relevant. Reply
  • autarchprinceps - Sunday, October 25, 2020 - link

    That’s exactly why they reserved the entire hardware. If you run only a single workload on SMT, that single thread can use the entire core. That’s kind of the point of SMT. Reply

Log in

Don't have an account? Sign up now