AMD 3rd Gen EPYC Milan Review: A Peak vs Per Core Performance Balance

Name: AMD 3rd Gen EPYC Milan Review: A Peak vs Per Core Performance Balance
Item: AMD 3rd Gen EPYC Milan Review: A Peak vs Per Core Performance Balance

by Dr. Ian Cutress & Andrei Frumusanu on March 15, 2021 11:00 AM EST

120 Comments | Add A Comment

120 Comments

Disclaimer June 25^th: The benchmark figures in this review have been superseded by our second follow-up Milan review article, where we observe improved performance figures on a production platform compared to AMD’s reference system in this piece.

SPEC - Per-Core Win for "F"-Series 75F3

A metric that is actually more interesting than isolated single-thread performance, is actually per-thread performance in a fully loaded system. This actually is a measurement and benchmark figure that would greatly interest enterprises and customers which are running software or workloads that are possibly licensed on a per-core basis, or simply workloads that require a certain level of per-thread service level agreement in terms of performance.

It’s precisely this market that AMD is trying to target with its new “F”-series of processors, and this is where the new 75F3 comes into play. With 32 cores, 4 cores per chiplet with the full 256MB of L3 cache, and a base frequency of 2.95GHz, boosting up to 4.0GHz at a default 280W TDP, is the chip is squeezing out the maximum per-core performance while still offering a massive amount of multi-threaded performance.

SPEC2017 Rate-N Estimated Per-Thread Performance (1S)

At full load, this ends up with a massive per-thread performance leadership on the part of the 75F3, landing 45% ahead of the 7763 and 51% ahead of the Intel Xeon 8280.

It’s to be noted that limiting the thread count of the higher core-count SKUs will also result in a better per-thread performance metric, for example running a 7713 with only 32 threads will result in a SPECint2017 estimated score of 4.30 – the 75F3 still has a 16% advantage there even though its boost clock is only 8.8% higher at the peak – meaning the 75F3 is achieving higher effective frequencies. Unfortunately, we didn’t have enough time to do the same experiment on the equal 280W 7763 part.

AMD discloses that the biggest generational gains for the Milan stack is found in the lower core-count models, where for example the 7313 and the 7343 outperforms the 7282 and 7302 by 25%. Reason for this is that for example the new 7313 features double the L3 cache, and all the new CPUs are boosting higher with respectively higher TDPs, increasing to 150/190W from 120/155W, as well as landing in at +50% higher price points when comparing generation to generation.

SPEC - Single-Threaded Performance SPECjbb MultiJVM - Java Performance

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

120 Comments

View All Comments

mode_13h - Monday, March 15, 2021 - link
Please don't paint Nvidia as a victim. They are not. All of these guys will have to support each other, for the foreseeable future, and for purely pragmatic reasons.
Oxford Guy - Monday, March 15, 2021 - link
They are not 'guys'. They're corporations. Corporations were invented to, to quote Ambrose Bierce, grant 'individual profit without individual responsibility'.
mode_13h - Wednesday, March 17, 2021 - link
No disagreement, but I'm slightly disheartened you decided to take issue with my use of the term "guys". I'll try harder, next time--just for you.
Oxford Guy - Tuesday, April 6, 2021 - link
People humanize corporations all the time. It doesn't lead to good outcomes for societies.

Of course, it's questionable whether corporations lead to good outcomes, considering that they're founded on scamming people (profit being 'sell less for more', needing tricks to get people to agree to that).
chavv - Monday, March 15, 2021 - link
Is it possible to add another "benchmark" - ESX server workload?
Like, running 8-16-32-64 VMs all with some workload...
Andrei Frumusanu - Monday, March 15, 2021 - link
As we're rebuilding our server test suite, I'll be looking into more diverse benchmarks to include. It's a long process that needs a lot of thought and possibly resources so it's not always evident to achieve.
eva02langley - Monday, March 15, 2021 - link
Just buy EPYC and start your hybridation and your reliance on a SINGLE supplier...
eva02langley - Monday, March 15, 2021 - link
edit: Just buy EPYC and start your hybridation and STOP your reliance on a SINGLE supplier...
mode_13h - Monday, March 15, 2021 - link
You guys should really include some workloads involving multiple <= 16-core/32-thread VMs, that could highlight the performance advantages of NPS4 mode. Even if all you did was partition up the system into smaller VMs running multithreaded SPEC 2017 tests, at least that would be *something*.

That said, please don't get rid of all system-wide multithreaded tests, because we definitely still want to see how well these systems scale (both single- and multi- CPU).
ishould - Monday, March 15, 2021 - link
Yes this seems more useful for my needs as well. We use a grid system for job submission and not all cores will be hammered at the same time

AMD 3rd Gen EPYC Milan Review: A Peak vs Per Core Performance Balance

SPEC - Per-Core Win for "F"-Series 75F3

Post Your Comment

120 Comments

View All Comments

mode_13h - Monday, March 15, 2021 - link

Oxford Guy - Monday, March 15, 2021 - link

mode_13h - Wednesday, March 17, 2021 - link

Oxford Guy - Tuesday, April 6, 2021 - link

chavv - Monday, March 15, 2021 - link

Andrei Frumusanu - Monday, March 15, 2021 - link

eva02langley - Monday, March 15, 2021 - link

eva02langley - Monday, March 15, 2021 - link

mode_13h - Monday, March 15, 2021 - link

ishould - Monday, March 15, 2021 - link

Log in

Don't have an account? Sign up now