AMD 3rd Gen EPYC Milan Review: A Peak vs Per Core Performance Balance

Name: AMD 3rd Gen EPYC Milan Review: A Peak vs Per Core Performance Balance
Item: AMD 3rd Gen EPYC Milan Review: A Peak vs Per Core Performance Balance

by Dr. Ian Cutress & Andrei Frumusanu on March 15, 2021 11:00 AM EST

120 Comments | Add A Comment

120 Comments

Disclaimer June 25^th: The benchmark figures in this review have been superseded by our second follow-up Milan review article, where we observe improved performance figures on a production platform compared to AMD’s reference system in this piece.

Compiling LLVM, NAMD Performance

As we’re trying to rebuild our server test suite piece by piece – and there’s still a lot of work go ahead to get a good representative “real world” set of workloads, one more highly desired benchmark amongst readers was a more realistic compilation suite. Chrome and LLVM codebases being the most requested, I landed on LLVM as it’s fairly easy to set up and straightforward.

git clone https://github.com/llvm/llvm-project.git

cd llvm-project

git checkout release/11.x

mkdir ./build

cd ..

mkdir llvm-project-tmpfs

sudo mount -t tmpfs -o size=10G,mode=1777 tmpfs ./llvm-project-tmpfs

cp -r llvm-project/* llvm-project-tmpfs

cd ./llvm-project-tmpfs/build

cmake -G Ninja \

-DLLVM_ENABLE_PROJECTS="clang;libcxx;libcxxabi;lldb;compiler-rt;lld" \

-DCMAKE_BUILD_TYPE=Release ../llvm

time cmake --build .

We’re using the LLVM 11.0.0 release as the build target version, and we’re compiling Clang, libc++abi, LLDB, Compiler-RT and LLD using GCC 10.2 (self-compiled). To avoid any concerns about I/O we’re building things on a ramdisk. We’re measuring the actual build time and don’t include the configuration phase as usually in the real world that doesn’t happen repeatedly.

LLVM Suite Compile Time

For the new Milan chips, the results are a bit mixed. The higher-power 7763 takes a lead with a +10.5% improvement over the 7742, however the 7713 doesn’t manage to keep up with that predecessor.

The 1S vs 2S scores are interesting as the 2S figures showcase the new Milan chips in a better light due to the higher single-threaded performance of the Zen3 cores. The compilation here also has linking phases which are single-thread performance bottle-necked. This results in scenarios such as the 7713 losing to the 7662 in 1S comparisons, however winning out against the same chip in the 2S comparison, as it’s able to make that advantage count for more.

It’s also great to see the 75F3 keep up with the 64-core counterparts at around 72% of the top-SKU performance.

NAMD (Git-2020-12-09) - Apolipoprotein A1

Finally, in NAMD, this is more of a core-local compute workload. We see the 7763 outperform the 7742 by +11.8%, however the Milan chip is still outperformed by the higher core compute capacity of the 80-core Altra chip.

Generally, I have my reservations about NAMD as a benchmark due to its multicore vs MPI variants and scaling anomalies, on top of the whole topic of the benchmark having a completely different algorithm for AVX512 processors.

SPECjbb MultiJVM - Java Performance Conclusion & End Remarks

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

120 Comments

View All Comments

Oxford Guy - Tuesday, April 6, 2021 - link
PSP, as far as I know.
Linustechtips12#6900xt - Monday, March 15, 2021 - link
I understand that "zen" architecture is for x86 but with modifications could it be transplanted to the ARM instruction set, as i see it, it definitely could so the real question is when will the transition really start i think around the theoretical zen 5th gen or 6th gen, theres gonna be a lot of arm around here especially with apple. and yes it will defenitly start wiht servers it always does.
Gomez Addams - Monday, March 15, 2021 - link
There are really two things at work : the instruction set of the processor and its topology. AMD has been improving both quite a bit. The instruction set enhancements won't transfer quite so well to ARM but the topology certainly can. Since ARM processors are much smaller, they could probably work in chiplets with possibly 32 cores in each or maybe 16 cores and 4-way SMT. That could make for a very impressive server processor. Four chiplets would give 64 cores and 256 threads. Yikes!
rahvin - Monday, March 15, 2021 - link
So much wrong.
mode_13h - Monday, March 15, 2021 - link
There are pieces of it that can be reused (on the same manufacturing node, at least), but making a truly-competitive ARM chip is probably going to involve some serious tinkering with the pipeline stages & architecture. And there are significant parts of an x86 chip that you'd have to throw out and redo, most notably the instruction decoder.

In all, it's a different core that you're talking about. Not like CPU vs. GPU level of difference, but it's a lot more than just cosmetics.
coder543 - Monday, March 15, 2021 - link
"For this launch, both the 16-core F and 24-core F have the same TDP, so the only reason I can think of for AMD to have a higher price on the 16-core processor is that it only has 2 cores per chiplet active, rather than three? Perhaps it is easier to bin a processor with an even number of cores active."

If I were to speculate, I would strongly guess that the actual reason is licensing. AMD knows that more people are going to want the 16 core CPUs in order to fit into certain brackets of software licensing, so AMD charges more for those to maximize profit and availability of the 16 core parts. For those customers, moving to a 24 core processor would probably mean paying *significantly* more for whatever software they're licensing.
SarahKerrigan - Monday, March 15, 2021 - link
Yep.

Intel sold quad-core Xeon E7's for impressively high prices for a similar reason.
Mikewind Dale - Monday, March 15, 2021 - link
Why couldn't you run a 16 core software license on a 24 core CPU? I run a 4 core licensed version of Stata MP on an 8 core Ryzen just fine.
Ithaqua - Monday, March 15, 2021 - link
Compliance and lawsuits.
You have to pay for all the cores you use for some software.

Yes if you're only running 4 cores on your 8 core Ryzen then your fine but Stata MP is using all 8, there could be a lawsuit.

Now for you I'm sure they wouldn't care. For a larger firm with 10,000+ machines, then that's going to be a big lawsuit.
arashi - Wednesday, March 17, 2021 - link
Some licenses charge for ALL cores, regardless of how many cores you would actually be using.

AMD 3rd Gen EPYC Milan Review: A Peak vs Per Core Performance Balance

Compiling LLVM, NAMD Performance

Post Your Comment

120 Comments

View All Comments

Oxford Guy - Tuesday, April 6, 2021 - link

Linustechtips12#6900xt - Monday, March 15, 2021 - link

Gomez Addams - Monday, March 15, 2021 - link

rahvin - Monday, March 15, 2021 - link

mode_13h - Monday, March 15, 2021 - link

coder543 - Monday, March 15, 2021 - link

SarahKerrigan - Monday, March 15, 2021 - link

Mikewind Dale - Monday, March 15, 2021 - link

Ithaqua - Monday, March 15, 2021 - link

arashi - Wednesday, March 17, 2021 - link

Log in

Don't have an account? Sign up now