Sizing Up Servers: Intel's Skylake-SP Xeon versus AMD's EPYC 7000 - The Server CPU Battle of the Decade?
by Johan De Gelas & Ian Cutress on July 11, 2017 12:15 PM EST- Posted in
- CPUs
- AMD
- Intel
- Xeon
- Enterprise
- Skylake
- Zen
- Naples
- Skylake-SP
- EPYC
SMT Integer Performance With SPEC CPU2006
Next, to test the performance impact of simultaneous multithreading (SMT) on a single core, we test with two threads on the same core. This way we can evaluate how well the core handles SMT.
Subtest | Application type | Xeon E5-2690 @ 3.8 | Xeon E5-2690 v3 @ 3.5 | Xeon E5-2699 v4 @ 3.6 | EPYC 7601 @3.2 | Xeon 8176 @ 3.8 |
400.perlbench | Spam filter | 39.8 | 43.9 | 47.2 | 40.6 | 55.2 |
401.bzip2 | Compression | 32.6 | 32.3 | 32.8 | 33.9 | 34.8 |
403.gcc | Compiling | 40.7 | 43.8 | 32.5 | 41.6 | 32.1 |
429.mcf | Vehicle scheduling | 44.7 | 51.3 | 55.8 | 44.2 | 56.6 |
445.gobmk | Game AI | 36.6 | 35.9 | 38.1 | 36.4 | 39.4 |
456.hmmer | Protein seq. analyses | 32.5 | 34.1 | 40.9 | 34.9 | 44.3 |
458.sjeng | Chess | 36.4 | 36.9 | 39.5 | 36 | 41.9 |
462.libquantum | Quantum sim | 75 | 73.4 | 89 | 89.2 | 91.7 |
464.h264ref | Video encoding | 52.4 | 58.2 | 58.5 | 56.1 | 75.3 |
471.omnetpp | Network sim | 25.4 | 30.4 | 48.5 | 26.6 | 42.1 |
473.astar | Pathfinding | 31.4 | 33.6 | 36.6 | 29 | 37.5 |
483.xalancbmk | XML processing | 43.7 | 53.7 | 78.2 | 37.8 | 78 |
Now on a percentage basis versus the single-threaded results, so that we can see how much performance we gained from enabling SMT:
Subtest | Application type | Xeon E5-2699 v4 @ 3.6 | EPYC 7601 @3.2 | Xeon 8176 @ 3.8 |
400.perlbench | Spam filter | 109% | 131% | 110% |
401.bzip2 | Compression | 137% | 141% | 128% |
403.gcc | Compiling | 137% | 119% | 131% |
429.mcf | Vehicle scheduling | 125% | 110% | 131% |
445.gobmk | Game AI | 125% | 150% | 127% |
456.hmmer | Protein seq. analyses | 127% | 125% | 125% |
458.sjeng | Chess | 120% | 151% | 125% |
462.libquantum | Quantum sim | 91% | 129% | 90% |
464.h264ref | Video encoding | 101% | 112% | 112% |
471.omnetpp | Network sim | 109% | 116% | 103% |
473.astar | Pathfinding | 140% | 149% | 137% |
483.xalancbmk | XML processing | 120% | 107% | 116% |
On average, both Xeons pick up about 20% due to SMT (Hyperthreading). The EPYC 7601 improved by even more: it gets a 28% boost on average. There are many possible explanations for this, but two are the most likely. In the situation where AMD's single threaded IPC is very low because it is waiting on the high latency of a further away L3-cache (>8 MB), a second thread makes sure that the CPU resources can be put to better use (like compression, the network sim). Secondly, we saw that AMD core is capable of extracting more memory bandwidth in lightly threaded scenarios. This might help in the benchmarks that stress the DRAM (like video encoding, quantum sim).
Nevertheless, kudos to the AMD engineers. Their first SMT implementation is very well done and offers a tangible throughput increase.
219 Comments
View All Comments
tamalero - Tuesday, July 11, 2017 - link
How is that different if AMD ran stuff that is extremely optimized for them?Friendly0Fire - Tuesday, July 11, 2017 - link
That's kinda the point? You want to benchmark the CPUs in optimal scenarios, since that's what you'd be looking at in practice. If one CPU's weakness is eliminated by using a more recent/tweaked compiler, then it's not a weakness.coder543 - Tuesday, July 11, 2017 - link
Rather, you want to test under practical scenarios. Very few people are going to be running 17.04 on production grade servers, they will run an LTS release, which in this case is 16.04.It would be good to have benchmarks from 17.04 as another point of comparison, but given how many things they didn't have time to do just using 16.04, I can understand why they didn't use 17.04.
Santoval - Wednesday, July 12, 2017 - link
A compromise can be found by upgrading Ubuntu 16.04's outdated kernel. Ubuntu LTS releases include support for rolling HWE Stacks, which is a simple meta package for installing newer kernels compiled, modified, tested and packaged by the Ubuntu Kernel Team, and installed directly from the official Ubuntu repositories (not via a Launchpad PPA). With HWE 16.04 LTS can install up to the kernel of 18.04 LTS.I also use 16.04 LTS + HWE (it just requires installing the linux-generic-hwe-16.04 package), which currently provides the 4.8 kernel. There is even a "beta" version of HWE (the same package plus an -edge at the end) for installing the 4.10 kernel (aka the kernel of 17.04) earlier, which will normally be released next month.
I just spotted various 4.10 kernel listings after checking in Synaptic, so they must have been added very recently. After that there are two more scheduled kernel upgrades, as is shown in the following link. Of course HWE upgrades solely the kernel, it does not upgrade any application or any of the user level parts to a more recent version of Ubuntu.
https://wiki.ubuntu.com/Kernel/RollingLTSEnablemen...
CajunArson - Tuesday, July 11, 2017 - link
Considering the similarities between RyZen and Haswell (that aren't coincidental at all) you are already seeing a highly optimized set of RyZen results.But I have no problem seeing RyZen be tested with the newest distros, the only difference being that even Ubuntu 16.04 already has most of the optimizations for RyZen baked in.
coder543 - Tuesday, July 11, 2017 - link
What similarities? They're extremely different architectures. I can't think of any obvious similarities. Between the CCX model, caches being totally different layouts, the infinity fabric, Intel having better AVX-256/512 stuff (IIRC), etc.I don't think 16.04 is naturally any more optimized for Ryzen than it is for Skylake-SP.
CajunArson - Tuesday, July 11, 2017 - link
Oh please, at the core level RyZen is a blatant copy-n-paste of Haswell with the only exception being they just omitted half the AVX hardware to make their lives easier.It's so obvious that if you followed any of the developer threads for people optimizing for RyZen they say to just use the Haswell compiler optimizations that actually work better than the official RyZen optimization flags.
ddriver - Tuesday, July 11, 2017 - link
Can't tell if this post is funny or sad.CajunArson - Tuesday, July 11, 2017 - link
It's neither: It's accurate.Don't believe me? Look at the differences in performance of the holy 1800X over multiple Linux distros ranging from pretty new (OpenSuse Tumbleweed) to pretty old (Fedora 23 from 2015): http://www.phoronix.com/scan.php?page=article&...
Nowhere near the variation that we see with Skylake X since Haswell was already a solved problem long before RyZen lauched.
coder543 - Tuesday, July 11, 2017 - link
Right, of course. Ryzen is a copy-and-paste of Haswell.Don't make me laugh.