Murphy's Law

Anything That Can Go Wrong, Will Go Wrong

For those of you that may not know, I am an Academic Director of MCT at Howest University here in Belgium. I perform research in our labs here on big data analytics, virtualization, cloud computing, and server technology in general. We do all the testing here in the lab, and I also do launch article testing for AnandTech.

Undoubtedly, like most academic institutions, we have a summer vacation, where our labs are locked and we are told to get some sunlight. AMD's Rome launch has happened just as our lab closing started, and so I had the Rome server delivered to my home lab instead. The only issue was that our corresponding Intel server was still in the academic lab. Normally this isn't really a problem - even when the lab is open, I issue testing through remote access and process the data that way, in order to reboot the system and run tests and so forth. If a hardware change is needed, I need to be physically there, but usually this isn't a problem.

However, as Murphy's Law would have it, during testing for this review, our Domain Controller also crashed while our labs were closed. We could not reach our older servers any more. This has limited us somewhat in our testing - while I can test this Rome system during normal hours at the home lab (can't really run it overnight, it is a server and therefore loud), I couldn't issue any benchmarks to our Naples / Cascade Lake systems in the lab.

As a result, our only option was to limit ourselves to the benchmarks already done on the EPYC 7601, Skylake, and Cascade Lake machines. Rest assured that we will be back with our usual Big Data/AI and other real world tests once we can get our complete testing infrastructure up and running. 

Benchmark Configuration and Methodology

All of our testing was conducted on Ubuntu Server 18.04 LTS, except for the EPYC 7742 server, which was running Ubuntu 19.04. The reason was simple: we were told that 19.04 had validated support for Rome, and with two weeks of testing time, we wanted to complete what was possible. Support (including X2APIC/IOMMU patches to utilize 256 threads) for Rome is available with Linux Kernel 4.19 and later. 

You will notice that the DRAM capacity varies among our server configurations. This is of course a result of the fact that Xeons have access to six memory channels while EPYC CPUs have eight channels. As far as we know, all of our tests fit in 128 GB, so DRAM capacity should not have much influence on performance. 

 

AMD Daytona - Dual EPYC 7742

AMD sent us the "Daytona XT" server, a reference platform build by ODM Quanta (D52BQ-2U). 

CPU ​AMD EPYC 7742 (2.25 GHz, 64c, 256 MB L3, 225W)
RAM 512 GB (16x32 GB) Micron DDR4-3200
Internal Disks SAMSUNG MZ7LM240 (bootdisk)
Micron 9300 3.84 TB (data)
Motherboard Daytona reference board: S5BQ
PSU PWS-1200

Although the 225W TDP CPUs needs extra heatspipes and heatsinks, there are still running on air cooling... 

AMD EPYC 7601 –  (2U Chassis)

CPU Two EPYC 7601  (2.2 GHz, 32c, 8x8MB L3, 180W)
RAM 512 GB (16x32 GB) Samsung DDR4-2666 @2400
Internal Disks SAMSUNG MZ7LM240 (bootdisk)
Intel SSD3710 800 GB (data)
Motherboard AMD Speedway
PSU 1100W PSU (80+ Platinum)

Intel's Xeon "Purley" Server – S2P2SY3Q (2U Chassis)

CPU Two Intel Xeon Platinum 8280  (2.7 GHz, 28c, 38.5MB L3, 205W)
Two Intel Xeon Platinum 8176  (2.1 GHz, 28c, 38.5MB L3, 165W)
RAM 384 GB (12x32 GB) Hynix DDR4-2666
Internal Disks SAMSUNG MZ7LM240 (bootdisk)
Micron 9300 3.84 TB (data)
Motherboard Intel S2600WF (Wolf Pass baseboard)
Chipset Intel Wellsburg B0
PSU 1100W PSU (80+ Platinum)

We enabled hyper-threading and Intel virtualization acceleration.

The BIG LIST of Rome CPUs: Core Counts and Frequencies Memory Subsystem: Bandwidth
Comments Locked

180 Comments

View All Comments

  • JoeBraga - Wednesday, August 14, 2019 - link

    It can happen if Intel uses the new archtecture Sunny Cove and MCM/Chiplet design instead of Monolithic Design
  • SanX - Thursday, August 15, 2019 - link

    7zip is not a legacy test, it is important for anyone who sends big data over always damn slow network. Do you know all those ZIPs, GZs and other zippers which people mostly use, compress with turtle speeds as low as 20 MB/s even on supercomputers ? The 7Zip though parallelizes that nicely. So do not diminish this good test calling it "legacy"
  • imaskar - Friday, August 16, 2019 - link

    7zip is a particular program, doing LZMA in parallel, that's why it is faster that lets say gzip. But on server you often do not want to parallel things, because other cores are doing other jobs and switching is costly. There are a lot of compressing algorithms which are better in certain situations. LZMA rarely fits. More often it is it's LZ4 or zstd for "generate once, consume many" or basic gzip (DEFLATE) for "generate once, consume once". Yes, you would be surprised, but the very basic 30 years old DEFLATE is still the king if you care for sum of compress, send, decompress AND your nodes are inside one datacenter (which is most of the times).
  • SanX - Thursday, August 15, 2019 - link

    What you can say about Ian's own test he developed to demonstrate avx512 speed boost which shows some crazy up to 3-4x or more speedups ? Does your test of Molecular Dynamics tell that Ian's test mostly irrelevant for such huge improvement of speed of the real life complex programs?
  • imaskar - Friday, August 16, 2019 - link

    Probably because you can't use ONLY avx512. You still need regular things like jumps and conditions. And this is only the best case. Usually you also need to process part of the vector differently. For example, your vector has size 20, but your width is 16. You either do another vector pass, or 4 regular computations. Often second thing is faster or just the only option.
  • realbabilu - Sunday, August 18, 2019 - link

    Most of finite element software use Intel mkl to get every juice power spec of processor.it works for Intel ones not for amd
    Amd math kernel not heavily programmed, otnwaa just for Linux.
    Other third party like gotoblas openblas still trying hard to detect cache and type for zen2.
    I mean for workstation floating point still hard for amd.
  • peevee - Monday, August 19, 2019 - link

    Prices per core-GHz:
    EPYC 7742 $48.26
    EPYC 7702 $50.39
    EPYC 7642 $43.25
    EPYC 7552 $38.12
    EPYC 7542 $36.64
    EPYC 7502 $32.50
    EPYC 7452 $26.93
    EPYC 7402 $26.53
    EPYC 7352 $24.46
    EPYC 7302 $20.38
    EPYC 7282 $14.51
    EPYC 7272 $17.96
    EPYC 7262 $22.46
    EPYC 7252 $19.15

    Value in this 7282 is INSANE.
  • peevee - Tuesday, August 20, 2019 - link

    "Even though our testing is not the ideal case for AMD (you would probably choose 8 or even 16 back-ends), the EPYC edges out the Xeon 8176. Using 8 JVMs increases the gap from 1% to 4-5%."

    1%? 36917 / 27716 = 1.3319...

    33%. Without 8 JVMs.
  • KathyMilligan - Wednesday, August 21, 2019 - link

    University of Illinois Urbana-Champaign is very good university. I am too poorly prepared for this level of education. But I'm getting ready. I read a lot of articles and books, communicate with many smart former students of this university. I also buy research papers on site and this gives me a lot of useful information, which is not so easy to find on the Internet.
  • YB1064 - Wednesday, August 28, 2019 - link

    Looks like Intel has been outclassed, out-priced and completely out-maneuvered by AMD. What a disaster!

Log in

Don't have an account? Sign up now