HPC Benchmarks

Discussing HPC benchmarks feels always like opening a can of worms to me. Each benchmark requires a thorough understanding of the software and performance can be tuned massively by using the right compiler settings. And to make matters worse: in many cases, these workloads can be run much faster on a GPU or MIC, making CPU benchmarking in some situations irrelevant.

NAMD (NAnoscale Molecular Dynamics) is a molecular dynamics application designed for high-performance simulation of large biomolecular systems. It is rather memory bandwidth limited, as even with the advantage of an AVX-512 binary, the Xeon 8160 does not defeat the AVX2-equipped AMD EPYC 7601.

LAMMPS is classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. GROMACS (for GROningen MAchine for Chemical Simulations) primarily does simulations for biochemical molecules (bonded interactions). Intel compiled the AMD version with the Intel compiler and AVX2. The Intel machines were running AVX-512 binaries.

For these three tests, the CPU benchmarks results do not really matter. NAMD runs about 8 times faster on an NVIDIA P100. LAMMPS and GROMACS run about 3 times faster on a GPU, and also scale out with multiple GPUs.

Monte Carlo is a numerical method that uses statistical sampling techniques to approximate solutions to quantitative problems. In finance, Monte Carlo algorithms are used to evaluate complex instruments, portfolios, and investments. This is a compute bound, double precision workload that does not run faster on a GPU than on Intel's AVX-512 capable Xeons. In fact, as far as we know the best dual socket Xeons are quite a bit faster than the P100 based Tesla. Some of these tests are also FP latency sensitive.

Black-Scholes is another popular mathematical model used in finance. As this benchmark is also double precision, the dual socket Xeons should be quite competitive compared to GPUs.

So only the Monte Carlo and Black Scholes are really relevant, showing that AVX-512 binaries give the Intel Xeons the edge in a limited number of HPC applications. In most HPC cases, it is probably better to buy a much more affordable CPU and to add a GPU or even a MIC.

The Caveats

Intel drops three big caveats when reporting these numbers, as shown in the bullet points at the bottom of the slide.

Firstly is that these are single node measurements: One 32-core EPYC vs 20/24-core Intel processors. Both of these CPUs, the Gold 6148 and the Platinum 8160, are in the ball-park pricing of the EPYC. This is different to the 8160/8180 numbers that Intel has provided throughout the rest of the benchmarking numbers.

The second is the compiler situation: in each benchmark, Intel used the Intel compiler for Intel CPUs, but compiled the AMD code on GCC, LLVM and the Intel compiler, choosing the best result. Because Intel is going for peak hardware performance, there is no obvious need for Intel to ensure compiler parity here. Compiler choice, as always, can have a substantial effect on a real-world HPC can of worms. 

The third caveat is that Intel even admits that in some of these tests, they have different products oriented to these workloads because they offer faster memory. But as we point out on most tests, GPUs also work well here.

Database Performance & Variability Conclusion: Competition Is Good


View All Comments

  • Johan Steyn - Monday, December 18, 2017 - link

    I have stated before that Anandtech is on Intel's payroll. You could see it especially with the first Threadripper review, it was horrendous to say the least. This article goes the same route. You see, two people can say the same thing, but project a completely different picture. I do not disagree that Intel has it strengths over EPYC, but this article basically just agrees with Intel,s presentation. Ha ha, that would have been funny, but it is not.

    Intel is corrupt company and Anandtech is missing the point on how they present their "facts." I now very rarely read anything Anandtech publishes. In the 90's they were excellent - those were the days...
  • Jumangi - Tuesday, November 28, 2017 - link

    Maybe you have herd of Google..or Facebook. Not only d9 they build but they design their own rack systems to suit their massive needs. Reply
  • Samus - Wednesday, November 29, 2017 - link

    Even mom and pop shops shouldn't have servers built from scratch. Who's going to support and validate that hardware for the long haul?

    HP and Dell have the best servers in my opinion. Top to bottom. Lenovo servers are at best just rehashes of their crappy workstations. If you want to get exotic (I don't) one could consider Supermicro...friends in the industry have always mentioned good luck with them, and good support. But my experience is with the big three.
  • Ratman6161 - Wednesday, November 29, 2017 - link

    You are both wrong in my experience. These days the software that runs on servers usually costs more (often by a wide margin) than the hardware it runs on. I was once running a software package the company paid $320K for on a VM environment of five two socket Dell servers and a SAN where the total hardware cost was $165K. But that was for the whole VM environment that ran many other servers besides the two that ran this package. Even the $165K for the VM environment included VMWare licensing so that was part software too. Considering the resources the two VMs running this package used, the total cost for the project was probably somewhere around 10% hardware and 90% software licensing.
    For my particular usage, the virtualization numbers are the most important so if we accept these numbers, Intel seems to be the way to go. The $10K CPU's seem pretty outlandish though. For virtualization purposes it seems like there might be more bang for the buck by going with the 8160 and just adding more hosts. Would have to get down to actually doing the math to decide on that one.
  • meepstone - Thursday, December 07, 2017 - link

    So I'm not sure who has the bigger e-peen between eek2121 and CajunArson. The drama in the comments were more entertaining than the article! Reply
  • ddriver - Tuesday, November 28, 2017 - link

    Take a chill pill you intel shill :)

    Go over to servethehome and check results from someone who is not paid to pimp intel. Epyc enjoys ample lead against similarly priced xeons.

    The only niche it is at a disadvantage is the low core count high clock speed skus, simply because for some inexplicable reason amd decided to not address that important market.

    Lastly, nobody buys those 10+k $$$ xeons with his own money. Those are bought exclusively with "others' money" by people who don't care about purchase value, because they have deals with intel that put a percent of that money right back into their pockets, which is their true incentive. If they could put that money in their pockets directly, they would definitely seek the best purchase value rather than going through intel to essentially launder it for them.
  • iwod - Tuesday, November 28, 2017 - link

    This. Go to servethehome and make up your own mind. Reply
  • lazarpandar - Tuesday, November 28, 2017 - link

    It's one thing to sound like a dick, it's another thing to sound like a dick and be wrong at the same time. Reply
  • mkaibear - Tuesday, November 28, 2017 - link

    Er, yes, if you want just 128Gb of RAM it may cost you $1,500, but if you actually want to use the capacity of those servers you'll want a good deal more than that.

    The server mentioned in the Intel example can take 1.5Tb of ECC RAM, at a total cost of about $20k - at which point the cost of the CPU is much less of an impact.

    As CajunArson said, a full load of RAM on one of these servers is expensive. Your response of "yes well if you only buy 128Gb of RAM it's not that expensive", while true, is a tad asinine - you're not addressing the point he made.
  • eek2121 - Tuesday, November 28, 2017 - link

    Not every workload requires that the RAM be topped off. We are currently in the middle of building our own private cloud on Hyper-V to replace our AWS presence, which involves building out at multiple datacenters around the country. Our servers have half a terabyte of RAM. Even with that much RAM, CPUs like this would still be (and are) a major factor in the overall cost of the server. The importance for our use case is the ability to scale, not the ability to cram as many VMs into one machine as possible. 2 servers with half a terabyte of RAM are far more valuable to us than 1 server with 1-1.5 terabytes due to redundancy. Reply

Log in

Don't have an account? Sign up now