SPEC - Multi-Threaded Performance - Subscores

Picking up from the power efficiency discussion, let’s dive directly into the multi-threaded SPEC results. As usual, because these are not officially submitted scores to SPEC, we’re labelling the results as “estimates” as per the SPEC rules and license.

We compile the binaries with GCC 10.2 on their respective platforms, with simple -Ofast optimisation flags and relevant architecture and machine tuning flags (-march/-mtune=Neoverse-n1 ; -march/-mtune=skylake-avx512 ; -march/-mtune=znver2 (for Zen3 as well due to GCC 10.2 not having znver3). 

I’ll be going over two chart comparisons, first of all, with the respective flagship parts, consisting of the new EPYC 7763 numbers, pitted against Intel’s 40-core Xeon Ice Lake SP and Ampere’s Altra Q80-33, along with the figures we have on AMD’s EPYC 7742. It’s to be noted that this latter is a 225W part, compared to the 280W 7763.

SPECint2017 Rate-N Estimated Scores (1 Socket)

In SPECint2017, the EPYC 7763 extends its lead over Intel’s current best CPU, improving the numbers beyond what we had originally published in our April review. While AMD also further narrows the gap to Ampere’s 80-core Altra SKU, there are still many core-bound workloads that still notably favour the Neoverse N1 part given its 25% advantage in core count.

SPECint2017 Rate-N Estimated Scores (1 Socket)

Also in SPECint2017, but this time focusing on the mid-tier SKUs, the main comparison points that are interesting here is the new 24-core EPYC 7443 and 16-core EPYC 7343 against the new 28-core Xeon 6330. What’s shocking here, is that Intel’s new Ice Lake SP server chip has troubles not only competing against AMD’s 24-core chip, but actually even struggles to differentiate itself from AMD’s 16-core chip, which is quite shocking.

The 72F3 8-core part is interesting, but generally we have troubles to competitively place such a SKU given that we don’t have a comparable part from the competition to pit against it.

SPECfp2017 Rate-N Estimated Scores (1 Socket)

For the high-end SKUs, we again see the 7763 increase its performance positioning compared to what we had review a few months ago, although with fewer large performance boost outliers, due o the memory-heavy nature of the floating-point test suite.

SPECfp2017 Rate-N Estimated Scores (1 Socket)

In the low-end SKUs, we see a similar story as in the integer suite, where AMD’s 16-core 7343 battles it out against Intel’s 28-core Xeon, while the 24-core unit is comfortable a good margin ahead of the competition.

The 72F3 showcases some interesting score here – because there’s more workloads that are fundamentally memory bound; the actual core count deficit of this SKU doesn’t really hamper its performance compared to its siblings. If anything, the lower core count actually has some positive side-effects as it results in less cache and DRAM contention, resulting in less overhead and actually higher performance than the higher core count parts. Theoretically you could mimic this with the higher core count parts by simply running fewer workload instances and threads, but if a system deployment would be running workloads that are more typical of such performance characteristics, the low-core count 72F3 could make sense.

AMD Platform vs GIGABYTE: IO Power Overhead Gone SPEC - Multi-Threaded Performance - Aggregate
Comments Locked

58 Comments

View All Comments

  • Threska - Sunday, June 27, 2021 - link

    Seems the only thing blunted is the economics of throwing more hardware at the problem. Actual technical development has taken off because all the chip-makers have multiple customers across many domains. That's why Anandtech and others are able to have articles like they have.
  • tygrus - Sunday, June 27, 2021 - link

    Reminds me of the inn keeper from Les Miserables. Nice to your face with lots of good promises but then tries to squeeze more money out of the customer at every turn.
  • tygrus - Sunday, June 27, 2021 - link

    I was ofcourse referring to the SW not the CPU.
  • 130rne - Tuesday, September 14, 2021 - link

    What the hell did I just read? Just came across this, I had no idea the enterprise side was this fucked. They are scalping the ungodly dog shit out of their own customers. So you obviously can't duplicate their software in house meaning you're forced to use their software to be competitive, that seems to be the gist. So I buy a stronger cpu, usually a newer model, yeah? And it's more power efficient, and I restrict the software to a certain number of threads on those cpus, they'll just switch the pricing model because I have a better processor. This would incentivize me to buy cheaper processors with less threads, yeah? Buy only what I need.
  • 130rne - Tuesday, September 14, 2021 - link

    Continued- basically gimping my own business, do I have that right? Yes? Ok cool, just making sure.
  • eachus - Thursday, July 15, 2021 - link

    There is a compelling use case that builders of military systems will be aware of. If you have an in-memory database and need real-time performance, this is your chip. Real-time doesn't mean really fast, it means that the performance of any command will finish within a specified time. So copy the database on initialization into the L3 cache, and assuming the process is handing the data to another computer for further processing, the data will stay in the cache. (Writes, of course, will go to main memory as well, but that's fine. You shouldn't be doing many writes, and again the time will be predictable--just longer.)

    I've been retired for over a decade now, so I don't have any knowledge of systems currently being developed.

    Who would use a system like this? A good example would be a radar recognition and countermeasures database. The fighter (or other aircraft) needs that data within milliseconds, microseconds is better.
  • hobbified - Thursday, August 19, 2021 - link

    At the time I was involved in that (~2010) it was per-core, with multiple cores on a package counting as "half a CPU" — that is, 1 core = 1CPU license, two 1-core packages = 2CPU license, one 2-core package = 1CPU license, 4 cores total = 2CPU license, etc.

    I'm told they do things in a completely different (but no less money-hungry) way these days.
  • lemurbutton - Friday, June 25, 2021 - link

    Can we get some metrics on $/performance as well as power/performance? I think the Altra part would be better value there.
  • schujj07 - Friday, June 25, 2021 - link

    "Database workloads are admittedly still AMD’s weakness here, but in every other scenario, it’s clear which is the better value proposition." I find this conclusion a bit odd. In MultiJVM max-jOPS the 2S 24c 7443 has ~70% the performance of the 2S 40c 8380 (SNC1 best result) despite having 60% the cores of the 8380. In the critical-jOPS the 7443's performance is between the 8380's SNC1 & SNC2 results despite the core disadvantage. To me that means that the DB performance of the Epyc isn't a weakness.

    I have personally run the SAP HANA PRD performance test on Epyc 7302's & 7401's. Both CPUs passed the SAP HANA PRD performance test requirements on ESXi 6.7 U3. However, I do not have scores from Intel based hosts for comparison of scores.
  • schujj07 - Friday, June 25, 2021 - link

    The DB conclusion also contradicts what I have read on other sites. https://www.servethehome.com/amd-epyc-7763-review-... Look at the MariaDB numbers for explanation of what is being analyzed. Their 32c Epyc &543p vs Xeon 6314U is also a nice core count vs core count comparison. https://www.servethehome.com/intel-xeon-gold-6314u... In that the Epyc is ~20%+ faster in Maria than the Xeon.

Log in

Don't have an account? Sign up now