SPEC2017 Single-Threaded Results

SPEC2017 is a series of standardized tests used to probe the overall performance between different systems, different architectures, different microarchitectures, and setups. The code has to be compiled, and then the results can be submitted to an online database for comparison. It covers a range of integer and floating point workloads, and can be very optimized for each CPU, so it is important to check how the benchmarks are being compiled and run.

We run the tests in a harness built through Windows Subsystem for Linux, developed by Andrei Frumusanu. WSL has some odd quirks, with one test not running due to a WSL fixed stack size, but for like-for-like testing it is good enough. Because our scores aren’t official submissions, as per SPEC guidelines we have to declare them as internal estimates on our part.

For compilers, we use LLVM both for C/C++ and Fortan tests, and for Fortran we’re using the Flang compiler. The rationale of using LLVM over GCC is better cross-platform comparisons to platforms that have only have LLVM support and future articles where we’ll investigate this aspect more. We’re not considering closed-source compilers such as MSVC or ICC.

clang version 10.0.0
clang version 7.0.1 (ssh://git@github.com/flang-compiler/flang-driver.git
 24bd54da5c41af04838bbe7b68f830840d47fc03)

-Ofast -fomit-frame-pointer
-march=x86-64
-mtune=core-avx2
-mfma -mavx -mavx2

Our compiler flags are straightforward, with basic –Ofast and relevant ISA switches to allow for AVX2 instructions.

To note, the requirements for the SPEC licence state that any benchmark results from SPEC have to be labeled ‘estimated’ until they are verified on the SPEC website as a meaningful representation of the expected performance. This is most often done by the big companies and OEMs to showcase performance to customers, however is quite over the top for what we do as reviewers.

SPECint2017 Rate-1 Estimated Scores

Opening things up with SPECint2017 single-threaded performance, it's clear that Intel has improved ST performance for Raptor Lake on generation-upon-generation basis. Because the Raptor Cove P-cores used here don't deliver significant IPC gains, these performance gains are primarily being driven by the chip's higher frequency. In particular, Intel has made notable progress in improving their v/f curve, which allows Intel to squeeze out more raw frequency.

And this is something Intel's own data backs up, with one of Intel's performance breakdown slides showing that the bulk of the gains are due to frequency, while improved memory speeds and the larger caches only making small contributions.

The ST performance itself in SPECint2017 is marginally better going from Alder Lake to Raptor Lake, but these differences can certainly be explained by the improvements as highlighted above. What's interesting is the performance gap between the Core i9-13900K and the Ryzen 9 7950X isn't as far apart as it was with Alder Lake vs. Ryzen 9 5950X. In 500.perlbench_r, the Raptor Lake chip actually outperforms the Zen 4 variant by just under 4%, while Ryzen 9 7950X is a smidgen over 10% better in the 505.mcf_r test. 

SPECfp2017 Rate-1 Estimated Scores

Looking at the second set of SPEC2017 results (fp), the Ryzen 9 7950X is ahead of the Core i9-13900K by 16% in the 503.bwaves_r test, while the Raptor Lake chip is just under 10% better off in the 508.namd_r test. The key points to digest here is that Intel has done well to bridge the gap in single-threaded performance to Ryzen 7000 in most of the tests, and overall, it's a consistent trade-off between which test favors which mixture of architecture, frequency, and most importantly of all, IPC performance.

While we highlighted in our AMD Ryzen 9 7950X processor review, which at the time of publishing was the clear leader in single-core performance, it seems as though Intel's Raptor Lake is biting at the heels of the new Zen 4-core. In some instances, it's actually ahead, but stiff competition from elsewhere is always good as competition creates innovation.

With Raptor Lake being more of a transitional and enhanced core design that Intel's worked with before (Alder Lake), it remains to be seen what the future of 2023 holds for Intel's advancement in IPC and single-threaded performance. Right now, however SPEC paints a picture where it's pretty much neck and neck between Raptor Cove and Zen 4.

Core-to-Core Latency SPEC2017 Multi-Threaded Results
POST A COMMENT

169 Comments

View All Comments

  • mode_13h - Friday, October 21, 2022 - link

    "The new instruction cache on Gracemont is actually very unique. x86 instruction encoding is all over the place and in the worst (and very rare) case can be as long as 15 bytes long. Pre-decoding an instruction is a costly linear operation and you can’t seek the next instruction before determining the length of the prior one. Gracemont, like Tremont, does not have a micro-op cache like the big cores do, so instructions do have to be decoded each time they are fetched. To assist that process, Gracemont introduced a new on-demand instruction length decoder or OD-ILD for short. The OD-ILD generates pre-decode information which is stored alongside the instruction cache. This allows instructions fetched from the L1$ for the second time to bypass the usual pre-decode stage and save on cycles and power."

    Source: https://fuse.wikichip.org/news/6102/intels-gracemo...
    Reply
  • Sailor23M - Friday, October 21, 2022 - link

    Interesting to see Ryzen 5 7600X perform so well in excel/ppt benchmarks. Why is that so? Reply
  • Makste - Friday, October 21, 2022 - link

    Thank you for the review. So Intel too, is finally throwing more cores and increasing frequencies to the problem these days, which increases heat and power usage in turn. AMD too, is a culprit of this practice but has not gone to these lengths as Intel. 16 cores versus supposedly efficiency cores. What is not happening? Reply
  • ricebunny - Friday, October 21, 2022 - link

    It would be a good idea to highlight that the MT Spec benchmarks are just N instantiations of the single thread test. They are not indicative of parallel computing application performance. There are a few dedicated SPEC benchmarks for parallel performance but for some reason they are never included in Anandtechs benchmarks. Reply
  • Ryan Smith - Friday, October 21, 2022 - link

    "There are a few dedicated SPEC benchmarks for parallel performance but for some reason they are never included in Anandtechs benchmarks."

    They're not part of the actual SPEC CPU suite. I'm assuming you're talking about the SPEC Workstation benchmarks, which are system-level benchmarks and a whole other kettle of fish.

    With SPEC, we're primarily after a holistic look at the CPU architecture, and in the rate-N workloads, whether there's enough memory bandwidth and other resources to keep the CPU cores fed.
    Reply
  • wolfesteinabhi - Friday, October 21, 2022 - link

    its strange to me that when we are talking about value ...especially for budget constraint buyers ... who are also willing to let go of bleeding edge/performance ... we dont even mention AM4 platform.

    AM4 is still good ..if not great (not to say mature/stable) platform for many ..and you can still buy a lot of reasonably price good procs including 5800X3D ...and users have still chance to upgrade it upto 5950X if they need more cpu at a later date.
    Reply
  • cowymtber - Friday, October 21, 2022 - link

    Burning hot POS. Reply
  • BernieW - Friday, October 21, 2022 - link

    Disappointed that you didn't spend more time investigating the serious regression for the 13900K vs the 12900K in the 502.gc_r test. The single threaded test does not have the same regression so it's a curious result that could indicate something wrong with the test setup. Alternately, perhaps the 13900K was throttling during that part of the test or maybe E cores are really not good at compiling code. Reply
  • Avalon - Friday, October 21, 2022 - link

    I had that same thought. Why publish something so obviously anomalous and not even say anything about it? Did you try re-testing it? Did you accidentally flip the scores between the 12th and 13th gen? There's no obvious reason this should be happening given the few changes between 12th and 13th gen cores. Reply
  • Ryan Smith - Friday, October 21, 2022 - link

    "Disappointed that you didn't spend more time investigating the serious regression for the 13900K vs the 12900K in the 502.gc_r test."

    We still are. That was flagged earlier this week, and re-runs have produced the same results.

    So at this point we're digging into matters a bit more trying to figure out what is going on, as the cause is non-obvious. I'm thinking it may be a thread director hiccup or an issue with the ratio of P and E cores, but there's a lot of different (and weird) ways this could go.
    Reply

Log in

Don't have an account? Sign up now