CPU Tests: SPEC MT Performance - P and E-Core Scaling

Update Nov 6th:

We’ve finished our MT breakdown for the platform, investigating the various combination of cores and memory configurations for Alder Lake and the i9-12900K. We're posting the detailed scores for the DDR5 results, following up the aggregate results for DDR4 as well.

The results here solely cover the i9-12900K and various combinations of MT performance, such as 8 E-cores, 8 P-cores with 1T as well as 2T, and the full 24T 8P2T+8E scenario. The results here were done on Linux due to easier way to set affinities to the various cores, and they’re not completely comparable to the WSL results on the previous page, however should be within small margins of error for most tests.

SPECint2017 Rate-N Estimated Scores (i9-12900K Scaling)

In the integer suite, the E-cores are quite powerful, reaching scores of around 50% of the 8P2T results, or more.

Many of the more core-bound workloads appear to very much enjoy just having more cores added to the suite, and these are also the workloads that have the largest gains in terms of gaining performance when we add 8 E-cores on top of the 8P2T results.

Workloads that are more cache-heavy, or rely on memory bandwidth, both shared resources on the chip, don’t scale too well at the top-end of things when adding the 8 E-cores. Most surprising to me was the 502.gcc_r result which barely saw any improvement with the added 8 E-cores.

More memory-bound workloads such as 520.omnetpp or 505.mcf are not surprising to see them not scale with the added E-cores – mcf even seeing a performance regression as the added cores mean more memory contention on the L3 and memory controllers.

SPECfp2017 Rate-N Estimated Scores (i9-12900K Scaling)

In the FP suite, the E-cores more clearly showcase a lower % of performance relative to the P-cores, and this makes sense given their design. Only few more compute-bound tests, such as 508.namd, 511.povray, or 538.imagick see larger contributions of the E-cores when they’re added in on top of the P-cores.

The FP suite also has a lot more memory-hungry workload. When it comes to DRAM bandwidth, having either E-cores or P-cores doesn’t matter much for the workload, as it’s the memory which is bottlenecked. Here, the E-cores are able to achieve extremely large performance figures compared to the P-cores. 503.bwaves and 519.lbm for example are pure DRAM bandwidth limited, and using the E-cores in MT scenarios allows for similar performance to the P-cores, however at only 35-40W package power, versus 110-125W for the P-cores result set.

Some of these workloads also see regressions in performance when adding in more cores or threads, as it just means more memory traffic contention on the chip, such as seen in the 8P2T+8E, 8P2T regressions over the 8P1T results.

SPEC2017 Rate-N Estimated Total (i9-2900K Scaling)

What’s most interesting here is the scaling of performance and the attribution between the P-cores and the E-cores. Focusing on the DDR5 set, the 8 E-cores are able to provide around 52-55% of the performance of 8 P-cores without SMT, and 47-51% of the P-cores with SMT. At first glance this could be argued that the 8P+8E setup can be somewhat similar to a 12P setup in MT performance, however the combined performance of both clusters only raises the MT scores by respectively 25% in the integer suite, and 5% in the FP suite, as we are hitting near package power limits with just 8P2T, and there’s diminishing returns on performance given the shared L3. What the E-cores do seem to allow the system is to allows to reduce every-day average power usage and increase the efficiency of the socket, as less P-cores need to be active at any one time.

CPU Tests: SPEC MT Performance - DDR5 Advantage CPU Benchmark Performance: E-Core
POST A COMMENT

472 Comments

View All Comments

  • 5j3rul3 - Thursday, November 4, 2021 - link

    Great step for intel Reply
  • Bobbyjones - Thursday, November 4, 2021 - link

    Indeed. Biggest improvements since sandybridge. If you look at the timeline, this wouldve been the first CPU designed since they saw Zen 1. This is their Zen 1 moment and they already took the performance crown back basically across the board and at a lower price. AMD is now on the back foot, and it will be another whole year before Zen 4, and the thing is, Zen 4 isnt even competing with Alder Lake, Raptor Lake is rumored to be out before Zen 4. AMD has really screwed up with their launch cycle and given Intel so much room that they not only caught back up but beat them. Intel is truly back. Reply
  • Netmsm - Thursday, November 4, 2021 - link

    For now Threadripper has the performance crown.
    With this performance per watt, Intel can just win the market for PCs.
    Enterprise will never accept this performance per watt! So, AMD wins the high profitable enterprise market.
    12900k guzzles power up to 241! whereas 5950x consumes half!

    Considering power consumption, it's like a Pyrrhic victory for Intel.
    Reply
  • fazalmajid - Thursday, November 4, 2021 - link

    The HEDT market in Enterprise is workstations, which run certified apps like AutoCAD and has a lot of inertia. The first real Zen workstation is the Lenovo P620 and it only recently came out, so AMD hasn't conquered that market yet. Most actual Enterprise desktops are compact models that typically run on laptop CPUs. Reply
  • DominionSeraph - Friday, November 5, 2021 - link

    And Intel has AMD beat for miles in system validation.
    My 3950X on a x570 Phantom Gaming X has major issues with disk access across one NVMe, one SATA SSD, and two HDDs. Some things will start up fine, but some things will just HANG. Deus Ex loading screens take like 10 seconds. I just tried to play a video off my NVMe and it took ~15 seconds for it to launch MPC-HC. (further launches are fine.) MeGUI takes 15 seconds to launch.
    This thing is just frustratingly slow in general desktop tasks compared to my old i7 4790.
    Does it beat the pants off the 4790 in heavily multithreaded crunching? Yes. But iAMD does not put out a quality product.
    Reply
  • Gothmoth - Friday, November 5, 2021 - link

    anecdotal evidence? ....YOU have issues with your system.
    well we have 16 core ryzen and threadripper 32 & 64 core systems at work and we can´t complain.
    it´s not as if intel is issue free (and i am not taking about security flaws).

    when you have such grave issues.. YOUR system has issues.
    probably a bad setup. i did not hear that starting MPC needs 15 seconds when i read abourt AMD systems.
    Reply
  • dotjaz - Sunday, November 7, 2021 - link

    What about USB issues that are publicly acknowledged AND multiple BIOSes claim to have fixed it, yet here we are. Reply
  • Netmsm - Friday, November 5, 2021 - link

    It is your problem not AMD nor Intel!
    This is why we always refer to QVL of MB before buying RAM, SSD, etc. to avoid such problems. It is not AMD prerogative rather it is for all platforms.
    For now you may better update MB bios as soon as it is released. To solve the problem completely you need to reassemble it according to the MB's QVL.
    Reply
  • DominionSeraph - Friday, November 5, 2021 - link

    It is an AMD issue. I've put together hundreds of Intel systems and none of them have any issues. Reply
  • Netmsm - Friday, November 5, 2021 - link

    When you face abnormality just put your cards on the table and ask a pro. Reply

Log in

Don't have an account? Sign up now