Core-to-Core Latency

As the core count of modern CPUs is growing, we are reaching a time when the time to access each core from a different core is no longer a constant. Even before the advent of heterogeneous SoC designs, processors built on large rings or meshes can have different latencies to access the nearest core compared to the furthest core. This rings true especially in multi-socket server environments.

But modern CPUs, even desktop and consumer CPUs, can have variable access latency to get to another core. For example, in the first generation Threadripper CPUs, we had four chips on the package, each with 8 threads, and each with a different core-to-core latency depending on if it was on-die or off-die. This gets more complex with products like Lakefield, which has two different communication buses depending on which core is talking to which.

If you are a regular reader of AnandTech’s CPU reviews, you will recognize our Core-to-Core latency test. It’s a great way to show exactly how groups of cores are laid out on the silicon. This is a custom in-house test, and we know there are competing tests out there, but we feel ours is the most accurate to how quick an access between two cores can happen.


(Click on image to enlarge)

Looking at core-to-core latencies going from Alder Lake (12th Gen) to Raptor Lake (13th Gen), things look quite similar on the surface. The P-cores are listed within Windows 11 from cores 0 to 15, and in comparison to Alder Lake, latencies are much the same as what we saw when we reviewed the Core i9-12900K last year. The same comments apply here as with the Core i9-12900K, as we again see more of a bi-directional cache coherence.

Latencies between each Raptor Cove core have actually improved when compared to the Golden Cove cores on Alder Lake from 4.3/4.4 ns, down to 3.8/4.1 ns per each L1 access point.

The biggest difference is the doubling of the E-cores (Gracemont) on the Core i9-13900K, which as a consequence, adds more paths and crossovers. These paths do come with a harsher latency penalty than we saw with the Core i9-12900K, with latencies around the E-cores ranging from 48 to 54 ns within four core jumps between them; this is actually slower than it was on Alder Lake.

One possible reason for the negative latency is the 200 MHz reduction in base frequency on the Gracemont cores on Raptor Lake when compared with Alder Lake. When each E-core (Gracemont) core is communicating with each other, they travel through the L2 cache clusters via the L3 cache ring and back again, which does seem quite an inefficient way to go.

Test Bed and Setup: Updating Our Test Suite for 2023 SPEC2017 Single-Threaded Results
POST A COMMENT

169 Comments

View All Comments

  • OreoCookie - Tuesday, October 25, 2022 - link

    Yes, TDP has a meaning, and technically, neither company is using it correctly. Back in the good-ol’ days when TDP was really max power under load, it easily allowed you to spec a cooler. Clock boosts were meant to be temporary, transient states so that *on average*, you’d still lie within the thermal budget of the cooler. Obviously, we are well past that.

    So yes, AMD is playing it a bit loose (+31 %). But Intel is playing it ridiculous: the i9’s max power (as tested here) is 2.7x (!) their “TDP”.
    Reply
  • shaolin95 - Thursday, October 20, 2022 - link

    AMD does the same thing. dont be a fanboy Reply
  • yh125d - Thursday, October 20, 2022 - link

    If you're equating AMD going ~50w over TDP to intel going 210w over TDP, you're being the fanboy. Reply
  • Yojimbo - Friday, October 21, 2022 - link

    AMD's turbo clocking is more than 50W. Reply
  • Yojimbo - Friday, October 21, 2022 - link

    i checked and it's 60 W. That doesn't make AMD "less dishonest”. Neither company are being dishonest. It means AMD does not intend their desktop products to be used in lower power products. If you want to design a product around a Ryzen 7950X you need a 170 W cooling solution. Whereas you can put an i9 13900K in a product that can only dissipate 125 W. That's the difference between the two processors in terms if the TDPs. That's what TDP means. Reply
  • Truebilly - Friday, October 21, 2022 - link

    I'd like to see someone run that 13900k with 120mm rad Reply
  • Wrs - Friday, October 21, 2022 - link

    I mean, it works. The processor automatically steps down the v/f curve and doesn't hiccup with a puny cooler good for 140'ish W. I tested a 12900k with a low-profile AXP-200 from my Skylake days. Performance wasn't bad, over 4GHz all 16 cores. I left all the OC settings on, or else stock E-cores would be 3.9GHz. Reply
  • nandnandnand - Thursday, October 20, 2022 - link

    Go look at some efficiency curves for the 7950X and 13900K, for example at 19:00 in Hardware Unboxed's review: https://www.youtube.com/watch?v=P40gp_DJk5E Reply
  • Yojimbo - Friday, October 21, 2022 - link

    none of the companies "do” anything here. The "doing" is by the people who, though they are ignorant, write seething rants in comment sections damning the companies. Reply
  • bji - Friday, October 21, 2022 - link

    This issue would be a lot less contentious if technical sites like Anandtech actually used their expertise to curate information presented. They just shouldn't even show TDP as it's simply not relevant to the end users who are reading the articles. They should have some standard benchmark they run to determine peak and maximum sustained power draws and show ONLY those values in any charts. Reply

Log in

Don't have an account? Sign up now