Core-to-Core Latency

As the core count of modern CPUs is growing, we are reaching a time when the time to access each core from a different core is no longer a constant. Even before the advent of heterogeneous SoC designs, processors built on large rings or meshes can have different latencies to access the nearest core compared to the furthest core. This rings true especially in multi-socket server environments.

But modern CPUs, even desktop and consumer CPUs, can have variable access latency to get to another core. For example, in the first generation Threadripper CPUs, we had four chips on the package, each with 8 threads, and each with a different core-to-core latency depending on if it was on-die or off-die. This gets more complex with products like Lakefield, which has two different communication buses depending on which core is talking to which.

If you are a regular reader of AnandTech’s CPU reviews, you will recognize our Core-to-Core latency test. It’s a great way to show exactly how groups of cores are laid out on the silicon. This is a custom in-house test, and we know there are competing tests out there, but we feel ours is the most accurate to how quick an access between two cores can happen.


Click to enlarge (lots of cores and threads = lots of core pairings)

Comparing core to core latencies from Zen 4 (7950X) and Zen 3 (5950X), both are using a two CCX 8-core chiplet design, which is a marked improvement over the four CCX 16-core design featured on the Zen 2 microarchitecture, the Ryzen 9 3950X. The inter-core latencies within the L3 cache range from between 15 ns and 19 ns. The inter-core latencies between different cores within different parts of the CCD show a larger latency penalty of up to 79.5 ns, which is something AMD should work on going forward, but it's an overall improvement in cross CCX latencies compared to Zen 3. Any gain is still a gain.

Even though AMD has opted for a newer and more 'efficient' IOD which is based on TSMC's 6 nm node. It is around the same size physically as the previous AMD IOD on Zen 3 manufactured on GlobalFoundries 12 nm node, but with a much larger transistor count. Within the IOD is the newly integrated RDNA 2 graphics, although this isn't typical iGPU in the sense that an APU is. A lot of the room on the IOD is made up of the DDR5 memory controller or IMC, as well as the chips PCIe 5.0 lanes, and of course, connects to the logic through its primary interconnect named Infinity Fabric. All of these variables play a part on power, latency, and operation.


AMD Ryzen 9 5950X Core-to-Core Latency results

It's actually astounding how similar the latency performance of the Ryzen 9 7950X (Zen 4) is when compared directly to the Ryzen 9 5950X (Zen 3), despite being on the new 5 nm TSMC manufacturing process. Even with a change of IOD, but with the same interconnect, the inter-core latencies within the Ryzen 9 7950X are great in terms of cores within the same core complex; latency does degrade when pairing up with a core in another chiplet, but this works and AMD's Ryzen 5000 series proved that the overall penalty performance is negatable.

Test Bed and Setup SPEC2017 Single-Threaded Results
POST A COMMENT

205 Comments

View All Comments

  • emn13 - Monday, September 26, 2022 - link

    The geekbench 4 ST results for the 7600x seem very low - is that benchmark result borked, or is there really something weird going on? Reply
  • emn13 - Monday, September 26, 2022 - link

    Sorry, I meant the geekbench 4 MT not ST results. The score trails way behind even the 3600xt. Reply
  • Silver5urfer - Monday, September 26, 2022 - link

    Good write up.

    First I would humbly request you to please include older Intel processors in your suite, it will be easier to understand the relative gains for eg the old 9th gen, 10th gen as a reliable place I see things all over on other sites, AT is at-least consistent so would be better if we have a ton of CPUs in one spot. Thanks

    Now speaking about this launch.

    The IOD is now improved by a huge factor so no more of that IF clock messing with the I/O controller and high voltage on the Zen 3 likes it's all improved so I think the USB fallout issues are fixed on this platform now. Plus the DP2.0 on iGPU is a dead giveaway on RDNA3 with DP2.0 as well.

    IMC is also improved looking at it AMD operated with synchronized clocks with DRAM now they can do it without that since IF is now at 2000MHz and the IMC and DRAM are higher at 3000MHz to match the DDR5 data rates. Plus the EXPO is also lower latency, however the MCM design causes the AIDA benchmark to have high latency vs Intel even though Intel is operating at Gear 2 ratio with similar Uncore decoupled. Surprisingly the inter core latencies did not change much, maybe that's one of the key to improving more on AMD side gotta see what they will do for Zen 5.

    The CPU clocks are insane, 5GHz on all 16C32T is a huge thing, plus even the 7600X is hitting 5.4GHz. Massive boost from AMD improving their design, plus the TSMC5N High Performance node is too good. However AMD did axed their temps and power. It's a very good move to not castrate the CPU with power limits and clocks now that's out it gets to spread it's wings. But the downside is, unlike Intel i7 series Ryzen 6 also gets hot meaning the budget buyers need to invest money in AIO vs older Zen 3 being fine on Air. That's a negative effect for AMD when they removed the Power Limits like Intel and let these rip to 250W.

    Chipset downlink capping at PCIe4.0x4 was the biggest negative I can think of it, because Intel DMI is now 4.0x8 on ADL and RPL, RKL had it at 3.0x8 CML at 3.0x4. AMD is stuck to 4.0x4 from X570. Many will not even care, but it is a disadvantage when you pay top money for X670E they should have given us the PCIe5.0x4, AMD will give that in 2024 with Zen 5 X770 chipset that's my guess.

    The ILM backplate engineering is solid that alone and the LGA1718 AM5 longevity itself is a major PLUS for AMD over LGA1700's bending ILM and EOL by 13th gen. Yes the 12th gen is a better purchase given how the Cooling requirement for i7 and i5 is not this high like R6 and R7 and the cheaper board costs plus 13th gen is coming and AMD's platform is new as well you would be a guinea pig. Depends on what people want and how much they can spend and what they want in longevity.

    Performance is top notch for 7600X and 7950X absolute sheer dominance but the pricing is higher when you see the % variance vs Zen 3 and Intel 12th gen parts, and added AIO mandatory because they are hot. The gaming performance is as expected not much to see here and the 5800X3D still is a contender there but to me that chip is worthless as it cannot match any processor in high core count workloads. Although 7600X is a champion 6C12T and it beats 12C24T in many things and the 10C20T 10th gen Intel too. IPC is massive in ST and MT workloads as expected. AMD Zen 4 will decimate ARM, Apple has only one thing lol muh efficiency all that BGA baggage, locked down ecosystem is free.

    RPSC3 perf at TPU's Red Dead Redemption is weird as I do not see any gains over Intel, given how much of a beast this AVX512 is on Zen 4 with 2x256Bit without AVX offset that too maybe they are not using AVX512. Plus their AMD Zen 3 gauging is also bad because they do not work well vs Intel 9th gen even, I wish you guys cover Dolphin emu, PCSX2, RPCS3 and Switch Emulators.

    I think best option is to wait for next year and buy these parts as they will drop, right now no PCIe5.0 SSD in high capacity. no PCIe5.0 GPU even that Nvidia skimped on it. No use of the new platform unless one is running a super damn old CPU and GPU setups.

    Shame that OC is totally dead, Zen 3 was hamfisted with its Curve Optimizer and Memory tuning becoming a head ache due to how AGESA was handled and the 1.4v high voltage and lack of documentation. Zen 4 it's even 1.0-1.2v still no OC because AMD's design basically is now pushed to maximum with it's Core TJMax temps and how it works on the basis of Core temperatures over everything else. There's no room here, AIO is saturated with 90C here. Too high heat density on AMD side similar to Intel 11th and 12th gen. Although Intel can go upto 350W and hit all cores at higher vs AMD 250W max. Well OC was on life support, only Intel is basically keeping it alive at this point after 10th gen it became worse and 12th very hot and high heat and now 13th gotta see if that DLVR regulator helps or not.

    All in all a good CPU but has some downsides to it. Not much worth for existing 2020 class HW folks at all. Better wait when DDR5 matures even further and more PCIe5.0 becomes prevalent.
    Reply
  • Threska - Monday, September 26, 2022 - link

    Maybe people will start delidding.

    https://youtu.be/y_jaS_FZcjI
    Reply
  • Silver5urfer - Tuesday, September 27, 2022 - link

    That Delid is a direct die, it will 100% ruin the AM5 socket for longevity and the whole CPU too. That guy runs HWBot, ofc he will make a video on his bs delid kits. Nobody should run any CPU completely blowing the IHS off. You will have a ton of issues with that. Water leak, CPU silicon die crack due to Thermodynamics and the pressure differences over the time, Liquid Metal leak. Total bust of Warranty on any parts once that LM drops on your machine game over for $5000 worth rig there.

    AMD should have done some more improvements and reduced the max TJ Max to say 90 at-least but it's what it is unfortunately (for high temps and cooling requirements) and fortunately (to have super high performance)
    Reply
  • Threska - Tuesday, September 27, 2022 - link

    There are some in the comments both wondering if lapping would achieve the same and the thicker lid was giving some room for future additions like 3D cache, etc. Reply
  • abufrejoval - Wednesday, September 28, 2022 - link

    I'm not sure that PCIe 4.0 "DMI" downlink capping is a hard cap per se by the SoC, but really the result of negotiations with the ASmedia chipset, which can't do better. I'd assume once someone comes up with a PCI 5.0 chipset/switch, there is no reason it won't do PCIe 5.0. It's just a bunch of 4 lanes, that happen to be connected to ASmedia PCIe 4.0 chips on all currrent mainboards.

    Likewise I don't see why you couldn't add the second chipset/switch to the "NVMe" port of the SoC or any of the bifurcated slots: what you see is motherboard design choices not Ryzen 7000 limitations. That just has 24 PCIe 5.0 lanes to offer in many bundle variants. It's the mainboard that straps all that flexibility to slots and ports.

    I don't see that you have to invest into AIO coolers, *unless* you want/need top clocks on all cores. E.g. if your workloads are mixed, e.g. a few threads that profit from top clocks for interactive workloads (including games) and others that are more batch oriented like large compiles or renders, you may get maximum personal value even from an air cooler that only handles 150 Watts.

    Because the interactive stuff will rev to 5.crazy clocks on say 4-8 cores, while for the batch stuff you may not wait in front of the screen anyway (or do other stuff while it's chugging in the background). So if it spends 2 extra hours on a job that might take 8 hours on AIO, that may be acceptable if it saves you from putting fluids into your computer.

    In a way AMD is now giving you a clear choice: The performance you can obtain from the high-end variants is mostly limited by the amount of cooling you want to provide. And as a side effect it also steers the power consumption: you provide 150 Watts worth of cooling, it won't consume more except for short bursts.

    In that regard it's much like a 5800U laptop, that you configure between say 15/28/35 Watts of TDP for distinct working points in terms of power vs. cooling/noise (and battery endurance).

    Hopefully AMD will provide integration tools on both Windows and Linux to check/measure/adjust the various power settings at run-time, so you can adjust your machine to your own noise/heat/performance bias, depending on the job it's running.
    Reply
  • Dug - Monday, September 26, 2022 - link

    "While these comments make sense, ultimately very few users apply memory profiles (either XMP or other) as they require interaction with the BIOS"

    This is getting so old. Your assumption is incorrect which should be obvious by the millions of articles and youtube videos on building computers. Not to mention your entire article is not even directed to "general public" but to enthusiasts. Otherwise why write out this entire article? Just say you put a cpu in a motherboard and it works. Say it's fast. Article done.

    Why not test with Curve Optimizer?
    Reply
  • Oxford Guy - Tuesday, September 27, 2022 - link

    This text appears again and again for the same reason Galileo was placed under house arrest. Reply
  • socket420 - Monday, September 26, 2022 - link

    Could someone, preferably Ryan or Gavin, please elaborate on what this sentence - "the new chip is compliant with Microsoft’s Pluton initiative as well" - actually means? This is the only review I could find that mentions Pluton in conjunction with desktop Zen 4 at all, but merely saying it's "compliant" is a weird way of wording it. Is Pluton on-die and enabled by default in Ryzen 7000 desktop CPUs? Reply

Log in

Don't have an account? Sign up now