The Core Complex, Caches, and Fabric

Many core designs often start with an initial low-core-count building block that is repeated across a coherent fabric to generate a large number of cores and the large die. In this case, AMD is using a CPU Complex (CCX) as that building block which consists of four cores and the associated caches.

Each core will have direct access to its private L2 cache, and the 8 MB of L3 cache is, despite being split into blocks per core, accessible by every core on the CCX with ‘an average latency’ also L3 hits nearer to the core will have a lower latency due to the low-order address interleave method of address generation.

The L3 cache is actually a victim cache, taking data from L1 and L2 evictions rather than collecting data from prefetch/demand instructions. Victim caches tend to be less effective than inclusive caches, however Zen counters this by having a sufficiency large L2 to compensate. The use of a victim cache means that it does not have to hold L2 data inside, effectively increasing its potential capacity with less data redundancy.

It is worth noting that a single CCX has 8 MB of cache, and as a result the 8-core Zen being displayed by AMD at the current events involves two CPU Complexes. This affords a total of 16 MB of L3 cache, albeit in two distinct parts. This means that the true LLC for the entire chip is actually DRAM, although AMD states that the two CCXes can communicate with each other through the custom fabric which connects both the complexes, the memory controller, the IO, the PCIe lanes etc.

 

The cache representation shows L1 and L2 being local to each the core, followed by 8MB of L3 split over several cores. AMD states that the L1 and L2 bandwidth is nearly double that of Excavator, with L3 now up to 5x for bandwidth, and that this bandwidth will help drive the improvements made on the prefetch side. AMD also states that there are large queues in play for L1/L2 cache misses.

One interesting story is going to be how AMD’s coherent fabric works. For those that follow mobile phone SoCs, we know fabrics and interconnects such as CCI-400 or the CCN family are optimized to take advantage of core clusters along with the rest of the chip. A number of people have speculated that the fabric used in AMD’s new design is based on HyperTransport, however AMD has confirmed that they are using a superset HyperTransport here for Zen, and that the Infinity fabric design is meant to be high bandwidth, low latency, and be in both Zen and Vega as well as future products. Almost similar to the CPU/GPU roadmaps, the Fabric has its own as well.

Ultimately the new fabric involves a series of control and data passing structures, with the data passing enabling third-party IP in custom designs, a high-performance common bus for large multi-unit (CPU/GPU) structures, and socket to socket communication. The control elements are an extension of power management, enabling parts of the fabric to duty cycle when not in use, security by way of memory management and detection, and test/initialization for activities such as data prefetch.

Execution, Load/Store, INT and FP Scheduling Simultaneous MultiThreading (SMT) and New Instructions
Comments Locked

574 Comments

View All Comments

  • Cooe - Sunday, February 28, 2021 - link

    Find me these so-called people buying Intel HEDT CPU's (aka OG Ryzen 7's direct competition) for gaming & never for HPC uses.... Oh wait. They don't exist.
  • Haawser - Thursday, March 2, 2017 - link

    Yeah, but if you're a gamer who streams, Ryzen is waaaay better than anything Inter offer for $499. Especially if you're gaming at 4K, or going to be. Different people have different needs, even gamers.
  • Jimster480 - Thursday, March 2, 2017 - link

    Yes but no,
    Because Broadwell-E and Haswell-E HEDT platforms are in the same boat as Ryzen.

    But this is what this Ryzen 7 release is meant to do.
    Compete with the HEDT platforms, not against the "APU" chips.
    Those chips will come later, albeit with much higher clockspeeds to compete with intel.
    For now you have Intel with 10-20% clockspeed advantages in clockspeed dependent applications.
  • Meteor2 - Saturday, March 4, 2017 - link

    I hope you're right but there's no indication they will be clocked higher. AMD has access to processes which are generation behind Intel's, at least for a couple of years. We can't expect miracles.
  • nos024 - Thursday, March 2, 2017 - link

    Lol, butt hurt? Why even bother running gaming benchmarks? You even said it yourself that ryzen wont make it to your so called grown-up workstation because if low pcie count.

    So tell me who is this $500 Ryzen chip designed for? Not grown ups running workstation, or pathetic kiddies gamers...so theyre for Wannabes?
  • Tunnah - Thursday, March 2, 2017 - link

    He literally said it is ideal to replace his aging 3770k, he gave an example of how it will be used. Try more reading and less being a turd
  • ddriver - Thursday, March 2, 2017 - link

    Ryzen is that much more affordable that with the price difference I could have built another whole system, dedicated to running the 2 HBA adapters, thus saving on the need of 16 lanes. 40 - 16 is exactly 24, which is what ryzen has. If it was available a year ago I would have simply built two systems, offering a good 50-60% more CPU performance, double the GPU performance, with enough need to accommodate my IO needs, even if between two systems, that wouldn't have been much of an issue.

    The pci lane count is lower than intel E series chips, however it is still 50% higher than what you can get from intel outside the E series. It will actually suffice in most workstation scenarios, even if you end up running graphics at x8, which is not really a big deal.
  • ddriver - Thursday, March 2, 2017 - link

    "you even said it yourself that ryzen wont make it to your so called grown-up workstation because if low pcie count"

    I did not say that. Not all workstations require 40 pcie lanes. Most could do with 24. I was talking about my workstation in particular, which has plenty of pcie hardware. For the vast majority of HPC scenarios that would not be necessary, furthermore as already mentioned, with the saved money you can build additional systems dedicated to specific tasks, offloading both the need of more pcie lanes and the cpu time the attached hardware consumes.

    It remains to be seen how much IO will the server zen parts have. Ryzen is not particularly a workstation grade chip, it just happens to be GOOD ENOUGH to do the job. AMD give you 50% more performance and 50% more IO at the same or better price point, and I think they will do the same for the chips they actually design for workstation.

    It looks like the 16 core workstation chip will have 64 pcie lanes, and the 32 core - a whooping 128 lanes. So intel E series looks like a sad little orphan with its modest 40 lanes... And no, xeons aren't much better, they are in fact worse, the 24 core E7-8894 v4 only has a modest 32 lanes.

    So no, while I will not be replacing my main 10 core workstation with a ryzen, because that would win me nothing, I am definitely looking forward to replacing it next year with a Naples system, and I definitely wished ryzen was available last year as I could have spent my money much better than buying intel.
  • Intel999 - Thursday, March 2, 2017 - link

    "So tell me who is this $500 Ryzen chip designed for?"

    Logic would imply it is aimed at anyone that works in an environment where they need superior multithreading performance. For instance, anyone that has bought a 6900k or 6950k, but more importantly it is for those individuals that "wanted" to buy either of Intel's multi core champs but couldn't due to ridiculous prices.

    I'd dare to make a bet there are more people that wanted to buy a 6900k than there are people that actually did. Now they can buy one and still put food on the table this month.
  • FriendlyUser - Thursday, March 2, 2017 - link

    Exactly right. I was always tempted by the 6850K, but the price of the CPU+platform was simply ridiculous. For much less I got a faster CPU and a high-end MB. I won't miss the 40PCIe lanes.

Log in

Don't have an account? Sign up now