The Core Complex, Caches, and Fabric

Many core designs often start with an initial low-core-count building block that is repeated across a coherent fabric to generate a large number of cores and the large die. In this case, AMD is using a CPU Complex (CCX) as that building block which consists of four cores and the associated caches.

Each core will have direct access to its private L2 cache, and the 8 MB of L3 cache is, despite being split into blocks per core, accessible by every core on the CCX with ‘an average latency’ also L3 hits nearer to the core will have a lower latency due to the low-order address interleave method of address generation.

The L3 cache is actually a victim cache, taking data from L1 and L2 evictions rather than collecting data from prefetch/demand instructions. Victim caches tend to be less effective than inclusive caches, however Zen counters this by having a sufficiency large L2 to compensate. The use of a victim cache means that it does not have to hold L2 data inside, effectively increasing its potential capacity with less data redundancy.

It is worth noting that a single CCX has 8 MB of cache, and as a result the 8-core Zen being displayed by AMD at the current events involves two CPU Complexes. This affords a total of 16 MB of L3 cache, albeit in two distinct parts. This means that the true LLC for the entire chip is actually DRAM, although AMD states that the two CCXes can communicate with each other through the custom fabric which connects both the complexes, the memory controller, the IO, the PCIe lanes etc.

 

The cache representation shows L1 and L2 being local to each the core, followed by 8MB of L3 split over several cores. AMD states that the L1 and L2 bandwidth is nearly double that of Excavator, with L3 now up to 5x for bandwidth, and that this bandwidth will help drive the improvements made on the prefetch side. AMD also states that there are large queues in play for L1/L2 cache misses.

One interesting story is going to be how AMD’s coherent fabric works. For those that follow mobile phone SoCs, we know fabrics and interconnects such as CCI-400 or the CCN family are optimized to take advantage of core clusters along with the rest of the chip. A number of people have speculated that the fabric used in AMD’s new design is based on HyperTransport, however AMD has confirmed that they are using a superset HyperTransport here for Zen, and that the Infinity fabric design is meant to be high bandwidth, low latency, and be in both Zen and Vega as well as future products. Almost similar to the CPU/GPU roadmaps, the Fabric has its own as well.

Ultimately the new fabric involves a series of control and data passing structures, with the data passing enabling third-party IP in custom designs, a high-performance common bus for large multi-unit (CPU/GPU) structures, and socket to socket communication. The control elements are an extension of power management, enabling parts of the fabric to duty cycle when not in use, security by way of memory management and detection, and test/initialization for activities such as data prefetch.

Execution, Load/Store, INT and FP Scheduling Simultaneous MultiThreading (SMT) and New Instructions
Comments Locked

574 Comments

View All Comments

  • lakerssuperman - Thursday, March 2, 2017 - link

    People like me. I was previously running a 2600k overclocked. Nice chip. Still runs great, but I was looking for an upgrade about a year ago as one of the things I do a lot of is Handbrake conversion for my HTPC. Going to even the newest Intel 4 core got me maybe 20% improvement on one of my major workloads for insane amounts of money and going to the high end to get 8-10 cores was just not justifiable.

    I ended up buying a used Xeon/X79 motherboard combo for around $300 off ebay. 8 cores/16 threads and it works great for Handbrake. I lost some clock speed in the move so single thread performance took a bit of a hit, but was more than made up for in multi-thread performance. I can still game on this CPU just fine and I don't play the newest stuff right away anyway just because of time constraints.

    The X79 platform is fine for what I'm doing with it. Would I like the new stuff? Sure. And if I was in the position I was last year looking for an upgrade I don't see how I wouldn't get an 1800x. It gives me the right balance of features for what I do with my computer.

    If I was just gaming, I'd look at Intel currently because their 4 core i5 is the sweet spot for this. But I'm not just gaming so this chip is infinitely more attractive to someone like me. With the price and features I can't see how it isn't a winner and when the 4 and 6 core parts come out at likely higher frequencies, I think they are going to be the real winners for gaming.
  • rarson - Thursday, March 2, 2017 - link

    Ryzen is clearly well-suited to anyone who values high performance in a multitude of usage scenarios over one single usage scenario, especially if one cares about how much money they need to spend to achieve those results.
  • injurer - Friday, March 3, 2017 - link

    1800X is definitely designed for enthusiast, and AMD fans, but when you go to 1700X this is a price killer targeting the mainstream. 1700 is on the same boat but at even lower price. All the 3 are 8 core chips and are quite close to the 6900K but at 2-4 times lower price.

    At the end I really believe AMD are still having to show us the real potential of their architecture. Those chips are just the start. Remember Ryzen design is a new from its core, so they definitely have room to ecpand and enhance it.
  • bill.rookard - Thursday, March 2, 2017 - link

    Well, thing to remember is that for those looking for a new build, they now have a legitimate choice. I still do see in the future that things will only go more multithreaded, and even though the i7-7700k is still a great chip, having more physical cores and resources to throw at it will only help.

    To that end, again, anyone planning a NEW build from the ground up will be able to seriously consider a Ryzen system.

    Worst case, think about it. In the deep dive they had mention of 'competitive resource sharing' with SMT enabled. If you were to disable SMT on Ryzen - it would give you 8 PHYSICAL cores versus the 4 physical/4 logical cores of the 7700k. Without those resources being partially used across 16 threads - all resources would be allocated to the physical cores instead, potentially allowing more processing power per physical core.

    There's still quite a bit to be checked out and dug through.
  • lilmoe - Thursday, March 2, 2017 - link

    This. I want 2 things dug deeper in follow ups:
    1) Single/multi threaded performance with SMT disabled VS SMT enabled.
    2) Game comparisons with more sensible GPUs (which actually ship and sell in volume, IE: the ones people actually buy), like the GTX 1060 and/or RX 480.
  • BurntMyBacon - Friday, March 3, 2017 - link

    @lilmoe

    I agree with 1). Intel had HT for several generations before it was universally better to leave it enabled (still needs to be disabled some times, but these are more the edge cases now).

    Not so sure I'm onboard with 2). Pairing a $200 GPU with a $500 processor for gaming purposes seems a little backwards. I'd like to see that (GTX1060 / RX480) gaming comparison on a higher clocked R5 or R3 processor when they are released.
  • Meteor2 - Friday, March 3, 2017 - link

    I'd rather see tests paired with a 1080 Ti. At RX480/1060 level, it's well known the bottleneck is GPU performance not CPU. A 1080 Ti should be fast enough to show up the CPU.
  • lilmoe - Friday, March 3, 2017 - link

    @BurntMyBacon @Meteor2

    Lots of people, like me, are more into CPU power. I'm OK with a mid-range GPU. Gaming is not my top priority, and when I do, It's never above 1080p.

    It'd be interesting to see if there are differences. I wouldn't dismiss it, saying the GPU would be the bottleneck so fast.
  • bigboxes - Sunday, March 5, 2017 - link

    I'm with you on that. Gaming is way down in my priority list. I do it occasionally just because I love to see what my hardware can do. I currently have a ultrawide 1080p monitor. When I move to 4K then hopefully midrange GPU will cover that. My CPU is a 4790K. It's great for most tasks. I've been wanting to go to 6/8 core for some time, but the cost for the platform was too high. I think in a couple of years I will seriously think about Ryzen when building a new workstation.
  • rarson - Thursday, March 2, 2017 - link

    I am interested in seeing potential improvement due to BIOS updates. Additionally, I'm interested in seeing potential improvement due to better multi-threaded software. My hunch is that AMD is either on-par or better than Intel, or maybe damn near that prediction, so I think the 4-core parts will compare well to the current Skylake SKUs. I also expect them to overclock better than the 8-core chips. I guess we'll just have to wait for them to release.

    8 physical cores is definitely better than 4 cores with SMT/HTT/whatever you want to call it.

Log in

Don't have an account? Sign up now