Deciphering the New Cache Hierarchy

The cache hierarchy is a significant deviation from recent previous AMD designs, and most likely to its advantage.  The L1 data cache is both double in size and increased in associativity compared to Bulldozer, as well as being write-back rather than write-through. It also uses an asymmetric load/store implementation, identifying that loads happen more often than stores in the critical paths of most work flows. The instruction cache is no longer shared between two cores as well as doubling in associativity, which should decrease the proportion of cache misses. AMD states that both the L1-D and L1-I are low latency, with details to come.

The L2 cache sits at half a megabyte per core with 8-way associativity, which is double that of Intel’s Skylake which has 256 KB/core and is only 4-way. On the other hand, Intel’s L3/LLC on their high-end Skylake SKUs is at 2 MB/core or 8 MB/CPU, whereas Zen will feature 1 MB/core and both are at 16-way associativity.

Edit 7:18am: Actually, the slide above is being slightly evasive in its description. It doesn't say how many cores the L3 cache is stretched over, or if there is a common LLC between all cores in the chip. However, we have recieved information from a source (which can't be confirmed via public AMD documents) that states that Zen will feature two sets of 8MB L3 cache between two groups of four cores each, giving 16 MB of L3 total. This would means 2 MB/core, but it also implies that there is no last-level unified cache in silicon across all cores, which Intel has. The reasons behind something like this is typically to do with modularity, and being able to scale a core design from low core counts to high core counts. But it would still leave a Zen core with the same L3 cache per core as Intel.

Cache Levels
  Bulldozer
FX-8150
Zen Broadwell-E
i7-6950X
Skylake
i7-6700K
L1 Instruction 64 KB 2-way
per module
64 KB 4-way 32 KB 8-way 32 KB 8-way
L1 Data 16 KB 4-way
Write Through
32 KB 8-way
Write Back
32 KB 8-way
Write-Back
32 KB 8-way
Write-Back
L2 2 MB 16-way
per module
512 KB 8-way 256 KB 8-way 256 KB 4-way
L3 1 MB/core
64-way
1 or 2 MB/core ?
16-way
2.5 MB/core
16/20-way
2 MB/core
16-way

What this means, between the L2 and the L3, is that AMD is putting more lower level cache nearer the core than Intel, and as it is low level it becomes separate to each core which can potentially improve single thread performance. The downside of bigger and lower (but separate) caches is how each of the cores will perform snoop in each other’s large caches to ensure clean data is being passed around and that old data in L3 is not out-of-date. AMD’s big headline number overall is that Zen will offer up to 5x cache bandwidth to a core over previous designs.

Zen High Level Block Diagram Low Power, FinFET and Clock Gating
Comments Locked

216 Comments

View All Comments

  • m1ngky - Saturday, August 20, 2016 - link

    It could be the performance boost is only 5% each generation because there wasn't a need for more due to the monopoly Intel has in the CPU market.

    Once decent competition from AMD emerges I'm betting we see more of a % boost then.
  • sonicmerlin - Saturday, August 20, 2016 - link

    I seriously doubt it, Intel needs performance boosts to sell new products every year. If they could've then they would've.
  • Byte - Thursday, August 18, 2016 - link

    Value of top end K chips actually don't really go down that much. If you want to look for a Haswell devils canon, you still have to pony up around $300, maybe you can find a used one for $250ish, but same can be said for a Skylake. Even a 4770k or 3770k is hard to find for under $250 used. Even a 2770k i sold one not too long ago for $245.
  • Nagorak - Thursday, August 18, 2016 - link

    Prices for computer hardware isn't dropping very fast because performance has barely increased. A two year old CPU now is for all intents and purposes is just as good as a brand new one. There may be some marginal situations where the 5% difference in performance matters, but for the most part they perform identically.

    Compare this to the heyday of the late 90s when a two year old CPU might be half as fast as a new one. It was no surprise that upgrade cycles were shorter and resale values much less.
  • KPOM - Friday, August 19, 2016 - link

    Tell that to all the people on MacRumors complaining that the 13" rMBP still has a Broadwell processor.
  • Icehawk - Sunday, August 21, 2016 - link

    While I agree with Nagorak, I have moved from a 2yr cycle to a 4+ cycle on CPU/platforms, I think the Apple folks have a right to gripe about the lack of updates - some of them are a few gens back at this point and prices haven't dropped enough to make up for that IMO.
  • smilingcrow - Thursday, August 18, 2016 - link

    Their whole CPU business is based on an Intel license to copy; ignoring the ARM stuff.
  • blublub - Thursday, August 18, 2016 - link

    1. Intel's X64 is based on AMD's license.....so what !? (remeber the Itanium disaster?)
    2. AMD also hold X86 licenses which are used by Inte - both x86 and x64 are cross-licenses

    So in the end they both license/copy another -- so what!?

    And I am pretty sure after the recent Intel/Nvidia rattle that the next Intel GPUs are being build via AMD's license
  • smilingcrow - Thursday, August 18, 2016 - link

    There’s a massive difference though. AMD only has a license due to IBM insisting on Intel allowing a second manufacturer for its patented x86 CPUs.
    So AMD has been a parasite living on Intel patents with a degree of symbiosis in the relationship. That makes their various successful phases all the more noteworthy and hopefully Zen leads into another long awaited successful phase.

    I think you are jumping to conclusions much to quickly over a mere PR spat with Nvidia.
  • ddriver - Thursday, August 18, 2016 - link

    you are such an obvious intel troll fanboy that its just sad

Log in

Don't have an account? Sign up now