Deciphering the New Cache Hierarchy

The cache hierarchy is a significant deviation from recent previous AMD designs, and most likely to its advantage.  The L1 data cache is both double in size and increased in associativity compared to Bulldozer, as well as being write-back rather than write-through. It also uses an asymmetric load/store implementation, identifying that loads happen more often than stores in the critical paths of most work flows. The instruction cache is no longer shared between two cores as well as doubling in associativity, which should decrease the proportion of cache misses. AMD states that both the L1-D and L1-I are low latency, with details to come.

The L2 cache sits at half a megabyte per core with 8-way associativity, which is double that of Intel’s Skylake which has 256 KB/core and is only 4-way. On the other hand, Intel’s L3/LLC on their high-end Skylake SKUs is at 2 MB/core or 8 MB/CPU, whereas Zen will feature 1 MB/core and both are at 16-way associativity.

Edit 7:18am: Actually, the slide above is being slightly evasive in its description. It doesn't say how many cores the L3 cache is stretched over, or if there is a common LLC between all cores in the chip. However, we have recieved information from a source (which can't be confirmed via public AMD documents) that states that Zen will feature two sets of 8MB L3 cache between two groups of four cores each, giving 16 MB of L3 total. This would means 2 MB/core, but it also implies that there is no last-level unified cache in silicon across all cores, which Intel has. The reasons behind something like this is typically to do with modularity, and being able to scale a core design from low core counts to high core counts. But it would still leave a Zen core with the same L3 cache per core as Intel.

Cache Levels
  Bulldozer
FX-8150
Zen Broadwell-E
i7-6950X
Skylake
i7-6700K
L1 Instruction 64 KB 2-way
per module
64 KB 4-way 32 KB 8-way 32 KB 8-way
L1 Data 16 KB 4-way
Write Through
32 KB 8-way
Write Back
32 KB 8-way
Write-Back
32 KB 8-way
Write-Back
L2 2 MB 16-way
per module
512 KB 8-way 256 KB 8-way 256 KB 4-way
L3 1 MB/core
64-way
1 or 2 MB/core ?
16-way
2.5 MB/core
16/20-way
2 MB/core
16-way

What this means, between the L2 and the L3, is that AMD is putting more lower level cache nearer the core than Intel, and as it is low level it becomes separate to each core which can potentially improve single thread performance. The downside of bigger and lower (but separate) caches is how each of the cores will perform snoop in each other’s large caches to ensure clean data is being passed around and that old data in L3 is not out-of-date. AMD’s big headline number overall is that Zen will offer up to 5x cache bandwidth to a core over previous designs.

Zen High Level Block Diagram Low Power, FinFET and Clock Gating
Comments Locked

216 Comments

View All Comments

  • wumpus - Thursday, August 18, 2016 - link

    I want this chip (or a semi-low priced i7 with the graphics removed and 4 more cores in its place) with HBM[2-3] memory (and presumably all the DRAM that fits. Hopefully in 5 years that doesn't imply a transition die) and xpoint as "main memory - SSD buffer/cache/'SSD dram'"

    So yes, five years at least.
  • ikjadoon - Thursday, August 18, 2016 - link

    No, I think it theoretically is very relevant. If those QD1 numbers are to be believed, we should see noticeable performance increases in day-to-day usage, right?

    Exactly: it's a fantasy at the price points that are palatable to *consumers*, hehe. Prosumers are also buying $1000+ GPUs, hehe...not the same market.

    Right....and that transition is still many years away.

    So, what I meant....IDF16 is not very interesting for consumers. AMD timed this presentation quite well.
  • smilingcrow - Thursday, August 18, 2016 - link

    I am not sure that the QD1 numbers will really make a noticeable difference for general consumer usage patterns. Have to wait for real world benchmarks.
  • azazel1024 - Thursday, August 18, 2016 - link

    I was very meh about Zen, but now I am actually kind of anticipating it. Even with some of the early engineering sample leaks and rumors that it will be improved IPC, possibly even right up on Skylake, but with much lower clocks. meaning it'll still be lower single threaded performance doesn't bother me too much. BD and it's kin are generally extremely poor single thread compared to Intel's latest Core processors. If Zen comes a fair amount closer...but does it while having 8 cores and 16 threads...that to me says it might actually have a good shot at being in between Skylake/Broadwell and Broadwell-E. If it can do that at a lower price point and being in spitting distance of single thread performance AND manage vaguely reasonable power consumption figures, you could count me as a buyer (if AM4 socketed boards have decent bus support).

    Give me a Zen with 80-90% of the single thread of Broadwell-E and 80-90% of the mutlthreaded performance of an Octocore Broadwell-E at the price of an entry level Broadwell-E Hexacore, or even a little less ($250-350) and you could count me as a buyer, so long as it isn't some 150TDP monster.
  • jjj - Thursday, August 18, 2016 - link

    Intel rates Broadwell-E at 140W while Zen 8 cores is supposed to be 95W.
    We'll see about base clocks and Turbo clocks but power might end up being very interesting.
    Ofc die size will be interesting too and they should have 4 cores 65W with no GPU.
  • smilingcrow - Friday, August 19, 2016 - link

    Keep in mind that the TDP for the E range tends to be the same for the whole range so in practice the chips below the top of the stack may in reality be capable of using a lower TDP.
  • patel21 - Thursday, August 18, 2016 - link

    For me, a performance comparable to i3 skylake, with power requirements at max over 20% of i3, with a good gpu integrated and at around 70% of i3's price. And My boat will sail AMD....Ho yaa
  • nandnandnand - Thursday, August 18, 2016 - link

    Weren't "8-core" Bulldozer/Excavator chips sold around $200-250? Maybe it's not so crazy to say that AMD will sell Zen real 8-cores in that price range.

    80% single threaded of Broadwell-E, 80% multithreaded performance, $225. How does that sound?
  • Gigaplex - Thursday, August 18, 2016 - link

    If Zen is much faster than Bulldozer, expect it to cost quite a bit more. Bulldozer sold for peanuts because nobody wanted it.
  • StrangerGuy - Thursday, August 18, 2016 - link

    Didn't you already know AMD fanboys have the right to be self-entitled cheapskates?

    "I want AMD to be competitive but without the competitive price tag along with it because evil Intel/NV."

Log in

Don't have an account? Sign up now