Fetch/Prefetch

Starting with the front end of the processor, the prefetchers.

AMD’s primary advertised improvement here is the use of a TAGE predictor, although it is only used for non-L1 fetches. This might not sound too impressive: AMD is still using a hashed perceptron prefetch engine for L1 fetches, which is going to be as many fetches as possible, but the TAGE L2 branch predictor uses additional tagging to enable longer branch histories for better prediction pathways. This becomes more important for the L2 prefetches and beyond, with the hashed perceptron preferred for short prefetches in the L1 based on power.

In the front end we also get larger BTBs, to help keep track of instruction branches and cache requests. The L1 BTB has doubled in size from 256 entry to 512 entry, and the L2 is almost doubled to 7K from 4K. The L0 BTB stays at 16 entries, but the Indirect target array goes up to 1K entries. Overall, these changes according to AMD affords a 30% lower mispredict rate, saving power.

One other major change is the L1 instruction cache. We noted that it is smaller for Zen 2: only 32 KB rather than 64 KB, however the associativity has doubled, from 4-way to 8-way. Given the way a cache works, these two effects ultimately don’t cancel each other out, however the 32 KB L1-I cache should be more power efficient, and experience higher utilization. The L1-I cache hasn’t just decreased in isolation – one of the benefits of reducing the size of the I-cache is that it has allowed AMD to double the size of the micro-op cache. These two structures are next to each other inside the core, and so even at 7nm we have an instance of space limitations causing a trade-off between structures within a core. AMD stated that this configuration, the smaller L1 with the larger micro-op cache, ended up being better in more of the scenarios it tested.

AMD Zen 2 Microarchitecture Overview: The Quick Analysis Decode
Comments Locked

216 Comments

View All Comments

  • Korguz - Monday, June 17, 2019 - link

    im glad im not the only one that sees this...
  • Qasar - Monday, June 17, 2019 - link

    korguz, you aren't the only one that sees it.

    Xyler94, i dont hate intel.. but i am sick of what they have done so far to the cpu industry, sticking the mainstream with quad cores for how many years ? i would of loved to get a 6 or 8 core intel chip, but the cost of the platform, made it out of my reach. the little performance gains year over year, come on, thats the best intel can do with all the money they have ?? and the constant lies about 10nm.... then Zen is released and what was it, less then 2 months later, intel all of a sudden has more then 4 cores for the mainstream, and even more cores for the HEDT ? my next upgrade at this point, looks to be zen 2.. but i am waiting till the 7th, to read the reviews. hstewart does glorify intel any chance he can, and it just looks so stupid, cause some one calls him out on it.. and he seems to pretty much vanish from that convo
  • HStewart - Thursday, June 13, 2019 - link

    Notice that I mention unless they change it from dual 128 bit.
  • Targon - Thursday, June 13, 2019 - link

    Socket AM4 is limited to a dual-channel memory controller, because you need more pins to add more memory channels. The same applies to the number of PCI Express lanes as well. The only way around this would be to use one of the abilities of Gen-Z where the CPU would just talk to the Gen-Z bus, at which point, dedicated pins for memory and PCI Express could be replaced by a very wide and fast connection to the system bus/fabric. Since that would require a new motherboard and for the CPU to be designed around it, why bother with socket AM4 at that point?
  • Korguz - Thursday, June 13, 2019 - link

    why bother?? um upgrade ability ? maybe not quite needed ? the things you suggest, sound like they would be a little expensive to implement. if you need more memory bandwidth and pcie lanes.. grab a TR board and a lower end cpu....
  • austinsguitar - Monday, June 10, 2019 - link

    Thank you Ian for this write up. :)
  • megapleb - Monday, June 10, 2019 - link

    Why does the 3600X have power consumption of 95W, and the 3700X, with two more cores, four more threads, and the same frequency max, consume only 65W? I'm guessing those two got switched around?
  • anonomouse - Monday, June 10, 2019 - link

    higher sustained base clock drives up the tdp
  • megapleb - Monday, June 10, 2019 - link

    200Mhz extra base increases power consumption by 46%? I would have though max power consumption would be all cores operating at maximum frequency so the base would have nothing to do with it?
  • scineram - Tuesday, June 11, 2019 - link

    Nobody said anything about power consumption.

Log in

Don't have an account? Sign up now