In their own side event this week, AMD invited select members of the press and analysts to come and discuss the next layer of Zen details. In this piece, we’re discussing the microarchitecture announcements that were made, as well as a look to see how this compares to previous generations of AMD core designs.

AMD Zen

Prediction, Decode, Queues and Execution

First up, let’s dive right into the block diagram as shown:

If we focus purely on the left to start, we can see most of the high-level microarchitecture details including basic caches, the new inclusion of an op-cache, some details about decoders and dispatch, scheduler arrangements, execution ports and load/store arrangements.  A number of slides later in the presentation talk about cache bandwidth.

Firstly, one of the bigger deviations from previous AMD microarchitecture designs is the presence of a micro-op cache (it might be worth noting that these slides sometimes say op when it means micro-op, creating a little confusion). AMD’s Bulldozer design did not have an operation cache, requiring it to fetch details from other caches to implement frequently used micro-ops. Intel has been implementing a similar arrangement for several generations to great effect (some put it as a major stepping stone for Conroe), so to see one here is quite promising for AMD. We weren’t told the scale or extent of this buffer, and AMD will perhaps give that information in due course.

Aside from the as-expected ‘branch predictor enhancements’, which are as vague as they sound, AMD has not disclosed the decoder arrangements in Zen at this time, but has listed that they can decode four instructions per cycle to feed into the operations queue. This queue, with the help of the op-cache, can deliver 6 ops/cycle to the schedulers. The reasons behind the queue being able to dispatch more per cycle is if the decoder can supply an instruction which then falls into two micro-ops (which makes the instruction vs micro-op definitions even muddier). Nevertheless, this micro-op queue helps feed the separate integer and floating point segments of the CPU. Unlike Intel who uses a combined scheduler for INT/FP, AMD’s diagram suggests that they will remain separate with their own schedulers at this time.

The INT side of the core will funnel the ALU operations as well as the AGU/load and store ops. The load/store units can perform 2 16-Byte loads and one 16-Byte store per cycle, making use of the 32 KB 8-way set associative write-back L1 Data cache. AMD has explicitly made this a write back cache rather than the write through cache we saw in Bulldozer that was a source of a lot of idle time in particular code paths. AMD is also stating that the load/stores will have lower latency within the caches, but has not explained to what extent they have improved.

The FP side of the core will afford two multiply ports and two ADD ports, which should allow for two joined FMAC operations or one 256-bit AVX per cycle. The combination of the INT and FP segments means that AMD is going for a wide core and looking to exploit a significant amount of instruction level parallelism. How much it will be able to depends on the caches and the reorder buffers – no real data on the buffers has been given at this time, except that the cores will have a +75% bigger instruction scheduler window for ordering operations and a +50% wider issue width for potential throughput. The wider cores, all other things being sufficient, will also allow AMD’s implementation of simultaneous multithreading to potentially take advantage of multiple threads with a linear and naturally low IPC.

Deciphering the New Cache Hierarchy: L1, 512 KB L2, 8 or 16 MB L3
Comments Locked

216 Comments

View All Comments

  • Reww - Friday, August 19, 2016 - link

    Neither AMD or Intel invented the microprocessor, so they're both copying from someone. Now that we cleared that up, everyone can stfu about copying.
  • BillBear - Saturday, August 20, 2016 - link

    I will be thrilled to see AMD be competitive on more than price. If AMD is also competitive on performance it's a huge win for consumers.
  • SlyNine - Saturday, August 20, 2016 - link

    I almost expected Anand himself to come back and review this one.
  • FireSnake - Monday, August 22, 2016 - link

    Where did he go, anyway? Does anybody know?
  • patel21 - Tuesday, August 23, 2016 - link

    Some commenters say he is working for Apple now
  • Johan Steyn - Monday, August 22, 2016 - link

    So many people here are defending Intel. Yes AMD has floundered. They have been poor competition to Intel. They are are struggling and maybe even a dying company. It will be a miracle if this chip will be successful, yet I do believe in miracles. I just hate having an Intel CPU in my notebook.

    Why is this so? Intel is the bully in town and they bullied AMD to death (almost). I have been in this business at that time. Companies were basically forced not to sell AMD. Intel was found guilty of it and got a slap on the wrist for it. $1B is nothing for them. For this I would welcome the day Intel dies a slow (make it rather quicker) and painful death. But this will probably not happen.

    People say it it is just business, well it is in my books not ethical business even though it might be legal. It was even found to be illegal, yet with it they killed their opponent. These days many contractors do the same. When they build a building, the law requires a certain amount of parking space (in our country). If they do not do this, they are fined. Parking brings in little compensation and therefore they rather pay the fines, even if the fine are relatively high. This is what Intel did. They new they did wrong, but also new that the repercussions will be minimal. It was worth it for them to kill the competition by breaking the law and be fined. Intel might be your hero, not mine.

    This is sickening. Intel makes me sick. I really hope AMD has some success with Zen, even though I think Intel will find another devious way to curb AMD's success. I even hope ARM will eventually dethrone Intel.
  • Outlander_04 - Monday, August 22, 2016 - link

    AMD have surpassed intel in the past . Some of us are old enough to remember 1800 Mhz Athlon 64's smashing intel P4's running at 3000+ MHz .
    We also remember intels response that saw them bribe oems to continue using their crappy processors by sending back bags of cash to people still buying from them .
    We also remember the fines and penalties intel eventually paid for their price fixing. Price fixing that cost their fanboys because it kept the prricee of theeir processors aartificiaally high eveen though they were junk .

    A strong AMD is in everyones benefit . We will get more powerful processors and we will get them at a reasonable price . Lets hope ZEN is even better than it seems
  • sharath.naik - Tuesday, August 23, 2016 - link

    The Problem with AMD is that being a technical company, they should have realized lying repeatedly in the name of marketing about the performance of their products, is akin to crying wolf. For now, it does not matter if they actually have a good product or not. The General assumption is that this is going to be another falsehood, and likely their chip can match intel at 3 ghz only when one core is running (That too when turbo is disabled on intel). And will fall far behind both in single and multithread when there is no trubo restriction on the intel chip
  • slyronit - Tuesday, August 23, 2016 - link

    I agree with you, but if there's something that can kill Intel at this point, it would be ARM based chips, not AMD.
  • atomsymbol - Monday, August 22, 2016 - link

    Bulldozer and Piledriver have a write-through L1D cache. Pentium4 has a write-through L1D cache. Zen has a write-back L1D cache. Skylake has a write-back L1D cache.

Log in

Don't have an account? Sign up now