AMD Zen Microarchitecture: Dual Schedulers, Micro-Op Cache and Memory Hierarchy Revealed

Name: AMD Zen Microarchitecture: Dual Schedulers, Micro-Op Cache and Memory Hierarchy Revealed
Item: AMD Zen Microarchitecture: Dual Schedulers, Micro-Op Cache and Memory Hierarchy Revealed
Author: Dr. Ian Cutress

by Ian Cutress on August 18, 2016 9:00 AM EST

Posted in
CPUs
AMD
Zen

216 Comments | Add A Comment

216 Comments

Low Power, FinFET and Clock Gating

When AMD launched Carrizo and Bristol Ridge for notebooks, one of the big stories was how AMD had implemented a number of techniques to improve power consumption and subsequently increase efficiency. A number of those lessons have come through with Zen, as well as a few new aspects in play due to the lithography.

First up is the FinFET effect. Regular readers of AnandTech and those that follow the industry will already be bored to death with FinFET, but the design allows for a lower power version of a transistor at a given frequency. Now of course everyone using FinFET can have a different implementation which gives specific power/performance characteristics, but Zen on the 14nm FinFET process at Global Foundries is already a known quantity with AMD’s Polaris GPUs which are built similarly. The combination of FinFET with the fact that AMD confirmed that they will be using the density-optimised version of 14nm FinFET (which will allow for smaller die sizes and more reasonable efficiency points) also contributes to a shift of either higher performance at the same power or the same performance at lower power.

AMD stated in the brief that power consumption and efficiency was constantly drilled into the engineers, and as explained in previous briefings, there ends up being a tradeoff between performance and efficiency about what can be done for a number of elements of the core (e.g. 1% performance might cost 2% efficiency). For Zen, the micro-op cache will save power by not having to go further out to get instruction data, improved prefetch and a couple of other features such as move elimination will also reduce the work, but AMD also states that cores will be aggressively clock gated to improve efficiency.

We saw with AMD’s 7^th Gen APUs that power gating was also a target with that design, especially when remaining at the best efficiency point (given specific performance) is usually the best policy. The way the diagram above is laid out would seem to suggest that different parts of the core could independently be clock gated depending on use (e.g. decode vs FP ports), although we were not able to confirm if this is the case. It also relies on having very quick (1-2 cycle) clock gating implementations, and note that clock gating is different to power-gating, which is harder to implement.

Deciphering the New Cache Hierarchy: L1, 512 KB L2, 8 or 16 MB L3 Simultaneous Multi-Threading, Time Frame

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

216 Comments

View All Comments

DigitalFreak - Thursday, August 18, 2016 - link
Microsoft was already in the process of creating a 64bit version of Windows based on AMD's 64bit implementation (hence the reason you see AMD64 everywhere in 64bit Windows). Microsoft basically told Intel they were not going to support two competing implementations of "x64", so Intel caved and adopted the AMD64 implementation.
tygrus - Thursday, September 8, 2016 - link
They license the ISA's ie. use of instructions and the expected output. The whole silicon designs are not cross-licensed. There probably have some IP of the silicon cross-licensed but the major point was they could handle the same instructions and be mostly compatible. AMD could only fully copy 486 and earlier designs. You can copy and implement the same ISA without having the same silicon. Intel had started a design for x86-64 but the front-end decoding and instructions were changed to be cmpatible. With micro/macro ops and microcoding there can be a lot of abstraction between ISA and execution. Intel made at least 1 mistake with their early AMD64 implementation that had to have work arounds and a later fix.
frenchy_2001 - Thursday, August 18, 2016 - link
Opposite.
Intel was vehement at the time that 64 bits needed to be a clean break from x86 and were pushing for their Itanium processors, implementing IA64 (completely incompatible with x86).
The market followed AMD, especially sice they had the better architecture at the time (Athlon64, with 64 bits and in processor memory controllers, faster interconnect, better server scaling...).
Intel then licensed AMD64 and and rebranded it EMT64 or x86-64.
wifiwolf - Friday, August 19, 2016 - link
wow. finally someone who remembers that time correctly. Intel pushed for Itanium for too much time, even after they adopted amd's 64bit implementation. They eventually had to drop it as it never got enough market.
Samus - Sunday, August 21, 2016 - link
Microsoft did make an IA64 edition on NT and 2000 but without x86 compatibility there were no apps. The genius behind AMD's 64 bit implementation is it is simply a memory extension of x86 with 64 bit integer registers, maintaining complete 32-bit compatibility with no real impact on 32 bit performance, while costing very little die space for the extensions.

Microsoft and software developers saw this and basically told Intel their Itanium dreams were not going to come true.
anubis44 - Monday, August 22, 2016 - link
And the genius behind that 'genius' was none other than Jim Keller, the man who also just designed the upcoming Zen processor family.
Visual - Tuesday, August 23, 2016 - link
No, the IA64 architecture of Itanium does not try to keep any backwards-compatibility with x86, so any mention of it even being considered as an alternative to AMD64 is absurd. At that time the world was just not ready for a compatibility-breaking switch.
Kevin G - Tuesday, August 23, 2016 - link
The ISA didn't directly try to keep backwards compatibility but Intel did put some x86 functionality into the first few generations of Itanium. This was later removed in chips post 2006.

https://en.wikipedia.org/wiki/IA-32_Execution_Laye...
Gigaplex - Thursday, August 18, 2016 - link
Which is a legal term to describe "copying with permission".
pikunsia - Friday, August 19, 2016 - link
AMD cannot copy ``TM'' Intel technologies as this is a crime with criminal consequences. All is managed through licenses and royalties.

AMD Zen Microarchitecture: Dual Schedulers, Micro-Op Cache and Memory Hierarchy Revealed

Low Power, FinFET and Clock Gating

Post Your Comment

216 Comments

View All Comments

DigitalFreak - Thursday, August 18, 2016 - link

tygrus - Thursday, September 8, 2016 - link

frenchy_2001 - Thursday, August 18, 2016 - link

wifiwolf - Friday, August 19, 2016 - link

Samus - Sunday, August 21, 2016 - link

anubis44 - Monday, August 22, 2016 - link

Visual - Tuesday, August 23, 2016 - link

Kevin G - Tuesday, August 23, 2016 - link

Gigaplex - Thursday, August 18, 2016 - link

pikunsia - Friday, August 19, 2016 - link

Log in

Don't have an account? Sign up now