Reaching for Turbo: Aligning Perception with AMD’s Frequency Metrics
by Dr. Ian Cutress on September 17, 2019 10:00 AM ESTAMD’s Turbo
With AMD introducing Turbo after Intel, as has often been the case in their history, they've had to live in Intel's world. And this has repercussions for the company.
By the time AMD introduced their first Turbo-enabled processors, everyone in the desktop space ‘knew’ what Turbo meant, because we had gotten used to how Intel did things. For everyone, saying ‘Turbo’ meant only one thing: Intel’s definition of Turbo, which we subconsciously took as the default, and that’s all that mattered. Every time an Intel processor family is released, we ask for the Turbo tables, and life is good and easy.
Enter AMD, and Zen. Despite AMD making it clear that Turbo doesn’t work the same way, the message wasn’t pushed home. AMD had a lot of things to talk about with the new Zen core, and Turbo, while important, wasn’t as important as the core performance messaging. Certain parts of how the increased performance were understood, however the finer points were missed, with users (and press) assuming an Intel like arrangement, especially given that the Zen core layout kind of looks like an Intel core layout if you squint.
What needed to be pushed home was the sense of a finer grained control, and how the Ryzen chips respond and use this control.
When users look at an AMD processor, the company promotes three numbers: a base frequency, a turbo frequency, and the thermal design power (TDP). Sometimes an all-core turbo is provided. These processors do not have any form of turbo tables, and AMD states that the design is not engineered to decrease in frequency (and thus performance) when it detects instructions that could cause hot spots.
It should be made clear at this point that Zen (Ryzen 1000, Ryzen 2000) and Zen2 (Ryzen 3000) act very differently when it comes to turbo.
Turbo in Zen
At a base level, AMD’s Zen turbo was just a step function implementation, with two cores getting the higher turbo speed. However, most cores shipped with features that allowed the CPU to get higher-than-turbo frequencies depending on its power delivery and current delivery limitations.
You may remember this graph from the Ryzen 7 1800X launch:
For Zen processors, AMD enabled a 0.25x multiplier increment, which allows the CPU to jump up in 25 MHz steps, rather than 100 MHz. This bit was easy to understand: it meant more flexibility in what the frequency could be at any given time. AMD also announced XFR, or ‘eXtreme Frequency Range’, which meant that with sufficient cooling and power headroom, the CPU could perform better than the rated turbo frequency in the box. Users that had access to a better cooling solution, or had lower ambient temperatures, would expect to see better frequencies, and better performance.
So the Ryzen 7 1800X was a CPU with a 3.6 GHz base frequency and a 4.0 GHz turbo frequency, which it achieves when 2 or fewer cores are active. If possible, the CPU will use the (now depreciated in later models) eXtended Frequency Range feature to go beyond 4.1 GHz if the conditions are correct (thermals, power, current). When more than two cores are active, the CPU drops down to its all-core boost, 3.7 GHz, and may transition down to 3.6 GHz depending on the conditions (thermals, power, current).
Turbo in Zen+, then Zen2
AMD dropped XFR from its marketing materials, tying it all under Precision Boost. Ultimately the boost function of the processor relied on three new metrics, alongside the regular thermal and total power consumption guidelines:
PPT: Socket Power Capacity
TDC: Sustained VRM Capacity
EDC: Peak/Transient VRM Capacity
In order to get the highest turbo frequencies, users would have to score big on all three metrics, as well as cooling, to stop one being a bottleneck. The end result promised by AMD was an aggressive voltage/frequency curve that would ride the limit of the hardware, right up to the TDP listed on the box.
This means we saw a much tighter turbo boost algorithm compared to Zen. Both Zen+ and Zen2 then moved to this boost algorithm that was designed to offer a lot more frequency opportunities in mixed workloads. This was known as Precision Boost 2.
In this algorithm, we saw more than a simple step function beyond two threads, and depending on the specific chip performance as well as the environment the chip was in, the non-linear curve would react to the conditions and the workload to match hit the total power consumption of the chip as listed. The benefit of this was more performance in mixed workloads, in exchange for a tighter power consumption and frequency algorithm.
Move forward to Zen2, and one of the biggest differences for Zen2 is how the CPUs are binned. Since Zen, AMD’s own Ryzen Master software had been listing ‘best cores’ for each chip – for every Ryzen CPU, it would tell the user which cores had performed best based on internal testing, and were predicted to have this best voltage frequency curve. AMD took this a step further, and with the new 7nm process, in order to get the best frequencies out of every chip, it would perform binning per core, and only one core was required to reach the rated turbo speed.
So for example, here is a six-core Ryzen 5 3600X, with a base frequency of 3.8 GHz and a turbo frequency of 4.4 GHz. By binning tightly to the silicon maximums (for a given voltage), AMD was able to extract more performance on specific cores. If AMD had followed Intel’s binning strategy relating to turbo here, we would see a chip that would only be 4.2 GHz or 4.1 GHz maximum turbo – by going close to the chip limits for the given voltage, AMD is arguably offering more turbo functionality and ultimately more immediate performance.
There is one thing to note here though, which was the point of Paul’s article. In order to achieve maximum performance in a given workload, AMD had to adjust the Windows CPPC scheduler in order to assign a workload to the best core. By identifying the best cores on a chip, it meant that when a single threaded workload needed the best speed, it could be assigned to the best core (in our theoretical chip above that would be Core 2).
Note that with an Intel binning strategy, as the binning does not go to the per-core limits but rather relies on per-chip limits, it doesn’t matter what core the work is assigned to: this is the benefit of a homogeneous turbo binning design, and ultimately makes the scheduler algorithm in the operating system very simple. With AMD’s solution, that single best core is frequency scheduled that work, and as such the software stack in place needs to know the operation of the CPU and how to assign work to that specific core.
Does this make any difference to the casual user? No. For anyone just getting on with their daily activities, it makes absolutely zero difference. While the platform exposes the best cores, you need to be able to use tools to see it, and unless you uninstall the driver stack or micromanage where threads are allocated, you can’t really modify it. For casual users, and for gamers, it makes no difference to their workflow.
This binning strategy however does affect casual overclockers looking to get more frequency – based on AMD’s binning, there isn’t much headroom. All-core overclocks don’t really work in this scenario, because the chip is so close to the voltage/frequency curve already. This is why we’re not seeing great all-core overclocks on most Ryzen 3000 series CPUs. In order to get the best overall system overclocks this time around, users are going to have to play with each core one-by-one, which makes the whole process time consuming.
A small note about Precision Boost Overdrive (PBO) here. AMD introduced PBO in Zen and Zen+, and given the binning strategy on those chips, along with the mature 14/12nm process, users with the right thermal environment and right motherboards could extract another 100-200 MHz from the chip without doing much more than flicking a switch in the Ryzen Master software. Because of the new binning strategy – and despite what some of AMD's poorly executed marketing material has been saying – PBO hasn't been having the same effect, and users are seeing little-to-no benefit. This isn’t because PBO is failing, it’s because the CPU out of the box is already near its peak limits, and AMD’s metrics from manufacturing state that the CPU has a lifespan that AMD is happy with despite being near silicon limits. It ends up being a win-win, although people wanting more performance from overclocking aren’t going to get it – because they already have some of the best performance that piece of silicon has to offer.
The other point of assigning workloads to a specific core does revolve around lifespan. Typically over time, silicon is prone to electromigration, where electrons over time will slowly adjust the position of the silicon atoms inside the chip. Adjusting atom positioning typically leads to higher resistance paths, requiring more voltage over time to drive the same frequency, but which also leads to more electromigration. It’s a vicious cycle.
With electromigration, there are two solutions. One is to set the frequency and voltage of the processor low enough that over the expected age of the CPU it won’t ever become an issue, as it happens at such a slow rate – alternatively set the voltage high enough that it won’t become an issue over the lifetime. The second solution is to monitor the effect of electromigration as the core is used over months and years, then adjust the voltage upwards to compensate. This requires a greater level of detection and management inside the CPU, and is arguably a more difficult problem.
What AMD does in Ryzen 3000 is the second solution. The first solution results in lower-than-ideal performance, and so the second solution allows AMD to ride the voltage/frequency limits of a given core. The upshot of this is that AMD also knows (through TSMC’s reporting) how long each chip or each core is expected to last, and the results in their eyes are very positive, even with a single core getting the majority of the traffic. For users that are worried about this, the question is, do you trust AMD?
Also, to point out, Intel could use this method of binning by core. There’s nothing stopping them. It all depends on how comfortable the company is with its manufacturing process aligning with the expected longevity. To a certain extent, Intel already kind of does this with its Turbo Boost Max 3.0 processors, given that they specify specific cores to go beyond the Turbo Boost 2.0 frequency – and these cores get all the priority programs to run at a higher frequency and would experience the same electromigration worries that users might have by running the priority core more often. There difference between the two companies is that AMD has essentially applied this idea chip-wide and through its product stack, while Intel has not, potentially leaving out-of-the-box performance on the table.
144 Comments
View All Comments
ajlueke - Tuesday, September 17, 2019 - link
More specifically, I was referring to this test from the article."Because of the new binning strategy – and despite what some of AMD's poorly executed marketing material has been saying – PBO hasn't been having the same effect, and users are seeing little-to-no benefit. This isn’t because PBO is failing, it’s because the CPU out of the box is already near its peak limits, and AMD’s metrics from manufacturing state that the CPU has a lifespan that AMD is happy with despite being near silicon limits."
What silicon limits exactly? AMDs marketing material has always indicated that a CPU will boost until it reaches either the PPT, TDC, EDC, or thermal limits. If none of those are met, it will boost until Fmax, which it simply will not exceed. Now, in a single threaded workload, the user is almost never at a PPT,TDC, EDC or thermal limit, and seem to be just shy of Fmax anyway. Now, if the user enables the auto-oc feature and extends Fmax by 100, 150 or 200MHz...nothing happens. The identical clockspeed and performance are observed.
I see the same thing happen in multicore on my 3900X. I normally hits the EDC and PPT limits under standard boosting. If I remove them, with precision boost overdrive, it does boost higher, but not by much. It again seems to stop a certain point. Again, EDC, TDC and PPT motherboard limits are not met, I am certainly not at Fmax, and the chip is under 70C, but it stops nonetheless. Nothing I can do makes it boost further.
"The Stilt", seems to mention the silicon fitness monitoring feature (FIT) in his "Matisse Strictly Technical" post on overclock.net. FIT appears to be a specific voltage limit for high and low current the CPU cannot exceed. This has never been included in AMDs documentation, and would help explain why the processor's stop boosting when according to AMD's own documentation, they should keep on going. So what exactly is this feature, and how does it work? I think that answer would do a great deal to alleviate user confusion.
mabellon - Tuesday, September 17, 2019 - link
>> "To a certain extent, Intel already kind of does this with its Turbo Boost Max 3.0 processors... [the] difference between the two companies is that AMD has essentially applied this idea chip-wide and through its product stack, while Intel has not, potentially leaving out-of-the-box performance on the table."What does this mean? What has Intel not done that AMD has done? Both have variable max frequency per core. Both expose this concept to the OS. Both rely on the same Window scheduler. What are you alluding to is different here?
It seems to me that Intel's HEDT platform with Turbo 3.0 is very much similar to AMD's implementation in the sense of having certain cores run faster. @Ian how is performance left on the table for Intel here? (Intel non HEDT is obviously stuck on Turbo 2.0 which is at a disadvantage)
Targon - Tuesday, September 17, 2019 - link
The majority of Intel chips are multiplier locked, so there isn't any real overclocking ability to speak of. It is only the k chips that users can overclock. AMD on the other hand, has PBO which is more advanced when it comes down to it.edzieba - Thursday, September 19, 2019 - link
"What does this mean? What has Intel not done that AMD has done?"Intel picks the maximum 'turbo' bin as the lowest that any core can achieve. AMD picks their maximum boost bin as the highest that any single core could achieve. 'Turbo 3.0' pre-selected two cores that were able to clock above the all-core turbo bin and allowed them to clock higher for lightly threaded workloads.
Jaxidian - Tuesday, September 17, 2019 - link
Is this WSL tool available for us to use? I'd love to have a better view of what speeds my cores could hit with a tool like this. In fact, I'd probably use it to map out all 12 cores (disabling 11 of them at a time). Obviously even that wouldn't quite give the whole picture, but it would be an interesting baseline map to have for my 3900x chip.Jaxidian - Tuesday, September 17, 2019 - link
I got my "no" answer here: https://twitter.com/IanCutress/status/117401405985..."It's a custom kludgy thing for internal use."
MFinn3333 - Tuesday, September 17, 2019 - link
I miss the old days when I would just push the Turbo frequency on my 286 and the CPU would go from 10MHz to 12MHz. Sure occassionally chip poppped off from the Glue but it was totally worth it to play Dune 2.sing_electric - Tuesday, September 17, 2019 - link
"Turbo, in this instance, is aspirational. We typically talk about things like ‘a 4.4 GHz Turbo frequency’, when technically we should be stating ‘up to 4.4 GHz Turbo frequency’."This is true, but EXACTLY the problem. The marketing teams at AMD, Intel and everyone else KNOW that when you see "3.6 GHz / 4.5 GHZ Turbo" written on a box, your eye falls to the second, larger number, and that's what sticks in your head.
Why should the consumer know that some of the numbers on the box (core count, base freq) are guaranteed, but some (turbo) aren't? That makes no sense and is borderline deceptive. And this doesn't just matter to the fairly small, tech savvy group of people who buy a processor alone in a box - here's how Dell lists the processor on its base config XPS 13 laptop when you go to "Tech Specs & Customization"
"8th Generation Intel® Core™ i5-8265U Processor (6M Cache, up to 3.9 GHz, 4 cores)"
Dell doesn't even bother LISTING the base frequency, even when you click to get more detail - how's a consumer supposed to gauge how fast their processor is? (To their credit, Apple, HP and Lenovo all list base frequency and "up to" the turbo).
Turbo is a great technology for getting the most out of limited silicon, but both AMD and Intel are, while not QUITE being untruthful, certainly trying to put their products in as good of a light as possible.
DigitalFreak - Tuesday, September 17, 2019 - link
That's marketing for you. Step as close to the "deceive the customer" line as possible without getting sued.Jaxidian - Tuesday, September 17, 2019 - link
I'm looking at the retail box for my 3900x right now. The only thing it says about frequencies is "4.6 GHz Max Boost, 3.8 GHz Base". There is no "up to" verbiage anywhere on the box. From a FTC advertising standpoint, the 4.6GHz should be guaranteed even if only under nuanced "limited single-core" and "with specific but reasonable motherboard, cooling, and software" scenarios.While this is a very good article and I generally have very few issues with AMD's new approach here, I'm of the belief that legally, a 3900x should be guaranteed to hit 4.6GHz when in a specific-yet-real-world scenario. I don't mean $100 mobos with $25 coolers should be able to hit it. But a better-than-budget x570 motherboard using the stock cooler with proper updates on a supported OS should absolutely hit 4.6GHz with certain loads. Otherwise, I think there's a real legal issue here.
All this said, I am now seeing 4.6GHz from time to time on my 3900x with ABBA on my x570 Aorus Master, so we're good here. Never saw higher than 4.575 before ABBA.