A Timely Discovery: Examining Our AMD 2nd Gen Ryzen Results

Name: A Timely Discovery: Examining Our AMD 2nd Gen Ryzen Results
Item: A Timely Discovery: Examining Our AMD 2nd Gen Ryzen Results

by Ian Cutress & Ryan Smith on April 25, 2018 11:15 AM EST

242 Comments | Add A Comment

242 Comments

A Timely Re-Discovery

Most users have no need to worry about the internals of a computer: point, click, run, play games, and spend money if they want something faster. However one of the important features in a system relates to how they measure time. A modern system relies on a series of both hardware and software timers, both internal and external, in order to maintain a linear relation between requests, commands, execution, and interrupts.

The timers have different users, such as following instructions, maintaining video coherency, tracking real time, or managing the flow of data. Timers can (but not always) use external references to ensure their own consistency – damage, unexpected behavior, and even thermal environments can cause timers to lose their accuracy.

Timers are highly relevant for benchmarking. Most benchmark results are a measure of work performed per unit time, or in a given time. This means that both the numerator and the denominator need to be accurate: the system has to be able to measure what amount of work has been processed, and how long it took to do it in. Ideally there is no uncertainty in either of those values, giving an accurate result.

With the advent of Windows 8, between Intel and Microsoft, the way that the timers were used in the OS were changed. Windows 8 had the mantra that it had to ‘support all devices’, all the way from the high-cost systems down to the embedded platforms. Most of these platforms use what is called an RTC, a ‘real time clock’, to maintain the real-world time – this is typically a hardware circuit found in almost all devices that need to keep track of time and the processing of data. However, compared to previous versions of Windows, Microsoft changed the way it uses timers, such that it was compatible with systems that did not have a hardware-based RTC, such as low-cost and embedded devices. The RTC was an extra cost that could be saved if the software was built to do so.

Ultimately, any benchmark software in play has to probe the OS to determine the current time during the benchmark to then at the end give an accurate result. However the concept of time, without an external verifying source, is an arbitrarily defined constant – without external synchronization, there is no guarantee that ‘one second’ on the system equals ‘one second’ in the real world. For the most part, all of us rely on the reporting from the OS and the hardware that this equality is true, and there are a lot of hardware/software engineers ensuring that this is the case.

However, back in 2013, it was discovered that it was fairly easy to 'distort time' on a Windows 8 machine. After loading into the operating system, any adjustment in the base frequency of the processor, which is usually 100 MHz, can cause the ‘system time’ to desynchronise with ‘real time’. This was a serious issue in the extreme overclocking community, where world records require the best system tuning: when comparing two systems at the same frequency but with different base clock adjustments, up to a 7% difference in results were observed when there should have been a sub-1% difference. This was down to how Windows was managing its timers, and was observed on most modern systems.

For home users, most would suspect that this is not an issue. Most users tend not to adjust the base frequencies of their systems manually. For the most part that is true. However, as shown in some of our motherboard testing over the years, frequency response due to default BIOS settings can provide an observable clock drift around a specified value, something which can be exacerbated by the thermal performance. Having a system with observable clock drift, and subsequent timing drift, is not a good thing. It relies on the accuracy and quality of the motherboard components, as well as the state of the firmware. This issue has formally been classified as ‘RTC Bias’.

The extreme overclocking community, after analysing the issue, found a solution: forcing the High Performance Event Timer, known as HPET, found in the chipset. Some of our readers will have heard of HPET before, however our analysis is more interesting than it first appears.

Why A PC Has Multiple Timers

Aside from the RTC, a modern system makes use of many timers. All modern x86 processors have a Time Stamp Counter (TSC) for example, that counts the number of cycles from a given core, which was seen back in the day as a high-resolution, low-overhead way to get CPU timing information. There is also a Query Performance Counter (QPC), a Windows implementation that relies on the processor performance metrics to get a better resolution version of the TSC, which was developed in the advent of multi-core systems where the TSC was not applicable. There is also a timer function provided by the Advanced Configuration and Power Interface (ACPI), which is typically used for power management (which means turbo related functionality). Legacy timing methodologies, such as the Programmable Interval Timer (PIT), are also in use on modern systems. Along with the High Performance Event Timer, depending on the system in play, these timers will run at different frequencies.

Ryzen 7 2700X with HPET Off ASUS ROG Crosshair VII Hero	Core i7-8700K with HPET Off ASRock Z370 Gaming i7
Core i7-6950X with HPET On ASUS X99-E-10G	Core i7-6700K with HPET Off GIGABYTE X170-Extreme ECC
Core i5-5200U with HPET Off GIGABYTE BRIX	Core i7-3960X with HPET Off EVGA X79 SLI

The timers will be used for different parts of the system as described above. Generally, the high performance timers are the ones used for work that is more time sensitive, such as video streaming and playback. HPET, for example, was previously referred to by its old name, the Multimedia Timer. HPET is also the preferred timer for a number of monitoring and overclocking tools, which becomes important in a bit.

With the HPET timer being at least 10 MHz as per the specification, any code that requires it is likely to be more in sync with the real-world time (the ‘one-second in the machine’ actually equals ‘one-second in reality’) than using any other timer.

In a standard Windows installation, the operating system has access to all the timers available. The software used above is a custom tool developed to show if a system has any of those four timers (but the system can have more). For the most part, depending on the software instructions in play, the operating system will determine which timer is to be used – from a software perspective, it is fundamentally difficult to determine which timers will be available, so the software is often timer agnostic. There is not much of a way to force an algorithm to use one timer or another without invoking specific hardware or instructions that rely on a given timer, although the timers can be probed in software like the tool above.

HPET is slightly different, in that it can be forced to be the only timer. This is a two stage process:

The first stage is that it needs to be enabled in the BIOS. Depending on the motherboard and the chipset, there may or may not be an option for this. The options are usually for enable/disable, however this is not a simple on/off switch. When disabled, HPET is truly disabled. However, when enabled, this only means that the HPET is added to the pool of potential timers that the OS can use.

The second stage is in the operating system. In order to force HPET as the only timer to be used for the OS, it has to be explicitly mentioned in the system Boot Configuration Data (BCD). In standard operation, HPET is not in the BCD, so it remains in the pool of timers for the OS to use. However, for software to guarantee that the HPET is the only timer running, the software will typically request to make a change and make an accompanying system reboot to ensure the software works as planned. Ever wondered why some overclocking software requests a reboot *before* starting the overclock? One of the reasons is sometimes to force HPET to be enabled.

This leads to four potential configuration implementations:

BIOS enabled, OS default: HPET is in list of potential timers
BIOS enabled, OS forced: HPET is used in all situations
BIOS disabled, OS default: HPET is not available
BIOS disabled, OS forced: HPET is not available

Again, for extreme overclockers relying on benchmark results to be equal on Windows 8/10, HPET has to be forced to ensure benchmark consistency. Without it, the results are invalid.

The Effect of a High Performance Timer

With a high performance timer, the system is able to accurately determine clock speeds for monitoring software, or video streaming processing to ensure everything hits in the right order for audio and video. It can also come into play when gaming, especially when overclocking, ensuring data and frames are delivered in an orderly fashion, and has been shown to reduce stutter on overclocked systems. And perhaps most importantly, it avoids any timing issues caused by clock drift.

However, there are issues fundamental to the HPET design which means that it is not always the best timer to use. HPET is a continually upward counting timer, which relies on register recall or comparison metrics rather than a ‘set at x and count-down’ type of timer. The speed of the timer can, at times, cause a comparison to fail, depending on the time to write the compared value to the register and that time already passing. Using HPET for very granular timing requires a lot of register reads/writes, adding to the system load and power draw, and in a workload that requires explicit linearity, can actually introduce additional latency. Usually one of the biggest benefits to disabling HPET on some systems is the reduction in DPC Latency, for example.

An Audit after our Ryzen 2000-series Review AMD and Intel Have Different HPET Guidance

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

242 Comments

View All Comments

BillyONeal - Wednesday, April 25, 2018 - link
The TSC is *clock cycle* accurate but not *real time* accurate. It speeds up and slows down relative to real time with changes in CPU clock speed; such as what CPUs do on their own when system power state changes.

That is, when a hypothetical 1.6GHz chip downclocks to 800MHz, the TSC's rate relative to real time is cut in half.
mczak - Wednesday, April 25, 2018 - link
No, that was true maybe 10+ years ago.
There's several flags to indicate TSC properties:
- constant (fixed clock rate, but may be halted depending on C-State)
- invariant (runs the same independent from C-State)

All intel cpus since at least Nehalem (the first Core-i chips) should support these features (not entirely sure about AMD, probably since Bulldozer or thereabouts).

The TSCs are also usually in-sync for all cpu cores (on single socket systems at least), albeit I've seen BIOSes screwing this up majorly (TSC reg is allowed to be written, but this will destroy the synchronization and it is impossible to (accurately) resync them between cores - unless your cpu supports tsc_adjust meaning you can write an offset reg instead of tsc directly), causing the linux kernel to drop tsc as a clock source even and using hpet instead (at least at that time the kernel made no attempt to resync the TSCs for different cores).

So on all "modern" x86 systems, usually tsc based timing data should be used. It is far more accurate and has far lower cost than anything else. If you need a timer (to generate an interrupt) instead of just reading the timing data, new cpus actually support a tsc_deadline mode, where the local apic will generate an interrupt when the TSC reaches a programmed value.
mczak - Wednesday, April 25, 2018 - link
FWIW I think the reason Ryzen Master (and some other software for OC) requires HPET is because, while the TSC frequency is invariant, it might not be as invariant as you'd like it to be when overclocking (though Ryzen Master has the HPET requirement fixed a while ago).
Ryzen PPR manual (https://support.amd.com/TechDocs/54945_PPR_Family_... says that the TSC invariant clock corresponds to P0 P-State - this would be cpu base clock. So then naturally from that follows if you were to change the base clock for overclocking, the TSC clock would change too, causing all sort of mayhem since likely the OS is going to rely on TSC being really invariant (as it's announced as such).
That said, this manual says (for MSRC001_0015) there's a "LockTscToCurrentP0: lock the TSC to the current P0 frequency" bit. It does just what the doctor asked for:
"LockTscToCurrentP0: lock the TSC to the current P0 frequency. Read-write. Reset: 0. 0=The TSC will count at the P0 frequency. 1=The TSC frequency is locked to the current P0 frequency at the time this bit is set and remains fixed regardless of future changes to the P0 frequency."
So maybe they are now setting this (or maybe they always set this and requiring HPET had other reasons...).
BillyONeal - Wednesday, April 25, 2018 - link
Because Intel has mitigated Spectre microcode available, and no such thing is available for AMD yet. Intel is paying that context switching overhead and AMD isn't (yet).
Spunjji - Thursday, April 26, 2018 - link
Factually incorrect here:
https://arstechnica.com/gadgets/2018/04/latest-win...

Given AT are running brand-new AMD CPUs with the latest version of Windows 10, I'm pretty sure they have this code active.
Nutty667 - Thursday, April 26, 2018 - link
It's nothing todo with the accuracy of HPET, but the cost in reading HPET.
Reading HPET is an IO operation and system call, which means you hit the Meltdown mitigation penalty, something that AMD does not suffer from.
chrcoluk - Wednesday, April 25, 2018 - link
no forcing HPET is a very unusual config, no modern OS has it as the default time on modern hardware. Not only is it slower but also things like msi-x require LAPIC to work.

In short what anandtech did here is "very bad".
patrickjp93 - Wednesday, April 25, 2018 - link
Well, for every OS but the BSD variety.
Ratman6161 - Friday, April 27, 2018 - link
"forcing HPET is a very unusual config,"
Actually I think it may be very common. For example, based on what I read, in the article my system will have it forced and I never even knew it (will check it as soon as I get home). See, I always do my overclocking from the bios/uefi stings. But a while back, just for grins I decided to try out Ryzen Master. I messed around with it for a while, didn't really like it, and uninstalled it. But the story says when I installed it and rebooted, then it was forced on and that setting did not go away when I uninstalled Ryzen Master. So, essentially anyone who has ever used Ryzen Master or a similar tool will be forced on unless they knew enough to turn it off. I certainly had no clue how this worked and I'm betting that most other people were as clueless as me.
Ratman6161 - Friday, April 27, 2018 - link
"In short what anandtech did here is "very bad"."
Or you could interpret it as very good because all of us who had no clue have now learned something :)

A Timely Discovery: Examining Our AMD 2nd Gen Ryzen Results

A Timely Re-Discovery

Why A PC Has Multiple Timers

The Effect of a High Performance Timer

Post Your Comment

242 Comments

View All Comments

BillyONeal - Wednesday, April 25, 2018 - link

mczak - Wednesday, April 25, 2018 - link

mczak - Wednesday, April 25, 2018 - link

BillyONeal - Wednesday, April 25, 2018 - link

Spunjji - Thursday, April 26, 2018 - link

Nutty667 - Thursday, April 26, 2018 - link

chrcoluk - Wednesday, April 25, 2018 - link

patrickjp93 - Wednesday, April 25, 2018 - link

Ratman6161 - Friday, April 27, 2018 - link

Ratman6161 - Friday, April 27, 2018 - link

Log in

Don't have an account? Sign up now