Forcing HPET On, Plus Spectre and Meltdown Patches

Based on my extreme overclocking roots back in the day, my automated benchmark scripts for the past year or so have forced HPET through the OS. Given that AMD’s guidance is now that it doesn’t matter for performance, and Intel hasn’t even mentioned the issue relating to a CPU review, having HPET enabled was the immediate way to ensure that every benchmark result was consistent, and would not be interfered with by clock drift on special motherboard manufacturer in-OS tweaks. This was a fundamental part of my overclocking roots – if I want to test a CPU, I want to make certainly sure that the motherboard is not causing any issues. It really gets up my nose when after a series of CPU testing, it turns out that the motherboard had an issue – keeping HPET on was designed to stop any timing issues should they arise.

From our results over that time, if HPET was having any effect, it was unnoticed: our results were broadly similar to others, and each of the products fell in line with where they were expected. Over the several review cycles we had, there were a couple of issues that cropped up that we couldn’t explain, such as our Skylake-X gaming numbers that were low, or the first batch of Ryzen gaming tests, where the data was thrown out for being obviously wrong however we never managed to narrow down the issue.

Enter our Ryzen 2000 series numbers in the review last week, and what had changed was the order of results. The way that forcing HPET was affecting results was seemingly adjusted when we bundle in the Spectre and Meltdown patches that also come with their own performance decrease on some systems. Pulling one set of results down further than expected started some alarm bells and needed closer examination.

HPET, by the way it is invoked, is programmed by a memory mapped IO window through the ACPI into the circuit found on the chipset. Accessing it is very much an IO command, and one of the types of commands that fall under the realm of those affected by the Spectre and Meltdown patches. This would imply that any software that required HPET access (or all timing software if HPET is forced) would have the performance reduced even further when these patches are applied, further compounding the issue.

It Affects AMD and Intel Differently: Productivity

So far we have done some quick initial re-testing on the two key processors in this debate, the Ryzen 7 2700X and Intel Core i7-8700K. These are the two most talked about processors at this time, due to the fact that they are closely matched in performance and price, with each one having benefits in certain areas over the other. For our new tests, we have enabled the Spectre/Meltdown patches on both systems – HPET is ‘on’ in the BIOS, but left as ‘default’ in the operating system.

For our productivity tests, on the Intel system, there was an overall +3.3% gain when un-forcing HPET in the OS:

The biggest gains here were in the web tests, a couple of the renderers, WinRAR (memory bound), and PCMark 10. Everything else was pretty much identical. Our compile tests gave us three very odd consecutive numbers, so we are looking at those results separately.

On the AMD system, the productivity tests difference was an overall +0.3% gain when un-forcing HPET in the OS:

This is a lower gain, with the biggest rise coming from PCMark10’s video conference test to the tune of +16%. The compile test results were identical, and a lot of tests were with 1-2%.

If Affects AMD and Intel Differently: Gaming

The bigger changes happen with the gaming results, which is the reason why we embarked on this audit to decipher our initial results. Games rely on timers to ensure data and pacing and tick rates are all sufficient for frames to be delivered in the correct manner – the balance here is between waiting on timers to make sure everything is correct, or merely processing the data and hoping it comes out in more or less the right order: having too fine a control might cause performance delays. In fact, this is what we observe.

With our GTX 1080 and AMD’s Ryzen 7 2700X, we saw minor gains across the board, however it was clear that 1080p was the main beneficiary over 4K. The 10%+ adjustments came in only Civilization 6 and Rise of the Tomb Raider.

Including the 99th percentile data, removing HPET gave an overall boost of around 4%, however the most gains were limited to specific titles at the smaller resolutions, which would be important for any user relying on fast frame rates at lower resolutions.

The Intel side of the equation is where it gets particularly messy. We rechecked these results several times, but the data was quite clear.

As with the AMD results, the biggest beneficiaries of disabling HPET were the 1080p tests. Civilization 6 and Rise of the Tomb Raider had substantial performance boosts (also in 4K testing), with Grand Theft Auto observing an additional +27%. By comparison, Shadow of Morder was ‘only’ +6%.

Given that the difference between the two sets of data is related to the timer, one could postulate that the more granular the timer, the more the effect it can have: on both of our systems, the QPC timer is set for 3.61 MHz as a baseline, but the HPET frequencies are quite different. The AMD system has a HPET timer at 14.32 MHz (~4x), while the Intel system has a HPET timer at 24.00 MHz (~6.6x). It is clear that the higher granularity of the Intel timer is causing substantially more pipeline delays – moving from a tick-to-tick delay of 277 nanoseconds to 70 nanoseconds to 41.7 nanoseconds is crossing the boundary from being slower than a CPU-to-DRAM access to almost encroaching on a CPU-to-L3 cache access, which could be one of the reasons for the results we are seeing, along with the nature of how the HPET timer works.

There is also another aspect to gaming that does not appear with standard CPU tests: depending on how the engine is programmed, some game developers like to keep track of a lot of the functions in flight in order to either adjust features on the fly, or for internal metrics. For anyone that has worked extensively on a debug mode and had to churn through the output, it is basically this. If a title had shipped with a number of those internal metrics still running in the background, this is exactly the sort of issue that having HPET enabled could stumble upon - if there is a timing mismatch (based on the way HPET works) and delays are introduced due to these mismatches, it could easily slow down the system and reduce the frame rate.

AMD and Intel Have Different HPET Guidance Why This Matters
POST A COMMENT

242 Comments

View All Comments

  • BillyONeal - Wednesday, April 25, 2018 - link

    The TSC is *clock cycle* accurate but not *real time* accurate. It speeds up and slows down relative to real time with changes in CPU clock speed; such as what CPUs do on their own when system power state changes.

    That is, when a hypothetical 1.6GHz chip downclocks to 800MHz, the TSC's rate relative to real time is cut in half.
    Reply
  • mczak - Wednesday, April 25, 2018 - link

    No, that was true maybe 10+ years ago.
    There's several flags to indicate TSC properties:
    - constant (fixed clock rate, but may be halted depending on C-State)
    - invariant (runs the same independent from C-State)

    All intel cpus since at least Nehalem (the first Core-i chips) should support these features (not entirely sure about AMD, probably since Bulldozer or thereabouts).

    The TSCs are also usually in-sync for all cpu cores (on single socket systems at least), albeit I've seen BIOSes screwing this up majorly (TSC reg is allowed to be written, but this will destroy the synchronization and it is impossible to (accurately) resync them between cores - unless your cpu supports tsc_adjust meaning you can write an offset reg instead of tsc directly), causing the linux kernel to drop tsc as a clock source even and using hpet instead (at least at that time the kernel made no attempt to resync the TSCs for different cores).

    So on all "modern" x86 systems, usually tsc based timing data should be used. It is far more accurate and has far lower cost than anything else. If you need a timer (to generate an interrupt) instead of just reading the timing data, new cpus actually support a tsc_deadline mode, where the local apic will generate an interrupt when the TSC reaches a programmed value.
    Reply
  • mczak - Wednesday, April 25, 2018 - link

    FWIW I think the reason Ryzen Master (and some other software for OC) requires HPET is because, while the TSC frequency is invariant, it might not be as invariant as you'd like it to be when overclocking (though Ryzen Master has the HPET requirement fixed a while ago).
    Ryzen PPR manual (https://support.amd.com/TechDocs/54945_PPR_Family_... says that the TSC invariant clock corresponds to P0 P-State - this would be cpu base clock. So then naturally from that follows if you were to change the base clock for overclocking, the TSC clock would change too, causing all sort of mayhem since likely the OS is going to rely on TSC being really invariant (as it's announced as such).
    That said, this manual says (for MSRC001_0015) there's a "LockTscToCurrentP0: lock the TSC to the current P0 frequency" bit. It does just what the doctor asked for:
    "LockTscToCurrentP0: lock the TSC to the current P0 frequency. Read-write. Reset: 0. 0=The TSC will count at the P0 frequency. 1=The TSC frequency is locked to the current P0 frequency at the time this bit is set and remains fixed regardless of future changes to the P0 frequency."
    So maybe they are now setting this (or maybe they always set this and requiring HPET had other reasons...).
    Reply
  • BillyONeal - Wednesday, April 25, 2018 - link

    Because Intel has mitigated Spectre microcode available, and no such thing is available for AMD yet. Intel is paying that context switching overhead and AMD isn't (yet). Reply
  • Spunjji - Thursday, April 26, 2018 - link

    Factually incorrect here:
    https://arstechnica.com/gadgets/2018/04/latest-win...

    Given AT are running brand-new AMD CPUs with the latest version of Windows 10, I'm pretty sure they have this code active.
    Reply
  • Nutty667 - Thursday, April 26, 2018 - link

    It's nothing todo with the accuracy of HPET, but the cost in reading HPET.
    Reading HPET is an IO operation and system call, which means you hit the Meltdown mitigation penalty, something that AMD does not suffer from.
    Reply
  • chrcoluk - Wednesday, April 25, 2018 - link

    no forcing HPET is a very unusual config, no modern OS has it as the default time on modern hardware. Not only is it slower but also things like msi-x require LAPIC to work.

    In short what anandtech did here is "very bad".
    Reply
  • patrickjp93 - Wednesday, April 25, 2018 - link

    Well, for every OS but the BSD variety. Reply
  • Ratman6161 - Friday, April 27, 2018 - link

    "forcing HPET is a very unusual config,"
    Actually I think it may be very common. For example, based on what I read, in the article my system will have it forced and I never even knew it (will check it as soon as I get home). See, I always do my overclocking from the bios/uefi stings. But a while back, just for grins I decided to try out Ryzen Master. I messed around with it for a while, didn't really like it, and uninstalled it. But the story says when I installed it and rebooted, then it was forced on and that setting did not go away when I uninstalled Ryzen Master. So, essentially anyone who has ever used Ryzen Master or a similar tool will be forced on unless they knew enough to turn it off. I certainly had no clue how this worked and I'm betting that most other people were as clueless as me.
    Reply
  • Ratman6161 - Friday, April 27, 2018 - link

    "In short what anandtech did here is "very bad"."
    Or you could interpret it as very good because all of us who had no clue have now learned something :)
    Reply

Log in

Don't have an account? Sign up now