Power Consumption - Up to 65W or not?

TDPs and power consumption has been a topic we’ve been revisiting on an (unfortunately) regular basis on almost every product launch. Over the last few generations of product launches in particular, we’ve been attempting to explain the current industry situation in more depth in order to demystify marketed power and thermal envelope figures versus what you can actually expect to encounter in the real products.

Given our limited time in writing up this Tiger Lake-H system, I’ll refer back to our more extensive articles, in particular the in-depth explanation of TDPs and Intel’s new generation product behaviour in our review of the Tiger Lake reference platform last September:

Alongside that piece, we also want to point out more extensive historical talks about TDPs, Turbo and power consumption:

In the context of today’s Tiger Lake-H reference laptop and system, the one thing we must preface the rest of the review is the power settings the laptop came in and the resulting behaviour and thermal characteristics of the Core i9-11980HK SKU we’ll be reviewing today.

Intel’s reference laptop, out of the box as delivered to us by Intel seemingly was set up with no PL1 limit, or at least what we suspect is the maximum cTDP limit of 65W of the i9-11980HK. Generally speaking, this is no surprise as TGL-H is targeting the high-power desktop replacement laptop market which tends to come with capable and extensive thermal dissipation designs.

At first, we started running our tests on the platform in this default reference setting, representing what we had hoped being the best-case scenario for the chip and platform, until we discovered some concerning thermal behaviour when under full load:

Under a Prime95 load in a more prolonged test period of 10+ minutes, when tracking package power consumption as well as CPU temperatures and frequencies, we’re seeing that the TGL-H reference laptop is having great troubles at sustaining this default 65W TDP mode.

During the initial idle period we see the CPU has low power consumption in the 2.2W range with some workload noise in the mix, boosting the CPU frequencies up to 4.9GHz.

The initial load ramp results in peak power consumption of up to 86W, but this is a very transient measurement as power quickly throttles down to 70W and below within seconds. In our readouts it seems that Tau (PL2 turbo period) seems to be set at 5 seconds for this machine.

The worrying behaviour starts happening after around 2 minutes of load: we indeed see the CPU package generally trying to limit itself to around 65W, however it’s not a constant steady state, with very obvious large fluctuations between 65W and 35W.

Looking at the temperature, we’re seeing maximum load figures in excess of 95°C, with some 96°C peaks in our coarse sampled data. What seems to be happening here is that the CPU is thermal tripping between the 65W and 35W states, unable to sustain the 65W state for any amount of prolonged time.

We’ve confirmed that this throttling and power and frequency fluctuations happen on several workloads, and the only conclusion we can come to is that the reference system simply doesn’t have an adequate enough thermal dissipation solution to effectively enable the 65W cTDP mode of the CPU.

While the reference laptop had a bare-bones BIOS, fortunately enough we were able to rely on XTU to change the system’s PL1 settings, and we chose to re-test at 45W given that this is the i9-11980HK’s supposed default TDP setting.

Under this simple change, the thermal, power and frequency response of the system appears to be much more reasonable. We’re still seeing peak power consumption figures of 75-86W which should correspond to the PL2 figures of the chip, but again that’s only for short workloads which fit into the 5 second Tau/Turbo period.

For the remainder of the next 15 minutes, the machine was able to sustain a steady state CPU temperature of around 83°C, and power consumption capped at 45W. Frequencies ended up around 3200MHz all-core for most of the test but had a prolonged 2700MHz period a few minutes in. The 11980HK has an advertised base frequency of 2600MHz, so that seems to be in line with Intel’s specifications.

Unfortunately, we don’t have Intel’s previous generation H-Series processors in house for a direct generational comparison, but the next best thing is the 11980HK’s nearest competitor, AMD’s Ryzen 9 5980HS. This latter chip comes with a 35W TDP which is 10W lower than the new TGL-H SKU we have in house right now, and we see an obvious difference between the chip’s long-term thermals and power, ending up at different levels after some while.

The Ryzen 9 has a prolonged 300s semi-turbo state where it sustains 42W power until thermal saturation of the laptop. During this period, with similar power consumption to the 11980HK and also quite similar thermal results of around 80-83°C, both platforms seem to be quite similar – except for the fact that AMD Zen3 cores able to operate at all-core boost frequencies of around 4GHz, while the Willow Cove cores of the TGL-H system operate at around 3200MHz and below. This is an important metric to note as we dive deeper in other results of our test suite.

In more real-world full load workloads such as Agisoft, we’re seeing the i9-11980HK being able to have more aggressive boost frequencies due to the more dynamic and differentiated nature of the workload. Boost frequencies during the heft of the workload reach up to around 4.5GHz which is what Intel advertises as the nT turbo of the chip, while the rough average sits at around 3.5GHz. Towards the end of the test, we’re seeing lower core count boost frequencies reaching the near 5GHz 1-2T advertised boosts of the cores.

Still, what’s relatively concerning is that temperatures are still quite high even when tested in the 45W PL1 mode, we still see temperatures well in excess of 90°C and peaking at >95°C for transient periods.

(0-0) Peak Power

In terms of peak power comparisons, we see that the chip goes up to quite high transients, and has PL2 configurations of around 85-90W. Due to Tau and the turbo period here being only a mere 5 seconds, it shouldn’t affect thermals too much, and should give the system very good responsiveness, even though it will come at the cost of power and battery life.

Unfortunately, due to the embargo and extremely limited time we’ve had with the system we haven’t yet tested more power scenarios, such as 35W cTDP-down or unplugged battery-only behaviour of the platform. We’ll be following up with updates after today’s initial review.

Tiger Lake-H: 8x Willow Cove up to 65W CPU Tests: Core-to-Core and Cache Latency
POST A COMMENT

229 Comments

View All Comments

  • Otritus - Monday, May 17, 2021 - link

    And that's why if there is an option to run AVX-512, I'd like to see it being run. Also with the massive efficiency deficit of Tiger Lake, and AVX-512 requiring even more power, it's plausible Cezanne might be competitive at the same power limits. Although with NAMD, I'd expect Tiger Lake to top the chart. Reply
  • vyor - Monday, May 17, 2021 - link

    That is not how AVX512 works. Reply
  • mode_13h - Monday, May 17, 2021 - link

    Depends on how well the compiler can vectorize your workload or if you're using something like OpenMP. However, if I really wanted max performance from AVX-512, I'd be using the intrinsics.

    BTW, it's worth noting that you can't simply disable AVX-512 with build-time compiler flags, for software that performs runtime code-generation (usually via LLVM). Many popular deep learning frameworks fall in this category.
    Reply
  • mode_13h - Monday, May 17, 2021 - link

    > Even Skylake-X and cascadelake-X there is a noticeable improvement in performance in AVX-512

    Not always. Clock throttling is so bad in Skylake SP & Cascade Lake that you need an AVX-512 -heavy workload to see a net-benefit.
    Reply
  • zaza - Monday, May 17, 2021 - link

    I actually ran that test myself in my university lab. we ran AI workloads and vectorized workloads on avx2 and avx512. Running avx512 on all threads would result in a significant clock drop, but despite this, I was testing 20% faster than AVX 256 (they were using the same power ~200 watts). When you mixing several workloads (AVX and non-AVX), you don't see the same drop unless more than 50% of the cores are running AVX. Granted this is anecdotal evidence, but I think the power budgeting across the chip was working well to maximize performance. Reply
  • mode_13h - Monday, May 17, 2021 - link

    > I actually ran that test myself in my university lab. we ran AI workloads and vectorized
    > workloads on avx2 and avx512. Running avx512 on all threads would result in a
    > significant clock drop, but despite this, I was testing 20% faster than AVX 256
    > (they were using the same power ~200 watts).

    Depends on your network architecture. We saw the opposite, and this was confirmed by engineers at Intel. To resolve the problem, they sent us a patch to disable AVX-512.

    > When you mixing several workloads (AVX and non-AVX), you don't see the same drop
    > unless more than 50% of the cores are running AVX.

    I'm talking specifically about AVX-512. And when I had a lot of threads using it for maybe only 10% of the time (different scenario than above), I also saw clock drops big enough to decrease overall system throughput.

    This was all on 14 nm CPUs, so I'm eager to try Intel's 10 nm chips.
    Reply
  • mode_13h - Monday, May 17, 2021 - link

    > I'm conflicted on your decision to omit AVX-512 on NAMD.

    Let's remember that this is a notebook processor. Sure, it's what most Intel-based mobile workstations will probably use, but most users of this processor aren't going to be recompiling their apps with -march=native.

    In other words, I think the case for testing AVX-512 on this CPU is a lot weaker than on server CPUs or even desktop processors.
    Reply
  • ballsystemlord - Monday, May 17, 2021 - link

    Good point! Reply
  • Techtree101 - Monday, May 17, 2021 - link

    Dwarf Fortress looks like a great benchmark tool for judging heavy single threaded AI/Simulation games?

    Could I use these benchmark results with it, loosely, to extrapolate what this CPU can do for Civilization VI, Cities Skylines, Minecraft, etc.?
    Reply
  • Techtree101 - Monday, May 17, 2021 - link

    And Dolphin 5.0 I suppose too. Reply

Log in

Don't have an account? Sign up now