Power Consumption: AVX-512 Caution

I won’t rehash the full ongoing issue with how companies report power vs TDP in this review – we’ve covered it a number of times before, but in a quick sentence, Intel uses one published value for sustained performance, and an unpublished ‘recommended’ value for turbo performance, the latter of which is routinely ignored by motherboard manufacturers. Most high-end consumer motherboards ignore the sustained value, often 125 W, and allow the CPU to consume as much as it needs with the real limits being the full power consumption at full turbo, the thermals, or the power delivery limitations.

One of the dimensions of this we don’t often talk about is that the power consumption of a processor is always dependent on the actual instructions running through the core.  A core can be ‘100%’ active while sitting around waiting for data from memory or doing simple addition, however a core has multiple ways to run instructions in parallel, with the most complex instructions consuming the most power. This was noticeable in the desktop consumer space when Intel introduced vector extensions, AVX, to its processor design. The concurrent introduction of AVX2, and AVX512, means that running these instructions draws the most power.

AVX-512 comes with its own discussion, because even going into an ‘AVX-512’ mode causes additional issues. Intel’s introduction of AVX-512 on its server processors showcased that in order to remain stable, the core had to reduce the frequency and increase the voltage while also pausing the core to enter the special AVX-512 power mode. This made the advantage of AVX-512 suitably only for strong high-performance server code. But now Intel has enabled AVX-512 across its product line, from notebook to enterprise, with the running AI code faster, and enabling a new use cases. We’re also a couple of generations on from then, and AVX-512 doesn’t get quite the same hit as it did, but it still requires a lot of power.

For our power benchmarks, we’ve taken several tests that represent a real-world compute workload, a strong AVX2 workload, and a strong AVX512 workload.

Starting with the Agisoft power consumption, we’ve truncated it to the first 1200 seconds as after that the graph looks messy. Here we see the following power ratings in the first stage and second stage:

  • Intel Core i9-11900K (1912 sec): 164 W dropping to 135 W
  • Intel Core i7-11700K (1989 sec): 149 W dropping to 121 W
  • Intel Core i5-11600K (2292 sec): 109 W dropping to 96 W
  • AMD Ryzen 7 5800X (1890 sec): 121 W dropping to 96 W

So in this case, the heavy second section of the benchmark, the AMD processor is the lowest power, and quickest to finish. In the more lightly threaded first section, AMD is still saving 25% of the power compared to the big Core i9.

One of the big takeaways from our initial Core i7-11700K review was the power consumption under AVX-512 modes, as well as the high temperatures. Even with the latest microcode updates, both of our Core i9 parts draw lots of power.

The Core i9-11900K in our test peaks up to 296 W, showing temperatures of 104ºC, before coming back down to ~230 W and dropping to 4.5 GHz. The Core i7-11700K is still showing 278 W in our ASUS board, tempeartures of 103ºC, and after the initial spike we see 4.4 GHz at the same ~230 W.

The Core i5-11600K, with fewer cores, gets a respite here. Our peak power numbers are around the 206 W range, with the workload not doing an initial spike and staying around 4.6 GHz. Peak temperatures were at the 82ºC mark, which is very manageable. During AVX2, the i5-11600K was only at 150 W.

Moving to another real world workload, here’s what the power consumption looks like over time for Handbrake 1.3.2 converting a H.264 1080p60 file into a HEVC 4K60 file.

This is showing the full test, and we can see that the higher performance Intel processors do get the job done quicker. However, the AMD Ryzen 7 processor is still the lowest power of them all, and finishes the quickest. By our estimates, the AMD processor is twice as efficient as the Core i9 in this test.

Thermal Hotspots

Given that Rocket Lake seems to peak at 104ºC, and here’s where we get into a discussion about thermal hotspots.

There are a number of ways to report CPU temperature. We can either take the instantaneous value of a singular spot of the silicon while it’s currently going through a high-current density event, like compute, or we can consider the CPU as a whole with all of its thermal sensors. While the overall CPU might accept operating temperatures of 105ºC, individual elements of the core might actually reach 125ºC instantaneously. So what is the correct value, and what is safe?

The cooler we’re using on this test is arguably the best air cooling on the market – a 1.8 kilogram full copper ThermalRight Ultra Extreme, paired with a 170 CFM high static pressure fan from Silverstone. This cooler has been used for Intel’s 10-core and 18-core high-end desktop variants over the years, even the ones with AVX-512, and not skipped a beat. Because we’re seeing 104ºC here, are we failing in some way?

Another issue we’re coming across with new processor technology is the ability to effectively cool a processor. I’m not talking about cooling the processor as a whole, but more for those hot spots of intense current density. We are going to get to a point where can’t remove the thermal energy fast enough, or with this design, we might be there already.

I will point out an interesting fact down this line of thinking though, which might go un-noticed by the rest of the press – Intel has reduced the total vertical height of the new Rocket Lake processors.

The z-height, or total vertical height, of the previous Comet Lake generation was 4.48-4.54 mm. This number was taken from a range of 7 CPUs I had to hand. However, this Rocket Lake processor is over 0.1 mm smaller, at 4.36 mm. The smaller height of the package plus heatspreader could be a small indicator to the required thermal performance, especially if the airgap (filled with solder) between the die and the heatspreader is smaller. If it aids cooling and doesn’t disturb how coolers fit, then great, however at some point in the future we might have to consider different, better, or more efficient ways to remove these thermal hotspots.

Peak Power Comparison

For completeness, here is our peak power consumption graph.

(0-0) Peak Power

Platform Stability: Not Complete

It is worth noting that in our testing we had some issues with platform stability with our Core i9 processor. Personally, across two boards and several BIOS revisions, I would experience BSODs in high memory use cases. Gavin, our motherboard editor, was seeing lockups during game tests with his Core i9 on one motherboard, but it worked perfectly with a second. We’ve heard about issues of other press seeing lockups, with one person going through three motherboards to find stability. Conversations with an OEM showcased they had a number of instability issues running at default settings with their Core i9 processors.

The exact nature of these issues is unknown. One of my systems refused to post with 4x32 GB of memory, only with 2x32 GB of memory. Some of our peers that we’ve spoken to have had zero problems with any of their systems. For us, our Core i7 and Core i5 were absolutely fine. I have a second Core i9 processor here which is going through stability tests as this review goes live, and it seems to be working so far, which might point that it is a silicon/BIOS issue, not a memory issue.

Edit: As I was writing this, the second Core i9 crashed and restarted to desktop.

We spoke to Intel about the problem, and they acknowledged our information, stating:

We are aware of these reports and actively trying to reproduce these issues for further debugging.

Some motherboard vendors are only today putting out updated BIOSes for Intel’s new turbo technology, indicating that (as with most launches) there’s a variety of capability out there. Seeing some of the comments from other press in their reviews today, we’re sure this isn’t an isolated incident; however we do expect this issue to be solved.

Intel’s New Adaptive Boost Technology for Core i9-K/KF CPU Tests: Microbenchmarks
Comments Locked

279 Comments

View All Comments

  • ozzuneoj86 - Thursday, April 1, 2021 - link

    "Rocket Lake also gets you PCIe 4.0, however users might feel that is a small add-in when AMD has PCIe 4.0, lower power, and better general performance for the same price."

    If a time traveling tech journalist would have told us back in the Bulldozer days that Anandtech would be writing this sentence in 2021 in a nonchalant way (because AMD having better CPUs is the new normal), we wouldn't have believed him.
  • Hrel - Friday, April 2, 2021 - link

    Just in case anyone able to actually affect change reads these comments, I'm not even interested in these because the computer I built in 2014 has a 14nm processor too... albeit with DDR 3 RAM but come on, DDR4 isn't even much of a real world difference outside ultra specific niche scenarios.

    Intel, this is ridiculous, you're going to have been on the SAME NODE for a DECADE HERE!!!!

    Crying out loud 10nm has been around for longer than Intels 14nm, this is nuts!
  • James5mith - Saturday, April 3, 2021 - link

    " More and more NAS and routers are coming with one or more 2.5 GbE ports as standard"

    No, they most definitely are not. lol
  • Linustechtips12#6900xt - Monday, April 5, 2021 - link

    gotta say, love the arguments on page 9 lol
  • peevee - Monday, April 5, 2021 - link

    "the latest microcode from Intel should help increase performance and cache latency"

    Do we really want the increase in cache latency? ;) :)
  • 8 Cores is Enough - Wednesday, August 4, 2021 - link

    I just bought the 11900k with a z590 Gigabyte Aorous Pro Ax mobo and Samsung 980 pro 500GB ssd. This replaced my 9900k in a z390 Gigabyte Aurous Master with a 970 pro 512GB ssd.

    They're both 14nm node processors with 8c/16t and both overclocked, 5GHz all cores for 9900k and 5.2GHz all cores with up to 5.5GHz on one core via tiurbo modes on the 11900k.

    However, the 11900k outperforms the 9900k in every measure. In video encoding, which I do fairly often, it's twice as fast. In fact, the 11900k can comvert 3 videos at the same time each one as fast as my rtx 2070 super can do 1 video af a time.

    On UserBenchmark.com, my 11900k is the current record holder for fastest 11900k tested. It beats all the 10900k's even in the 64 thread server workload metric. It loses to the 5900x and 5950x in this one metric but clobbers them botb in the 1, 2, 4 and 8 core metrics.

    I wish I had a 5900x to test on Wondershare Uniconverter. I suspect my 11900k would match it given the 2X improvement over the 9900k, which was about 1/2 as fast as the 3950x in video comversion.

    I do a lot of video editing as well. Maybe on this workload an AMD 5900x or 5950x would beat the 11900k. It seems plausible so let's presume this and accept Ryzen 9 is most likely still best for video editing.

    But the cliam thaf being stuck on 14nm node means Intel RKL CPUs perform the same as Haswell or that they are even close does not make sense to me based on my experiences so far going from coffee lake refresh to RKL.

    The Rocket Lake CPUs are like the muscle cars of 1970. They are inefficient beasts that haul buttocks. They exist as a matter of circumstance and we may never see the likes of them again.

    Faster more efficient CPUs will be built but the 11th gen Intel CPUs will be remembered for being the back ported abominations they are: thirsty and fast with the software of 2021 which for the time being still favors single thread processing.

    If you play Kerbal Space Program then get an 11900k because that game is all about single thread performance and right now the 11900k beats all other CPUs at that.
  • Germanium - Thursday, September 2, 2021 - link

    My experimentation with my Rocket Lake Core I 11700k on my Asus Z590-A motherboard has shown me that it least on some samples AVX512 can be more efficient & cooler running than AVX2 at the same clock speed.

    I am running my sample at 4.4GHz both AVX512 & AVX2. When running Hand Brake there is nearly a 10 watt savings when running AVX512 as opposed to AVX2.

    Before anyone says Hand Brake does not use AVX512 & that is true out of the box but there is a setting script I found online to activate AVX512 on Hand Brake and it does work. It most be manually entered, no copy & paste available.

    With stock voltage settings at 4.2GHz using AVX2 at was drawing over 200 watts. With my settings I am able to run AVX512 at 4.4 GHz with peak wattage in Hand Brake of 185 watts. That was absolute peak wattage. It mostly ran between 170 to 180 watts. AVX2 runs about 10 watts more for slightly less performance at same clock speed.
  • Germanium - Thursday, September 2, 2021 - link

    Forgot to mention that on order to make AVX512 so efficient one must set the AVX Guard Band voltage Offset at or near 0 to bring the power to acceptable levels. Both AVX512 & AVX2 must be lowered. If AVX2 is not lowered at least same amount AVX512 setting will have little or no effect.
  • chane - Thursday, January 13, 2022 - link

    I hope my post is considered on topic

    Scenario 1: Without discrete graphics 1080p grade card, using on-chip graphics: Given the same core count (but below 10 cores), base and turbo frequencies and loaded with the same Cinebench and/or Handbrake test loads, would a Rocket lake Xeon w series processor run hotter, cooler or about the same as a Rocket Lake i family series processor with the same TDP spec?

    Scenario 2: As above but with 1080p grade discrete graphics card.

    Note: The Xeon processor pc will be using 16GB of ECC memory, however much that may impact heat and fan noise.

    Please advise.
    Thanks.

Log in

Don't have an account? Sign up now