Power Consumption, Temperature

Two other arguments for having SMT enabled or disabled comes down to power consumption and temperature.

With SMT enabled, the core utilization is expected to be higher, with more instructions flowing through and being processed per cycle. This naturally increases the power requirements on the core, but might also reduce the frequency of the core. The trade-off is meant to be that the work going through the core should be more than enough to make up for extra power used, or any lower frequency. The lower frequency should enable a more efficient throughput, assuming the voltage is adjusted accordingly.

This is perhaps where AMD and Intel differ slightly. Intel’s turbo frequency range is hard-bound to specific frequency values based on core loading, regardless of how many threads are active or how many threads per core are active. The activity is a little more opportunistic when we reach steady state power, although exactly how far down the line that is will depend on what the motherboard has set the power length to. AMD’s frequency is continually opportunistic from the moment load is applied: it obviously scales down as more cores are loaded, but it will balance up and down based on core load at all times. On the side of thermals, this will depend on the heat density being generated in each core, but this also acts as a feedback loop into the turbo algorithm if the power limit has not been reached.

For our analysis here, we’ve picked two benchmarks. Agisoft, which is a variable threaded test performs practically the same with SMT On/Off, and 3DPMavx, a pure MT test which gets the biggest gain from SMT.

Agisoft

Photoscan from Agisoft is a 2D image to 3D model creator, using dozens of high-quality 2D images to generate related point maps to form a 3D model, before finally texturing the model using the images provided. It is used in archiving artefacts, as well as converting 2D sculpture into 3D scenes. Our test analyses a standardized set of 85 x 18 megapixel photos, with a result measured in time to complete.

Simply looking at CPU temperatures while running our real-world Agisoft test, our current setup (MSI X570 Godlike with Noctua NH12S) shows that both CPUs will flutter around 74ºC sustained. Perhaps the interesting element is at the beginning of the test, where the CPU temperatures are higher in SMT Off mode. Looking into the data, and during SMT Off, the processor is at 4300 MHz, compared to 4150 MHz when SMT is enabled. This would account for the difference.

Looking at power, we can follow that for the bulk of the test, both processors have similar package power consumption, around 130 W. The SMT Off is drawing more power during the first couple of minutes of the test, due to the higher frequency. Clearly the thermal density in this part of the test by only having one thread per core is allowing for a higher turbo.

If we measure the total power of the test, it’s basically identical in any metric that matters. Nearer the end of the test, where the workload is more variably threaded, this is where the SMT Off mode seems to come under power. This benchmark completion time is essentially the same due to the nature of the test, but SMT Off comes in at 2% lower power overall.

3DPMavx (3D Particle Movement)

Our 3DPM test is an algorithmic sequence of non-interactive random three-dimensional movement, designed to simulate molecular diffusive movement inside a gas or a fluid. The simulation is made non-interactive (i.e. no two molecules will collide) due to the original average movement of each particle taking collisions into account. Our test cycles through six movement algorithms at ten seconds apiece, followed by ten seconds of idle, with the whole loop being repeated six times, taking about 20 minutes, regardless of how fast or slow the processor is. The related performance figure is millions of particle movements per second. Each algorithm has been accelerated for AVX2.

On the temperature side of things, it is clear that the SMT Off mode again puts up a higher thermal profile. Temperatures this time peak at 66ºC, but it is clear the difference between the two modes.

On the power side, we can see why SMT Off mode is warmer – the cores are drawing more power. Looking at the data, SMT Off mode is running ~4350 MHz, compared to SMT On which is running closer to 4000 MHz.

With the higher frequency with SMT Off, the estimated total power consumption is 6.8% higher. This appears to be very constant throughout the benchmark, which lasts about 20 minutes total.

But, let us add in the performance numbers. Because 3DPMavx can take advantage of SMT On, that mode scores +77.5% by having two threads per core rather than one (a score of 10245 vs 5773). Combined this makes SMT On mode +91% better in performance per watt on this benchmark.

Gaming Performance (Discrete GPU) Conclusions: SMT On
Comments Locked

126 Comments

View All Comments

  • abufrejoval - Thursday, December 3, 2020 - link

    It's hard to imagine a transistor defect that would break *only* SMT. As you say all non-SMT chips are really SMT chips internally and the decision to disable SMT doesn't really result in huge chunks of transistors going dark (the potential target area for physical defects).

    I'd say most of the SMT vs. no-SMT decisions on individual CPUs are binning related: SMT can create significantly more heat because there is less idle which allows the chip to cool. So if you have a chip with higher resistance in critical vias and require higher voltage to function, you need to sacrifice clocks, TDP or utilization (and permutations).
  • leexgx - Saturday, December 5, 2020 - link

    With HT off I have definitely noticed less smoothness windows, as with HT it can keep the cpu active when a thread is slightly stuck
  • iranterres - Thursday, December 3, 2020 - link

    Why are people still testing SMT in 2020? Cache coherency and hierarchy design is mature enough to offset the possible instruction bottleneck issues. I don't even know the purpose of this article at all... Anyways, perhaps fallng back to 2008? Come on...
  • quadibloc - Friday, December 4, 2020 - link

    Well, instead of testing the concept of SMT, which has been around for a while, perhaps one could think of it as testing the implementation of SMT found on the chips we can get in 2020.
  • eastcoast_pete - Friday, December 4, 2020 - link

    Thanks Ian! I always thought of SMT as a way of using whatever compute capacity a core has, but isn't being used in the moment. Hence it's efficient if many tasks need doing that each don't take a full core most of the time. However, that hits a snag if the cores get really busy. Hence (for desktop or laptop), 6 or 8 real cores are usually better than 4 cores that pretend to be 8.
  • AntonErtl - Friday, December 4, 2020 - link

    I found the "Is SMT an good thing" discussion (and later discussion of the same topics) strange, because it seemed to take the POV of someone who wants to optimize some efficiency or utilization metric of someone who can choose the number of resources in the core. If you are in that situation, then the take of the EV8 designers was: we build a wide machine so that single-threaded applications can run fast, even though we know the wideness leads to low utilization; we also add SMT so that multi-threaded applications can increase utilization. Now, 20 years later, such wide cores become reality, although interestingly Apple and ARM do not add SMT.

    Anyway, buyers and users of off-the-shelf CPUs are not in that situation, and for them the questions are: For buyers: How much benefit does the SMT capabilty provide, and is it worth the extra money? For users: Does disabling SMT on this SMT-capable CPU increase the performance or the efficiency?

    The article shows that the answers to these questions depend on the application (although for the Zen3 CPUs available now the buyer's question does not pose itself).

    It would be interesting to see whether the wider Zen3 design gives significantly better SMT performance than Zen or Zen2 (and maybe also a comparison with Intel), but that would require also testing these CPUs.

    I did not find it surprising that the 5950X runs into the power limit with and without SMT. The resulting clock rates are mentioned in the text, but might be more interesting graphically than the temperature. What might also be interesting is the power consumed at the same clock frequency (maybe with fewer active cores and/or the clock locked at some lower clock rate).

    If SMT is so efficient (+91%) for 3DPMavx, why does the graphics only show a small difference?
  • Bensam123 - Friday, December 4, 2020 - link

    Anand, while I value your in depth articles you guys really need to drop the 95th percentile frame times and get on board with 1% and .1% lows. What disrupts gaming the most is the hiccups, not looking at a statistically smooth chart. SMT/HT effects these THE most, especially in heavily single threaded games. If you aren't testing what it influences, why test it at all? Youtube reviews are also having problems with tests that don't reflect real world scenarios as well. Sometimes it's a lot more disagreeable then others.

    Completely invalid testing methodology at this point.

    My advice based on my own testing. You turn off SMT/HT except in scenarios in which you become CPU bound, across all cores, not one. This improved .1 and 1% frame time... IE stutters. You turn it on when you reach a point of 90%+ utilization as it helps and a lot when your CPU is maxed out. Generally speaking <6 and soon to be 8 cores should always have it on.

    You didn't even test where this helps the most and that's low end CPUs vs high end CPUs where you find the Windows scheduler messes things up.

    Also if you're testing this on your own, always turn it off in the bios. If you use something like process lasso or manually change affinity, windows will still put protected services and process onto those extra virtual cores causing contention issues that lead to the stuttering.

    Most obvious games that get a benefit from SMT/HT off are heavily single threaded games, such as MOBAS.
  • Gloryholle - Friday, December 4, 2020 - link

    Testing Zen3 with 3200CL16?
  • peevee - Friday, December 4, 2020 - link

    "Most modern processors, when in SMT-enabled mode, if they are running a single instruction stream, will operate as if in SMT-off mode and have full access to resources."

    Which would have access to the whole microinstruction cache (L0I) in SMT mode?
  • Arbie - Friday, December 4, 2020 - link

    Another excellent AT article, which happens to hit my level of knowledge and interest; thanks!

Log in

Don't have an account? Sign up now