Power Consumption, Temperature

Two other arguments for having SMT enabled or disabled comes down to power consumption and temperature.

With SMT enabled, the core utilization is expected to be higher, with more instructions flowing through and being processed per cycle. This naturally increases the power requirements on the core, but might also reduce the frequency of the core. The trade-off is meant to be that the work going through the core should be more than enough to make up for extra power used, or any lower frequency. The lower frequency should enable a more efficient throughput, assuming the voltage is adjusted accordingly.

This is perhaps where AMD and Intel differ slightly. Intel’s turbo frequency range is hard-bound to specific frequency values based on core loading, regardless of how many threads are active or how many threads per core are active. The activity is a little more opportunistic when we reach steady state power, although exactly how far down the line that is will depend on what the motherboard has set the power length to. AMD’s frequency is continually opportunistic from the moment load is applied: it obviously scales down as more cores are loaded, but it will balance up and down based on core load at all times. On the side of thermals, this will depend on the heat density being generated in each core, but this also acts as a feedback loop into the turbo algorithm if the power limit has not been reached.

For our analysis here, we’ve picked two benchmarks. Agisoft, which is a variable threaded test performs practically the same with SMT On/Off, and 3DPMavx, a pure MT test which gets the biggest gain from SMT.

Agisoft

Photoscan from Agisoft is a 2D image to 3D model creator, using dozens of high-quality 2D images to generate related point maps to form a 3D model, before finally texturing the model using the images provided. It is used in archiving artefacts, as well as converting 2D sculpture into 3D scenes. Our test analyses a standardized set of 85 x 18 megapixel photos, with a result measured in time to complete.

Simply looking at CPU temperatures while running our real-world Agisoft test, our current setup (MSI X570 Godlike with Noctua NH12S) shows that both CPUs will flutter around 74ºC sustained. Perhaps the interesting element is at the beginning of the test, where the CPU temperatures are higher in SMT Off mode. Looking into the data, and during SMT Off, the processor is at 4300 MHz, compared to 4150 MHz when SMT is enabled. This would account for the difference.

Looking at power, we can follow that for the bulk of the test, both processors have similar package power consumption, around 130 W. The SMT Off is drawing more power during the first couple of minutes of the test, due to the higher frequency. Clearly the thermal density in this part of the test by only having one thread per core is allowing for a higher turbo.

If we measure the total power of the test, it’s basically identical in any metric that matters. Nearer the end of the test, where the workload is more variably threaded, this is where the SMT Off mode seems to come under power. This benchmark completion time is essentially the same due to the nature of the test, but SMT Off comes in at 2% lower power overall.

3DPMavx (3D Particle Movement)

Our 3DPM test is an algorithmic sequence of non-interactive random three-dimensional movement, designed to simulate molecular diffusive movement inside a gas or a fluid. The simulation is made non-interactive (i.e. no two molecules will collide) due to the original average movement of each particle taking collisions into account. Our test cycles through six movement algorithms at ten seconds apiece, followed by ten seconds of idle, with the whole loop being repeated six times, taking about 20 minutes, regardless of how fast or slow the processor is. The related performance figure is millions of particle movements per second. Each algorithm has been accelerated for AVX2.

On the temperature side of things, it is clear that the SMT Off mode again puts up a higher thermal profile. Temperatures this time peak at 66ºC, but it is clear the difference between the two modes.

On the power side, we can see why SMT Off mode is warmer – the cores are drawing more power. Looking at the data, SMT Off mode is running ~4350 MHz, compared to SMT On which is running closer to 4000 MHz.

With the higher frequency with SMT Off, the estimated total power consumption is 6.8% higher. This appears to be very constant throughout the benchmark, which lasts about 20 minutes total.

But, let us add in the performance numbers. Because 3DPMavx can take advantage of SMT On, that mode scores +77.5% by having two threads per core rather than one (a score of 10245 vs 5773). Combined this makes SMT On mode +91% better in performance per watt on this benchmark.

Gaming Performance (Discrete GPU) Conclusions: SMT On
Comments Locked

126 Comments

View All Comments

  • Holliday75 - Thursday, December 3, 2020 - link

    As usage for modern users changes I wonder how this could be better tested/visualized.

    I am not looking at a 5900x to run any advanced tools. I am looking to game, run mutiple browsers with a few dozen tabs open, stream, download, run Plex (transcoding), security tools, VPN, and the million other applications a normal user would have running at any given point in time. While no two users will have the same workload at any given time, how could we quantify SMT versus no SMT for the average user?

    In the not to distance future we could be seeing the average PC running 32 cores. I am talking your run of the mill office machine from Dell that costs $800. Or will we? Is there a point where it does not matter anymore?
  • realbabilu - Thursday, December 3, 2020 - link

    Simple. At average user 4 core 8gen u series have more core than the generation before. It has more strength, but it's rarely got 100 percent cpu utilized for those normal you doing.
    To get 8 threads or 4 cores work 100 percent need killer applications that programmed by man know how to extract every juice of it processor, know how to program multithread, or using optimized math kernel.library / optimized compiler switch like FEM, Render, math applied science.
    Other than those app, maybe you could expense it to gpu for gaming.
  • schujj07 - Thursday, December 3, 2020 - link

    Or you just have multiple tabs open. I regularly hit 100% usage on my work i5-6400 with 4c/4t having 10-12 tabs open. It gets quite annoying as on a normal day I might need up to double that open at any given time. That means that 20 tabs would peg a 4c/8t CPU pretty easily.
  • evilpaul666 - Friday, December 4, 2020 - link

    You need an ad blocker unless those tabs are all very busy doing something. I mean, it sounds like they're mining Monero for somebody else, I mean what they're *supposed* to be doing for you.
  • schujj07 - Friday, December 4, 2020 - link

    I use an ad blocker and nothing is being mined. However, ads are an example of things that will destroy your performance in web browsing quite quickly and suck up a lot of CPU cycles. While right now 4c/8t is enough for an office machine, it will not be long before 6c/12t is the standard.
  • marrakech - Tuesday, December 15, 2020 - link

    15 cores are the futureeeeee
  • Hulk - Thursday, December 3, 2020 - link

    Wouldn't high SMT performance be an indication of bad software design rather than bad core design?
    While SMT performance is changing in these tests the core is not. Only the software is changing. It seems as though an Intel CPU in this comparison would have provided additional insights to these questions.
  • BillyONeal - Thursday, December 3, 2020 - link

    The situations that create high SMT performance are generally outside the software in question's control. For example, a program might have 1 thread that's doing all divides and another that's doing all multiplies. The thread that only has multiplies or divisions aren't poorly designed, they just aren't using units on the chip that don't help their respective workloads.

    There are also cache effects. If you have 2 threads working on data bigger than the CPU's caches while one is waiting for that data to come back from memory the other can make unrelated progress and vice versa, but the data being big isn't necessarily an indicator of poor software design. Some problem domains just have big data sets there's no way around.
  • WaltC - Thursday, December 3, 2020 - link

    Exactly. Some software is written to utilize a lot of threads simultaneously, some is not. Running software that does not make use of a lot of simultaneous threads tells us really nothing much about SMT CPU hardware, imo, other than "this software doesn't support it very well."
  • Elstar - Thursday, December 3, 2020 - link

    SMT24? Ha. Try SMT128: https://en.wikipedia.org/wiki/Cray_XMT#Threadstorm...

Log in

Don't have an account? Sign up now