CPU MT Performance: A Real Monster

What’s more interesting than ST performance, is MT performance. With 8 performance cores and 2 efficiency cores, this is now the largest iteration of Apple Silicon we’ve seen.

As a prelude into the scores, I wanted to remark some things on the previous smaller M1 chip. The 4+4 setup on the M1 actually resulted that a significant chunk of the MT performance being enabled by the E-cores, with the SPECint score in particular seeing a +33% performance boost versus just the 4 P-cores of the system. Because the new M1 Pro and Max have 2 less E-cores, just assuming linear scaling, the theoretical peak of the M1 Pro/Max should be +62% over the M1. Of course, the new chips should behave better than linear, due to the better memory subsystem.

In the detailed scores I’m showcasing the full 8+2 scores of the new chips, and later we’ll talk about the 8 P scores in context. I hadn’t run the MT scores of the new Fortran compiler set on the M1 and some numbers will be missing from the charts because of that reason.

SPECint2017 Rate-N Estimated Scores

Looking at the data – there’s very evident changes to Apple’s performance positioning with the new 10-core CPU. Although, yes, Apple does have 2 additional cores versus the 8-core 11980HK or the 5980HS, the performance advantages of Apple’s silicon is far ahead of either competitor in most workloads. Again, to reiterate, we’re comparing the M1 Max against Intel’s best of the best, and also nearly AMD’s best (The 5980HX has a 45W TDP).

The one workload standing out to me the most was 502.gcc_r, where the M1 Max nearly doubles the M1 score, and lands in +69% ahead of the 11980HK. We’re seeing similar mind-boggling performance deltas in other workloads, memory bound tests such as mcf and omnetpp are evidently in Apple’s forte. A few of the workloads, mostly more core-bound or L2 resident, have less advantages, or sometimes even fall behind AMD’s CPUs.

SPECfp2017 Rate-N Estimated Scores

The fp2017 suite has more workloads that are more memory-bound, and it’s here where the M1 Max is absolutely absurd. The workloads that put the most memory pressure and stress the DRAM the most, such as 503.bwaves, 519.lbm, 549.fotonik3d and 554.roms, have all multiple factors of performance advantages compared to the best Intel and AMD have to offer.

The performance differences here are just insane, and really showcase just how far ahead Apple’s memory subsystem is in its ability to allow the CPUs to scale to such degree in memory-bound workloads.

Even workloads which are more execution bound, such as 511.porvray or 538.imagick, are – albeit not as dramatically, still very much clearly in favour of the M1 Max, achieving significantly better performance at drastically lower power.

We noted how the M1 Max CPUs are not able to fully take advantage of the DRAM bandwidth of the chip, and as of writing we didn’t measure the M1 Pro, but imagine that design not to score much lower than the M1 Max here. We can’t help but ask ourselves how much better the CPUs would score if the cluster and fabric would allow them to fully utilise the memory.

SPEC2017 Rate-N Estimated Total

In the aggregate scores – there’s two sides. On the SPECint work suite, the M1 Max lies +37% ahead of the best competition, it’s a very clear win here and given the power levels and TDPs, the performance per watt advantages is clear. The M1 Max is also able to outperform desktop chips such as the 11900K, or AMD’s 5800X.

In the SPECfp suite, the M1 Max is in its own category of silicon with no comparison in the market. It completely demolishes any laptop contender, showcasing 2.2x performance of the second-best laptop chip. The M1 Max even manages to outperform the 16-core 5950X – a chip whose package power is at 142W, with rest of system even quite above that. It’s an absolutely absurd comparison and a situation we haven’t seen the likes of.

We also ran the chip with just the 8 performance cores active, as expected, the scores are a little lower at -7-9%, the 2 E-cores here represent a much smaller percentage of the total MT performance than on the M1.

Apple’s stark advantage in specific workloads here do make us ask the question how this translates into application and use-cases. We’ve never seen such a design before, so it’s not exactly clear where things would land, but I think Apple has been rather clear that their focus with these designs is catering to the content creation crowd, the power users who use the large productivity applications, be it in video editing, audio mastering, or code compiling. These are all areas where the microarchitectural characteristics of the M1 Pro/Max would shine and are likely vastly outperform any other system out there.

CPU ST Performance: Not Much Change from M1 GPU Performance: 2-4x For Productivity, Mixed Gaming
Comments Locked

493 Comments

View All Comments

  • ruthan - Friday, October 29, 2021 - link

    So great on paper and for some number crunching, compiling and maybe some video editing.. but where you really need performance for gaming it sucks... and all Apples lofty paper specs are gone. I know that there is some translation layer, but its Apple choice to use it.
  • richardnpaul - Sunday, October 31, 2021 - link

    I think that that is a bit of an unfair characterisation at this stage.
  • jojo62 - Saturday, October 30, 2021 - link

    I am programmer. Not a gaming programmer but I use my Mac Book Pro 2019 to connect to my work computer. I run Databases like Oracle 21c, microsoft sql server, and others in Windows 11 on my Mac. The performance is great and these laptops last forever. I still have my mac book pro 2012 laptop and it works. I've had many many computers over the years and they all seem to die after 3-4 years but not my apple computers. I think PC makers have implemented planned Obsolescence on their products. I am upgrading to the new mac book pro m1 max soon.
  • razer555 - Saturday, October 30, 2021 - link

    https://www.youtube.com/watch?v=xRPPLrlUeSA

    Anadtech, It seems you really need to test with Baldur's gate 3 which can perform 4K 100~120 FPS.
  • ailooped - Monday, November 1, 2021 - link

    What 7? years back there were proof of concept ARM computers that proved you can run many many processors in parallel. I am not that technically apt, of course. However this seems like apple taking advantage of that fact.

    They are just doubling everything. I am guessing we will see a 64 core graphics and perhaps a max of 128 core for Mac Pro. With M2 cpu cores also doubling to 24 cores or something like that.

    Yes, Apple chose to say goodbye to windows compatibility. However, they have a HUGE developer base in iOS. And they (Mac and iOS) are now on-par and running on the same silicone.

    This is a disruption to the pc world no matter how you slice it. Of course, intel can see it hence the smear ads against apple. Windows is quietly tinkering with their ARM version of windows, just to see if apple can actually take off with it.

    The pc ecosystem is already suffering from the influx of powerful smartphones/tablets. And now apple is in 100% with ARM computers, with a HUGE iOS user base what will be seduced by a seamless transition to Macs from iPhones? Perhaps.. Understandable that Apple is trying...

    Do you really mind though? The Intel/AMD/Nvidia trifecta seems to be quite stagnant on CISC. Perhaps it`s better for the PC ecosystem to be on the same silicone as phones and tablets... To benefit from ALL that R&D money going into it...
  • ailooped - Monday, November 1, 2021 - link

    silicon...
  • ailooped - Monday, November 1, 2021 - link

    To be quite honest, I am not sure I want to see Apple with their approach to hardware gain tons of marketshare on the desktop/laptops.. No upgradeability... RAM integrated into CPU... I DO however think Intel/AMD/Nvidia can do with a fourth player in the GPU/CPU game..
  • jmmx - Tuesday, November 2, 2021 - link

    It would be nice to see some discussion of the NPU. I imagine it would be hard to find any tests across platforms but some type of evaluation would be helpful.
  • bgnn - Tuesday, November 2, 2021 - link

    Clarification on node advantage.. I've designed in both 7nm and 5nm. The power and performance increases are marginal compared to good old days. Back then when we switched from 32nm to 28nm we had more than 70% perf/power increase. 7nm to 5nm it's more like 25% at best. Density is the main benefit. Interconnect is killing it for smaller nodes. Gate contacts are tiny and they are incredibly resistive..
  • Hrunga_Zmuda - Sunday, November 7, 2021 - link

    Anyone who actually designs in this corner of the computer industry must be familiar with the law of diminishing returns. Right?

Log in

Don't have an account? Sign up now