CPU MT Performance: A Real Monster

What’s more interesting than ST performance, is MT performance. With 8 performance cores and 2 efficiency cores, this is now the largest iteration of Apple Silicon we’ve seen.

As a prelude into the scores, I wanted to remark some things on the previous smaller M1 chip. The 4+4 setup on the M1 actually resulted that a significant chunk of the MT performance being enabled by the E-cores, with the SPECint score in particular seeing a +33% performance boost versus just the 4 P-cores of the system. Because the new M1 Pro and Max have 2 less E-cores, just assuming linear scaling, the theoretical peak of the M1 Pro/Max should be +62% over the M1. Of course, the new chips should behave better than linear, due to the better memory subsystem.

In the detailed scores I’m showcasing the full 8+2 scores of the new chips, and later we’ll talk about the 8 P scores in context. I hadn’t run the MT scores of the new Fortran compiler set on the M1 and some numbers will be missing from the charts because of that reason.

SPECint2017 Rate-N Estimated Scores

Looking at the data – there’s very evident changes to Apple’s performance positioning with the new 10-core CPU. Although, yes, Apple does have 2 additional cores versus the 8-core 11980HK or the 5980HS, the performance advantages of Apple’s silicon is far ahead of either competitor in most workloads. Again, to reiterate, we’re comparing the M1 Max against Intel’s best of the best, and also nearly AMD’s best (The 5980HX has a 45W TDP).

The one workload standing out to me the most was 502.gcc_r, where the M1 Max nearly doubles the M1 score, and lands in +69% ahead of the 11980HK. We’re seeing similar mind-boggling performance deltas in other workloads, memory bound tests such as mcf and omnetpp are evidently in Apple’s forte. A few of the workloads, mostly more core-bound or L2 resident, have less advantages, or sometimes even fall behind AMD’s CPUs.

SPECfp2017 Rate-N Estimated Scores

The fp2017 suite has more workloads that are more memory-bound, and it’s here where the M1 Max is absolutely absurd. The workloads that put the most memory pressure and stress the DRAM the most, such as 503.bwaves, 519.lbm, 549.fotonik3d and 554.roms, have all multiple factors of performance advantages compared to the best Intel and AMD have to offer.

The performance differences here are just insane, and really showcase just how far ahead Apple’s memory subsystem is in its ability to allow the CPUs to scale to such degree in memory-bound workloads.

Even workloads which are more execution bound, such as 511.porvray or 538.imagick, are – albeit not as dramatically, still very much clearly in favour of the M1 Max, achieving significantly better performance at drastically lower power.

We noted how the M1 Max CPUs are not able to fully take advantage of the DRAM bandwidth of the chip, and as of writing we didn’t measure the M1 Pro, but imagine that design not to score much lower than the M1 Max here. We can’t help but ask ourselves how much better the CPUs would score if the cluster and fabric would allow them to fully utilise the memory.

SPEC2017 Rate-N Estimated Total

In the aggregate scores – there’s two sides. On the SPECint work suite, the M1 Max lies +37% ahead of the best competition, it’s a very clear win here and given the power levels and TDPs, the performance per watt advantages is clear. The M1 Max is also able to outperform desktop chips such as the 11900K, or AMD’s 5800X.

In the SPECfp suite, the M1 Max is in its own category of silicon with no comparison in the market. It completely demolishes any laptop contender, showcasing 2.2x performance of the second-best laptop chip. The M1 Max even manages to outperform the 16-core 5950X – a chip whose package power is at 142W, with rest of system even quite above that. It’s an absolutely absurd comparison and a situation we haven’t seen the likes of.

We also ran the chip with just the 8 performance cores active, as expected, the scores are a little lower at -7-9%, the 2 E-cores here represent a much smaller percentage of the total MT performance than on the M1.

Apple’s stark advantage in specific workloads here do make us ask the question how this translates into application and use-cases. We’ve never seen such a design before, so it’s not exactly clear where things would land, but I think Apple has been rather clear that their focus with these designs is catering to the content creation crowd, the power users who use the large productivity applications, be it in video editing, audio mastering, or code compiling. These are all areas where the microarchitectural characteristics of the M1 Pro/Max would shine and are likely vastly outperform any other system out there.

CPU ST Performance: Not Much Change from M1 GPU Performance: 2-4x For Productivity, Mixed Gaming
Comments Locked

493 Comments

View All Comments

  • michael2k - Thursday, October 28, 2021 - link

    Power consumption scales linearly with clock speed.

    Clock speed, however, is constrained by voltage. That said, we already know that the M1M itself has a 3.2GHz clock while the GPU is only running at 1.296GHz. It is unknown if there is any reason other than power for the GPU to run so slowly. If they could double the GPU clock (and therefore double it's performance) without increasing it's voltage, it would only draw about 112W. If they let it run at 3.2GHz it would draw 138W.

    Paired with the CPU drawing 40W the M1M would still be several times under the Mac Pro's current 902W. So that leaves open the possibility of a multiple chip solution (4 M1P still only draws 712W if the GPU is clocked to 3.2GHz) as well as clocking up slightly to 3.5GHz, assuming no need to increase voltage. Bumping up to 3.5GHz would still only consume 778W while giving us almost 11x the GPU power of the current M1P, which would be 11x the performance of the 3080 found in the GE76 Raider

    Also, you bring up AMD/Intel/NVIDIA at 5nm, without also considering that when Apple stops locking up 5nm it's because they will be at 4nm and 3nm.
  • uningenieromas - Thursday, October 28, 2021 - link

    You would think that if Apple's silicon engineers are so freakin' good, they could basically work wherever they want...and, yep, they chose Apple. There might be a reason for that?
  • varase - Wednesday, November 3, 2021 - link

    We're glad you shared your religious epiphany with the rest of us 😳.
  • Romulo Pulcinelli Benedetti - Sunday, May 22, 2022 - link

    Sure, Intel and AMD would take all the hard work to advance humanity toward Apple level chips if Apple was not there, believe in this...
  • Alej - Tuesday, October 26, 2021 - link

    The native ARM Mac scarcity I don’t fully get, a lot of games get ported to the switch which is already ARM. And if they are using Vulkan as the graphics API then there’s already MoltenVK to translate it to Metal, which even if not perfect and won’t use the 100% of available tricks and optimizations, it would run well enough.
  • Wrs - Tuesday, October 26, 2021 - link

    @Alej It's a numbers and IDE game. 90 million Switches sold, all purely for gaming, supported by a company that exclusively does games. 20 million Macs sold yearly, most not for gaming in the least, built by a company not focused on gaming for that platform. iPhones are partially used for gaming, however, and sell many times the volume of the Switch, so as expected there's a strong gaming ecosystem.
  • Kangal - Friday, October 29, 2021 - link

    Apple is happy where they are.
    However, if Apple were a little faster/wiser, they would've made the switch from Intel Macs to M1 Macs back in 2018 using the TSMC 7nm node, their Tempest/Vortex CPUs and their A12-GPU. They wouldn't be too far removed from the performance of the M1, M1P, M1X if scaled similarly.

    And even more interesting, what if Apple released a great Home Console?
    Something that is more compact than the Xbox Series S, yet more powerful than the Xbox Series X. That would leave both Microsoft and Sony scrambling. They could've designed a very ergonomic controller with much less latency, and they could've enticed all these AAA-developers to their platform (Metal v2 / Swift v4). It would be gaming-centric, with out-of-box support for iOS games/apps, and even a limited-time support (Rosetta v2) for legacy OS X Applications. They wouldn't be able to subsidies the pricing like Sony, but could basically front the costs from their own pocket to bring it to a palatable RRP. After 2 years, then they would be able to turn a profit from its hardware sales and software sales.

    I'm sure they could have been a hit. And it would then pivot to make MacBook Pro's more friendly for media consumption, and developer-supported. Strengthening their entire ecosystem, and leveraging their unique position in software and hardware to remain competitive.
  • kwohlt - Tuesday, October 26, 2021 - link

    I think it is just you. Imagine a hypothetical ultra thin, fanless laptop that offered 20 hours of battery under load and could play games at desktop 3080 levels...Would you wish this laptop was louder, hotter, and had worse battery?

    No of course not. Consuming less power and generating less heat, while offering similar or better performance has always been the goal of computing. It's this trend that allows us to daily carry computing power that was once the size of a refrigerator in our pockets and on our wrists.
  • Wrs - Wednesday, October 27, 2021 - link

    No, but I might wish it could scale upward to a desktop/console for way more performance than a 3080. :) That would also be an indictment of how poorly the 3080 is designed or fabricated, or how old it is.

    Now, if in the future silicon gets usurped by a technology that does not scale up in power density, then I could be forced to say yes.
  • turbine101 - Monday, October 25, 2021 - link

    Why would developers waste there time on a device which will have barely any sales?

    The M1 Mac Max costs $6knzd. That's just crazy, even the most devout Apple enthusiasts cannot justify this. And Mac is far less usable than IOS.

Log in

Don't have an account? Sign up now