Last week, Apple had unveiled their new generation MacBook Pro laptop series, a new range of flagship devices that bring with them significant updates to the company’s professional and power-user oriented user-base. The new devices particularly differentiate themselves in that they’re now powered by two new additional entries in Apple’s own silicon line-up, the M1 Pro and the M1 Max. We’ve covered the initial reveal in last week’s overview article of the two new chips, and today we’re getting the first glimpses of the performance we’re expected to see off the new silicon.

The M1 Pro: 10-core CPU, 16-core GPU, 33.7bn Transistors

Starting off with the M1 Pro, the smaller sibling of the two, the design appears to be a new implementation of the first generation M1 chip, but this time designed from the ground up to scale up larger and to more performance. The M1 Pro in our view is the more interesting of the two designs, as it offers mostly everything that power users will deem generationally important in terms of upgrades.

At the heart of the SoC we find a new 10-core CPU setup, in a 8+2 configuration, with there being 8 performance Firestorm cores and 2 efficiency Icestorm cores. We had indicated in our initial coverage that it appears that Apple’s new M1 Pro and Max chips is using a similar, if not the same generation CPU IP as on the M1, rather than updating things to the newer generation cores that are being used in the A15. We seemingly can confirm this, as we’re seeing no apparent changes in the cores compared to what we’ve discovered on the M1 chips.

The CPU cores clock up to 3228MHz peak, however vary in frequency depending on how many cores are active within a cluster, clocking down to 3132 at 2, and 3036 MHz at 3 and 4 cores active. I say “per cluster”, because the 8 performance cores in the M1 Pro and M1 Max are indeed consisting of two 4-core clusters, both with their own 12MB L2 caches, and each being able to clock their CPUs independently from each other, so it’s actually possible to have four active cores in one cluster at 3036MHz and one active core in the other cluster running at 3.23GHz.

The two E-cores in the system clock at up to 2064MHz, and as opposed to the M1, there’s only two of them this time around, however, Apple still gives them their full 4MB of L2 cache, same as on the M1 and A-derivative chips.

One large feature of both chips is their much-increased memory bandwidth and interfaces – the M1 Pro features 256-bit LPDDR5 memory at 6400MT/s speeds, corresponding to 204GB/s bandwidth. This is significantly higher than the M1 at 68GB/s, and also generally higher than competitor laptop platforms which still rely on 128-bit interfaces.

We’ve been able to identify the “SLC”, or system level cache as we call it, to be falling in at 24MB for the M1 Pro, and 48MB on the M1 Max, a bit smaller than what we initially speculated, but makes sense given the SRAM die area – representing a 50% increase over the per-block SLC on the M1.

 

The M1 Max: A 32-Core GPU Monstrosity at 57bn Transistors

Above the M1 Pro we have Apple’s second new M1 chip, the M1 Max. The M1 Max is essentially identical to the M1 Pro in terms of architecture and in many of its functional blocks – but what sets the Max apart is that Apple has equipped it with much larger GPU and media encode/decode complexes. Overall, Apple has doubled the number of GPU cores and media blocks, giving the M1 Max virtually twice the GPU and media performance.

The GPU and memory interfaces of the chip are by far the most differentiated aspects of the chip, instead of a 16-core GPU, Apple doubles things up to a 32-core unit. On the M1 Max which we tested for today, the GPU is running at up to 1296MHz  - quite fast for what we consider mobile IP, but still significantly slower than what we’ve seen from the conventional PC and console space where GPUs now can run up to around 2.5GHz.

Apple also doubles up on the memory interfaces, using a whopping 512-bit wide LPDDR5 memory subsystem – unheard of in an SoC and even rare amongst historical discrete GPU designs. This gives the chip a massive 408GB/s of bandwidth – how this bandwidth is accessible to the various IP blocks on the chip is one of the things we’ll be investigating today.

The memory controller caches are at 48MB in this chip, allowing for theoretically amplified memory bandwidth for various SoC blocks as well as reducing off-chip DRAM traffic, thus also reducing power and energy usage of the chip.

Apple’s die shot of the M1 Max was a bit weird initially in that we weren’t sure if it actually represents physical reality – especially on the bottom part of the chip we had noted that there appears to be a doubled up NPU – something Apple doesn’t officially disclose. A doubled up media engine makes sense as that’s part of the features of the chip, however until we can get a third-party die shot to confirm that this is indeed how the chip looks like, we’ll refrain from speculating further in this regard.

Huge Memory Bandwidth, but not for every Block
Comments Locked

493 Comments

View All Comments

  • Speedfriend - Tuesday, October 26, 2021 - link

    This isn't their first attempt. They have been building laptop version of the A series chips for years now for testing. There have been leaks about this for years. Assuming that the world best SOC design team will make a significant advancement from here after 10 years of progress on A series is hoping for a bit much
  • robotManThingy - Tuesday, October 26, 2021 - link

    All of the games are x86 translated by Apple's Rosetta, which means they are meaningless when it come to determining the speed of the M1 Max or any other M1 chip.
  • TheinsanegamerN - Tuesday, October 26, 2021 - link

    Real-world software isnt worthless.
  • AshlayW - Tuesday, October 26, 2021 - link

    "The M1X is slightly slower than the RTX-3080, at least on-paper and in synthetic benchmarks."
    Not quite, it matches the 3080 in mobile-focused synthetics where Apple is focusing on pretending to have best-in-class performance, and then its true colours shows in actual video gaming. This GPU is for content creators (where it's excellent) but you don't just out-muscle decades of GPU IP optimisation for gaming in hardware and software that AMD/NVIDIA have. Furthermore, the M1MAX is significantly weaker in GPU resources than the GA104 chip in the mobile 3080, which here, is actually limited to quite low clock speeds, it is no surprise it is faster in actual games, by a lot.
  • TheinsanegamerN - Tuesday, October 26, 2021 - link

    Rarely do synthetics ever line up with real word performance, especially in games. MatcHong 3060 mobile performance is already pretty good.
  • NPPraxis - Tuesday, October 26, 2021 - link

    Where are you seeing "actual gaming performance" benchmarks that you can compare? There's very few AAA games available for Mac to begin with; most of the ones that do exist are running under Rosetta 2 or not using Metal; and Windows games using VMs or WINE + Rosetta 2 has massive overhead.

    The number of actual games running is tiny and basically the only benchmark I've seen is Shadow of the Tomb Raider. I need a higher sample size to state anything definitively.

    That said, I wouldn't be shocked if you're right, Apple has always targeted Workstation GPU buyers more than gaming GPU buyers.
  • GigaFlopped - Tuesday, October 26, 2021 - link

    The games tested were already ported over to the Metal API, it was only the CPU side that was emulated, we've seen emulated benchmarks before, the M1 and Rosetta does a pretty decent job at it and when they ran the games at 4k, that would have pretty much removed any potential bottleneck. So what you see is pretty much what you'll get in terms of real-world rasterization performance, they might squeeze an extra 5% or so out of it, but don't expect any miracles, it's an RTX 3060 Mobile competitor in terms of Rasterization, which is certainly not to be sniffed at and very good achievement. The fact that it can match the 3060 whilst consuming less power is a feat of its own, considering this is Apple first real attempt at desktop level or performance GPU.
  • lilkwarrior - Friday, November 5, 2021 - link

    These M1 chips aren't appropriate for serious AAA Gaming. They don't even have hardware-accelerated ray-tracing and other core DX12U/Vulkan tech for current-gen games coming up moving forward. Want to preview that? Play Metro Exodus: Enhanced Edition.
  • OrphanSource - Thursday, May 26, 2022 - link

    you 'premium gaming' encephalitics are the scum of the GD earth. Oh, you can only play your AAA money pit cash grabs at 108 fps instead of 145fps at FOURTEEN FORTY PEE on HIGH QUALITY SETTING? OMG, IT"S AS BAD AS THE RTX 3060? THE OBJECTIVELY MOST COST/FRAME EFFECTIVE GRAPHICS CARD OF 2021??? WOW THAT SOUNDS FUCKING AMAZING!

    Wait, no I, misunderstood, you are saying that's a bad thing? Oh you poor, old, blind, incontinent man... well, at least I THINK you are blind if you need 2k resolution at well over 100fps across the most graphics intensive games of 2020/2021 to see what's going on clearly enough to EVEN REMOTELY enjoy the $75 drug you pay for (the incontinence I assume because you 1. clearly wouldn't give a sh*t about these top end, graphics obsessed metrics and 2. have literally nothing else to do except shell out enough money to feed a family a small family for a week with the cost of each of your cutting edge games UNLESS you were homebound in some way?)

    Maybe stop being the reason why the gaming industry only cares about improving their graphics at the cost of everything else. Maybe stop being the reason why graphics cards are so wildly expensive that scientific researchers can't get the tools they need to do the more complex processing needed to fold proteins and cure cancer, or use machine learning to push ahead in scientific problems that resist our conventional means of analysis

    KYS fool
  • BillBear - Monday, October 25, 2021 - link

    The performance numbers would look even nicer if we had numbers for that GE76 Raider when it's unplugged from the wall and has to throttle the CPU and GPU way the hell down.

    How about testing both on battery only?

Log in

Don't have an account? Sign up now