Apple Shooting for the Stars: x86 Incumbents Beware

The previous pages were written ahead of Apple officially announcing the new M1 chip. We already saw the A14 performing outstandingly and outperforming the best that Intel has to offer. The new M1 should perform notably above that.

We come back to a few of Apple’s slides during the presentations as to what to expect in terms of performance and efficiency. Particularly the performance/power curves are the most detail that Apple is sharing at this moment in time:

In this graphic, Apple showcases the new M1 chip featuring a CPU power consumption peak of around 18W. The competing PC laptop chip here is peaking at the 35-40W range so certainly these are not single-threaded performance figures, but rather whole-chip multi-threaded performance. We don’t know if this is comparing M1 to an AMD Renoir chip or an Intel ICL or TGL chip, but in both cases the same general verdict applies:

Apple’s usage of a significantly more advanced microarchitecture that offers significant IPC, enabling high performance at low core clocks, allows for significant power efficiency gains versus the incumbent x86 players. The graphic shows that at peak-to-peak, M1 offers around a 40% performance uplift compared to the existing competitive offering, all whilst doing it at 40% of the power consumption.

Apple’s comparison of random performance points is to be criticised, however the 10W measurement point where Apple claims 2.5x the performance does make some sense, as this is the nominal TDP of the chips used in the Intel-based MacBook Air. Again, it’s thanks to the power efficiency characteristics that Apple has been able to achieve in the mobile space that the M1 is promised to showcase such large gains – it certainly matches our A14 data.

Don't forget about the GPU

Today we mostly covered the CPU side of things as that’s where the unprecedented industry shift is happening. However, we shouldn’t forget about the GPU, as the new M1 represents Apple’s first-time introduction of their custom designs into the Mac space.

Apple’s performance and power efficiency claims here are really lacking context as we have no idea what their comparison point is. I won’t try to theorise here as there’s just too many variables at play, and we don’t know enough details.

What we do know is that in the mobile space, Apple is absolutely leading the pack in terms of performance and power efficiency. The last time we tested the A12Z the design was more than able to compete and beat integrated graphics designs. But since then we’ve seen more significant jumps from both AMD and Intel.

Performance Leadership?

Apple claims the M1 to be the fastest CPU in the world. Given our data on the A14, beating all of Intel’s designs, and just falling short of AMD’s newest Zen3 chips – a higher clocked Firestorm above 3GHz, the 50% larger L2 cache, and an unleashed TDP, we can certainly believe Apple and the M1 to be able to achieve that claim.

This moment has been brewing for years now, and the new Apple Silicon is both shocking, but also very much expected. In the coming weeks we’ll be trying to get our hands on the new hardware and verify Apple’s claims.

Intel has stagnated itself out of the market, and has lost a major customer today. AMD has shown lots of progress lately, however it’ll be incredibly hard to catch up to Apple’s power efficiency. If Apple’s performance trajectory continues at this pace, the x86 performance crown might never be regained.

From Mobile to Mac: What to Expect?
Comments Locked

644 Comments

View All Comments

  • hecksagon - Tuesday, November 10, 2020 - link

    There is also the issue of the benchmarks not being long enough to cause any significant throttling. This is the reason Apple mobile devices are so strong in this benchmark. Their CPUs provide very strong peak performance that slows down as the device gets heat soaked. That's why it looks like an iPhone can compete with a i7 laptop according to this benchmark.
  • misan - Wednesday, November 11, 2020 - link

    Apple delivers performance comparable to that of best 5.0 ghz x86 chips while running at 3 ghz and drawing under 5 watts. You argument does not make any sense logically. Ys, there will be throttling — but at the same cooling performance and consuming the same power Apple chips will always be faster. In fact, their lead on x86 chips will increase when the CPUs are throttled, since Intel will need to drop the clocks significantly — Apple doesn't.
  • hecksagon - Tuesday, November 10, 2020 - link

    No he is saying that Geekbench weight on cache bound workloads to no represent reality.
  • techconc - Wednesday, November 11, 2020 - link

    GB5 scores are inline with Spec results, so there is no merit to the claim that they don't match reality.
  • chlamchowder - Wednesday, November 11, 2020 - link

    The large lsq/other ooo resource queues and high MLP numbers are there to cover for the very slow L3 cache. With 39ns latency on A13 and similar looking figures here, you're looking at over 100 cycles to get to L3. That's worse than Bulldozer's L3, which was considered pretty bad.
  • name99 - Wednesday, November 11, 2020 - link

    Why not try to *understand* Apple's architecture rather than concentrating on criticism?
    (a) Apple's design is *their* design, it is not a copy of AMD or Intel's design
    (b) Apple's design is optimized for the SoC as a whole, not just the CPU.

    The L3 on Apple SoC's does not fulfill the role of a traditional L3, that is why Apple calls it an SLC (System Level Cache). For traditional CPU caching, Apple has a large L2 (8MiB A14, 12MiB M1).

    The role of the L3 is PRIMARILY
    - to save power (everything, especially on the GPU side, that can be kept there rather than in DRAM is a power advantage)
    - to communicate between different elements of the SoC.
    The fact that the SLC can act as a large (slow, but still faster than DRAM) L3 is just a bonus, it is not the design target.

    Why did Apple keep pushing the UMA theme at their event? The stupid think it's Apple claiming that they are first with UMA; but Apple never said that. The point is that UMA is part of what enables Apple's massive cross-SoC accelerator interaction; while the SLC is what makes that interaction fast and low power. How many accelerators do you think are on the A14/M1? We don't know -- what we do know is that there are 42 on the A12.
    42 accelerators! Did you have a clue that it was anything close to that?
    Sure, you know the big picture, things like ISP, GPU and NPU working together for computational photography, but there is so much more. And they can all interact together and efficiently via SLC.

    https://arxiv.org/pdf/1907.02064v1.pdf
    discusses all this, along with pointing out just how important it is to have fast low energy communication between the accelerators.
  • techconc - Wednesday, November 11, 2020 - link

    Why are we still arguing about the validity of Geekbench? The article even states the following from their own testing...
    "There’s been a lot of criticism about more common benchmark suites such as GeekBench, but frankly I've found these concerns or arguments to be quite unfounded."
  • BlackHat - Wednesday, November 11, 2020 - link

    Because the creator of the benchmark themselves admitted that their old version were somehow inaccurate.
  • Spunjji - Thursday, November 12, 2020 - link

    Just as well we're not really discussing those here, then 😬
  • hecksagon - Tuesday, November 10, 2020 - link

    Too bad the links are all for Geekbench. This is about as far from a real world benchmark you can get.

Log in

Don't have an account? Sign up now