Rosetta2: x86-64 Translation Performance

The new Apple Silicon Macs being based on a new ISA means that the hardware isn’t capable of running existing x86-based software that has been developed over the past 15 years. At least, not without help.

Apple’s new Rosetta2 is a new ahead-of-time binary translation system which is able to translate old x86-64 software to AArch64, and then run that code on the new Apple Silicon CPUs.

So, what do you have to do to run Rosetta2 and x86 apps? The answer is pretty much nothing. As long as a given application has a x86-64 code-path with at most SSE4.2 instructions, Rosetta2 and the new macOS Big Sur will take care of everything in the background, without you noticing any difference to a native application beyond its performance.

Actually, Apple’s transparent handling of things are maybe a little too transparent, as currently there’s no way to even tell if an application on the App Store actually supports the new Apple Silicon or not. Hopefully this is something that we’ll see improved in future updates, serving also as an incentive for developers to port their applications to native code. Of course, it’s now possible for developers to target both x86-64 and AArch64 applications via “universal binaries”, essentially just glued together variants of the respective architecture binaries.

We didn’t have time to investigate what software runs well and what doesn’t, I’m sure other publications out there will do a much better job and variety of workloads out there, but I did want to post some more concrete numbers as to how the performance scales across different time of workloads by running SPEC both in native, and in x86-64 binary form through Rosetta2:

SPECint2006 - Rosetta2 vs Native Score %

In SPECint2006, there’s a wide range of performance scaling depending on the workloads, some doing quite well, while other not so much.

The workloads that do best with Rosetta2 primarily look to be those which have a more important memory footprint and interact more with memory, scaling perf even above 90% compared to the native AArch64 binaries.

The workloads that do the worst are execution and compute heavy workloads, with the absolute worst scaling in the L1 resident 456.hmmer test, followed by 464.h264ref.

SPECfp2006(C/C++) - Rosetta2 vs Native Score %

In the fp2006 workloads, things are doing relatively well except for 470.lbm which has a tight instruction loop.

SPECint2017(C/C++) - Rosetta2 vs Native Score %

In the int2017 tests, what stands out is the horrible performance of 502.gcc_r which only showcases 49.87% performance of the native workload – probably due to high code complexity and just overall uncommon code patterns.

SPECfp2017(C/C++) - Rosetta2 vs Native Score %

Finally, in fp2017, it looks like we’re again averaging in the 70-80% performance scale, depending on the workload’s code.

Generally, all of these results should be considered outstanding just given the feat that Apple is achieving here in terms of code translation technology. This is not a lacklustre emulator, but a full-fledged compatibility layer that when combined with the outstanding performance of the Apple M1, allows for very real and usable performance of the existing software application repertoire in Apple’s existing macOS ecosystem.

SPEC2017 - Multi-Core Performance Conclusion & First Impressions
Comments Locked

682 Comments

View All Comments

  • tkSteveFOX - Friday, November 20, 2020 - link

    Just imagine if they up the TDP to 40W on the next 3nm process next year?
    Perhaps the CPU part won't get big gains (let's face it, CPU is better than anything on the market up to 60W), but GPU should double the performance. By that time 95% of apps will be native as well, so M1 performance will gain another 10-20% additional performance in all scenarios.
  • mdriftmeyer - Saturday, November 21, 2020 - link

    GPU double performance. That's delusional right there. Nothing in Apple's licensed IP will ever touch AMD GPU performance moving forward. Your claim the CPU is better than anything on the market up to 60W is also delusional.

    Enjoy 2021 when very little changes inside the M Series but AMD keeps moving forward. Being a NeXT/Apple alum I was hoping my former colleagues were wiser and moved to AMD three years ago and pushed back this ARM jump two more years.

    ARM has reached its zenith in designs in the embedded space and that is the reason everything wowed is about Camera Lenses. Fab processes are reaching their zenith as well.

    M1 is 12 years of ARM development by Apple's teams after buying PA Semi in 2008. It took 12 years with unlimited budgets to produce the M1. It's far less impressive than people realize.

    Most of Apple's frameworks are already fully optimized after the past 8 years of in-house development. People keep thinking this code base is young. It's not. It's mature. We never released anything young back at NeXT or Apple Engineering. That hasn't change since I left. It's the mantra from day 1.

    The architecture teams at AMD have decades more experience in CPU designs and nothing released here is something they haven't already worked on in-house.

    Intel's arrogance is one of the greatest falls from the top in computing history. And it's only going to get worse for the next five years.
  • dontlistentome - Saturday, November 21, 2020 - link

    Not sure when Intel will learn - they let the Gigahertz marketeers ruin them last time AMD had a lead, and this time it was the accountants. Wondering who screws them up again 15 years from now?
  • corinthos - Monday, November 23, 2020 - link

    Intel made so many mistakes and brought in outsiders who just didn't have the goods to set it on a good path forward. The board members who chose these folks are partially to blame. Larrabee was just one of the earlier warning signs.
  • corinthos - Monday, November 23, 2020 - link

    So are you saying that there's more limited growth opportunity for Apple going down the ARM path than people realize, and that the prospect of AMD producing competitive/superior low-powered processors is going to be much better?

    For now, it seems that for the power consumed, the M1 products have a leg up in power-performance over Intel or AMD-based competing products. Can Apple take that and scale it upwards to be competitive or even a leader in the desktop space?

    I think about how there are some test results coming in already showing how a 2019 Mac Pro with a 10-core cpu and expensive discrete amd gpu and loads of ram being outshined by these M1's in some video editing workloads and wonder if a powerhouse desktop is such a good investment these days. That thing came out like a year ago and cost probably around $10K.
  • Focher - Tuesday, November 24, 2020 - link

    From your post, I suspect you are not going to enjoy the next 2 years. Saying things like the code is already fully optimized is so ridiculous on its face, it’s hard to believe someone wrote it. If time led to full optimization, then what’s the magic time horizon where that happens? If you think Apple just played its full hand with the M1, you’ve never paid attention to Apple.
  • blackcrayon - Tuesday, November 24, 2020 - link

    I would think doubling GPU performance would be one of the easier updates they could make. More cores, more transistors with their existing design - the same thing they've done year after year to make "X" versions of their iPhone chips for the iPad. The M1 isn't at the point where doubling the GPU cores would make it gigantic and unsuitable for a higher end laptop or desktop. Unless you thought he meant "double Nvidia's best performance" or something which isn't going to be possible currently :)
  • zodiacfml - Friday, November 20, 2020 - link

    Coming back here just to leave a comment though the M1 truly leaves any previous Apple x86 product in the dust, it is far from the performance of a Ryzen 4800U which has TDP of 15W. The M1 is at 5nm while consuming 20-24W.
    The M1 iGPU is mighty though, which can only be equaled or beaten by next year/generation APU or Intel iGPU
  • thunng8 - Saturday, November 21, 2020 - link

    The 4800u uses up to 50w running benchmarks and under load. You will never see a current gen Ryzen without active cooling. In laptops running benchmarks, the fans ramps up to 6000rpm while the m1 can run in the MacBook Air with no cooling with hardly any performance degradation. And there’s also the issue of battery life where the m1 laptop with a smaller battery can far outlast any Ryzen laptop.

    In short you cannot compare intel and AMDs dubious tdp numbers with the number measured for the m1.
  • BushLin - Saturday, November 21, 2020 - link

    M1 and 4800U (in 15W mode) are consuming similar 22-24W power when the 4800U is showing better Cinebench performance, no denying the single thread advantage of the M1 though.
    If you've seen a 4800U anywhere near 50W, it won't have been in its 15W mode and running an unrealistic test like Prime95.
    Just the facts Jack.

Log in

Don't have an account? Sign up now