Rosetta2: x86-64 Translation Performance

The new Apple Silicon Macs being based on a new ISA means that the hardware isn’t capable of running existing x86-based software that has been developed over the past 15 years. At least, not without help.

Apple’s new Rosetta2 is a new ahead-of-time binary translation system which is able to translate old x86-64 software to AArch64, and then run that code on the new Apple Silicon CPUs.

So, what do you have to do to run Rosetta2 and x86 apps? The answer is pretty much nothing. As long as a given application has a x86-64 code-path with at most SSE4.2 instructions, Rosetta2 and the new macOS Big Sur will take care of everything in the background, without you noticing any difference to a native application beyond its performance.

Actually, Apple’s transparent handling of things are maybe a little too transparent, as currently there’s no way to even tell if an application on the App Store actually supports the new Apple Silicon or not. Hopefully this is something that we’ll see improved in future updates, serving also as an incentive for developers to port their applications to native code. Of course, it’s now possible for developers to target both x86-64 and AArch64 applications via “universal binaries”, essentially just glued together variants of the respective architecture binaries.

We didn’t have time to investigate what software runs well and what doesn’t, I’m sure other publications out there will do a much better job and variety of workloads out there, but I did want to post some more concrete numbers as to how the performance scales across different time of workloads by running SPEC both in native, and in x86-64 binary form through Rosetta2:

SPECint2006 - Rosetta2 vs Native Score %

In SPECint2006, there’s a wide range of performance scaling depending on the workloads, some doing quite well, while other not so much.

The workloads that do best with Rosetta2 primarily look to be those which have a more important memory footprint and interact more with memory, scaling perf even above 90% compared to the native AArch64 binaries.

The workloads that do the worst are execution and compute heavy workloads, with the absolute worst scaling in the L1 resident 456.hmmer test, followed by 464.h264ref.

SPECfp2006(C/C++) - Rosetta2 vs Native Score %

In the fp2006 workloads, things are doing relatively well except for 470.lbm which has a tight instruction loop.

SPECint2017(C/C++) - Rosetta2 vs Native Score %

In the int2017 tests, what stands out is the horrible performance of 502.gcc_r which only showcases 49.87% performance of the native workload – probably due to high code complexity and just overall uncommon code patterns.

SPECfp2017(C/C++) - Rosetta2 vs Native Score %

Finally, in fp2017, it looks like we’re again averaging in the 70-80% performance scale, depending on the workload’s code.

Generally, all of these results should be considered outstanding just given the feat that Apple is achieving here in terms of code translation technology. This is not a lacklustre emulator, but a full-fledged compatibility layer that when combined with the outstanding performance of the Apple M1, allows for very real and usable performance of the existing software application repertoire in Apple’s existing macOS ecosystem.

SPEC2017 - Multi-Core Performance Conclusion & First Impressions
Comments Locked

682 Comments

View All Comments

  • Ppietra - Friday, November 27, 2020 - link

    ARM was really founded as a joint venture between Apple, Acorn and another company.
    The technology behind it was based on technology from Acorn, but the company was established as a joint venture to develop the processor for Newton. That was the objective for the company creation.
  • darwinosx - Wednesday, November 18, 2020 - link

    Money talks and bullshit walks.
  • lilmoe - Tuesday, November 17, 2020 - link

    Not sure how you came to this conclusion with such a poor and unprofessional review. It seemed to me that AMD is killing it on all fronts, but the folks here made it really hard to tell with all these purposefully misleading charts.

    AMD wins. Come 5nm with Zen4 on laptops, poof goes all the drivel currently in the tech media.

    It's disappointing to see this from Andre non-the-less. Very poor quality, and very misleading benchmarks.
  • ws3 - Tuesday, November 17, 2020 - link

    Stage One: denial
  • Hifihedgehog - Tuesday, November 17, 2020 - link

    Stage One: Comparing Apples to Apples, or 5nm to 5nm. AMD 7nm Zen 2, not even their latest and greatest, is doing admirably against a 5nm product. Pit Zen 3, still held back by 7nm, against it which is a good deal faster than Zen 2 and you have a totally different outcome. Pit Zen 4 where the Zen microarchitecture is given the legs to run with 5nm and it's no contest.
  • defferoo - Tuesday, November 17, 2020 - link

    show me a 5nm Zen 4 CPU to test against then, oh, it doesn't exist. I guess we can't do that comparison yet. what matters here is availability, and Zen 4 won't come for another year, M1 is here now.

    the closest thing to apples to apples now is to use the same TDP for comparison. stack up the Ryzen 7 4800U against the M1 in a Macbook Pro (~15W). M1 is faster in both ST and MT despite the 4800U having 8 cores with SMT.

    when AMD was kicking Intel's butt on the 7nm process and Intel was on 14nm, nobody said, "but you need to compare like to like!" except for Intel fans. now it's Intel/AMD vs. Apple, and only those in denial are demanding a fair comparison on the same process node.
  • YesYesNo - Tuesday, November 17, 2020 - link

    I don't see the M1 having faster multicore than the 4800U, which benchmark am i missing?
  • Kuhar - Wednesday, November 18, 2020 - link

    You are absolutely right! All that hype around M1 was just overexaggerated.
  • defferoo - Wednesday, November 18, 2020 - link

    Spec2017 MT in this very article, Geekbench. We should not over index on one very specific benchmark (Cinebench r23) when we have more comprehensive ways to measure performance.
  • halo37253 - Tuesday, November 17, 2020 - link

    Actually the Ryzen 4800U is not only beating the M1 in MultiThread in cinebench. But doing so with similar power usage. This is Zen2. Zen 3 will easily compete with apple silicon in terms of performance/watt, and in most cases beat it. At 7nm no less. Only makes we wonder why Apple is so willing to fracture their already pretty small Mac OS fanbase.

    The 4800U in 15w mode uses around 20-25watts of power running cinebench, vs the m1 using around 22watts. Sure it doesn't have the lead in single thread performance, but pretty close when the m1 is running x86 apps. Zen3 as we know is a massive improvement in this area. And the 4800U single thread wise is largely clock limited to keep it in check power usage wise.

    I just dont see apple beating out AMD any time soon.

    The 4800u still uses GCN graphic cores, so expect a huge gain when they move up to RDNA or RDNA2 (Hopefully they jump to 2).

    Apple does have years of experience building tightly integrated SOCs, and this is where this chip shines. It clearly shows how well ARM can perform. But this is about as cutting edge as apple has been able to get their chip. AMD's focus is still mostly on the data center, so the fact their mobile devices do so well is a testament to how well suited Zen is to scale down.

    Geekbench is a joke of a benchmark and was only ever good comparing devices in the same family. There is wide score changes with the same hardware when taking into the OS its running on. Never use it to compare different CPU Archs or even two different operation systems.

Log in

Don't have an account? Sign up now