Rosetta2: x86-64 Translation Performance

The new Apple Silicon Macs being based on a new ISA means that the hardware isn’t capable of running existing x86-based software that has been developed over the past 15 years. At least, not without help.

Apple’s new Rosetta2 is a new ahead-of-time binary translation system which is able to translate old x86-64 software to AArch64, and then run that code on the new Apple Silicon CPUs.

So, what do you have to do to run Rosetta2 and x86 apps? The answer is pretty much nothing. As long as a given application has a x86-64 code-path with at most SSE4.2 instructions, Rosetta2 and the new macOS Big Sur will take care of everything in the background, without you noticing any difference to a native application beyond its performance.

Actually, Apple’s transparent handling of things are maybe a little too transparent, as currently there’s no way to even tell if an application on the App Store actually supports the new Apple Silicon or not. Hopefully this is something that we’ll see improved in future updates, serving also as an incentive for developers to port their applications to native code. Of course, it’s now possible for developers to target both x86-64 and AArch64 applications via “universal binaries”, essentially just glued together variants of the respective architecture binaries.

We didn’t have time to investigate what software runs well and what doesn’t, I’m sure other publications out there will do a much better job and variety of workloads out there, but I did want to post some more concrete numbers as to how the performance scales across different time of workloads by running SPEC both in native, and in x86-64 binary form through Rosetta2:

SPECint2006 - Rosetta2 vs Native Score %

In SPECint2006, there’s a wide range of performance scaling depending on the workloads, some doing quite well, while other not so much.

The workloads that do best with Rosetta2 primarily look to be those which have a more important memory footprint and interact more with memory, scaling perf even above 90% compared to the native AArch64 binaries.

The workloads that do the worst are execution and compute heavy workloads, with the absolute worst scaling in the L1 resident 456.hmmer test, followed by 464.h264ref.

SPECfp2006(C/C++) - Rosetta2 vs Native Score %

In the fp2006 workloads, things are doing relatively well except for 470.lbm which has a tight instruction loop.

SPECint2017(C/C++) - Rosetta2 vs Native Score %

In the int2017 tests, what stands out is the horrible performance of 502.gcc_r which only showcases 49.87% performance of the native workload – probably due to high code complexity and just overall uncommon code patterns.

SPECfp2017(C/C++) - Rosetta2 vs Native Score %

Finally, in fp2017, it looks like we’re again averaging in the 70-80% performance scale, depending on the workload’s code.

Generally, all of these results should be considered outstanding just given the feat that Apple is achieving here in terms of code translation technology. This is not a lacklustre emulator, but a full-fledged compatibility layer that when combined with the outstanding performance of the Apple M1, allows for very real and usable performance of the existing software application repertoire in Apple’s existing macOS ecosystem.

SPEC2017 - Multi-Core Performance Conclusion & First Impressions
Comments Locked

682 Comments

View All Comments

  • Kuhar - Wednesday, November 18, 2020 - link

    You are wrong. This is literally Apple`s ONLY chip. So I can say it is the highest end chip.
  • TEAMSWITCHER - Wednesday, November 18, 2020 - link

    Not for long...
  • Hrunga_Zmuda - Wednesday, November 18, 2020 - link

    It's not a chip. It's an SOC. But be that as it may, Apple is literally using multiple chips right now, and they are going to be replacing their whole line right up to the Mac Pro. People think the Mac is small potatoes, but it's the equivalent of a Fortune 500 company. It just looks small because of how massive the iOS ecosystem is. They will easily make money just fine with the whole line updated to the M system. Why? Because Apple doesn't have to sell anything to other companies, so every single thing they make doesn't have to make money by itself. So the Mac Pro's processor might not make a profit itself, but the Mac Pro will.
  • Spunjji - Thursday, November 19, 2020 - link

    "Is" is not the same as "will be"

    Reading comprehension in the comments is not strong.
  • Spunjji - Tuesday, November 17, 2020 - link

    Oh dear. Please don't blame the graphs - or, indeed, the author - when they show you something you didn't want to see.

    What you see here is extremely competitive performance, that AMD may well exceed when they get to 5nm - but they're not there just yet. For the end-user, what counts is what you can get.

    AMD need to get their chips into more designs and with any luck they will; Intel can't bribe away a performance advantage like Zen 3 has forever.
  • markiz - Thursday, November 19, 2020 - link

    For the end-user is not really relevant for this particular discussion, I think?
    I think the discussion is "philosophical" in nature, as in are there intrinsic differences and advantages of one over the other?

    E.g., can AMD (or Intel, or Qualcomm) in lets say 2 years offer a SOC as efficeint and as performant as apple can?

    So as to say, is it a matter of time, is that time reasonable, or is it unsurmountable?

    If I knew Qualcomm will offer a comparable snapdragon in 2022 (and MS sorts the emulation issues), or if AMD will offer comparable chip in 2022, i am good, and would pick from a wastly wider pool of hw designs of windows ecosystem. I like convertibles.
    If on the other hand this time frame is larger, or if they will never offer either the efficiency or performance, I would switch to apple all be damned.
  • BushLin - Thursday, November 19, 2020 - link

    ..."can AMD (or Intel, or Qualcomm) in lets say 2 years offer a SOC as efficeint and as performant as apple can?"

    AMD have a comparable chip available now in performance and power, been out for ages and it's in the benchmarks. If you need your system to do some actual work, the 4800U is a better chip. If your workload doesn't scale to many threads and the software is available for the new ARM platform then Apple's silicon looks pretty sweet.
  • haghands - Tuesday, November 17, 2020 - link

    Cope
  • adt6247 - Tuesday, November 17, 2020 - link

    The parts that beat the M1 have way more cores, a higher thermal budget, and higher clock.

    There's a lot of things to optimize for, and in its current form, Apple silicon doesn't offer solutions to all desktop workflows -- number of PCIe lanes comes to mind as a limitation.

    AMD isn't wholly beaten, but they're also not playing the same game. The best thing to come out of this would be lighting a fire under AMD's butt.

    But AMD will be chasing higher IPC and performance per watt, while Apple will be chasing higher core counts, higher thermal and power budget for desktop parts, and higher clocks. I'm hoping Intel is going to rebound with competitive parts in a couple years. Competition makes everyone better!
  • BushLin - Tuesday, November 17, 2020 - link

    Er... Similar power drawn by old zen 2 design at 7nm which is giving better multithreaded performance.

Log in

Don't have an account? Sign up now