Yesterday AMD revealed that in 2014 it would begin production of its first ARMv8 based 64-bit Opteron CPUs. At the time we didn't know what core AMD would use, however today ARM helped fill in that blank for us with two new 64-bit core announcements: the ARM Cortex-A57 and Cortex-A53.

You may have heard of ARM's Cortex-A57 under the codename Atlas, while A53 was referred to internally as Apollo. The two are 64-bit successors to the Cortex A15 and A7, respectively. Similar to their 32-bit counterparts, the A57 and A53 can be used independently or in a big.LITTLE configuration. As a recap, big.LITTLE uses a combination of big (read: power hungry, high performance) and little (read: low power, lower performance) ARM cores on a single SoC. 

By ensuring that both the big and little cores support the same ISA, the OS can dynamically swap the cores in and out of the scheduling pool depending on the workload. For example, when playing a game or browsing the web on a smartphone, a pair of A57s could be active, delivering great performance at a high power penalty. On the other hand, while just navigating through your phone's UI or checking email a pair of A53s could deliver adequate performance while saving a lot of power. A hypothetical SoC with two Cortex A57s and two Cortex A53s would still only appear to the OS as a dual-core system, but it would alternate between performance levels depending on workload.

ARM's Cortex A57

Architecturally, the Cortex A57 is much like a tweaked Cortex A15 with 64-bit support. The CPU is still a 3-wide/3-issue machine with a 15+ stage pipeline. ARM has increased the width of NEON execution units in the Cortex A57 (128-bits wide now?) as well as enabled support for IEEE-754 DP FP. There have been some other minor pipeline enhancements as well. The end result is up to a 20 - 30% increase in performance over the Cortex A15 while running 32-bit code. Running 64-bit code you'll see an additional performance advantage as the 64-bit register file is far simplified compared to the 32-bit RF.

The Cortex A57 will support configurations of up to (and beyond) 16 cores for use in server environments. Based on ARM's presentation it looks like groups of four A57 cores will share a single L2 cache.

ARM's Cortex A53

Similarly, the Cortex A53 is a tweaked version of the Cortex A7 with 64-bit support. ARM didn't provide as many details here other than to confirm that we're still looking at a simple, in-order architecture with an 8 stage pipeline. The A53 can be used in server environments as well since it's ISA compatible with the A57.

ARM claims that on the same process node (32nm) the Cortex A53 is able to deliver the same performance as a Cortex A9 but at roughly 60% of the die area. The performance claims apply to both integer and floating point workloads. ARM tells me that it simply reduced a lot of the buffering and data structure size, while more efficiently improving performance. From looking at Apple's Swift it's very obvious that a lot can be done simply by improving the memory interface of ARM's Cortex A9. It's possible that ARM addressed that shortcoming while balancing out the gains by removing other performance enhancing elements of the core.

Both CPU cores are able to run 32-bit and 64-bit ARM code, as well as a mix of both so long as the OS is 64-bit.

Completed Cortex A57 and A53 core designs will be delivered to partners (including AMD and Samsung) by the middle of next year. Silicon based on these cores should be ready by late 2013/early 2014, with production following 6 - 12 months after that. AMD claimed it would have an ARMv8 based Opteron in production in 2014, which seems possible (although aggressive) based on what ARM told me.

ARM expects the first designs to appear at 28nm and 20nm. There's an obvious path to 14nm as well.

It's interesting to note ARM's commitment to big.LITTLE as a strategy for pushing mobile SoC performance forward. I'm curious to see how the first A15/A7 designs work out. It's also good to see ARM not letting up on pushing its architectures forward.



View All Comments

  • wsw1982 - Wednesday, October 31, 2012 - link

    yes, javascript is a extrem case, just as the b
    anchmark you show. there are lots of other compares, you can check the review yourself, i said the atom outperform the arm in a lot of benchmark, not all, it is of couse better to use arm cell phone to calculate climate change, i guess it will take you only one year on arm cellphone, and will take you another year if you use intel phone. but the atom do rendering the internet faster than arm (see xolo review). i mean, i am a simple person that use cell mostly broswer internet and never run mapreduce on that. in this case, atom running faster than any arm mobiles and consume less power (as far as i know). so it's clean winner to me. need less to say it achieve it by a much order design and big silicon node.
  • Wilco1 - Thursday, November 01, 2012 - link

    Atom used to be faster on Javascript, but with the latest software updates that is no longer true (Galaxy Note 2 for example has Sunspider scores way faster than either the Xolo or Razr i). Note that Sunspider and other Javascript benchmarks do not indicate browsing performance.

    Also where did you get the idea that Atom based mobiles use less power? Battery life of the Xolo and Razr i is worse than other modern phones like the Note 2, iPhone 5, HTC One X. See for the battery life comparisons and the previous page for the Sunspider and other Javascript results.
  • wsw1982 - Thursday, November 01, 2012 - link

    I think you also need take the battery size and other component of the phone into considerations, do you? My core 2 dual laptop with 8 gig memory and a second touch-on battery, run both faster and last longer then the sandy bridge notebook with 2 g memory.

    The Galaxy note 2 have a comparable battery life by a 1.5 times battery.

    The operational system also affect battery life a lot, my macbook pro can last at least 4 hour with os 10, but less than 2.5 hour with windows 7.

    The farest comparation is between razr i and rarz m, at least they are almost identical design, aren't they? that's why I most compare krait with medfield, and the razr i is a clear winner in battery life.

    The note II didn't win all the web oriented test in your qoute. I would say they are similar in web performance, but note II is like 1.5 time power hungry. By the way, I didn't see any major website reallz qoute the geekmark... I think there must be reason for that...
  • Wilco1 - Thursday, November 01, 2012 - link

    Of course, battery capacity, OS, screen size, resolution and all the other components can make a big difference. And that makes fair comparisons difficult indeed.

    Looking at your links, the Note 2 does win the web browsing and video playback tests by a good margin, despite having twice the resolution and a larger screen. The talk time one is odd, as this hardly uses the CPU/screen, so you'd not expect it to use twice as much power. Obviously measuring talk time fairly is not easy either - you may connect to a different cell tower, a different band or need more power due to interference or "holding it wrongly"...

    Yes, I agree Razr i and m are very similar in specs (although I think they run different Android versions). Krait is a bit disappointing in battery life but beats Medfield by a large margin on most benchmarks (except for Javascript), so you do get better performance for the extra power consumption.

    Geekbench is not that popular but you do see AnandTech, GsmArena and a few others mentioning it nowadays. It is one of the better benchmarks as it uses native code, so more accurately models CPU performance, unlike Javascript.
  • ET - Wednesday, October 31, 2012 - link

    First I saw 3x today's high end, then learned that today's high end means A9, and A57 is expected to be 20-30% faster than A15. And it's only expected in 2014.

    Still cool, but enthusiasm is dampened.
  • A4i - Wednesday, October 31, 2012 - link

    "Today's high end" means APQ8064 , Apple A6/A6x and Еxynos 5250. APQ8064 Soc is in LG Optimus G and Nexus 4. Apple A6/A6x is in iPhone 5 and iPad 4. Еxynos 5250 is in Nexus 10 and Chromebook 3. LG Optimus G score in Linpack benchmark is 608 MFLOPS and that is slill without NEON optimisation. NEON is a 128-bit wide SIMID, roughly twice the size of a single Krait CPU core. Reply
  • Wilco1 - Wednesday, October 31, 2012 - link

    The 3x performance gain is over current high-end mobiles (Galaxy Note 2), not tablets or laptops - I think it will take a few months before we'll see A15 based phones.

    The penultimate slide shows shows the A15 is going to give about 2x gain, and the A57 gives another 50% again (this includes frequency increases). A 3x gain in less than 24 months is amazing. It means phones are approaching Sandy Bridge levels of performance!
  • A4i - Wednesday, October 31, 2012 - link

    Yep, 20-30% faster than A15 in 32-bit code, presumably at the same frequency. Reply
  • Charbax - Wednesday, October 31, 2012 - link Reply
  • blanarahul - Thursday, November 01, 2012 - link

    I can't wait to see a server with PFLOP/s level of computing power using the ARM Cortex A57. Let's see if it can break the long standing record of Blue Gene/Q 16 core Power PC in perf/watt. Reply

Log in

Don't have an account? Sign up now