Final Thoughts

ARM has certainly been busy, refreshing several key technologies for the next generation of SoCs. DynamIQ might not be as flashy as a new CPU, but as a replacement for big.LITTLE it’s every bit as important. It will be interesting to see how ARM’s partners utilize its flexibility. Will we continue to see the same 4+4 combination of big and little cores at the high end and 8 little cores in the low end to midrange? Or will we see new 7+1 or 3+1 combinations with a single A75 surrounded by A55s? Currently only the A75/A55 are compatible with DynamIQ, and the new CPUs cannot be mixed with older cores using big.LITTLE. This means we will not see the A35 used in mobile outside of MediaTek’s Helio X30.

DynamIQ is an upgrade to bL in other ways too. Placing both the big and little cores inside the same cluster brings several benefits: making the L2 caches local to each CPU and adding an optional L3 cache improves overall memory performance, thread migration latency is reduced, and CPUs can be powered up/down more quickly, which could lead to better battery life.

The A55’s extra performance is a welcome change. This should yield tangible improvements to the user experience in mobile applications, certainly for devices that use A55 cores exclusively. Even devices with A75 cores should still see some benefit considering how threads spend most of their time running on the little cores.

ARM already pushed throughput through the A53’s 2-wide in-order core about as far as it could. Given the power and area targets for A53/A55, going wider or out of order are not possible at this stage. Instead, ARM focused on improving the memory system, reducing latency and improving utilization of the in-order core by keeping it fed with data. The increased performance comes with a small bump in power, but overall efficiency is better.

For the A75, the move to 3-wide decode, improvements throughout the cache hierarchy, and tweaks to improve its out-of-order capability should yield clear performance gains over the A73 in both integer and floating-point workloads. At the same frequency, the A72 actually performs better than A73 in some situations. I expect this will not be the case with A75.

According to ARM’s numbers, the A75’s performance gains help it maintain the same efficiency as the A73, but power consumption is higher, which concerns me a little. ARM has an implementation team optimizing its reference design, so its power numbers are sort of a target for SoC vendors. Because of pressure to reduce time to market, vendors do not always have the same amount of time to optimize their designs, resulting in higher power consumption and lower efficiency. Hopefully, vendors put in the effort to match or get close to ARM’s numbers.

ARM’s primary goal for A72 was reducing power, for A73 it was improving power efficiency, and for A75 it's improving performance. What will be the goal for the next core, which will be coming from ARM’s Austin team that produced the A72? Will it look similar to A75, or will there be a significant shift in philosophy like we saw with A72 to A73? There is communication and cross pollination of ideas between teams so there's sure to be some similarities, especially with the execution pipes. The biggest changes should be in the front end, and I would not be surprised to see an extra ALU pipe with the move to 7nm.

If all goes according to plan, we should see the first SoCs using DynamIQ and the A75/A55 in Q1 2018 (maybe Q4 2017) on 10nm.

Cortex-A55 Microarchitecture
Comments Locked

104 Comments

View All Comments

  • Matt Humrick - Wednesday, May 31, 2017 - link

    The L1/L2 cache sizes for A53/A55 are stated in the article.
  • Great_Scott - Tuesday, May 30, 2017 - link

    Fantastic article, Matt. Best CPU tech article I've read in years, and I read most of them.
  • Alexvrb - Tuesday, May 30, 2017 - link

    "ARM wants to push the A75 into larger form-factor devices with power budgets beyond mobile’s 750mW/core too by pushing frequency higher. Something like a Chromebook or a 2-in-1 ultraportable come to mind. At 1W/core the A75 delivers 25% higher performance than the A73 and at 2W/core the A75’s advantage bumps up to 30% when running SPECint 2006. If anything, these numbers highlight why it’s not a good idea to push performance with frequency alone, as dynamic power scales exponentially."

    Perhaps, but it gives it a lot more headroom for use in things like tablets... and laptops. I'm thinking Windows on ARM could use an even faster SoC than the SD 835, and 2W is perfect. Right in Atom ULP territory, and there's no modern Atoms left to compete in the lower-price territory. Perhaps Intel will be forced to release cheaper gimped Core-based "Atoms" in the future? Or Celerons/Pentiums. ;)
  • LiverpoolFC5903 - Wednesday, May 31, 2017 - link

    Meh.

    Incremental update with no radical changes. Would LOVE to see a huge fat ARM core with a 5+ wide front end for premium devices, with single threaded throughput approaching that of the Core M series. Now that would be progress.

    No reason why a dual core with two fat cores cannot work great on android, especially given the idea of race to sleep. Off load background tasks to DSPs, Microcontrollers etc or even use a third big core clocked at about half the frequency of the main two cores.

    Sure, will be expensive and big, but you can be sure there will be customers for it, especially in the 700 USD plus market segment. As of now, manufacturers barely have any choice apart from qualcomm chipsets.
  • lizanosi - Wednesday, May 31, 2017 - link

    I ask you, why have Samsung and Apple continued to have great success deviating from ARM's reference designs, http://www.promocodeway.com/coupons/ubereats-promo... while Qualcomm has been married to them and paying the performance price (specifically looking at you, 808)
  • melgross - Wednesday, May 31, 2017 - link

    For the most part, Samsung's designs were straight from ARM. They didn't have an architectural license. It's only very recently that they've gotten one.

    But Snapdragon has been Qualcomm's own designs, because they do have an architectural license, as does Apple. But, like the rest of the industry, they were discombobulated when Apple came out with the 64 bit A7.

    They've never gotten totally back into the race. Their fist one was an ARM design, and it had heat problems. The second was their design, but performance was fairly poor. The 835 is not much better than the preceding model. Samsung has faired no better. The problem they all have is that Apple is two years ahead there, and likely took their time with the A7, because there was no competition. These guys are rushing to catch up, and they are likely restrained by the expectation by Android buyers that more cores are better, rather than having better cores.
  • StrangerGuy - Friday, June 2, 2017 - link

    Now that Apple's GPU is a mostly fully custom part, expect the A11 to start another A7-esque domination over Android SoCs on graphics. i also expect a Apple custom LTE baseband to debut this year too, since Apple is definitely too paranoid to depend solely on Qualcomm and Intel's baseband proved to be donkey balls.

    Besides, iPhones probably outsell everyone's else flagships combined yearly in a single launch quarter. The economics of scale for a Android flagship SoC makes far less sense.
  • Suraj tiwari - Thursday, June 1, 2017 - link

    Dynamiq is a welcome move, it should be adopted by SOC manufacturers immediately. No other cpu manufacturer (intel, AMD) has a technology like this!
  • Anato - Saturday, June 3, 2017 - link

    I would prefer 2+2 over 8 A55 cores any day and pay for it, but marketing disagrees :-(
  • slee915 - Wednesday, June 28, 2017 - link

    This article shows A73 has a 3-stage AGU LD/ST memory pipeline but last year's A73 article http://www.anandtech.com/show/10347/arm-cortex-a73... shows it has a 4-stage AGU LD/ST. So which one is correct ?

Log in

Don't have an account? Sign up now