Final Thoughts

ARM has certainly been busy, refreshing several key technologies for the next generation of SoCs. DynamIQ might not be as flashy as a new CPU, but as a replacement for big.LITTLE it’s every bit as important. It will be interesting to see how ARM’s partners utilize its flexibility. Will we continue to see the same 4+4 combination of big and little cores at the high end and 8 little cores in the low end to midrange? Or will we see new 7+1 or 3+1 combinations with a single A75 surrounded by A55s? Currently only the A75/A55 are compatible with DynamIQ, and the new CPUs cannot be mixed with older cores using big.LITTLE. This means we will not see the A35 used in mobile outside of MediaTek’s Helio X30.

DynamIQ is an upgrade to bL in other ways too. Placing both the big and little cores inside the same cluster brings several benefits: making the L2 caches local to each CPU and adding an optional L3 cache improves overall memory performance, thread migration latency is reduced, and CPUs can be powered up/down more quickly, which could lead to better battery life.

The A55’s extra performance is a welcome change. This should yield tangible improvements to the user experience in mobile applications, certainly for devices that use A55 cores exclusively. Even devices with A75 cores should still see some benefit considering how threads spend most of their time running on the little cores.

ARM already pushed throughput through the A53’s 2-wide in-order core about as far as it could. Given the power and area targets for A53/A55, going wider or out of order are not possible at this stage. Instead, ARM focused on improving the memory system, reducing latency and improving utilization of the in-order core by keeping it fed with data. The increased performance comes with a small bump in power, but overall efficiency is better.

For the A75, the move to 3-wide decode, improvements throughout the cache hierarchy, and tweaks to improve its out-of-order capability should yield clear performance gains over the A73 in both integer and floating-point workloads. At the same frequency, the A72 actually performs better than A73 in some situations. I expect this will not be the case with A75.

According to ARM’s numbers, the A75’s performance gains help it maintain the same efficiency as the A73, but power consumption is higher, which concerns me a little. ARM has an implementation team optimizing its reference design, so its power numbers are sort of a target for SoC vendors. Because of pressure to reduce time to market, vendors do not always have the same amount of time to optimize their designs, resulting in higher power consumption and lower efficiency. Hopefully, vendors put in the effort to match or get close to ARM’s numbers.

ARM’s primary goal for A72 was reducing power, for A73 it was improving power efficiency, and for A75 it's improving performance. What will be the goal for the next core, which will be coming from ARM’s Austin team that produced the A72? Will it look similar to A75, or will there be a significant shift in philosophy like we saw with A72 to A73? There is communication and cross pollination of ideas between teams so there's sure to be some similarities, especially with the execution pipes. The biggest changes should be in the front end, and I would not be surprised to see an extra ALU pipe with the move to 7nm.

If all goes according to plan, we should see the first SoCs using DynamIQ and the A75/A55 in Q1 2018 (maybe Q4 2017) on 10nm.

Cortex-A55 Microarchitecture
Comments Locked

104 Comments

View All Comments

  • Wilco1 - Tuesday, May 30, 2017 - link

    All of the Helio deca cores use Cortex-A72, just like Snapdragon 65x, and the higher clocks of Helio means it beats the 65x like you'd expect. So I'm not sure what your point is?

    Cortex-A53 in Kirin 950 at its highest frequency is ~50% more efficient than Cortex-A72, with the crossover point at around 2.1GHz. On 10nm with Cortex-A55 it may be closer to 2.5GHz.
  • serendip - Tuesday, May 30, 2017 - link

    Yeah, my mistake, I was thinking of Helios with 8x A53s only. Anyway, perf/watt matters at high clock speeds, so a 50% efficient A53 still can't do as much work as an A72 at the same clock speed. I'd rather keep the efficient cores humming along at low speed and have the big cores come online in short bursts, like for app loading or web page rendering. Note that this might not work for constant gaming though, the big cores constantly being on will overheat the phone and kill battery life.

    No easy solutions then.
  • Wardrive86 - Tuesday, May 30, 2017 - link

    2 GHz a53 has the single thread performance of 2.3 GHz Krait 400/1.85 GHz cortex a15. Octa designs often have well over double the multithread performance of say a Snapdragon 800/801. Not low end..very much midrange
  • serendip - Tuesday, May 30, 2017 - link

    But the A53 or A55 at 2+ GHz is a huge power hog that's still slower than a similarly clocked A72 or A75. The octacore branding is a gimmick when all cores are the same design. Performance doesn't scale equally with increasing frequency and power consumption - at one point, it's better to switch the task to a high performance core rather than keep increasing speed on a low performance core.

    A smart design (like the 650/652 which is a flagship killer) would 4x or 6x A55 at low clock rates for multi threaded stuff and 2x A75 for pure single threaded performance, power consumption be damned.
  • Wardrive86 - Tuesday, May 30, 2017 - link

    Snapdragon 625 @ 2.02 GHz and 626 @ 2.3 GHz are certainly not battery hogs. They are both homogenous and ramp clock speed up on all cores very, very often. Much snappier performance than paper specs would suggest and incredible battery life
  • StrangerGuy - Tuesday, May 30, 2017 - link

    Midrange ARM big cores at actual midrange prices are already quite a rarity in the China phone market, let alone outside of it. Since they perform so close to actual flagships SoCs, most OEMs will either price the devices similarly to their actual flagships (cough Samsung A9 Pro), or not doing them altogether. If you ask me who to blame, it will be the non-Apple custom core designers sucking hard at their jobs.

    Besides an A55 SoC with presumably >1K ST GB4 scores are no slouches either, for a $120 device I'm certainly not complaining.
  • serendip - Friday, June 2, 2017 - link

    Maybe the 650/652 was a flash in the pan and the 660 could be a unicorn chip, one that's announced but never deployed. Interestingly Xiaomi moved to the 625 in the Redmi Note 4 and Mi Max 2, whereas predecessor models used the 650. Maybe OEMs really are afraid of good-enough chips in their midrange devices cannibalizing flagship sales.
  • legume - Tuesday, May 30, 2017 - link

    All of these numbers are crap if the cache configs are not stated. DynamIQ is very different and most of the SPEC gains could be from L2/LLC increases. This is all marketing FUD
  • Wilco1 - Tuesday, May 30, 2017 - link

    Forgot to read the article?

    "These numbers, as well as the others shown in the chart, comparing the A55 and A53 are at the same frequency, same L1/L2 cache sizes, same compiler, etc. and are meant to be a fair comparison. The actual gains should actually be a little higher, because partner SoCs will benefit from adding the L3 cache, which these numbers do not include."
  • legume - Wednesday, May 31, 2017 - link

    iso is not the same as knowing the values

Log in

Don't have an account? Sign up now