Arm Unveils Client CPU Performance Roadmap Through 2020 - Taking Intel Head Onby Andrei Frumusanu on August 16, 2018 9:05 AM EST
Today’s announcement is an oddball one for Arm as we see the first-ever public forward looking CPU IP roadmap detailing performance and power projections for the next two generations through to 2020.
Back in May we extensively covered Arm’s next generation Cortex A76 CPU IP and how it’s meant to be a game-changer in terms of providing one of the biggest generational performance jumps in the company’s recent history. The narrative in particular focused on how the A76 now brought real competition and viable alternatives to the x86 market and in particular how it would be able to offer performance equivalent to Intel’s best mobile offerings, at much lower power.
Arm sees always-connected devices with 5G connectivity as a prime opportunity for a shift in the laptop market. Qualcomm’s recent Snapdragon 835 and Snapdragon 850 platforms were the first attempts in trying to establish this new slice for Arm-based PCs.
Today’s roadmap now publicly discloses the codenames of the next two generations of CPU cores following the A76 – Deimos and Hercules. Both future cores are based on the new A76 micro-architecture and will introduce respective evolutionary refinements and incremental updates for the Austin cores.
The A76 being a 2018 product – and we should be hearing more on the first commercial devices on 7nm towards the end of the year and coming months, Deimos is its 2019 successor aiming at more wide-spread 7nm adoption. Hercules is said to be the next iteration of the microarchitecture for 2020 products and the first 5nm implementations. This is as far as Arm is willing to project in the future for today’s disclosure, as the Sophia team is working on the next big microarchitecture push, which I suspect will be the successor to Hercules in 2021.
Part of today’s announcement is Arm’s reiteration of the performance and power goals of the A76 against competing platforms from Intel. The measurement metric today was the performance of a SPECint2006 Speed run under Linux while complied under GCC7. The power metrics represent the whole SoC “TDP”, meaning CPU, interconnect and memory controllers – essentially the active platform power much in a similar way we’ve been representing smartphone mobile power in recent mobile deep-dive articles.
Here a Cortex A76 based system running at up to 3GHz is said to match the single-thread performance of an Intel Core i5-7300U running at its maximum 3.5GHz turbo operating speed, all while doing it within a TDP of less than 5W, versus “15W” for the Intel system. I’m not too happy with the power presentation done here by Arm as we kind of have an apples-and-oranges comparison; the Arm estimates here are meant to represent actual power consumption under the single-threaded SPEC workload while the Intel figures are the official TDP figures of the SKU – which obviously don’t directly apply to this scenario.
We didn’t have internal data to verify Arm’s claims as of publishing of the article, but the 15W Intel figure is naturally on the high side, given that this just the official TDP representing multi-threaded workloads – a very quick test of CB15 ST power as reported by MSR registers on an 7200U at 3.1GHz measured 9.3W package+DRAM power while an 8250U at 3.35GHz came in at 11W. I haven’t correlated SPEC power on x86 to date, but I’m expecting it on average to be less than CB15. Even if the 15W figure for the 7300U is correct, and I’m expecting something more in the range of 9-11W, Arm might be using one of Intel’s notably less efficient performance points when doing the comparison for these SKUs. Of course this doesn’t invalidate the data as efficiency for the A76 at those frequencies would also not be optimal, it’s just something to keep in mind.
It’s also interesting to see Arm scale back on the performance comparison as they’re using a 3GHz A76 as the comparison data-point – this is in contrast to the 3.3GHz maximum 5W performance point presented during TechDay. I had tried to estimate the A76’s power in mobile form-factors based on the different metrics Arm disclosed and came at an estimated 2.3W at 3GHz. Naturally Arm says “less than 5W” and they could be erring on the safe side of not over-promising – but if it had been *that* much lower, as in my estimate, we would have maybe seen even more aggressive marketing figures. In the end, until we get the first A76 devices in our hands, we won’t know for sure what the exact figures will be and at which point on the efficiency curve Arm’s projected 3GHz performance figures will end up at.
The last slide that is notable to talk about is the performance projections for Deimos and Hercules. Here Arm’s taking a direct stab at Intel’s lack of significant progress over the last few years and reiterating its confidence in the company’s ability in sustaining high CAGR (compound annual growth rate) performance figures for the next generations.
Again at TechDay we quoted figures of 20-25% while today’s announcement contained a more conservative figures of “>=15%” – likely better representing a seemingly larger 20% projected boost for Deimos as well as what seems to be a 10% gain for the 5nm follow-up Hercules. Taking into account the relative positioning of the data-points in this chart, I did some quick correlation and it matches my initial estimated performance figures for a 3GHz A76 at around ~26 SPECint2006. Deimos and Hercules would come in at figures of ~31 and ~34 points.
Finally today’s announcement is a marketing exercise attempting to emphasise Arm’s performance and power commitments over the next few generations, trying to showcase it has the strategy and technology in place to make the Arm laptop market a real growth opportunity. If and how this pans out is something that we won’t find out at least until later on in the year, with the first actual A76 based large form-factor designs not being a thing until at least sometime in 2019. We’re eagerly awaiting the first A76 based mobile designs in the months to come and to have a first hand-on evaluation of the new microarchitecture family.
Post Your CommentPlease log in or sign up to comment.
View All Comments
Wilco1 - Thursday, August 16, 2018 - linkWhat future Intel chips will show a large gain? It would seem the 10nm chips like i3-8121U are taking a big step backwards.
HStewart - Thursday, August 16, 2018 - linkI am not talking about i3-8112U but CPU like i5-8350 since list only compare i5's and not i3;s
Wilco1 - Thursday, August 16, 2018 - linkThe i5-8350 is pretty much the same as i5-7300U. The extra cores don't matter since we're talking single threaded performance. So what wonders are you expecting from it?
Fritzkier - Friday, August 17, 2018 - linkBecause Intel is more famous and a dominant leader in x86 market share.
kgardas - Thursday, August 16, 2018 - link"Here a Cortex A76 based system running at up to 3GHz is said to match the single-thread performance of an Intel Core i5-7300U running at its maximum 3.5GHz turbo operating speed, all while doing it within a TDP of less than 5W, versus “15W” for the Intel system." -- Andrei, I've not seen any proof that ARM is going after Intel's single-threaded performance. Slide(s) are done in a clever way that you may consider that single-threaded, but it may also mean multi-threaded. I mean ARM is probably going to be better on parallel run of SPEC2006 while it assumes SoC with 8+ cores comparing that to 2+ cores Intels. That's IMHO very realistic scenario. On the other hand, catching Intel in single-threaded performance is IMHO very unrealistic. Certainly at least for next 2-3 years.
Andrei Frumusanu - Thursday, August 16, 2018 - linkAll the performance metrics Arm talked about A76 have always been single-threaded, this article included. It's also noted in the footnotes of the projection slide. They are very openly going after Intel ST performance and talked extensively about it during the A76 launch.
Arm specifically omitted MT projections because how many cores in a future SoC goes is up to the SoC designer, so of course this will vary from 2 to 4 or any other configuration.
kgardas - Thursday, August 16, 2018 - linkHmm, well, if you are that sure, then OK! Looks like interesting fight in front of us. Thanks for clarification since I've not seen any footnote which would specifically mention single-threaded benchmarking.
HStewart - Thursday, August 16, 2018 - linkRealistically - your previous article on A76 has projections under iPhone X - so did ARM change their projects so that A76 is great that Intel Xeon or AMD Epic CPU.
ZolaIII - Friday, August 17, 2018 - linkIt's greater than Falcor (QC) which is also four instructions per clock but without branch hybrid indirect predictor while probably not being larger. Falcor proven hit 76~80% of the X86 contrapart performance while staying 2x+ more power efficient. So it's expected how A76 will be 10~15% faster while being at least equally power efficient. This puts it at least ahead of current Epyc regarding performance while being there + times more power efficient. This lv of power efficiency is a key for pretty much everything; mobile SoC, thin and light to servers along with price of course.
drothgery - Thursday, August 16, 2018 - linkIt's great that ARM's interested in competing in the notebook (or similar power envelope) CPU space, but it seems like they're being deliberately misleading by stopping their U-series i5 comparison data points with a 7xxx dual core rather than a 8xxx quad core. Which for the highly multi-threaded workloads they're arguing performance is close to Intel on, kind of matters a bit, I'd think ...
Though I suppose that given pricing, ARM is probably really targeting more i3s and Pentiums and Celerons, and Intel is sticking with dual cores in the 15W and lower space for them at least for now (or sometimes quad-Atom family cores).