More Detail on ARM11 vs. Cortex A8

We’ve gone through the basic architectural details of the ARM11 and Cortex A8 cores, and across the board the A8 is far ahead. It gets even better for the new design once we drill a little deeper.

The L1 cache in the A8 gets a significant improvement. The ARM11 core had a 2 cycle L1 cache, while the A8 has a single cycle L1. In-order cores depend heavily on fast memory access, so an even faster L1 will have a dramatic impact on performance.
ARM11 actually supported a L2 cache but it was rarely used; the Cortex A8 is designed with a tightly coupled L2 cache varying in size. Vendors can choose from cache sizes as small as 128KB all the way up to 1MB, with a minimal access latency of 8 cycles. The L2 access time is programmable, with slower access more desirable to save power.

The caches also include way prediction to minimize the number of cache ways active when doing a cache access, this sort of cache level power management was also used by Intel back on the first Pentium M processors and is still used today in modern x86 processors.

The ARM11 core supported a 64-bit bus that connected it to the rest of the SoC; Cortex A8 allows for either a 64-bit or 128-bit bus. It’s unclear what vendors like Samsung and T.I. have implemented on their A8 based SoCs.


The S is for speed. Powered by the ARM Cortex A8.

With a deeper pipeline, the Cortex A8 also has a much more sophisticated branch prediction unit. While the ARM11 core had a 88% accurate branch predictor, the Cortex A8 can correctly predict branches over 95% of the time. If you care about stats, the A8 has a 512 entry branch target buffer and a 4K entry global history buffer. The accuracy of the branch predictor in the Cortex A8 is actually as high as what AMD claimed with its first Athlon processor, and this is an in-order core in a smartphone. With a 13-stage pipe however, a very accurate predictor was necessary.

While ARM11 supported some rudimentary SIMDfp instructions, Cortex A8 adds a full SIMDfp instruction set with NEON. ARM expects a greater than 2x improvement on media processing applications thanks to the A8’s NEON instructions - of course you’ll need to compile directly for NEON in order to see those gains. If you’re looking for a modern day relation, NEON is like the A8’s SSE whereas ARM11 basically had a sophisticated MMX equivalent. Both are very important.

The Cortex A8 is a more power hungry core than the ARM11, but the design also has much more extensive clock gating (turning off the clock to idle parts of the chip) than the ARM11. Since the A8 is newer it’s also going to be manufactured on a smaller manufacturing process. The bulk of ARM11 based SoCs used 90nm transistors, while A8 based SoCs are shipping at 65nm. ARM11 has started to transition down to 65nm, while A8 will move down to 45nm.

At the same clock speed and with the same L2 cache sizes, ARM shows the Cortex A8 as being able to execute 40% more instructions per second than the ARM11. That’s a generational performance improvement, something that can’t be delivered by clock speed alone, but the comparison is conservative. Cortex A8 designs won’t ship at the same clock speed and cache configurations as ARM11 chips; as far as I can tell, none of the major ARM11 based smartphones even had a L2 cache while Cortex A8 designs are expected to have one.

Furthermore, the ARM11 based smartphones were much lower in the frequency curve than the early A8 platforms. While a 40% improvement in instruction throughput is reasonable at the same specs, I would expect far larger real world performance improvements from a Cortex A8 based SoC compared to a ARM11 SoC.

Overall the Cortex A8 is much more like a modern day microprocessor. It’s still an in-order core, but it adds superscalar execution, a deeper pipeline, larger caches and a broader instruction set among other things. For any current high end smartphone there doesn’t seem to be a reason to choose the ARM11 over it, companies that insist on using ARM11 based designs even in 2009 are either not agile enough to implement a better chip in a quick manner or have no concern for performance and are more focused on cost savings. Neither option is a particularly good one and it is telling that the two manufacturers who seem to have gotten how to properly design a smartphone, Apple and Palm, have both opted to go with a Cortex A8 before most of the more established players.

A Call to Action

This leads me to a further point: we need more transparency in specs from smartphone manufacturers. The mobile phone market is all too shielded from the performance metrics and accountability that we’ve had in the PC space. When Intel was shipping Pentium 4s that performed slower than the Pentium IIIs they were replacing, we called them out on it. To this day, Apple refuses to talk about the processor in the iPhone 3GS. We get to hear all about what’s in the Nehalem Mac Pro, but the hardware behind the 3GS is off limits - despite the fact that it’s very good. This policy of not delivering specifics and a general unwillingness to talk about specs is absurd at best. It doesn’t take much more than a teardown and some homebrew code to figure out what CPU at what frequency is in any modern day smartphone; manufacturers should show pride in their hardware, or refrain from putting something inside a phone that’s they can’t be proud of.

What we need are cache sizes, clock speeds, full architecture disclosures. They don’t have to be on the phone’s marketing materials but make them accessible and at least some of the focus. These SoCs are so incredibly cool, they pack more power than the desktops of 10 years ago into a single chip smaller than my thumbnail - boast about them! Palm had a tremendous leg up on the competition with its OMAP 3430 processor, yet there was hardly any attention paid to it by Palm. I get that the vast majority of consumers don’t get, but those who do, would help tremendously if given access to this information. It’s something to get excited about.

And if the manufacturers won’t devote time and energy to this stuff, then I will.

Putting it in Perspective The CPU and its Performance
POST A COMMENT

59 Comments

View All Comments

  • lightzout - Saturday, July 11, 2009 - link

    My wife actually offered to give me her 3g if she got the the 3gs but I didnt think it was worth it. She asked me this morning how it was better and I didnt know (didnt admit it of course)

    Now I want her 3G "free" and she really does need the 3gs since since is always multitasking/social/mail..me, including aim.

    I thought the 3gs would have some radical new gps stuff but the compass is not impressive. Nothing to get me geeked on to the tune of $200. For my purposes having the older iphone would make travel and remodeling job estimating easier over my tattered razr.

    My media mogul mamacita however needs that sleek new 3gs like yesterday as every gripe she has about the 3g phone seems to have been addressed somehow.

    Great write-up!

    Only regret is when I saw the new screen and sleek size of the 3gs at the apple store a couple days ago it does screem "arent I beautiful?" but that is what apple does so well right?
    Reply
  • MrBowmore - Saturday, July 11, 2009 - link

    Give the magic, or hero another chance!
    Your numbers for those phones are whacked, its faster than the 3G at alot of things. Try to kill all the backgroundapps. (yes, it multitasks)
    Reply
  • RadnorHarkonnen - Friday, July 10, 2009 - link

    Very good analisys.

    I was just surprised ARM CPUs still made on 90nm and 65nm. With the performance and power saving 55nm and 45 nm processes i would imagine they would jump the bandwagon fast.
    Reply
  • nubie - Thursday, July 09, 2009 - link

    Some people can't drop $600 in a lump or $2600 over 3 years on something as stupid as a cellphone. No matter what it can do.

    Besides the fact that Apple is killing all support for proper hardware acceleration and access to OpenGL 2.0, whatever.

    Can we get more Android and G1 coverage? Please?
    Reply
  • psonice - Friday, July 10, 2009 - link

    Like the guy above said, you buy a phone, you either pay a lot upfront, or you get it with a contract. Either way you'll still need to pay a ton of money each month to for your voice and data. You could get a cheap phone that only makes calls and costs almost nothing, but that's not the same is it?

    And what's this about apple not supporting hardware acceleration / opengl es 2.0??? Almost everything in the gui is hardware accelerated. And there's very good opengl es 1.1/2.0 support in the sdk, hence the ton of hardware accelerated games. There may not be much supporting es2.0 yet, but that's because the first 2.0 capable device has only just been released.
    Reply
  • DLeRium - Friday, July 10, 2009 - link

    You know what? The cost is:

    $199 up front
    $70 / year * 24 months
    = $1680 + $199

    But let's face it, most of you already have cell phones. A quick look at a WinMo phone like the HTC Touch Pro is $70 / month too at minimum ($39.99 voice + $30 data. Same with a Blackberry.

    SO WHY THE HELL ARE YOU COMPLAINING?

    So if $1880 is too much for you, don't get a cell phone period.

    Stop complaining. The iPhone is actually pretty damn cheap. You're locked in a contract, but even if you had another phone WHY WOULD YOU GO DATALESS?
    Reply
  • araczynski - Thursday, July 09, 2009 - link

    i'll care about the iphone/ipod when they start sporting VGA screens. if my digital camera can have a 3" 640x480 display, so should these overpriced toys. Reply
  • psonice - Friday, July 10, 2009 - link

    Higher res screens look pretty, but 640x480 needs 2x more power to fill than 480x320. The screen is more than acceptable already, so I'd take faster running apps/games and longer battery life over more pixels any day. Reply
  • Kougar - Thursday, July 09, 2009 - link

    Thanks for the informative crash course in CPU instructions, that filled in some gaps I didn't understand. It's nice to now understand how some aspects of the design fit into or affect the rest of the design.

    Unfortunately, you've only drummed up the excitement factor for Intel's Sandy Bridge... from some general info that's been around and based on what you've given it sound like the potential is very much there for some very significant performance jumps. So much for Gulftown's allure!
    Reply
  • christinme7890 - Thursday, July 09, 2009 - link

    I love the attention to detail when describing the CPUs and the graphics processor and stuff. Very cool. I hate that other people are dissing the iphone hardware. If you don't like Macs rules get a pre. Plain and simple. I for one support these people that want to sell their apps for a good price and are trying to make it big in the dev world. Kudos and I will buy your apps.

    I will be honest, I am sick of the multitasking argument. You do hit on a point that needs to be addressed imho by Apple and that is that there is no good app for chatting. I really think that Apple needs to include their own IM App that stays on in the background (if you want it to) and collects all your SMS, MMS, IM, facebook, Twitter, etc messages. This would be great. While it would be great I recognize that this would totally sap the power on the iphone. If you had all this info push to your phone, the servers would be constantly sending you messages every second. As for multitasking, I don't really care to have it. There are areas where I wish I had it but it is not necessary. Not to mention that the palm pre has a horrible battery life...plain horrible. I hear people talk like they need 3 backup batteries just to get through the day.

    I have noticed myself that the compass is a little sketchy. There was a time on 07/04 that a friend and I were lost in the city walking around and we used my maps app to find where we are and I tried to get the compass to work to make reading the map easy and it wouldn't work. The map wouldn't rotate and it was frustrating. Oh well.

    Your review of the camera was spot on. It will never replace my uber camera but when I am out and about doing whatever it does great for quick and easy pics. And the movie functions are awesome as well. Now if only you could cut out middle pieces of a movie. Hopefully soon.

    I love the speed of the 3gs. I notice, not tested but notice, a large speed increase and I absolutely love it.

    The one major place the 3GS has over the pre is the App store. No company has been able to implement an app store like Apple. I get all my multimedia from one source (itunes) which is great....Movies, podcasts, video, audio, apps, etc...all in one place is the best thing that apple has done in forever. I will not argue prices or app submission ethics because I truly believe that apple keeps the People as their top priority.

    Great article.
    Reply

Log in

Don't have an account? Sign up now