The Right SoC at the Right Time: Apple's A5

Here's how I know Apple is masterful at marketing. After first showing off the new iPad Apple had tons of press convinced that the company was no longer competing based on specs but rather only interested in delivering an experience. In reality Apple is competing with hardware even more than before, it's just trying to give the public the impression that it's not. After all, Apple doesn't make the vast majority of the technology inside the iPad but it does control the experience. A competitor may be able to ship a dual core Cortex A9 but it can't ship the iOS experience. Is it really a surprise that Apple would downplay what it doesn't have exclusive rights to and instead try to get everyone to focus on what it does? Make no mistake, Apple is very much playing the specs game - in fact it's playing the game harder than anyone else in the industry today.

At the heart of the iPad 2 is a brand new SoC: the Apple A5. Built on what I assume is Samsung's 45nm process the A5 is a much more powerful SoC than it's predecessor the A4.

Architecture Comparison
  ARM11 ARM Cortex A8 ARM Cortex A9 Qualcomm Scorpion
Issue Width single-issue dual-issue dual-issue dual-issue
Pipeline Depth 8 stages 13 stages 9 stages 13 stages
Out of Order Execution N N Y Partial
FPU Optional VFPv2 (not-pipelined) VFPv3 (not-pipelined) Optional VFPv3-D16 (pipelined) VFPv3 (pipelined)
NEON N/A Y (64-bit wide) Optional MPE (64-bit wide) Y (128-bit wide)
Process Technology 90nm 65nm/45nm 40nm 40nm
Typical Clock Speeds 412MHz 600MHz/1GHz 1GHz 1GHz

While the A4 featured a single core ARM Cortex A8, the A5 integrates two ARM Cortex A9s with a total of a 1MB L2 cache. That puts the A5 at a similar level of CPU performance to NVIDIA's Tegra 2 and TI's OMAP 4430. The only insider information I've managed to come across points to A5 featuring ARM's MPE (SIMD/NEON engine) in its A9 cores.

Based on Chipworks' analysis of the Apple A5 die it looks like Apple implemented a dual-channel LP-DDR2 memory controller, similar to TI's OMAP 4430.

ARM Cortex A9 Based SoC Comparison
  Apple A5 TI OMAP 4 NVIDIA Tegra 2
Clock Speed Up to 1GHz Up to 1GHz Up to 1GHz
Core Count 2 2 2
L1 Cache Size 32KB/32KB 32KB/32KB 32KB/32KB
L2 Cache Size 1MB 1MB 1MB
Memory Interface Dual Channel LP-DDR2 (?) Dual Channel LP-DDR2 Single Channel LP-DDR2
NEON Support Yes (?) Yes No

Had it not been for NVIDIA Apple would've had the first shipping dual-core Cortex A9 SoC on the market. This is ultimately why Apple is producing it's own SoCs - most of the players in the SoC space don't seem to be moving fast enough for Apple's hardware schedule. Given the aggressive yearly product cadence I wouldn't be too surprised to see a dual-core Cortex A15 in the Apple A6 a year from now. Remember that much of Apple's success has come from being able to control it's hardware and software development. On the Mac side Apple has an extremely aggressive chip partner with Intel, but with the iDevices there is no equivalent (for now). Until that changes, Apple will continue to produce it's own SoCs. It's not that Apple is designing any of the IP that goes into the SoC, it's that Apple is piecing together what it needs, when it needs it.

We've already gone through the performance offered by the A5 over the A4, but to quickly recap: it's a huge increase. While the original iPad felt slow, the new one feels much faster. I would be lying if I said it was fast enough, but it's way better than the original.

CPU Performance

Taken from our iPad 2 Performance Preview:

Geekbench 2 - Floating Point Performance
  Apple iPad Apple iPad 2
Overall FP Score 456 915
Mandlebrot (single-threaded) 79.5 Mflops 279.1 Mflops
Mandlebrot (multi-threaded) 79.4 Mflops 554.7 Mflops
Dot Product (single-threaded) 245.7 Mflops 221.7 Mflops
Dot Product (multi-threaded) 247.2 Mflops 436.8 Mflops
LU Decomposition (single-threaded) 54.5 Mflops 205.4 Mflops
LU Decomposition (multi-threaded) 54.8 Mflops 421.6 Mflops
Primality Test (single-threaded) 71.2 Mflops 177.8 Mflops
Primality Test (multi-threaded) 69.3 Mflops 318.1 Mflops
Sharpen Image (single-threaded) 1.51 Mpixels/s 1.68 Mpixels/s
Sharpen Image (multi-threaded) 1.51 Mpixels/s 3.34 Mpixels/s
Blur Image (single-threaded) 760.2 Kpixels/s 665.5 Kpixels/s
Blur Image (multi-threaded) 753.2 Kpixels/s 1.32 Mpixels/s

Single threaded FPU performance is multiples of what we saw with the original iPad. This sort of an improvement in single-core performance is likely due to the pipelined Cortex A9 FPU. Looking at Linpack we see the same sort of huge improvement:

Linpack

Whether this performance advantage matters is another matter entirely. Although there aren't many FP intensive iPad apps available today, moving to the A5 is all about enabling developers - not playing catch up to software.

Geekbench reports the iPad 2 at 512MB of memory, double the original iPad's 256MB. Remember that Apple has to deal with lower profit margins than it'd like with the iPad, but it refuses to cut corners on screen quality so something else has to give.

L2 cache size has also apparently increased from 512KB to 1MB. The L2 cache is shared among both cores and 1MB seems to be the sweet spot this generation.

Geekbench 2 - Memory Performance
  Apple iPad Apple iPad 2
Overall Memory Score 644 787
Read Sequential (single-threaded scalar) 340.6 MB/s 334.2 MB/s
Write Sequential (single-threaded scalar) 842.4 MB/s 1.07 GB/s
Stdlib Allocate (single-threaded scalar) 1.74 Mallocs/s 1.86 Mallocs/s
Stdlib Write (single-threaded scalar) 1.20 GB/s 2.30 GB/s
Stdlib Copy (single-threaded scalar) 740.6 MB/s 522.0 MB/s

Geekbench's memory tests show an improvement in effective bandwidth as well. The biggest improvement is in the stdlib write test which shows a near doubling of bandwidth from 1.2GB/s to 2.3GB/s. Unfortunately this isn't enough data to draw conclusions about bus width or DRAM operating frequency. Given the increases in CPU and GPU performance, an increase in memory bandwidth to go along with the two isn't surprising.

Geekbench shows a healthy increase in integer performance, both in single and multithreaded scenarios. The multithreaded advantage makes sense (two are better than one), but the lead in single threaded tests shows the benefit the A9 can deliver thanks to its shorter pipeline and ability to reorder instructions around stalls.

Geekbench 2 - Integer Performance
  Apple iPad Apple iPad 2
Overall FP Score 365 688
Blowfish (single-threaded) 13.9 MB/s 13.2 MB/s
Blowfish (multi-threaded) 14.3 MB/s 26.1 MB/s
Text Compression (single-threaded) 1.23 MB/s 1.50 MB/s
Text Compression (multi-threaded) 1.20 MB/s 2.82 MB/s
Text Decompression (single-threaded) 1.11 MB/s 2.09 MB/s
Text Decompression (multi-threaded) 1.08 MB/s 3.28 MB/s
Image Compress (single-threaded) 3.36 Mpixels/s 3.79 Mpixels/s
Image Compress (multi-threaded) 3.41 Mpixels/s 7.51 Mpixels/s
Image Decompress (single-threaded) 6.02 Mpixels/s 6.68 Mpixels/s
Image Decompress (multi-threaded) 5.98 Mpixels/s 13.1 Mpixels/s
Lua (single-threaded) 172.1 Knodes/s 273.4 Knodes/s
Lua (multi-threaded) 171.9 Knodes/s 542.9 Knodes/s

On average Geekbench shows a 31% increase in single threaded integer performance over the A4 in the original iPad. NVIDIA told me they saw a 20% increase in instructions executed per clock for the A9 vs. A8 and if we remove the one outlier (text decompression) that's about what we see here as well.

Geekbench 2
  Overall Integer FP Memory Stream
Apple iPad 448 365 456 644 325
Apple iPad 2 750 688 915 787 324

The increases in integer performance and memory bandwidth are likely what will have the largest impact on your experience. The fact that we're seeing big gains in single as well as multi-threaded workloads means the performance improvement should be universal across all CPU-bound apps.

What does all of this mean for performance in the real world? The iPad 2 is much faster than its predecessor. Let's start with our trusty javascript benchmarks: SunSpider and BrowserMark.

SunSpider Javascript Benchmark 0.9

Apple improved the Safari JavaScript engine in iOS 4.3, which right off the bat helped the original iPad become more competitive in this test. Even with both pads running iOS 4.3, the iPad 2 is 80% faster than the original iPad here.

The Motorola Xoom we recently reviewed scored a few percent slower than the iPad 2 in SunSpider as well. Running different OSes and browsers, it's difficult to conclude much when comparing the A5 to Tegra 2.

A bug in BrowserMark kept us from running it for the Xoom review but it's since been fixed. Again we're looking at mostly JavaScript performance here. Rightware modeled its benchmark after the JavaScript frameworks and functions used by websites like Facebook, Amazon and Gmail among others. The results are simply one aspect of web browsing performance, but an important one:

Rightware BrowserMark

The move from the A4 in the iPad 1 to the A5 in the iPad 2 boosts scores by 47%. More impressive however is just how much faster the Xoom is here. I suspect this has more to do with Google's software optimizations in the Honeycomb browser than hardware, but let's see how these tablets fare in our web page loading tests.

We debuted an early version of our 2011 web page loading tests in the Xoom review. Two things have changed since then: 1) iOS 4.3 came out, and 2) we changed our timing methods to produce more accurate results. It turns out that Honeycomb's browser was stopping our page load timer sooner than iOS', which resulted in some funny numbers when we got to the 4.3/Honeycomb comparison. To ensure accuracy we went back to timing by hand (each test was repeated at least 5 times and we present an average of the results). We also added two more pages to the test suite (Digg and Facebook).

2011 Page Load Test - Average

The iPad 2 generally loads web pages faster than the Xoom. On average it's a ~20% increase in performance. I wouldn't say that the improvement is necessarily noticeable when surfing most sites, but it's definitely measurable.

Double the Memory, Still Not Enough

On a Mac or PC if you don't have enough system memory and go to run a new application you'll get a lot of swapping to disk. The OS will write least recently used pages of memory to disk and evict them from main memory, making room for the newly launched application. Memory management in iOS works differently. All applications are required to save their state as soon as they move from the foreground as iOS can evict them from memory at any point in time.

Having more memory in iOS means you can have apps with larger memory footprints or you can keep more apps in memory without forcefully evicting them, but it generally doesn't mean you'll see improved performance.

With the iPad 2 Apple chose to only equip the device with 512MB of LP-DDR2 memory. That's half of what you get in the Motorola Xoom, but twice what you got in the original iPad. This does mean that (as we mentioned earlier) things like web pages can remain in memory longer, although there's no real impact on performance from what we can tell.

If Apple follows its short tradition, we may see more memory in the iPhone 5 and then more in the iPad 3 next year. Display resolution didn't increase so there's no pressure for additional memory there, but Apple is definitely holding developers back by not throwing even more hardware resources at the iPad 2.

Industrial Design & The Future The GPU: Apple's Gift to Game Developers
Comments Locked

189 Comments

View All Comments

  • JarredWalton - Sunday, March 20, 2011 - link

    Considering the source (ARMflix), you need to take that video with a huge grain of salt. It looks like they're running some Linux variant on the two systems (maybe Chromium?), and while the build may be the same, that doesn't mean it's optimized equally well for Atom vs. A9.

    Single-core Atom at 1.6GHz vs. dual-core A9 at 500MHz surfing the web is fine and all, but when we discuss Atom being faster than A9 we're talking about raw performance potential. A properly optimized web browser and OS experience with high-speed Internet should be good on just about any modern platform. Throw in some video playback as well, give us something more than a script of web pages in a browser, etc.

    Now, none of this means ARM's A9 is bad, but to show that it's as fast as Atom when browsing some web pages is potentially meaningless. What we really need to know is what one platform can do well that the other can't handle properly. Where does A9 fall flat? Where does Atom stumble?

    For me, right now, Atom sucks at anything video related. Sorry, but YouTube and Hulu are pretty important tools for me. That also means iOS has some concerns, as it doesn't support Flash at all, and there are enough places where Flash is still used that it creates issues. Luckily, I have plenty of other devices for accessing the web. In the end, I mostly play Angry Birds on my iPod Touch while I'm waiting for someone. :-)
  • Wilco1 - Sunday, March 20, 2011 - link

    The article is indeed wrong to suggest that the A9 has only half the performance of an Atom. There are cases where a netbook with a single core Atom might be faster, for example if it runs at a much higher frequency, uses hyperthreading, and has a fast DDR3 memory system. However in terms of raw CPU performance the out-of-order A9 is significantly faster than the in-order Atom. Benchmark results such as CoreMark confirm this, a single core Atom cannot beat an A9 at the same frequency - even with hyperthreading. So it would be good to clarify that netbooks are faster because they use higher frequency CPUs and a faster memory system - as well as a larger battery...
  • somata - Sunday, March 27, 2011 - link

    CoreMark is nearly as meaningless as MIPS. Right now the best cross-platform benchmark we have is Geekbench. It uses portable, multi-threaded, native code to perform real tasks. My experience with Geekbench on the Mac/PC over the years indicates that Geekbench scores correlate pretty well to average application performance (determined by my personal suite of app benchmarks). Of course there will be outliers, but Geekbench does a pretty good job at representing typical code.

    Given that, the fact that a single-core 1.6GHz Atom (with HT) scores about 28% higher than the IPad's dual-core 1GHz A9s in the integer suite leaves me little doubt that the Atom, despite being in-order, has as good or better per-clock performance than the A9s.

    Even the oft-maligned PowerPC G4 totally outclasses the dual A9s, with 43% better integer performance at 1.42GHz... and that's just with a single core competing against two!
  • tcool93 - Sunday, March 20, 2011 - link

    Tablets do have their advantages despite what the article claims. For one thing, their battery life far out lives any Netbook or Notebook. They also run a lot cooler, unlike Notebooks and Netbooks, which you can fry an egg on. Maybe they aren't as portable as a phone, but who wants to look at the super tiny print on a phone.

    Tablets don't replace computers, and never will. There are nice to sit in bed with at night and browse the web or read books on, or play a simple game on. Anything that doesn't require a lot of typing.

    Even a 10" tablet screen isn't real big to read text, but its MUCH easier to zoom in on text to read it with tablets. Unlike any Notebook/'Netbook, which its a huge pain to get to zoom in.
  • tcool93 - Sunday, March 20, 2011 - link

    I do think the benchmarks shown here do show that there is quite an improvement over the Ipad 1, despite what many seem to claim that there isn't much of an upgrade.
  • secretmanofagent - Sunday, March 20, 2011 - link

    Anand,
    Appreciate the article, and appreciating that you're responding to the readers as well. All three of you said that it didn't integrate into your workflow, and I have a similar problem (which has prevented me from purchasing one). One thing I'm very curious about: What is your opinion on what would have been the Courier concept? Do you feel that is the direction that tablets should have taken, or do you think that Apple's refining as opposed to paradigming is the way to go?
  • VivekGowri - Sunday, March 20, 2011 - link

    I still despise Microsoft for killing the Courier project. Honestly, I'd have loved to see the tablet market go that direction - a lot more focused on content creation instead of a very consumption-centric device like the iPad. A $4-500 device running that UI, an ARM processor, and OneNote syncing ability would have sold like hotcakes to students. If only...
  • tipoo - Sunday, March 20, 2011 - link

    Me too, the Courier looked amazing. They cancel that, yet go ahead with something like the Kin? Hard to imagine where their heads are at.
  • Anand Lal Shimpi - Monday, March 21, 2011 - link

    While I've seen the Courier video, and it definitely looked impressive, it's tough to say how that would've worked in practice.

    I feel like there are performance limitations that are at work here. Even though a pair of A9s are quick, they are by no means fast enough. I feel like as a result, evolutionary refinement is the only way to go about getting to where we need to be. Along the way Apple (and its competitors) can pick up early adopters to help fund the progress.

    I'm really curious to see which company gets the gaming side of it down. Clearly that's a huge market.

    Take care,
    Anand
  • Azethoth - Monday, March 21, 2011 - link

    Gaming side is a good question. Apple will have an advantage there due to limited hardware specs to code to. They are a lot more like a traditional console that way vs Android which will be anything but.

    Are actual game controls like in the psp phone necessary?

    I am also curious what additional UI tech will eventually make it to the pad space:
    * Speech, although it is forever not there yet.
    * 3D maybe if its not a fad (glasses free)
    * Some form of the Kinect maybe to manipulate the 3d stuff and do magical kinect gestures and incantations we haven't dreamed up yet.
    * Haptic as mentioned earlier in the thread.

    Speech could make a pad suitable for hip bloggers like the AnandTech posse.

Log in

Don't have an account? Sign up now