The Apple iPad 2 Reviewby Brian Klug, Anand Lal Shimpi & Vivek Gowri on March 19, 2011 8:01 PM EST
The Right SoC at the Right Time: Apple's A5
Here's how I know Apple is masterful at marketing. After first showing off the new iPad Apple had tons of press convinced that the company was no longer competing based on specs but rather only interested in delivering an experience. In reality Apple is competing with hardware even more than before, it's just trying to give the public the impression that it's not. After all, Apple doesn't make the vast majority of the technology inside the iPad but it does control the experience. A competitor may be able to ship a dual core Cortex A9 but it can't ship the iOS experience. Is it really a surprise that Apple would downplay what it doesn't have exclusive rights to and instead try to get everyone to focus on what it does? Make no mistake, Apple is very much playing the specs game - in fact it's playing the game harder than anyone else in the industry today.
At the heart of the iPad 2 is a brand new SoC: the Apple A5. Built on what I assume is Samsung's 45nm process the A5 is a much more powerful SoC than it's predecessor the A4.
|ARM11||ARM Cortex A8||ARM Cortex A9||Qualcomm Scorpion|
|Pipeline Depth||8 stages||13 stages||9 stages||13 stages|
|Out of Order Execution||N||N||Y||Partial|
|FPU||Optional VFPv2 (not-pipelined)||VFPv3 (not-pipelined)||Optional VFPv3-D16 (pipelined)||VFPv3 (pipelined)|
|NEON||N/A||Y (64-bit wide)||Optional MPE (64-bit wide)||Y (128-bit wide)|
|Typical Clock Speeds||412MHz||600MHz/1GHz||1GHz||1GHz|
While the A4 featured a single core ARM Cortex A8, the A5 integrates two ARM Cortex A9s with a total of a 1MB L2 cache. That puts the A5 at a similar level of CPU performance to NVIDIA's Tegra 2 and TI's OMAP 4430. The only insider information I've managed to come across points to A5 featuring ARM's MPE (SIMD/NEON engine) in its A9 cores.
Based on Chipworks' analysis of the Apple A5 die it looks like Apple implemented a dual-channel LP-DDR2 memory controller, similar to TI's OMAP 4430.
|ARM Cortex A9 Based SoC Comparison|
|Apple A5||TI OMAP 4||NVIDIA Tegra 2|
|Clock Speed||Up to 1GHz||Up to 1GHz||Up to 1GHz|
|L1 Cache Size||32KB/32KB||32KB/32KB||32KB/32KB|
|L2 Cache Size||1MB||1MB||1MB|
|Memory Interface||Dual Channel LP-DDR2 (?)||Dual Channel LP-DDR2||Single Channel LP-DDR2|
|NEON Support||Yes (?)||Yes||No|
Had it not been for NVIDIA Apple would've had the first shipping dual-core Cortex A9 SoC on the market. This is ultimately why Apple is producing it's own SoCs - most of the players in the SoC space don't seem to be moving fast enough for Apple's hardware schedule. Given the aggressive yearly product cadence I wouldn't be too surprised to see a dual-core Cortex A15 in the Apple A6 a year from now. Remember that much of Apple's success has come from being able to control it's hardware and software development. On the Mac side Apple has an extremely aggressive chip partner with Intel, but with the iDevices there is no equivalent (for now). Until that changes, Apple will continue to produce it's own SoCs. It's not that Apple is designing any of the IP that goes into the SoC, it's that Apple is piecing together what it needs, when it needs it.
We've already gone through the performance offered by the A5 over the A4, but to quickly recap: it's a huge increase. While the original iPad felt slow, the new one feels much faster. I would be lying if I said it was fast enough, but it's way better than the original.
Taken from our iPad 2 Performance Preview:
|Geekbench 2 - Floating Point Performance|
|Apple iPad||Apple iPad 2|
|Overall FP Score||456||915|
|Mandlebrot (single-threaded)||79.5 Mflops||279.1 Mflops|
|Mandlebrot (multi-threaded)||79.4 Mflops||554.7 Mflops|
|Dot Product (single-threaded)||245.7 Mflops||221.7 Mflops|
|Dot Product (multi-threaded)||247.2 Mflops||436.8 Mflops|
|LU Decomposition (single-threaded)||54.5 Mflops||205.4 Mflops|
|LU Decomposition (multi-threaded)||54.8 Mflops||421.6 Mflops|
|Primality Test (single-threaded)||71.2 Mflops||177.8 Mflops|
|Primality Test (multi-threaded)||69.3 Mflops||318.1 Mflops|
|Sharpen Image (single-threaded)||1.51 Mpixels/s||1.68 Mpixels/s|
|Sharpen Image (multi-threaded)||1.51 Mpixels/s||3.34 Mpixels/s|
|Blur Image (single-threaded)||760.2 Kpixels/s||665.5 Kpixels/s|
|Blur Image (multi-threaded)||753.2 Kpixels/s||1.32 Mpixels/s|
Single threaded FPU performance is multiples of what we saw with the original iPad. This sort of an improvement in single-core performance is likely due to the pipelined Cortex A9 FPU. Looking at Linpack we see the same sort of huge improvement:
Whether this performance advantage matters is another matter entirely. Although there aren't many FP intensive iPad apps available today, moving to the A5 is all about enabling developers - not playing catch up to software.
Geekbench reports the iPad 2 at 512MB of memory, double the original iPad's 256MB. Remember that Apple has to deal with lower profit margins than it'd like with the iPad, but it refuses to cut corners on screen quality so something else has to give.
L2 cache size has also apparently increased from 512KB to 1MB. The L2 cache is shared among both cores and 1MB seems to be the sweet spot this generation.
|Geekbench 2 - Memory Performance|
|Apple iPad||Apple iPad 2|
|Overall Memory Score||644||787|
|Read Sequential (single-threaded scalar)||340.6 MB/s||334.2 MB/s|
|Write Sequential (single-threaded scalar)||842.4 MB/s||1.07 GB/s|
|Stdlib Allocate (single-threaded scalar)||1.74 Mallocs/s||1.86 Mallocs/s|
|Stdlib Write (single-threaded scalar)||1.20 GB/s||2.30 GB/s|
|Stdlib Copy (single-threaded scalar)||740.6 MB/s||522.0 MB/s|
Geekbench's memory tests show an improvement in effective bandwidth as well. The biggest improvement is in the stdlib write test which shows a near doubling of bandwidth from 1.2GB/s to 2.3GB/s. Unfortunately this isn't enough data to draw conclusions about bus width or DRAM operating frequency. Given the increases in CPU and GPU performance, an increase in memory bandwidth to go along with the two isn't surprising.
Geekbench shows a healthy increase in integer performance, both in single and multithreaded scenarios. The multithreaded advantage makes sense (two are better than one), but the lead in single threaded tests shows the benefit the A9 can deliver thanks to its shorter pipeline and ability to reorder instructions around stalls.
|Geekbench 2 - Integer Performance|
|Apple iPad||Apple iPad 2|
|Overall FP Score||365||688|
|Blowfish (single-threaded)||13.9 MB/s||13.2 MB/s|
|Blowfish (multi-threaded)||14.3 MB/s||26.1 MB/s|
|Text Compression (single-threaded)||1.23 MB/s||1.50 MB/s|
|Text Compression (multi-threaded)||1.20 MB/s||2.82 MB/s|
|Text Decompression (single-threaded)||1.11 MB/s||2.09 MB/s|
|Text Decompression (multi-threaded)||1.08 MB/s||3.28 MB/s|
|Image Compress (single-threaded)||3.36 Mpixels/s||3.79 Mpixels/s|
|Image Compress (multi-threaded)||3.41 Mpixels/s||7.51 Mpixels/s|
|Image Decompress (single-threaded)||6.02 Mpixels/s||6.68 Mpixels/s|
|Image Decompress (multi-threaded)||5.98 Mpixels/s||13.1 Mpixels/s|
|Lua (single-threaded)||172.1 Knodes/s||273.4 Knodes/s|
|Lua (multi-threaded)||171.9 Knodes/s||542.9 Knodes/s|
On average Geekbench shows a 31% increase in single threaded integer performance over the A4 in the original iPad. NVIDIA told me they saw a 20% increase in instructions executed per clock for the A9 vs. A8 and if we remove the one outlier (text decompression) that's about what we see here as well.
|Apple iPad 2||750||688||915||787||324|
The increases in integer performance and memory bandwidth are likely what will have the largest impact on your experience. The fact that we're seeing big gains in single as well as multi-threaded workloads means the performance improvement should be universal across all CPU-bound apps.
The Motorola Xoom we recently reviewed scored a few percent slower than the iPad 2 in SunSpider as well. Running different OSes and browsers, it's difficult to conclude much when comparing the A5 to Tegra 2.
The move from the A4 in the iPad 1 to the A5 in the iPad 2 boosts scores by 47%. More impressive however is just how much faster the Xoom is here. I suspect this has more to do with Google's software optimizations in the Honeycomb browser than hardware, but let's see how these tablets fare in our web page loading tests.
We debuted an early version of our 2011 web page loading tests in the Xoom review. Two things have changed since then: 1) iOS 4.3 came out, and 2) we changed our timing methods to produce more accurate results. It turns out that Honeycomb's browser was stopping our page load timer sooner than iOS', which resulted in some funny numbers when we got to the 4.3/Honeycomb comparison. To ensure accuracy we went back to timing by hand (each test was repeated at least 5 times and we present an average of the results). We also added two more pages to the test suite (Digg and Facebook).
The iPad 2 generally loads web pages faster than the Xoom. On average it's a ~20% increase in performance. I wouldn't say that the improvement is necessarily noticeable when surfing most sites, but it's definitely measurable.
Double the Memory, Still Not Enough
On a Mac or PC if you don't have enough system memory and go to run a new application you'll get a lot of swapping to disk. The OS will write least recently used pages of memory to disk and evict them from main memory, making room for the newly launched application. Memory management in iOS works differently. All applications are required to save their state as soon as they move from the foreground as iOS can evict them from memory at any point in time.
Having more memory in iOS means you can have apps with larger memory footprints or you can keep more apps in memory without forcefully evicting them, but it generally doesn't mean you'll see improved performance.
With the iPad 2 Apple chose to only equip the device with 512MB of LP-DDR2 memory. That's half of what you get in the Motorola Xoom, but twice what you got in the original iPad. This does mean that (as we mentioned earlier) things like web pages can remain in memory longer, although there's no real impact on performance from what we can tell.
If Apple follows its short tradition, we may see more memory in the iPhone 5 and then more in the iPad 3 next year. Display resolution didn't increase so there's no pressure for additional memory there, but Apple is definitely holding developers back by not throwing even more hardware resources at the iPad 2.