Apple iPad 2 Previewby Anand Lal Shimpi, Brian Klug & Vivek Gowri on March 12, 2011 6:01 AM EST
I remember the speculation that lead up to Apple's iPad launch. The list of things everyone expected the device to do was absurd, and the theories on the architecture behind Apple's first branded SoC was just as fantastic. The simplest answer is sometimes the right one and as Ars Technica's Jon Stokes pointed out, the A4 was nothing more than a hardened ARM Cortex A8 core running at 1GHz in the iPad (and 800MHz in the iPhone 4).
The Cortex A8 is something we've covered extensively here so I won't go into great detail right now. It's a dual-issue, in-order architecture with a 13 stage integer pipeline and a non-pipelined FPU.
When Apple announced the iPad 2, it also briefly announced the A5 SoC. The only detail given? The A5 is a dual-core processor with a GPU that's 9x faster than what's in the A4.
There are only two recent ARM architectures that have multicore support: the ARM11 and the ARM Cortex A9. The A8 doesn't come in a multicore variant. Given how many other SoC vendors are shipping dual-core Cortex A9 SoCs, the A5 was likely no different than NVIDIA's Tegra 2, TI's OMAP 4 or Samsung's Exynos in that regard: armed with a pair of Cortex A9s running at 1GHz. Update: Geekbench reports clock speed at 900MHz. Update 2: Apple confirms 1GHz clock speed on the iPad 2 specs page.
|ARM11||ARM Cortex A8||ARM Cortex A9||Qualcomm Scorpion|
|Pipeline Depth||8 stages||13 stages||9 stages||13 stages|
|Out of Order Execution||N||N||Y||Partial|
|FPU||Optional VFPv2 (not-pipelined)||VFPv3 (not-pipelined)||Optional VFPv3-D16 (pipelined)||VFPv3 (pipelined)|
|NEON||N/A||Y (64-bit wide)||Optional MPE (64-bit wide)||Y (128-bit wide)|
|Typical Clock Speeds||412MHz||600MHz/1GHz||1GHz||1GHz|
The Cortex A9 is similar to the A8 but with an out-of-order execution engine and a shallower pipeline (9 stages). The result is better-than-A8 performance at the same clock speed. The A9 also adds a fully pipelined FPU.
Now it's unclear what the rest of the A5 SoC looks like, but from the CPU standpoint I think it's safe to say that there are a pair of ARM Cortex A9s in there. We can look at the increase in Geekbench Floating Point scores for some proof:
|Geekbench 2 - Floating Point Performance|
|Apple iPad||Apple iPad 2|
|Overall FP Score||456||915|
|Mandlebrot (single-threaded)||79.5 Mflops||279.1 Mflops|
|Mandlebrot (multi-threaded)||79.4 Mflops||554.7 Mflops|
|Dot Product (single-threaded)||245.7 Mflops||221.7 Mflops|
|Dot Product (multi-threaded)||247.2 Mflops||436.8 Mflops|
|LU Decomposition (single-threaded)||54.5 Mflops||205.4 Mflops|
|LU Decomposition (multi-threaded)||54.8 Mflops||421.6 Mflops|
|Primality Test (single-threaded)||71.2 Mflops||177.8 Mflops|
|Primality Test (multi-threaded)||69.3 Mflops||318.1 Mflops|
|Sharpen Image (single-threaded)||1.51 Mpixels/s||1.68 Mpixels/s|
|Sharpen Image (multi-threaded)||1.51 Mpixels/s||3.34 Mpixels/s|
|Blur Image (single-threaded)||760.2 Kpixels/s||665.5 Kpixels/s|
|Blur Image (multi-threaded)||753.2 Kpixels/s||1.32 Mpixels/s|
Single threaded FPU performance is multiples of what we saw with the original iPad. This sort of an improvement in single-core performance is likely due to the pipelined Cortex A9 FPU. Looking at Linpack we see the same sort of huge improvement:
Whether this performance advantage matters is another matter entirely. Although there aren't many FP intensive iPad apps available today, moving to the A5 is all about enabling developers - not playing catch up to software.
Memory size, bandwidth and operating frequencies are all unknowns that I was hoping to find out more about once I put hands on the iPad 2. Geekbench reports the iPad 2 at 512MB of memory, double the original iPad's 256MB. Remember that Apple has to deal with lower profit margins than it'd like with the iPad, but it refuses to cut corners on screen quality so something else has to give.
L2 cache size has also apparently increased from 512KB to 1MB. The L2 cache is shared among both cores and 1MB seems to be the sweet spot this generation.
|Geekbench 2 - Memory Performance|
|Apple iPad||Apple iPad 2|
|Overall Memory Score||644||787|
|Read Sequential (single-threaded scalar)||340.6 MB/s||334.2 MB/s|
|Write Sequential (single-threaded scalar)||842.4 MB/s||1.07 GB/s|
|Stdlib Allocate (single-threaded scalar)||1.74 Mallocs/s||1.86 Mallocs/s|
|Stdlib Write (single-threaded scalar)||1.20 GB/s||2.30 GB/s|
|Stdlib Copy (single-threaded scalar)||740.6 MB/s||522.0 MB/s|
Geekbench's memory tests show an improvement in effective bandwidth as well. The biggest improvement is in the stdlib write test which shows a near doubling of bandwidth from 1.2GB/s to 2.3GB/s. Unfortunately this isn't enough data to draw conclusions about bus width or DRAM operating frequency. Given the increases in CPU and GPU performance, an increase in memory bandwidth to go along with the two isn't surprising.
Geekbench shows a healthy increase in integer performance, both in single and multithreaded scenarios. The multithreaded advantage makes sense (two are better than one), but the lead in single threaded tests shows the benefit the A9 can deliver thanks to its shorter pipeline and ability to reorder instructions around stalls.
|Geekbench 2 - Integer Performance|
|Apple iPad||Apple iPad 2|
|Overall FP Score||365||688|
|Blowfish (single-threaded)||13.9 MB/s||13.2 MB/s|
|Blowfish (multi-threaded)||14.3 MB/s||26.1 MB/s|
|Text Compression (single-threaded)||1.23 MB/s||1.50 MB/s|
|Text Compression (multi-threaded)||1.20 MB/s||2.82 MB/s|
|Text Decompression (single-threaded)||1.11 MB/s||2.09 MB/s|
|Text Decompression (multi-threaded)||1.08 MB/s||3.28 MB/s|
|Image Compress (single-threaded)||3.36 Mpixels/s||3.79 Mpixels/s|
|Image Compress (multi-threaded)||3.41 Mpixels/s||7.51 Mpixels/s|
|Image Decompress (single-threaded)||6.02 Mpixels/s||6.68 Mpixels/s|
|Image Decompress (multi-threaded)||5.98 Mpixels/s||13.1 Mpixels/s|
|Lua (single-threaded)||172.1 Knodes/s||273.4 Knodes/s|
|Lua (multi-threaded)||171.9 Knodes/s||542.9 Knodes/s|
On average Geekbench shows a 31% increase in single threaded integer performance over the A4 in the original iPad. NVIDIA told me they saw a 20% increase in instructions executed per clock for the A9 vs. A8 and if we remove the one outlier (text decompression) that's about what we see here as well.
|Apple iPad 2||750||688||915||787||324|
The increases in integer performance and memory bandwidth are likely what will have the largest impact on your experience. The fact that we're seeing big gains in single as well as multi-threaded workloads means the performance improvement should be universal across all CPU-bound apps.
The Motorola Xoom we recently reviewed scored a few percent slower than the iPad 2 in SunSpider as well. Running different OSes and browsers, it's difficult to conclude much when comparing the A5 to Tegra 2.
The move from the A4 in the iPad 1 to the A5 in the iPad 2 boosts scores by 47%. More impressive however is just how much faster the Xoom is here. I suspect this has more to do with Google's software optimizations in the Honeycomb browser than hardware, but let's see how these tablets fare in our web page loading tests.
We debuted an early version of our 2011 web page loading tests in the Xoom review. Two things have changed since then: 1) iOS 4.3 came out, and 2) we changed our timing methods to produce more accurate results. It turns out that Honeycomb's browser was stopping our page load timer sooner than iOS', which resulted in some funny numbers when we got to the 4.3/Honeycomb comparison. To ensure accuracy we went back to timing by hand (each test was repeated at least 5 times and we present an average of the results). We also added two more pages to the test suite (Digg and Facebook).
The iPad 2 generally loads web pages faster than the Xoom. On average it's a ~20% increase in performance. I wouldn't say that the improvement is necessarily noticeable when surfing most sites, but it's definitely measurable.
The move to iOS 4.3 really narrowed the gap between the original iPad and the Xoom. In some cases the two actually render pages in the same amount of time, however that's typically for lighter pages that are easy to render. Up the complexity and the Xoom easily distances itself from the original iPad.
We'll touch on this more in the full review but it's not all about performance when talking about web browsing between the iPad 2 and the Xoom. Although the iPad 2 may have faster render times on average, the Xoom still supports tabbed browsing which definitely has its advantages.