GPU Performance

All of our discussions around the new iPad and its silicon thus far have been in the theoretical space. Unfortunately the state of Android/iOS benchmarking is abysmal at best today. Convincing game developers to include useful benchmarks and timedemo modes in their games is seemingly impossible without a suitably large check. I have no doubt this will happen eventually, but today we're left with some great games and no way to benchmark them.

Without suitable game benchmarks, we rely on GLBenchmark quite a bit to help us in evaluating mobile GPU performance. Although even the current most stressful GLBenchmark test (Egypt) is a far cry from what modern Android/iOS games look like, it's the best we've got today.

We'll start out with the synthetic tests, which should show us roughly a 2x increase in performance compared to the iPad 2. Remember the PowerVR SGX 543MP4 simply bundles four SGX 543 cores instead of two. Since we're still on a 45nm LP process, GPU clocks haven't increased so we're looking at a pure doubling of virtually all GPU resources.

GLBenchmark 2.1—Fill Test

GLBenchmark 2.1—Triangle Test (White)

GLBenchmark 2.1—Triangle Test (Textured, Fragment Lit)

Indeed we see a roughly 2x increase in triangle and fill rates. Below we have the output from GLBenchmark's low level tests. Pay particular attention to how, at 1024 x 768, performance doubles compared to the iPad 2 but at 2048 x 1536 performance can drop to well below what the iPad 2 was able to deliver at 10 x 7. It's because of this drop in performance at the iPad's native resolution that we won't see many (if any at all), visually taxing games run at anywhere near 2048 x 1536.

GLBenchmark 2.1.3 Low Level Comparison
  iPad 2 (10x7) iPad 3 (10x7) iPad 3 (20x15) ASUS TF Prime
Trigonometric test—vertex weighted
35 fps
60 fps
57 fps
47 fps
Trigonometric test—fragment weighted
7 fps
14 fps
4 fps
20 fps
Trigonometric test—balanced
5 fps
10 fps
2 fps
9 fps
Exponential test—vertex weighted
59 fps
60 fps
60 fps
41 fps
Exponential test—fragment weighted
25 fps
49 fps
13 fps
18 fps
Exponential test—balanced
19 fps
37 fps
8 fps
7 fps
Common test—vertex weighted
49 fps
60 fps
60 fps
35 fps
Common test—fragment weighted
8 fps
16 fps
4 fps
28 fps
Common test—balanced
6 fps
13 fps
2 fps
12 fps
Geometric test—vertex weighted
57 fps
60 fps
60 fps
27 fps
Geometric test—fragment weighted
12 fps
24 fps
6 fps
20 fps
Geometric test—balanced
9 fps
18 fps
4 fps
9 fps
For loop test—vertex weighted
59 fps
60 fps
60 fps
28 fps
For loop test—fragment weighted
30 fps
57 fps
16 fps
42 fps
For loop test—balanced
22 fps
43 fps
11 fps
15 fps
Branching test—vertex weighted
58 fps
60 fps
60 fps
45 fps
Branching test—fragment weighted
58 fps
60 fps
30 fps
46 fps
Branching test—balanced
22 fps
43 fps
16 fps
16 fps
Array test—uniform array access
59 fps
60 fps
60 fps
60 fps
Fill test—Texture Fetch
1001483136 texels/s
1977874688
texels/s
1904501632
texels/s
415164192
texels/s
Triangle test—white
65039568
triangles/s
133523176
triangles/s
85110008
triangles/s
55729532
triangles/s
Triangle test—textured
56129984
triangles/s
116735856
triangles/s
71362616
triangles/s
54023840
triangles/s
Triangle test—textured, vertex lit
45314484
triangles/s
93638456
triangles/s
46841924
triangles/s
28916834
triangles/s
Triangle test—textured, fragment lit
43527292
triangles/s
92831152
triangles/s
39277916
triangles/s
26935792
triangles/s

GLBenchmark also includes two tests designed to be representative of a workload you could see in an actual 3D game. The older Pro test uses OpenGL ES 1.0 while Egypt is an ES 2.0 test. These tests can either run at the device's native resolution with vsync enabled, or rendered offscreen at 1280 x 720 with vsync disabled. The latter offers us a way to compare GPUs without device screen resolution creating unfair advantages.

Unfortunately there was a bug in the iOS version of GLBenchmark 2.1.2 that resulted in all on-screen benchmarks running at 1024 x 768 rather than the new iPad's native 2048 x 1536 resolution. This is why all of the native GLBenchmark scores from the new iPad are capped at 60 fps. It's not because the new GPU is fast enough to render at speeds above 60 fps at 2048 x 1536, it's because the benchmark is actually showing performance at 1024 x 768. Luckily, GLBenchmark 2.1.3 fixes this problem and delivers results at the new iPad's native screen resolution:

GLBenchmark 2.1—Egypt (Standard)

GLBenchmark 2.1—Pro (Standard)

Surprisingly enough, the A5X is actually fast enough to complete these tests at over 50 fps. Perhaps this is more of an indication of how light the Egypt workload has become, as the current crop of Retina Display enhanced 3D titles for the iPad all render offscreen to a non-native resolution due to performance constraints. The bigger takeaway is that with the 543MP4 and a quad-channel LP-DDR2 interface, it is possible to run a 3D game at 2048 x 1536 and deliver playable frame rates. It won't be the prettiest game around, but it's definitely possible.

The offscreen results give us the competitive analysis that we've been looking for. With a ~2x die size advantage, the fact that we're seeing a 2-3x gap in performance here vs. NVIDIA's Tegra 3 isn't surprising:

GLBenchmark 2.1—Egypt—Offscreen 720p

GLBenchmark 2.1—Pro—Offscreen 720p

The bigger worry is what happens when the first 1920 x 1200 enabled Tegra 3 tablets start shipping. With (presumably) no additional GPU horsepower or memory bandwidth under the hood, we'll see this gap widen.

The Impact of Larger Memory A5X vs. Tegra 3 in the Real World
Comments Locked

234 Comments

View All Comments

  • name99 - Friday, March 30, 2012 - link

    Just to clarify, this is NOT some Apple proprietary thing. The Apple ports are following the USB charging spec. This is an optional part of the spec, but any other manufacturer is also welcome to follow it --- if they care about the user experience.
  • darkcrayon - Thursday, March 29, 2012 - link

    All recent Macs (last 2-3 years) can supply additional power via their USB ports which is enough to charge an iPad that's turned on (though probably not if it's working very hard doing something). Most non-Mac computer USB ports can only deliver the standard amount of USB power, which is why you're seeing this.

    Your Lenovo *should* still recharge the iPad if the iPad is locked and sleeping, though it will do so very slowly.
  • dagamer34 - Friday, March 30, 2012 - link

    I did the calculations and it would take about 21 hours to recharge an iPad 3 on a normal non-fast charging USB port from dead to 100%. Keep in mind, we're talking about a battery that's larger in capacity than the 11" MacBook Air.
  • snoozemode - Thursday, March 29, 2012 - link

    http://www.qualcomm.com/media/documents/files/snap...
  • Aenean144 - Thursday, March 29, 2012 - link

    Anandtech: "iPhoto is a very tangible example of where Apple could have benefitted from having four CPU cores on A5X"

    Is iPhoto really a kind of app that can actually take advantage of 2 cores? If there are batch image processing type functionality, certainly, though I don't know if iPhoto for iOS has this type of functionality. The slowness could just be from a 1.0 product and further tuning and refinement will fix it.

    I'm typically highly skeptical of the generic "if the app is multithreaded, it can make use of all of the cores" line of thought. Basically all of the threads, save one, are typically just waiting on user input.
  • Anand Lal Shimpi - Thursday, March 29, 2012 - link

    It very well could be that iOS iPhoto isn't well written, but in using the editing tools I can typically use 60 - 95% of the A5X's two hardware threads. Two more cores, at the bare minimum, would improve UI responsiveness as it gives the scheduler another, lightly scheduled core to target.

    Alternatively, a 50% increase in operating frequency and an improvement in IPC could result in the same net benefit.

    Take care,
    Anand
  • shompa - Friday, March 30, 2012 - link

    *hint* Use top on a iOS/Android device and you will see 30-60 processes at all time. The single threaded, single program thinking is Windows specific and have been solved on Unix since late 1960. Todays Windows phones are all single threaded because windows kernel is not good at Multit hreding.

    With many processes running, it will always be beneficial to have additional cores. Apple have also solved it in OSX by adding Grand central dispatch in their development tools making multithreaded programs easy.

    Iphoto for Ipad: Editing 3 million pixel will demand huge amount of CPU/GPU time + memory. Apple have so far been able to program elegant solutions around the limits of ARM CPUs by using NOVA SIMD extensions and GPU acceleration. An educated guess is that Iphoto is not fully optimized and will be at later time.

    (the integrated approach gives Apple a huge advantage over Android since Apple can accelerate stuff with SIMDs. Google does not control the hardware and can therefore not optimize its code. That is one of the reasons why single core A4 was almost as fast as dual core Tegras. I was surpassed when Google managed to implement their own acceleration in Andriod 4.X. Instead of SIMD, Google uses GL, since all devices have graphics cards. This is the best feuture in Android 4.x.)
  • name99 - Thursday, March 29, 2012 - link

    [/quote]
    Apple’s design lifespan directly correlates to the maturity of the product line as well as the competitiveness of the market the product is in.
    [/quote]

    I think this is completely the wrong way to look at it. Look across the entire Apple product line.
    I'd say a better analysis of chassis is that when a product first comes out, Apple can't be sure how it will be used and perceived, so there is some experimentation with different designs. But as time goes by, the design becomes more and more perfected (yes yes, if you hate Apple we know your feelings about the use of this word) and so there's no need to change until something substantial drives a large change.

    Look, for example, at the evolution of iMac from the Luxo Jr version to the white all-in-on-flatscreen, to the current aluminum-edged flatscreen which is largely unchanged for what, five or six years now. Likewise for the MacBook Pro.
    Look at the MacBook Air. The first two revs showed the same experimentation, trying different curves and angles, but Apple (and I'd say customers) seems to feel that the current wedge shape is optimal --- a definite improvement on the previous MBA models, and without anything that obviously needs to be improved. (Perhaps the sharp edges could be rounded a little, and if someone could work out the mechanicals, perhaps the screen could tilt further back.)

    And people accept and are comfortable with this --- in spite of "people buy Apple as a fashion statement idiocy". No-one will be at all upset if the Ivy League iMacs and MBAs and Mac Minis look like their predecessors (apart from minor changes like USB3 ports) --- in fact people expect it.

    So for iPhone and iPad. Might Apple keep using the same iPhone4 chassis for the next two years, with only minor changes? Why not? There's no obvious improvement it needs.
    (Except, maybe, a magnet on the side like iPad has, so you could slip a book-like case on it that covered the screen, and switched it on by opening the book.)
    Likewise for iPad.

    New must have features in phones/tablets (NFC? near-field charging? waterproof? built-in projector like Samsung Beam?) might change things. But absent those, really, the issue is not "Apple uses two year design cycles", it is "Apple perfects the design, then sticks with it".
  • mr_ripley - Thursday, March 29, 2012 - link

    "In situations where a game is available in both the iOS app store as well as NVIDIA's Tegra Zone, NVIDIA generally delivers a comparable gaming experience to what you get on the iPad... The iPad's GPU performance advantage just isn't evident in those cases..."

    Would you expect it to be if all the games you compare have not been optimized for the new ipad yet? They run at great frame rates but suffer in visuals or are only available at ipad 2 resolutions. The tegra zone games are clearly optimized for Tegra while their iOS counterparts are not optimized for the A5x, so of course the GPU advantage is not evident.

    This comparison does not seem fair unless there is a valid reason to believe that the tegra zone games cannot be further enhanced/optimized to take advantage of the new ipad hardware.

    I suspect that the tegra zone games optimized for A5x will offer a tangibly superior performance and experience. And the fact that the real world performance suffers today does not mean we will not see it shortly.
  • Steelbom - Thursday, March 29, 2012 - link

    Exactly this.

Log in

Don't have an account? Sign up now