The GPU: Apple's Gift to Game Developers

The GPU side of the A5 is really what's most exciting. As we mentioned in our iPad 2 GPU Performance analysis, the A5 includes a dual-core PowerVR SGX 543 - also known as the SGX 543MP2. In our earlier article we showed the SGX 543MP2 easily beating both an iPad 1 and the Tegra 2 based Motorola Xoom.

To understand why the SGX 543MP2 has such a performance advantage we need to first remember that NVIDIA's Tegra 2 is nearly a year late. NVIDIA's first competitive ultra mobile GPU was supposed to be shipping in products in the first half of 2010, instead it found itself shipping in 2011. While NVIDIA is good at designing GPUs, it's not good enough that it can release a product and maintain a two year performance advantage over the competition. Let's look at the architecture, shall we?

NVIDIA's Tegra 2 features a DirectX 9-class GPU. NVIDIA used to call it the GeForce ULP (Ultra Low Power) but now it's just GeForce. As a DX9 class GPU we're dealing with a conventional, non-unified shader architecture. While all OpenGL ES 2.0 GPUs can execute pixel and vertex shader instructions, the GeForce in Tegra 2 runs pixel and vertex shaders on separate groups of hardware.

NVIDIA calls each pixel and vertex shader ALU a core. The Tegra 2 has four pixel shader cores and four vertex shader cores. The four pixel shader ALUs make up a single Vec4 and the same goes for the four vertex shader ALUs. NVIDIA wouldn't elaborate on what limitations exist when dispatching operations to the cores. All pixel shader operations happen at 20-bits per component precision while all vertex shader operations happen at 32-bits per component.

Each core is capable of executing one multiply+add (MAD) operation per clock. Do the math and that works out to be a peak rate of 8 MADs per clock for the entire GPU. The maximum operating frequency for the Tegra 2 GeForce GPU is 300MHz, however device vendors may run the GPU at a lower frequency to save on power. At 300MHz this works out to be 4.8 GFLOPS (counting a MAD as two FLOPs).

Imagination Technologies' PowerVR SGX 543MP2 is fundamentally a bigger GPU than the GeForce in NVIDIA's Tegra 2. Let's go through the math.

The SGX 543 features four USSE2 pipes. This is a unified shader architecture so both vertex and pixel shader code runs on the same set of hardware. The benefit of this approach is you get better performance in peaky situations where you're running a lot of vertex or pixel shader code and not a balance that's perfectly tailored to your architecture. The Tegra 2 will only run at peak efficiency if it encounters a mix of 50% vertex and 50% pixel shader code. The PowerVR SGX series will never have any of its execution pipes idle regardless of the instruction mix.

Each USSE2 pipe has a 4-wide vector ALU capable of cranking out 4 MADs per clock. Two of these pipes is enough to equal the peak throughput of what NVIDIA built in Tegra 2, but the PowerVR SGX 543 has four of them. As for the MP2? Go ahead and double that number again. The SGX 543MP2 is simply two 543s placed next to one another.

All of this works out to be 16 MADs per clock for the SGX 543 and 32 MADs per clock for the SGX 543MP2. At 200MHz that's 12.8GFLOPS and at 250MHz we're talking about 16 GFLOPS.

Mobile SoC GPU Comparison
  PowerVR SGX 530 PowerVR SGX 535 PowerVR SGX 540 PowerVR SGX 543 PowerVR SGX 543MP2 GeForce ULP Kal-El GeForce
SIMD Name USSE USSE USSE USSE2 USSE2 Core Core
# of SIMDs 2 2 4 4 8 8 12
MADs per SIMD 2 2 2 4 4 1 ?
Total MADs 4 4 8 16 32 8 ?
GFLOPS @ 200MHz 1.6 GFLOPS 1.6 GFLOPS 3.2 GFLOPS 6.4 GFLOPS 12.8 GFLOPS 3.2 GFLOPS ?
GFLOPS @ 300MHz 2.4 GFLOPS 2.4 GFLOPS 4.8 GFLOPS 9.6 GFLOPS 19.2 GFLOPS 4.8 GFLOPS ?

At its lowest expected clock speed, the 543MP2 already has over twice the compute power of the Tegra 2's GPU at its highest operating frequency. Take into account the fact that the A5 likely has more memory bandwidth than Tegra 2 and the SGX 543MP2 is a tile based architecture with lower bandwidth requirements and the performance numbers we talked about last time shouldn't be all that surprising.

The real competition for the SGX 543MP2 will be NVIDIA's Kal-El. That part is expected to ship on time and will feature a boost in core count: from 8 to 12. The ratio of pixel to vertex shader cores is not known at this point but I'm guessing it won't be balanced anymore. NVIDIA is promising 3x the GPU performance out of Kal-El so I suspect that we'll see an increase in throughput per core.

GPU Performance

Taken from our iPad 2 GPU Performance Preview:

As always we turn to GLBenchmark 2.0, a benchmark crafted by a bunch of developers who either have or had experience doing development work for some of the big dev houses in the industry. We'll start with some of the synthetics.

Over the course of PC gaming evolution we noticed a significant increase in geometry complexity. We'll likely see a similar evolution with games in the ultra mobile space, and as a result this next round of ultra mobile GPUs will seriously ramp up geometry performance.

Here we look at two different geometry tests amounting to the (almost) best and worst case triangle throughput measured by GLBenchmark 2.0. First we have the best case scenario - a textured triangle:

Geometry Throughput - Textured Triangle Test

The original iPad could manage 8.7 million triangles per second in this test. The iPad 2? 29 million. An increase of over 3x. Developers with existing titles on the iPad could conceivably triple geometry complexity with no impact on performance on the iPad 2.

Now for the more complex case - a fragment lit triangle test:

Geometry Throughput - Fragment Lit Triangle Test

The performance gap widens. While the PowerVR SGX 535 in the A4 could barely break 4 million triangles per second in this test, the PowerVR SGX 543MP2 in the A5 manages just under 20 million. There's just no competition here.

I mentioned an improvement in texturing performance earlier. The GLBenchmark texture fetch test puts numbers to that statement:

Fill Rate - Texture Fetch

We're talking about nearly a 5x increase in texture fetch performance. This has to be due to more than an increase in the amount of texturing hardware. An improvement in throughput? Increase in memory bandwidth? It's tough to say without knowing more at this point.

Apple iPad vs. iPad 2
  Apple iPad (PowerVR SGX 535) Apple iPad 2 (PowerVR SGX 543MP2)
Array test - uniform array access
3412.4 kVertex/s
3864.0 kVertex/s
Branching test - balanced
2002.2 kShaders/s
11412.4 kShaders/s
Branching test - fragment weighted
5784.3 kFragments/s
22402.6kFragments/s
Branching test - vertex weighted
3905.9 kVertex/s
3870.6 kVertex/s
Common test - balanced
1025.3 kShaders/s
4092.5 kShaders/s
Common test - fragment weighted
1603.7 kFragments/s
3708.2 kFragments/s
Common test - vertex weighted
1516.6 kVertex/s
3714.0 kVertex/s
Geometric test - balanced
1276.2 kShaders/s
6238.4 kShaders/s
Geometric test - fragment weighted
2000.6 kFragments/s
6382.0 kFragments/s
Geometric test - vertex weighted
1921.5 kVertex/s
3780.9 kVertex/s
Exponential test - balanced
2013.2 kShaders/s
11758.0 kShaders/s
Exponential test - fragment weighted
3632.3 kFragments/s
11151.8 kFragments/s
Exponential test - vertex weighted
3118.1 kVertex/s
3634.1 kVertex/s
Fill test - texture fetch
179116.2 kTexels/s
890077.6 kTexels/s
For loop test - balanced
1295.1 kShaders/s
3719.1 kShaders/s
For loop test - fragment weighted
1777.3 kFragments/s
6182.8 kFragments/s
For loop test - vertex weighted
1418.3 kVertex/s
3813.5 kVertex/s
Triangle test - textured
8691.5 kTriangles/s
29019.9 kTriangles/s
Triangle test - textured, fragment lit
4084.9 kTriangles/s
19695.8 kTriangles/s
Triangle test - textured, vertex lit
6912.4 kTriangles/s
20907.1 kTriangles/s
Triangle test - white
9621.7 kTriangles/s
29771.1 kTriangles/s
Trigonometric test - balanced
1292.6 kShaders/s
3249.9 kShaders/s
Trigonometric test - fragment weighted
1103.9 kFragments/s
3502.5 kFragments/s
Trigonometric test - vertex weighted
1018.8 kVertex/s
3091.7 kVertex/s
Swapbuffer Speed
600
599

Enough with the synthetics - how much of an improvement does all of this yield in the actual GLBenchmark 2.0 game tests? Oh it's big.

GLBenchmark 2.0 Egypt

Without AA, the Egypt test runs at 5.4x the frame rate of the original iPad. It's even 3.7x the speed of the Tegra 2 in the Xoom running at 1280 x 800 (granted that's an iOS vs. Android comparison as well).

GLBenchmark 2.0 Egypt - FSAA

With AA enabled the iPad 2 advantage grows to 7x. In a game with the complexity of the Egypt test the original iPad wouldn't be remotely playable while the iPad 2 could run it smoothly.

The Pro test is a little more reasonable, showing a 3 - 4x increase in performance compared to the original iPad:

GLBenchmark 2.0 PRO

GLBenchmark 2.0 PRO - FSAA

While we weren't able to reach the 9x figure claimed by Apple (I'm not sure that you'll ever see 9x running real game code), a range of 3 - 7x in GLBenchmark 2.0 is more reasonable. In practice I'd expect something less than 5x but that's nothing to complain about.

The Right SoC at the Right Time: Apple's A5 Battery Life
Comments Locked

189 Comments

View All Comments

  • synaesthetic - Sunday, March 20, 2011 - link

    I have to agree, the 11" MBA is one extremely sexy piece of kit.

    I wish there was a similar option that wasn't branded with the half-eaten fruit of hipsterdom. And doesn't run OSX, which I don't particularly like.
  • snouter - Sunday, March 20, 2011 - link

    iPad does have for real 10 hour battery life and is generally maintenance free. Charge it, pick it up, use it. But, the Air gets a solid 5 hours (gets me from coast to coast) and is also pretty much instant on and generates no heat and I never hear the fan. So, although the iPad has a clear advantage in battery life it has no clear advantage as a "consumption device" and it forces you to favor apps and it does not handle media files as well and it does not have flash, which, is still out there.

    As far as price, yeah, the 11" Air is 50% to 100% more expensive, but ULV Sandy Bridge will see a flood of products on the PC side of things that should have lower price tags and if some PC manufacturer would please step up and start taking product design seriously.

    I typed this on my Air, and I would probably type less and put less thought into it (the same dreaded way that BlackBerry effect has really been a setback for written communications with the half butt answers) on an iPad.

    Also, one last Air advantage, it has a screen on a hinge. I got so sick of hold the iPad or having to prop it up on things...

    The iPad is a +1 device, sure, but... I'm going to stick with the 2 pound laptops for a while.
  • nickdoc - Sunday, March 20, 2011 - link

    Well, if I deserve to be called a hipster or dickhead by some poorly educated idiot with two brain cells (both of them obviously white) for owning an iPad along with a MacBook Air, Mac Mini Server, MacBook Pro 15 and 17", 27" Cinema Display, iPhone4 and something else I forgot, then so be it. I'm not offended in the knowledge who the comment came from. A really sad case. Can't help feeling sorry for you, Kuka-whatever-your-screen-name-was.

    It looks like the comments here have been written by people under the age of 45-50 because no one has ever mentioned glasses. Yes, those things people need to see what's in front of them, far and near. It's worse when you need both. Then you won't be so happy to do any kind of work on an iPhone or even surf the web. You would wish for a larger screen every time you are forced to switch from your normal glasses to your reading spectacles. Use a netbook? Even worse. A tablet is different and allows you to read with your nose practically replacing your fingers on that touch screen. Perfect!

    As a surgeon, I often have to show other people what I mean. This can be a scan, a plain radiograph, lab results and so on. Unless I have a big screen right there for all to see, the iPad is the gadget of choice. Give it to the team before surgery to look at scans with my notes right there on the screen, pass it around when on teaching rounds, give it to a frightened patient to reassure. Try doing the same with a smartphone or a netbook (useless toys that they are) and you will see how crazy that idea is.

    Basically, in my field, there is no end to the list of possible applications. This is combining consumption with creation. Therefore, before using such terms as dickheads, try to think a bit further than your own little world if your "processor" has that much power. If not, well... As I said, a very sad case.
  • Gunhedd - Sunday, March 20, 2011 - link

    Thank you. I wish more folks would pipe in with the real-world capabilities and uses they're discovering. No matter though. Apple-hate isn't new. I dealt with it in the '90s when Apple really was in trouble. Apple currently firing on all cylinders just keeps giving haters more and more to bitch about. (Price of success perhaps?)

    Hipsters? Dickheads? WTF?

    This comment isn't about the review but the inane comments that invariably get trotted out by hater technogeeks that won't move out of their mother's basement, disappointed that all the flash-porn won't work on an iOS device. Instant "fail" (or whatever silly phrase the self-annointed, self-important digerati are using today) in their book. These folks need to get out and learn that most people are "not" like them. But that would require getting a life. (Which would probably be easier than getting a date...)

    (See? I can paint with the broad strokes too. ;)
  • softdrinkviking - Sunday, March 20, 2011 - link

    I just wanted to say that I really enjoyed Alexander's glass article, it was a great read.
    My grandfather was a material scientist, so it brought back a lot of good memories.
  • AgeOfPanic - Sunday, March 20, 2011 - link

    Thanks for the great review. Anandtech seems to be the best site for independent and in-depth reviews. Please keep that going, because there is too much fanboyism going around. Saying that I have to admit, that I lean towards the Android side, because I think it's much more suited towards the tech enthusiast. Right one my HD2 is running the newest Gingerbread 2.3.3 rom from XDA, something impossible with iOS. However, I'm typing this on my iPad and if you would ask me which tablet I would recommend to my parents right now, I would say the iPad.
    I myself will switch. The question is if I can hold out to the quad core SOC that have been announced for later this year or will go for a Xoom wifi only model. The iPad convinced me that a tablet is what I need most of the time. However, iOS is hopelessly outdated. No widgets, notifications are laughable and browsing is annoying. With no memory, switching between tabs means reloading almost every time. And loading is slow.
    That's also why I was so interested in your browser scores. Couple of things I noticed. First of all you switched back to manual measurements for the page loading, because the Honeycomb browser stopped the timer too early. Isn't that just a sign that it is fast or was it really, really early? Manual measurement has it's on flaws though.Very susceptible to operator bias. I don't think you should report your scores in milliseconds then, because that implies an accuracy you just don't have. Furthermore, I would like to see error bars, so we can determine if these differences were really significant.
    Again, these are my comments. Thanks for the good work.
  • bjacobson - Sunday, March 20, 2011 - link

    want it on android...
  • TEAMSWITCHER - Sunday, March 20, 2011 - link

    I purchased an iPad 2 for my wife. I had been giving her my old MacBook Pro laptops, which at even four years old are complete overkill for her use. She adores the new iPad. It's far more portable and can be used in more situations than a laptop.

    Case in point, this week she created the family shopping list on her iPad 2 and brought it grocery store. She browses the WEB, FaceBook, games, EMAIL, and keep all her favorite photos, movies, and music.

    From now on, i'll be hocking my used MacBooks on craigslist if I can. She doesn't even want a laptop anymore. That's the biggest issue I have - it's too good. Too many people will find that tablets are better and abandon their laptops altogether. Laptops will stop evolving, much like desktops did once Laptops became popular.
  • Anand Lal Shimpi - Monday, March 21, 2011 - link

    I agree that it's a better laptop for casual users. However the Flash limitation I believe is still a problem that prevents it from being a complete laptop replacement for even casual users (a lot of restaurant, automotive and photography websites are still unfortunately 100% flash based). As long as you have some access to a laptop however this is really a non-issue, except when traveling with only the iPad.

    Take care,
    Anand
  • alex2792 - Sunday, March 20, 2011 - link

    I enjoyed reading the review,but it seemed a bit biased to me. While it's true that the iPad can't replace a laptop for content creation it works just fine in many fields. I sell annuities and the iPad has totally replaced my laptop when I'm on the go. I have designed presentations using keynote before and It worked great, whenever meeting a client I always bring my iPad instead of carrying paper brouchoures, in fact most of these clients end up getting an iPad themselves after playing with mine. Maybe Apple should pay me for advertising their product.

Log in

Don't have an account? Sign up now