The GPU: Apple's Gift to Game Developers

The GPU side of the A5 is really what's most exciting. As we mentioned in our iPad 2 GPU Performance analysis, the A5 includes a dual-core PowerVR SGX 543 - also known as the SGX 543MP2. In our earlier article we showed the SGX 543MP2 easily beating both an iPad 1 and the Tegra 2 based Motorola Xoom.

To understand why the SGX 543MP2 has such a performance advantage we need to first remember that NVIDIA's Tegra 2 is nearly a year late. NVIDIA's first competitive ultra mobile GPU was supposed to be shipping in products in the first half of 2010, instead it found itself shipping in 2011. While NVIDIA is good at designing GPUs, it's not good enough that it can release a product and maintain a two year performance advantage over the competition. Let's look at the architecture, shall we?

NVIDIA's Tegra 2 features a DirectX 9-class GPU. NVIDIA used to call it the GeForce ULP (Ultra Low Power) but now it's just GeForce. As a DX9 class GPU we're dealing with a conventional, non-unified shader architecture. While all OpenGL ES 2.0 GPUs can execute pixel and vertex shader instructions, the GeForce in Tegra 2 runs pixel and vertex shaders on separate groups of hardware.

NVIDIA calls each pixel and vertex shader ALU a core. The Tegra 2 has four pixel shader cores and four vertex shader cores. The four pixel shader ALUs make up a single Vec4 and the same goes for the four vertex shader ALUs. NVIDIA wouldn't elaborate on what limitations exist when dispatching operations to the cores. All pixel shader operations happen at 20-bits per component precision while all vertex shader operations happen at 32-bits per component.

Each core is capable of executing one multiply+add (MAD) operation per clock. Do the math and that works out to be a peak rate of 8 MADs per clock for the entire GPU. The maximum operating frequency for the Tegra 2 GeForce GPU is 300MHz, however device vendors may run the GPU at a lower frequency to save on power. At 300MHz this works out to be 4.8 GFLOPS (counting a MAD as two FLOPs).

Imagination Technologies' PowerVR SGX 543MP2 is fundamentally a bigger GPU than the GeForce in NVIDIA's Tegra 2. Let's go through the math.

The SGX 543 features four USSE2 pipes. This is a unified shader architecture so both vertex and pixel shader code runs on the same set of hardware. The benefit of this approach is you get better performance in peaky situations where you're running a lot of vertex or pixel shader code and not a balance that's perfectly tailored to your architecture. The Tegra 2 will only run at peak efficiency if it encounters a mix of 50% vertex and 50% pixel shader code. The PowerVR SGX series will never have any of its execution pipes idle regardless of the instruction mix.

Each USSE2 pipe has a 4-wide vector ALU capable of cranking out 4 MADs per clock. Two of these pipes is enough to equal the peak throughput of what NVIDIA built in Tegra 2, but the PowerVR SGX 543 has four of them. As for the MP2? Go ahead and double that number again. The SGX 543MP2 is simply two 543s placed next to one another.

All of this works out to be 16 MADs per clock for the SGX 543 and 32 MADs per clock for the SGX 543MP2. At 200MHz that's 12.8GFLOPS and at 250MHz we're talking about 16 GFLOPS.

Mobile SoC GPU Comparison
  PowerVR SGX 530 PowerVR SGX 535 PowerVR SGX 540 PowerVR SGX 543 PowerVR SGX 543MP2 GeForce ULP Kal-El GeForce
SIMD Name USSE USSE USSE USSE2 USSE2 Core Core
# of SIMDs 2 2 4 4 8 8 12
MADs per SIMD 2 2 2 4 4 1 ?
Total MADs 4 4 8 16 32 8 ?
GFLOPS @ 200MHz 1.6 GFLOPS 1.6 GFLOPS 3.2 GFLOPS 6.4 GFLOPS 12.8 GFLOPS 3.2 GFLOPS ?
GFLOPS @ 300MHz 2.4 GFLOPS 2.4 GFLOPS 4.8 GFLOPS 9.6 GFLOPS 19.2 GFLOPS 4.8 GFLOPS ?

At its lowest expected clock speed, the 543MP2 already has over twice the compute power of the Tegra 2's GPU at its highest operating frequency. Take into account the fact that the A5 likely has more memory bandwidth than Tegra 2 and the SGX 543MP2 is a tile based architecture with lower bandwidth requirements and the performance numbers we talked about last time shouldn't be all that surprising.

The real competition for the SGX 543MP2 will be NVIDIA's Kal-El. That part is expected to ship on time and will feature a boost in core count: from 8 to 12. The ratio of pixel to vertex shader cores is not known at this point but I'm guessing it won't be balanced anymore. NVIDIA is promising 3x the GPU performance out of Kal-El so I suspect that we'll see an increase in throughput per core.

GPU Performance

Taken from our iPad 2 GPU Performance Preview:

As always we turn to GLBenchmark 2.0, a benchmark crafted by a bunch of developers who either have or had experience doing development work for some of the big dev houses in the industry. We'll start with some of the synthetics.

Over the course of PC gaming evolution we noticed a significant increase in geometry complexity. We'll likely see a similar evolution with games in the ultra mobile space, and as a result this next round of ultra mobile GPUs will seriously ramp up geometry performance.

Here we look at two different geometry tests amounting to the (almost) best and worst case triangle throughput measured by GLBenchmark 2.0. First we have the best case scenario - a textured triangle:

Geometry Throughput - Textured Triangle Test

The original iPad could manage 8.7 million triangles per second in this test. The iPad 2? 29 million. An increase of over 3x. Developers with existing titles on the iPad could conceivably triple geometry complexity with no impact on performance on the iPad 2.

Now for the more complex case - a fragment lit triangle test:

Geometry Throughput - Fragment Lit Triangle Test

The performance gap widens. While the PowerVR SGX 535 in the A4 could barely break 4 million triangles per second in this test, the PowerVR SGX 543MP2 in the A5 manages just under 20 million. There's just no competition here.

I mentioned an improvement in texturing performance earlier. The GLBenchmark texture fetch test puts numbers to that statement:

Fill Rate - Texture Fetch

We're talking about nearly a 5x increase in texture fetch performance. This has to be due to more than an increase in the amount of texturing hardware. An improvement in throughput? Increase in memory bandwidth? It's tough to say without knowing more at this point.

Apple iPad vs. iPad 2
  Apple iPad (PowerVR SGX 535) Apple iPad 2 (PowerVR SGX 543MP2)
Array test - uniform array access
3412.4 kVertex/s
3864.0 kVertex/s
Branching test - balanced
2002.2 kShaders/s
11412.4 kShaders/s
Branching test - fragment weighted
5784.3 kFragments/s
22402.6kFragments/s
Branching test - vertex weighted
3905.9 kVertex/s
3870.6 kVertex/s
Common test - balanced
1025.3 kShaders/s
4092.5 kShaders/s
Common test - fragment weighted
1603.7 kFragments/s
3708.2 kFragments/s
Common test - vertex weighted
1516.6 kVertex/s
3714.0 kVertex/s
Geometric test - balanced
1276.2 kShaders/s
6238.4 kShaders/s
Geometric test - fragment weighted
2000.6 kFragments/s
6382.0 kFragments/s
Geometric test - vertex weighted
1921.5 kVertex/s
3780.9 kVertex/s
Exponential test - balanced
2013.2 kShaders/s
11758.0 kShaders/s
Exponential test - fragment weighted
3632.3 kFragments/s
11151.8 kFragments/s
Exponential test - vertex weighted
3118.1 kVertex/s
3634.1 kVertex/s
Fill test - texture fetch
179116.2 kTexels/s
890077.6 kTexels/s
For loop test - balanced
1295.1 kShaders/s
3719.1 kShaders/s
For loop test - fragment weighted
1777.3 kFragments/s
6182.8 kFragments/s
For loop test - vertex weighted
1418.3 kVertex/s
3813.5 kVertex/s
Triangle test - textured
8691.5 kTriangles/s
29019.9 kTriangles/s
Triangle test - textured, fragment lit
4084.9 kTriangles/s
19695.8 kTriangles/s
Triangle test - textured, vertex lit
6912.4 kTriangles/s
20907.1 kTriangles/s
Triangle test - white
9621.7 kTriangles/s
29771.1 kTriangles/s
Trigonometric test - balanced
1292.6 kShaders/s
3249.9 kShaders/s
Trigonometric test - fragment weighted
1103.9 kFragments/s
3502.5 kFragments/s
Trigonometric test - vertex weighted
1018.8 kVertex/s
3091.7 kVertex/s
Swapbuffer Speed
600
599

Enough with the synthetics - how much of an improvement does all of this yield in the actual GLBenchmark 2.0 game tests? Oh it's big.

GLBenchmark 2.0 Egypt

Without AA, the Egypt test runs at 5.4x the frame rate of the original iPad. It's even 3.7x the speed of the Tegra 2 in the Xoom running at 1280 x 800 (granted that's an iOS vs. Android comparison as well).

GLBenchmark 2.0 Egypt - FSAA

With AA enabled the iPad 2 advantage grows to 7x. In a game with the complexity of the Egypt test the original iPad wouldn't be remotely playable while the iPad 2 could run it smoothly.

The Pro test is a little more reasonable, showing a 3 - 4x increase in performance compared to the original iPad:

GLBenchmark 2.0 PRO

GLBenchmark 2.0 PRO - FSAA

While we weren't able to reach the 9x figure claimed by Apple (I'm not sure that you'll ever see 9x running real game code), a range of 3 - 7x in GLBenchmark 2.0 is more reasonable. In practice I'd expect something less than 5x but that's nothing to complain about.

The Right SoC at the Right Time: Apple's A5 Battery Life
Comments Locked

189 Comments

View All Comments

  • Anand Lal Shimpi - Monday, March 21, 2011 - link

    The Xoom review was really written from the perspective of an iPad alternative, while I felt like we covered much of what made the iPad 2 different in our preview and wanted to focus on the bigger picture in the review.

    The Xoom's multitasking and notifications I believe make it easier to integrate into my workflow, but still not perfect. However Apple has been ergonomics than the Xoom, seemingly better (non-Flash) webpage compatibility, better stability and a smoother UI so it's a tradeoff.

    Personally, I'd probably carry the iPad 2 thanks to improved ergonomics (especially with a smart cover) and non-smooth UI frame rates do bother me. But given my workflow neither is sufficient for me to use exclusively when traveling. That's why I mention that both camps have things to work on, whichever gets there first should get your money if you're really on the fence.

    Take care,
    Anand
  • Death666Angel - Sunday, March 20, 2011 - link

    How do you get to the number in the chart? It would make sense to use the average of all 4 displays, but you don't seem to do that:
    406 + 409 + 352 + 354 = 1521
    1521 / 4 = 380,25 ~ 380
    Am I missing something here?
    Also, the contrast should be 861:
    966 + 842 + 778 + 859 = 3445
    3445 / 4 = 861,25 ~ 861
    Black levels should be better however:
    0,42 + 0,45 + 0,49 + 0,41 = 1,77
    1,77 / 4 = 0,4425 ~ 0,44
  • kmmatney - Sunday, March 20, 2011 - link

    The point of having the numbers separate was to show the difference between the WiFi and "WiFi+3G" versions.
  • Death666Angel - Monday, March 21, 2011 - link

    And my question wasn't about that at all. The numbers in the actual charts they use for comparison against the iPad1 and the Xoom are not corresponding to any of the 4 distinct iPad2s. So I was wondering where they got the numbers from, if they averaged them or whatnot. If they did average them, then they made a few mistakes in the process. :-) If they got them through some other means it would still be interesting to know which they used.
  • buff_samurai - Sunday, March 20, 2011 - link


    speaking about workflow.

    I am running a small consulting company for food/pharma industry, my expertise is in analytical instrumentation. Right now I'm using a beefy PC for CAD/backups and IP4/ipad for everything else: emails, project management, crm and documentation all squeezed into a small and portable device (terminal).

    Although I am not 100% happy with exchange support in iOS, security, syncing etc I see myself more efficient then ever and that simply means more time/money in my pocket.

    Lets try a common scenario: in a car, take a call, pull over, grab a laptop from a bag, power up, check some details, email couple of pdfs and do all that with customer hanging on the phone. Repeat the whole thing 10 times or more - you will see where I am coming from. Or try to carry your laptop around any mid size production line, control room and boardrooms and impress other engineers with questions like: 'where can I plug my laptop' at the same time.

    I can understand that for most of heavy laptop users ipad is just useless but lets face the fact that there are millions of professionals on the road and all they care for is better response time and flexibility.

    I could spend hours listing applications where no PC (portable or not) can match a tablet but the bottom line is: when moving to new tech we need to overcome our habits first. ipad, xoom and other are like a nice and shiny screwdriver but you will never find a use for it with pockets full of nails. That means no reviewer should ever comment on any device without actually making it a primary tool for couple of months: and if there is no time/money for it - just focus on things that are traceable or you may use your reputation.
  • darwiniandude - Sunday, March 20, 2011 - link

    I always love Anandtech reviews, they cover 'everything' really well, advantages and flaws with equal gusto.

    Thanks!

    Two points:
    1) Page 19 I think you're referring to iDisk, not iDrive. Doesn't really matter unless someone trys to Google it.

    2) With regard to web browsing, I know you're comparing these units as shipped, but I strongly recommend pro users consider 'iCab' from the AppStore, I don't use Safari much anymore. Propper tabs, full screen, downloading, browser user agent ID spoofing, way more powerful. Scroll pad to quickly navigate huge pages, gestures etc. Tt's very anti-iOS in that it's insanely powerful rather than designed to be simple, but I love it, specifically options like 'open bookmark in new tab' and 'open links from different domain in new tab' very customisable, plugins, blah blah blah. Anyway. It has Desktop style tabs. I wouldn't suggest you change the article, or review this or other 3rd party browsers because it's kinda beyond the scope of the review of the device, but it would be nice if people knew there were alternatives to give a more desktop style (still sans-flash) browser.
  • pja - Sunday, March 20, 2011 - link

    I have really wanted an iPad ever since they were released. Several months ago I had the money (the previous barrier to acquisition) so I went looking. I didn't want the 3G version, WiFi would be fine. But I knew I would not be happy unless I got the one with maximum ram. Well in Australia that was going to cost me over AUD1,000 (I thought they were much cheaper :-( ).

    Just before this I had built myself a new desktop with an AMD processor and graphics card; see I'm a fan of AMD (but not a bigot). So might I be better off with a netbook rather than an iPad. AMD had recently released the Brazos range. So I started to do some research.

    The result was my purchase of a Toshiba NB550D (the sexy orange one) which is a "little under-done" with the C50 Fusion Processor, only 1 Gb of memory and Windows 7 Starter. I have upgraded the memory to 2 Gb (still not enough) and installed Windows 7 Home Pro.

    The Toshiba is about the same size as an iPad but is much more functional, it has all my desktop PC's apps installed (particularly my favourite text editor (EditPlus), my browser (Firefox) and all the same bookmarks, etc. etc.) so when I travel I have everything I need and I didn't need to learn how to use new software.

    I still think the iPad is a great bit of gear but that's when I use the right side of my brain. My left side says "where's the value proposition?" We are all different but for me the left side of my brain always tends to win over the right side. I am very happy with "my" iPad alternative; more memory and the C-350 processor would be good (but not the larger form=factor that seems to entail). Oh! I forgot to say that the total cost of the Toshiba (including hardware and software upgrades was about AUD675 - more than AUD325 saving!

    Regards,
    Peter
  • Deepcover96 - Sunday, March 20, 2011 - link

    Great review. Anandtech's reviews are always well worth the wait. They are always thorough and I always learn something. I agree that it is a luxury device and it is hard to justify it for getting work done. I still purchased an iPad 1. I recently sold it to buy the iPad 2, as soon as I can find one. I do think you downplay how important the app selection is on iOS as compared to Android.
  • name99 - Sunday, March 20, 2011 - link

    Reading the summary of what all three authors think of iPad feels to me like someone who buys an iPod because it has calendar and contact functionality, and then is upset/surprised that it isn't a Palm.
    iPad is not a REPLACEMENT for a laptop/desktop, it is an AUGMENTATION. You use each for what they are good at. If you find yourself spending most of your time traveling and you need a full-featured computer during that time then, sure, adding iPad to the mix is stupid. But if you already have a laptop, and can afford it, iPad makes certain tasks a lot more pleasant.

    For my part, for example, my primary use for iPad is reading technical PDFs using Good Reader. I could read these on a laptop, but the keyboard really gets in the way (not to mention that the aspect ratio of the screen is inappropriate). If you don't do much reading of technical PDFs, this might seem dumb to you --- but I DO spend many hours a day reading these PDFs and I appreciate a tool that does the job properly, just like a professional carpenter doesn't use a $5 saw he bought at Walmart.

    The future of computing is not one device that does everything; it is multiple devices all optimized to a particular human form factor, that all work together --- an iPod nano AND an iPhone AND an iPad AND a laptop AND a desktop. Criticism of something germane to this vision is legitimate and sensible (and Apple's flailing regarding how much of the file metaphor it wants to present to users is a legitimate part of this criticism.) But complaints whose primary structure is "this device doesn't work exactly like a device I already own" is just stupid --- like complaining that a bicycle isn't a car.

    It's perfectly reasonable to say that you don't have a use for a certain class of device, especially because you already use something more powerful. I, for example, have no use for a Tivo or a video streaming devices like WD Live or Roku --- I have a full-fledged computer hooked up to my TV. But it is unreasonable to go further than that, and I've observed plenty of non-techy people who are very happy with their WD Live's or Tivos.
    It's even more unreasonable to complain that "Tivo sucks because it doesn't play DVDs".

    Use some sense. Don't keep trying to use iPad for things it is no good at. Keep in the bedroom, and use it to read, or to look up something quickly on the net, or to play a movie just before you go to sleep. Don't be insane and try to write a novel on it.
  • pja - Monday, March 21, 2011 - link

    "iPad is not a REPLACEMENT for a laptop/desktop, it is an AUGMENTATION. You use each for what they are good at. If you find yourself spending most of your time traveling and you need a full-featured computer during that time then, sure, adding iPad to the mix is stupid. But if you already have a laptop, and can afford it, iPad makes certain tasks a lot more pleasant."

    You must have either too much cash or too much time on your hands or both. A good business class laptop is AUD1,500+ while a top of the range iPad is AUD1,000 + here in Australia.

    It seems to me when you think about the iPod with your left brain there is very little functionality that a good netbook does not do both better and cheaper. However, I would agree that when you let your right brain rule then all of a sudden the iPad becomes a irresistible thing that you must possess. Unfortunately for me my left brain clicks in when I pull out my credit card.

    Peter

Log in

Don't have an account? Sign up now