The A6: What's Next?

Apple has somehow managed to get a lot of the mainstream press to believe it doesn't care about specs and that it competes entirely based on user experience. Simply looking at the facts tell us a different story entirely:

Apple SoCs
  2007 2008 2009 2010 2011 2012
Process 90nm 90nm 65nm 45nm 45nm 28/32nm
µArch ARM11 ARM11 Cortex A8 Cortex A8 Cortex A9 ?
CPU Clock 412MHz 412MHz 600MHz 800MHz 800MHz ?

Apple has been at the forefront of the mobile hardware race, particularly if we look at the iOS platform as a whole (iPad + iPhone). Apple was among the first to move from ARM11 to the Cortex A8, and once again with the move to the A9. On the GPU side Apple has been even more aggressive.

Apple hasn't stayed on the same process node for more than two generations, echoing a philosophy maintained by even the high-end PC GPU vendors. It also hasn't shipped the same microprocessor architecture for more than two generations in a row.

Furthermore Apple even seems to be ok with combining a process shrink with a new architecture as we saw with the iPhone 3GS. It's generally thought of as a risky practice to migrate to both a new process technology and a new architecture in the same generation, although if you can pull it off the benefits are wonderful.

The truth of the matter is Apple is very focused on user experience, but it enables that experience by using the fastest hardware available on the market. With that in mind, what comes in 2012 with Apple's sixth-generation SoC?

It's fairly obvious that we'll see a process node shrink. Apple has been on 45nm for two generations now and the entire market will be moving to 28/32nm next year. If Apple sticks with Samsung, it'll be on their 32nm LP process.

The CPU architecture is a bit of a question at this point. We already know that Qualcomm will be shipping its next-generation Krait architecture in devices in the first half of 2012. TI, on the other hand, will deliver an ARM Cortex A15 based competitor by the end of next year. The aggressive move would be for Apple to once again migrate to a new process and architecture and debut a Cortex A15 design at 32nm next year.

Looking purely at historical evidence it would seem likely that we'd get a 32nm dual-Cortex A9 design at higher clocks first. If Apple wants to release an iPad update early next year, that's likely what we'll see. That still doesn't preclude a late 2012 release of a dual-Cortex A15 solution, perhaps for use in the next iPhone.

Note that we haven't talked much about potential GPU options for Apple's next silicon. Given the huge upgrade we saw going into the A5 and likely resolution targets for next-generation tablets, it's likely that we'll see pretty big gains there as well.

GPU Performance Using Unreal Engine 3 Siri
Comments Locked

199 Comments

View All Comments

  • doobydoo - Friday, December 2, 2011 - link

    Its still absolute nonsense to claim that the iPhone 4S can only use '2x' the power when it has available power of 7x.

    Not only does the iPhone 4s support wireless streaming to TV's, making performance very important, there are also games ALREADY out which require this kind of GPU in order to run fast on the superior resolution of the iPhone 4S.

    Not only that, but you failed to take into account the typical life-cycle of iPhones - this phone has to be capable of performing well for around a year.

    The bottom line is that Apple really got one over all Android manufacturers with the GPU in the iPhone 4S - it's the best there is, in any phone, full stop. Trying to turn that into a criticism is outrageous.
  • PeteH - Tuesday, November 1, 2011 - link

    Actually it is about the architecture. How GPU performance scales with size is in large part dictated by the GPU architecture, and Imagination's architecture scales better than the other solutions.
  • loganin - Tuesday, November 1, 2011 - link

    And I showed it above Apple's chip isn't larger than Samsung's.
  • PeteH - Tuesday, November 1, 2011 - link

    But chip size isn't relevant, only GPU size is.

    All I'm pointing out is that not all GPU architectures scale equivalently with size.
  • loganin - Tuesday, November 1, 2011 - link

    But you're comparing two different architectures here, not two carrying the same architecture so the scalability doesn't really matter. Also is Samsung's GPU significantly smaller than A5's?

    Now we've discussed back and forth about nothing, you can see the problem with Lucian's argument. It was simply an attempt to make Apple look bad and the technical correctness didn't really matter.
  • PeteH - Tuesday, November 1, 2011 - link

    What I'm saying is that Lucian's assertion, that the A5's GPU is faster because it's bigger, ignores the fact that not all GPU architectures scale the same way with size. A GPU of the same size but with a different architecture would have worse performance because of this.

    Put simply architecture matters. You can't just throw silicon at a performance problem to fix it.
  • metafor - Tuesday, November 1, 2011 - link

    Well, you can. But it might be more efficient not to. At least with GPU's, putting two in there will pretty much double your performance on GPU-limited tasks.

    This is true of desktops (SLI) as well as mobile.

    Certain architectures are more area-efficient. But the point is, if all you care about is performance and can eat the die-area, you can just shove another GPU in there.

    The same can't be said of CPU tasks, for example.
  • PeteH - Tuesday, November 1, 2011 - link

    I should have been clearer. You can always throw area at the problem, but the architecture dictates how much area is needed to add the desired performance, even on GPUs.

    Compare the GeForce and the SGX architectures. The GeForce provides an equal number of vertex and pixel shader cores, and thus can only achieve theoretical maximum performance if it gets an even mix of vertex and pixel shader operations. The SGX on the other hand provides general purpose cores that work can do either vertex or pixel shader operations.

    This means that as the SGX adds cores it's performance scales linearly under all scenarios, while the GeForce (which adds a vertex and a pixel shader core as a pair) gains only half the benefit under some conditions. Put simply, if a GeForce core is limited by the number of pixel shader cores available, the addition of a vertex shader core adds no benefit.

    Throwing enough core pairs onto silicon will give you the performance you need, but not as efficiently as general purpose cores would. Of course a general purpose core architecture will be bigger, but that's a separate discussion.
  • metafor - Tuesday, November 1, 2011 - link

    I think you need to check your math. If you double the number of cores in a Geforce, you'll still gain 2x the relative performance.

    Double is a multiplier, not an adder.

    If a task was vertex-shader bound before, doubling the number of vertex-shaders (which comes with doubling the number of cores) will improve performance by 100%.

    Of course, in the case of 543MP2, we're not just talking about doubling computational cores.

    It's literally 2 GPU's (I don't think much is shared, maybe the various caches).

    Think SLI but on silicon.

    If you put 2 Geforce GPU's on a single die, the effect will be the same: double the performance for double the area.

    Architecture dictates the perf/GPU. That doesn't mean you can't simply double it at any time to get double the performance.
  • PeteH - Tuesday, November 1, 2011 - link

    But I'm not talking about relative performance, I'm talking about performance per unit area added. When bound by one operation adding a core that supports a different operation is wasted space.

    So yes, doubling space always doubles relative performance, but adding 20 square millimeters means different things to the performance of different architectures.

Log in

Don't have an account? Sign up now