System Performance

One of the more popular and pervasive beliefs in this industry is that specs increasingly don’t matter. In a lot of ways, this review isn’t really the right place to address whether or not this matters, but the short answer is that things like SoC performance matter quite a bit. Outside of the display, the SoC and RF subsystems are one of the biggest power consumers in a phone today and unlike the display or RF systems the CPU and GPU can cause short spikes of enormous power consumption. At this point, we’ve seen SoCs this year that consume anywhere between 6 to over 12 watts when faced with a full load situation. The important part here is that when an SoC uses that much power, it needs to be delivering enough performance to justify the power consumption. In order to test aspects of the phone like the SoC we use our standard suite of benchmarks, which are designed to test various real-world scenarios to get an idea of what peak performance looks like.

Kraken 1.1 (Chrome/Safari/IE)

Google Octane v2  (Chrome/Safari/IE)

WebXPRT 2013 (Chrome/Safari/IE)

WebXPRT 2015 (Chrome/Safari/IE)

In the standard web browser benchmarks, the iPhone 6s and iPhone 6s Plus are clearly in the lead. The difference in some cases is significant, but given that the benchmarks that we’re running here are all enormous optimization targets it's still a reasonable comparison point. In the interest of trying to avoid optimization targets I decided to look at some new JavaScript benchmarks that aren’t regularly used right now. One interesting benchmark is Ember Performance, which is a JavaScript app framework that is used in a number of popular websites and applications. This isn’t as popular as AngularJS at the moment, but in the absence of a good mobile benchmark EmberJS should be a reasonably good proxy.

EmberJS (Chrome/Safari/IE)

In this benchmark, we can see that there’s a pretty enormous performance uplift that results when you compare the iPhone 6s' to anything else out there on the market. Weirdly enough, on average it looks like Samsung’s S-Browser ends up slower here than Chrome, but it’s likely that this is just because S-Browser is using an older build of Chromium which negates the advantages of platform-specific optimizations that Samsung is integrating into S-Browser.

Basemark OS II 2.0 - Overall

Basemark OS II 2.0 - System

Basemark OS II 2.0 - Memory

Basemark OS II 2.0 - Graphics

Basemark OS II 2.0 - Web

Looking at Basemark OS II, once again Apple is basically taking the lead across the board. The differences aren’t necessarily as enormous as they are in single-threaded browser benchmarks, but the iPhone 6s’ retain a significant overall performance lead over the next best mobile devices.

Overall, in benchmarks where CPU performance is a significant influence the iPhone 6s is pretty much at the very top of the stack. Of course, Apple has also had about 6-8 months of time since the launch of SoCs like the Snapdragon 810 and Exynos 7420 so this is at least partially to be expected. The real surprise and/or disappointment would be if future Exynos and Snapdragon SoCs continue to lag behind the A9 in CPU performance.

A9's GPU: Imagination PowerVR GT7600 System Performance Cont'd and NAND Performance
Comments Locked

531 Comments

View All Comments

  • toukale - Monday, November 2, 2015 - link

    Damn, "Now."
  • Kevin G - Monday, November 2, 2015 - link

    Not only is it enough to scare all other ARM SoC's but Intel has to be frighten by what Apple's engineers are capable of. Normalizing for clock speeds, it seems that the A9 is around Sandy/Ivy Bridge IPC and now with FinFET, there is a clock speed overlap with those chips as well. Intel has two newer generations of core designs (Haswell and Sky Lake) but they don't offer huge leaps over Sandy Bridge/Ivy Bridge. I'm really, really curious how the A9X in the iPad Pro will perform against various Core M designs in tablets. It is very conceivable that Apple could take the performance crown.

    Against low power i3/i5/i7 Sky Lake chips, Intel should still have performance lead. Granted those chips have a higher power budget it but it makes me wonder what Apple could pull off with a similar power budget.

    As for the A9 itself, it is a very solid improvement and there is still room to grow. My personal prediction for the A9, SMT, appears to be absent. Considering the width of the A9 design, there should be some performance gains. Certainly while running in a 4T2C mode, power consumption will be higher, 2T1C should be lower power than 2T2C.

    My predictions for the A10? I'm still sticking to the idea that SMT in Apple's CPU designs make sense so there is that. 4 MB of L2 cache and 12 MB of L3 cache are natural evolutions of their current topology. The GPU will core to an 8 core Rogue 7 design. The real SoC change will be in the memory subsystem with Apple adopting WideIO. I predict that the iPhone 7 will be the first product to drop the lightning connector and offer a USB Type-C port so USB and DisplayPort block will be included in the next iteration.
  • aliasfox - Monday, November 2, 2015 - link

    While Intel should be worried about the performance Apple's SoC engineers are capable of, what they should really be worried about is price. Sure, Apple might only offer 75% of the performance of a ULV Core chip, but when it comes at 20-30% of the price, that's serious competition.
  • Kevin G - Monday, November 2, 2015 - link

    There is the whole dichotomy of Apple being an end product supplier with the iPhone/iPad vs. Intel being a parts supplier. There is also the difference that Apple needs a third party to manufacture the A9 chip where as Intel does this in house. Intel is more of a middle man here and thus inflates the end cost of the OEM handsets and tablets. Intel can make the same amount of profit if they were able to spur volume sales but that trade off has never appealed much to Intel who historically enjoyed healthy margins on component pricing.
  • name99 - Monday, November 2, 2015 - link

    Guys, it's time to stop this pretense that Apple is "almost" at Intel performance.
    Apple IPC has exceeded the best Intel has to offer by about 15%.
    (gcc SPEC)
    A9 vs haswell = 3148/1.85 / 4800/3.3 = 1.16
    http://gcc.opensuse.org/SPEC/CINT/sb-czerny-head-6...
    i5-4670T boost 3.3G ~4800

    Or compare against the Broadwell in a MacBook:
    https://browser.primatelabs.com/geekbench3/compare...
    (Note that while the Bwell is nominally at 1.3GHz, Geekbench is short enough that it can turbo at 2.9GHz)

    With the A7 Apple got an "inner" core that was equal to the best Intel has to offer. With the A9 they now have an uncore that matches Intel (look at all the memory dependent benchmarks in the Geekbench comparison above, things like Sobel, Sharpen, and FFT --- Apple now matches Intel pretty much exactly).

    The only place where Apple still lags behind Intel (as far as the mobile space is concerned) is turbo-ing (ie an accurate on-SoC thermal model that allows parts of the SoC to run faster than rated up until the thermal budget is exceeded).
    This does not necessarily mean that turbo is the feature Apple will implement next. There are other directions they could go which provide (in their opinion) a better tradeoff, at least for now, than turbo'ing. Possibilities include
    - het core (add a low power low performance core. This sounds like big.LITTLE, but done right. The core selection and switching is done by a dedicated microcontroller which is tracking various CPU statistics like branch mispredictions and cache misses and using those to decide which core to use. The OS only sees one CPU; the het core is purely an internal implementation detail.
    Done right papers suggest this can buy you about 20% power reduction.)

    - KIP (kilo-instruction processor). A set of ideas that extend OoO from its current ability to tolerate latency out to L3, but not all the way to RAM, all the way out to RAM. This requires a ROB of size 1000 or so, and numerous modifications to allow the physical register set and load-store queues to match this size.

    - post-rename loop buffer. Places the loop buffer not just after fetch, not just after decode, but all the way after rename. Requires various modifications (to handle the "frozen" renaming) but capable of a nice drop in power whenever executing out of the loop buffer.

    Apart from starting down these paths, the obvious visible change for the A10 would appear to be that they
    - drop 32-bit support (which should probably allow them to drop at least one pipeline stage, and simplify the decoder substantially)
    - add support for the ARMv8.1a instructions.

    SMT is (IMHO) a low priority for Apple. They can add more cores faster than they can design in SMT, and area won't be a critical constraint until the Moore's law scaling party stops.
  • vFunct - Monday, November 2, 2015 - link

    They basically already have big.LITTLE with their M9 co-processor.
  • doggface - Monday, November 2, 2015 - link

    I'm sorry but no. Your geekbench scores mean nothing. Intel still has quite the lead. Otherwise Apple Mac book Pros would be using Apple SOCs.

    Apple will find that all the easy gains in CPU ipc/clocks are disappearing and like intel will struggle to make speed improvements beyond a certain level. Then chip cost will start going up. It is inevitable, it is physics.

    All that aside. The A9 is impressive. Kudos to Apple.
  • IanHagen - Wednesday, November 4, 2015 - link

    Whilst I agree mostly with you, the MacBook Pros don't sporting an Apple SoC is IMHO proof of nothing. The migration will be very costly and will brake compatibility with a ton of software. They can't simply slap a nice ARM chip on that thing and call it a day.
  • DerekZ06 - Wednesday, November 4, 2015 - link

    Switching architecture on the Mac book pros is like going from powerpc to x86 all over again.
  • gonsolo - Tuesday, November 3, 2015 - link

    Interesting. Can you quote some of the mentioned papers?

Log in

Don't have an account? Sign up now