System Performance

While synthetic test performance is one thing, and hopefully we’ve covered that well with SPEC, interactive performance in real use-cases behaves differently, and here software can play a major role in terms of the perceived performance.

I will openly admit that our iOS system performance suite looks extremely meager: we are only really left with our web browser tests, as iOS is quite lacking in meaningful alternatives such as to PCMark on the Android side.

Speedometer 2.0 - OS WebView

Speedometer 2.0 is the most up-to-date industry standard JavaScript benchmark which tests the most common and modern JS framework performance.

The A12 sports a massive jump of 31% over the A11, again pointing out that Apple’s advertised performance figures are quite underselling the new chipset.

We’re also seeing a small boost from iOS 12 on the previous generation devices. Here the boost comes not only thanks to an a change in how iOS’s scheduler handles load, but also thanks to further improvements in the ever evolving JS engine that Apple uses.

WebXPRT 3 - OS WebView

WebXPRT 3 is also a browser test, however its workloads are more wide-spread and varied, containing also a lot of processing tests. Here the iPhone XS showcases a smaller 11% advantage over the iPhone X.

Former devices here also see a healthy boost in performance, with the iPhone X ticking up from 134 to 147 points, or 10%. The iPhone 7’s A10 sees a larger boost of 33%, something we’ll get into more detail in a little bit.

iOS12 Scheduler Load Ramp Analyzed

Apple promised a significant performance improvement in iOS12, thanks to the way their new scheduler is accounting for the loads from individual tasks. The operating system’s kernel scheduler tracks execution time of threads, and aggregates this into an utilisation metric which is then used by for example the DVFS mechanism. The algorithm which decides on how this load is accounted over time is generally simple a software decision – and it can be tweaked and engineered to whatever a vendor sees fit.

Because iOS’s kernel is closed source, we’re can’t really see what the changes are, however we can measure their effects. A relatively simple way to do this is to track frequency over time in a workload from idle, to full performance. I did this on a set of iPhones ranging from the 6 to the X (and XS), before and after the iOS12 system update.

Starting off with the iPhone 6 with the A8 chipset, I had some odd results on iOS11 as the scaling behaviour from idle to full performance was quite unusual. I repeated this a few times yet it still came up with the same results. The A8’s CPU’s idled at 400MHz, and remained here for 110ms until it jumped to 600MHz and then again 10ms later went on to the full 1400MHz of the cores.

iOS12 showcased a more step-wise behaviour, scaling up earlier and also reaching full performance after 90ms.

The iPhone 6S had a significantly different scaling behaviour on iOS11, and the A9 chip’s DVFS was insanely slow. Here it took a total of 435ms for the CPU to reach its maximum frequency. With the iOS12 update, this time has been massively slashed down to 80ms, giving a great boost to performance in shorter interactive workloads.

I was quite astonished to see just how slow the scheduler was before – this is currently the very same issue that is handicapping Samsung’s Exynos chipsets and maybe other Android SoCs who don’t optimise their schedulers. While the hardware performance might be there, it just doesn’t manifest itself in short interactive workloads because the scheduler load tracking algorithm is just too slow.

The A10 had similar bad characteristics as the A9, with time to full performance well exceeding 400ms. In iOS12, the iPhone 7 slashes this roughly in half, to around 210ms. It’s odd to see the A10 being more conservative in this regard compared to the A9 – but this might have something to do with the little cores.

In this graph, it’s also notable to see the frequency of the small cores Zephyr cores – they start at 400MHz and peak at 1100MHz. The frequency in the graph goes down back to 758MHz because at this point there was a core switch over to the big cores, which continue their frequency ramp up until maximum performance.

On the Apple A11 – I didn’t see any major changes, and indeed any differences could just be random noise between measuring on the different firmwares. Both in iOS11 and iOS12, the A11 scales to full frequency in about 105ms. Please note the x-axis in this graph is a lot shorter than previous graphs.

Finally on the iPhone XS’s A12 chipset, we can’t measure any pre- and post- update as the phone comes with iOS12 out of the box. Here again we see that it reaches full performance after 108ms, and we see the transition of the tread from the Tempest cores over to the Vortex cores.

Overall, I hope this is the best and clear visual representation of the performance differences that iOS12 brings to older devices.

In terms of the iPhone XS – I haven’t had any issues at all with performance of the phone and it was fast. I have to admit I’m still a daily Android user, and I use my phones with animations completely turned off as I find they get in the way of the speed of a device. There’s no way to completely turn animation off in iOS, and while this is just my subjective personal opinion, I found they are quite hampering the true performance of the phone. In workloads that are not interactive, the iPhone XS just blazed through them without any issue or concern.

The A12 Tempest CPU & NN Performance Tests GPU Performance & Power
Comments Locked

253 Comments

View All Comments

  • eastcoast_pete - Sunday, October 7, 2018 - link

    Apple's strength (supremacy) in the performance of their SoCs really lies in the fine-tuned match of apps and especially low-level software that make good use of excellent hardware. What happens when that doesn't happen was outlined in detail by Andrei in his reviews of Samsung's Mongoose M3 SoC - to use a famous line from a movie that "could've been a contender", but really isn't. Apple's tight integration is the key factor that a more open ecosytem (Android) has a hard time matching; however, Google and (especially) Qualcomm leave a lot of possible performance improvements on the table by really poor collaboration; for example, GPU-assisted computing is AWOL for Android - not a smart move when you try to compete against Apple.
  • varase - Tuesday, October 23, 2018 - link

    I have serious doubts that Android would even run on an A12 SoC - I thought Apple trashed ARMv7 when it went to A11.
  • Strafeb - Saturday, October 6, 2018 - link

    It would be interesting to see comparison of screen efficiency of iPhone XR's low res LCD screen, and also some of LG's pOLED screens like in V40.
  • Alistair - Saturday, October 6, 2018 - link

    The Xeon Platinum 8176 is a 28 core, $9000 Intel server CPU, based on Skylake. In single threaded performance, the iPhone XS outperforms it by 12 percent for integers, despite its lower clock speed. If the iPhone were to run at 3.8ghz, the Apple A12 would outperform Intel's CPU by 64 percent on average for integer tests.

    iPhone XS and A12 numbers from: https://www.anandtech.com/show/13392/the-iphone-xs...

    Xeon numbers from: https://www.anandtech.com/show/12694/assessing-cav...

    spreadsheet: https://docs.google.com/spreadsheets/d/1ipKIh4i56o...

    image of chart: https://i.imgur.com/IAupi9p.jpg

    Think about that, the iPhone's CPU IPC (performance per clock) is already higher in integer performance now. Those tests include: spam filter, compression, compiling, vehicle scheduling, game ai, protein seq. analyses, chess, quantum simulation, video encoding, network sim, pathfinding, and xml processing. Test takes hours to run.
  • SanX - Saturday, October 6, 2018 - link

    Yes, and while Apple and all other mobile processor manufacturers charge $5 per core, Intel $300
  • yeeeeman - Saturday, October 6, 2018 - link

    It might be faster in single thread, but in MT it gets toasted by the Xeon. The Xeon is 9000$ for a few reasons:
    - it is an enterprise chip;
    - it supports ecc;
    - it supports up to 8 cpus on a board;
    - it supports tons of ram, a LOT of memory channels;
    - it has almost 40MB of L3 cache, compared to 8mb in a12;
    - it has a ring bus architecture meaning all those cores have very low latency between them and to memory;
    - it has CISC instructions, meaning that when you get out of basic phone apps and you start doing scientific/database/HPC stuff, you will see a lot of benefits and performance improvements from executing a single instruction for a specific operation, compared to the RISC nature of A12;
    - it supports AVX512, needed for high performance computing. In this, the A12 would get smashed;
    - and many more;
    So the Xeon 8180 is still an mighty impressive chip and Intel has invested some real thought and experience into making it. Things that Apple doesn't have.
    I get it, it is nice to see Apple having a chip with this much compute power in such a low TDP and it is due to the fact that x86 chips have a lot of extra stuff added in for legacy. But don't get carried away with this, what Apple is doing now from uArch point of view is not new. Desktop chip have had this stuff 15 years ago. The difference is that Apple works on the latest fabrication process and doesn't care about x86 legacy.
  • Alistair - Saturday, October 6, 2018 - link

    "It might be faster in single thread, but in MT it gets toasted by the Xeon"

    That is totally irrelevant. Obviously Apple could easily make a chip with more cores. Just like Cavium's Thunder. 8 x A12 Vortex cores would beat an 8 core Xeon in integer calculations easily enough.
  • eastcoast_pete - Sunday, October 7, 2018 - link

    Agree on your points re. the XEON. However, I'd still like to see Apple launch CPUs/iGPUs based on their design especially in the laptop space, where Intel still rules and charges premium prices. If nothing else, Apple getting into that game would fan the flames under Intel's chair that AMD is trying to kindle (started to work for desktop CPUs). In the end, we all benefit if Chipzilla either gets off its enormous bottom(line) and innovates more, or gets pushed to the side by superior tech. So, even as a non-Apple user: go Apple, go!
  • Constructor - Sunday, October 7, 2018 - link

    - it has CISC instructions, meaning that when you get out of basic phone apps and you start doing scientific/database/HPC stuff, you will see a lot of benefits and performance improvements from executing a single instruction for a specific operation, compared to the RISC nature of A12;

    CISC instructions generally don't really do much more than RISC ones do – they just have more addressing modes while RISC is almost always register-to-register with separate Load & Store.

    That just doesn' make any difference any more because the bottleneck is not instruction fetching (as it once was in the old times) but actually execution unit pipeline congestion, including of the Load & Store units.

    - it supports AVX512, needed for high performance computing. In this, the A12 would get smashed;

    There's already a scalable vector extention for ARM which Apple could adopt if that was actually a bottleneck. And even the existing vector units aren't anything to scoff at – the issue is more that Intel CPUs are forced to drop down to half their nominal clock once you actually use AVX512; It could actually be more efficient to optimize the regular vetor units for ful lspeed operation to make up for it.

    So the Xeon 8180 is still an mighty impressive chip and Intel has invested some real thought and experience into making it. Things that Apple doesn't have.

    We actually have no clue what Apple is investing in behind closed doors until they slam it on the table as a finished product ready for sale!
  • tipoo - Thursday, October 18, 2018 - link

    I'm hoping Apple takes the ARM switch as an opportunity to bring an ARM AVX-512 equivalent down to more products, like the iMac.

Log in

Don't have an account? Sign up now