Machine Learning Inference Performance

The new SoC generations also bring with them new AI capabilities, however things are quite different in terms of their capabilities. We saw the Snapdragon 865 add to the table a whole lot of new Tensor core performance which should accelerate ML workloads, but the software still plays a big role in being able to extract that capability out of the hardware.

Samsung’s Exynos 990 is quite odd here in this regard, the company quoted the SoC’s NPU and DSP being able to deliver a 10TOPs but it’s not clear how this figure is broken down. SLSI has also been able to take advantage of the new Mali-G77 GPU and its ML abilities, exposing them through NNAPI.

We’re skipping AIMark for today’s test as the benchmark couldn’t support hardware acceleration for either device, lacking updated support for neither Qualcomm’s or SLSI’s ML SDK’s. We thus fall back to AIBenchmark 3, which uses NNAPI acceleration.

AIBenchmark 3

AIBenchmark takes a different approach to benchmarking. Here the test uses the hardware agnostic NNAPI in order to accelerate inferencing, meaning it doesn’t use any proprietary aspects of a given hardware except for the drivers that actually enable the abstraction between software and hardware. This approach is more apples-to-apples, but also means that we can’t do cross-platform comparisons, like testing iPhones.

We’re publishing one-shot inference times. The difference here to sustained performance inference times is that these figures have more timing overhead on the part of the software stack from initializing the test to actually executing the computation.

AIBenchmark 3 - NNAPI CPU

We’re segregating the AIBenchmark scores by execution block, starting off with the regular CPU workloads that simply use TensorFlow libraries and do not attempt to run on specialized hardware blocks.

AIBenchmark 3 - 1 - The Life - CPU/FP AIBenchmark 3 - 2 - Zoo - CPU/FP AIBenchmark 3 - 3 - Pioneers - CPU/INT AIBenchmark 3 - 4 - Let's Play - CPU/FP AIBenchmark 3 - 7 - Ms. Universe - CPU/FP AIBenchmark 3 - 7 - Ms. Universe - CPU/INT AIBenchmark 3 - 8 - Blur iT! - CPU/FP

In the purely CPU accelerated workloads, we’re seeing both phones performing very well, but the Snapdragon 865’s A77 cores here are evidently in the lead by a good margin. It’s to be noted that the scores are also updated for the S10 phones – I noted a big performance boost with the Android 10 updates and the newer NNAPI versions of the test.

AIBenchmark 3 - NNAPI INT8

AIBenchmark 3 - 1 - The Life - INT8 AIBenchmark 3 - 2 - Zoo - Int8 AIBenchmark 3 - 3 - Pioneers - INT8 AIBenchmark 3 - 5 - Masterpiece - INT8 AIBenchmark 3 - 6 - Cartoons - INT8

Integer ML workloads on both phones is good, but because the Snapdragon 865 leverages the Hexagon DSP cores for such workload types, it’s much in lead ahead of the Exynos 990 S20. This latter variant however also showcases some very big performance improvements compared to its predecessor. I still think that Samsung here is only exposing the GPU of the SoC for NNAPI, but because of the new microarchitecture being able to accelerate ML workloads, we’re seeing a big performance improvement compared to the Exynos 9820.

AIBenchmark 3 - NNAPI FP16

AIBenchmark 3 - 1 - The Life - FP16 AIBenchmark 3 - 2 - Zoo - FP16 AIBenchmark 3 - 3 - Pioneers - FP16 AIBenchmark 3 - 5 - Masterpiece - FP16 AIBenchmark 3 - 6 - Cartoons - FP16 AIBenchmark 3 - 9 - Berlin Driving - FP16 AIBenchmark 3 - 10 - WESPE-dn - FP16

In FP16 workloads, the Exynos 990’s GPU actually manages to more often outperform the Snapdragon 865’s Adreno unit. In workloads that allow it, HiSilicon’s NPU still is far in the lead in workloads as it support FP16 acceleration which isn’t present on either the Snapdragon or Exynos SoCs – both falling back to their GPUs.

AIBenchmark 3 - NNAPI FP32

AIBenchmark 3 - 10 - WESPE-dn - FP32

Finally, FP32 also again uses the GPU of each SoC, and again the Exynos 990 presents quite a large performance lead ahead of the Snapdragon 865 unit.

It’s certainly encouraging to see the Samsung SoC keep up with the Snapdragon variant of the S20, pointing out that other vendors now finally are paying better attention to their ML capabilities. We don’t know much at all about the DSP or the NPU of the Exynos 990 as Samsung’s EDEN AI SDK is still not public – I hope that they finally open up more and allow third-party developers to take advantage of the available hardware.

System Performance: 120Hz Winner GPU Performance & Power
Comments Locked

137 Comments

View All Comments

  • Reflex78 - Friday, April 3, 2020 - link

    I live in Europe, I like Samsung and have S9 at the moment.
    But I will never pay +1000€ for lower quality Exynos S20 version in Europe!
    This is a big mistake from the company management to allow such a difference between this 2 variants at the same price!
    And I just read that they have chosen to sell Snapdragon version even for their home country:
    https://www.phonearena.com/news/Samsung-chip-divis...
  • twtech - Friday, April 3, 2020 - link

    The edge design of Samsung's more recent releases is just not good. There are no cases that can both properly protect the screen, and avoid blocking any of it. My older phones typically lasted for years without any significant damage. These new ones are one slip of the hand away from being garbage fodder.
  • FunBunny2 - Friday, April 3, 2020 - link

    "These new ones are one slip of the hand away from being garbage fodder."

    rube!! :) it's a feature, not a bug. going back to Lotus/MS conflict: "DOS ain't done til 1-2-3 won't run."
  • Harysviewty - Friday, April 3, 2020 - link

    Totally wrong calculation. It's 7.1mp when you crop 3x3. It's only possible to do full 12mp resolution with the help of Super resolution algorithm, which adds up to 75% more detail to 'normal bayer' setup
  • Andrei Frumusanu - Friday, April 3, 2020 - link

    I don't know what you're talking about. The sensor is 108MP at 12000 x 9000. 3x3 binning results in 4000 x 3000, which is 12MP.
  • krazyfrog - Friday, April 3, 2020 - link

    The 3x3 binning only happens on the 108MP sensor, not on the 64MP sensor.
  • s.yu - Saturday, April 4, 2020 - link

    You said "crop 3x3" which confuses people, usually we just say 3x crop, but yes, the digital zoom doesn't provide native 12MP in any sense at 3x.
  • JDSP - Friday, April 3, 2020 - link

    Image links on the iPhone are wrong, Wide links to zoom and night sight links to normal
  • Andrei Frumusanu - Friday, April 3, 2020 - link

    Corrected that, thanks. There's probably a few other link issues there I'll keep an eye out for that.
  • Quantumz0d - Friday, April 3, 2020 - link

    Great analysis, will go through it slowly but looking at that Exynos 990, Seriously WTF is that. Higher power consumption, lower GPU performance, higher throttling. Very unfortunate. And losing to that copycat Chinese Huawei Kirin trash (EMUI garbage with LZ Play backdoor, Read only EROFS Filesystem and Google copying that into Pixel 4 and the proprietary garbage NMSDslot no 3.5mm jack, No Play Store, lies and deception) Samsung should be ashamed of themselves removing all their genius whatever PR ads for milking customers and offering mediocrity.

    The only good part for Exynos is unlockable bootloader. Since SD versions in US are locked as hell and useless for customization esp how they depreciated the SpO2 sensor in this phone HW and from SW side also in S10 and previous phones to promote bullshit Smartwatches.

    This phone sucks bad, their S10 has better features and looks better as well on top this phone camera sucks, no 3.5mm jack and ugliest design ever with insane price tag on top, LG's V60 is looking very good in comparison from Camera to Audio and other features/specs plus price vs this phone or even that Chinese OP8 Pro (despite lacking SD slot and 3.5mm jack as at-least it is cheaper and has hassle free Bootloader unlock), since Samsung dropping HW feature set which defines the all in one phones from Samsung they want greed and money from that shitty Buds and other garbage.

    S10+ Exynos is a better choice all rounder as it has BL unlock as well. On top I'd like to mention how DJ Koh also was removed from the Samsung mobile CEO office, I presume it also played an important role in S.LSI even if independent and the financial results of the conglomerate, esp how it impacted the design philosophy for sure as Note 10 provided a platform for removing features and cost cutting similar aspect in S20 a bit more worse.

    On the Foundry aspects, TSMC 7NP is not EUV and Samsung 7nm LPP is EUV since N7+ is the EUV one. Not sure 20% to 30% is valid ? I do not know since the uArch is garbage doesn't mean that the node is trash, esp last rumor was Ampere would be on Samsung node with EUV and Nvidia wouldn't afford a stupid decision tbh, and a big shame is S.LSI getting hacked. I think the PR marketing team and the budgets ruined them or such, we may never know. Shame indeed.

Log in

Don't have an account? Sign up now