Machine Learning Inference Performance

The new SoC generations also bring with them new AI capabilities, however things are quite different in terms of their capabilities. We saw the Snapdragon 865 add to the table a whole lot of new Tensor core performance which should accelerate ML workloads, but the software still plays a big role in being able to extract that capability out of the hardware.

Samsung’s Exynos 990 is quite odd here in this regard, the company quoted the SoC’s NPU and DSP being able to deliver a 10TOPs but it’s not clear how this figure is broken down. SLSI has also been able to take advantage of the new Mali-G77 GPU and its ML abilities, exposing them through NNAPI.

We’re skipping AIMark for today’s test as the benchmark couldn’t support hardware acceleration for either device, lacking updated support for neither Qualcomm’s or SLSI’s ML SDK’s. We thus fall back to AIBenchmark 3, which uses NNAPI acceleration.

AIBenchmark 3

AIBenchmark takes a different approach to benchmarking. Here the test uses the hardware agnostic NNAPI in order to accelerate inferencing, meaning it doesn’t use any proprietary aspects of a given hardware except for the drivers that actually enable the abstraction between software and hardware. This approach is more apples-to-apples, but also means that we can’t do cross-platform comparisons, like testing iPhones.

We’re publishing one-shot inference times. The difference here to sustained performance inference times is that these figures have more timing overhead on the part of the software stack from initializing the test to actually executing the computation.

AIBenchmark 3 - NNAPI CPU

We’re segregating the AIBenchmark scores by execution block, starting off with the regular CPU workloads that simply use TensorFlow libraries and do not attempt to run on specialized hardware blocks.

AIBenchmark 3 - 1 - The Life - CPU/FP AIBenchmark 3 - 2 - Zoo - CPU/FP AIBenchmark 3 - 3 - Pioneers - CPU/INT AIBenchmark 3 - 4 - Let's Play - CPU/FP AIBenchmark 3 - 7 - Ms. Universe - CPU/FP AIBenchmark 3 - 7 - Ms. Universe - CPU/INT AIBenchmark 3 - 8 - Blur iT! - CPU/FP

In the purely CPU accelerated workloads, we’re seeing both phones performing very well, but the Snapdragon 865’s A77 cores here are evidently in the lead by a good margin. It’s to be noted that the scores are also updated for the S10 phones – I noted a big performance boost with the Android 10 updates and the newer NNAPI versions of the test.

AIBenchmark 3 - NNAPI INT8

AIBenchmark 3 - 1 - The Life - INT8 AIBenchmark 3 - 2 - Zoo - Int8 AIBenchmark 3 - 3 - Pioneers - INT8 AIBenchmark 3 - 5 - Masterpiece - INT8 AIBenchmark 3 - 6 - Cartoons - INT8

Integer ML workloads on both phones is good, but because the Snapdragon 865 leverages the Hexagon DSP cores for such workload types, it’s much in lead ahead of the Exynos 990 S20. This latter variant however also showcases some very big performance improvements compared to its predecessor. I still think that Samsung here is only exposing the GPU of the SoC for NNAPI, but because of the new microarchitecture being able to accelerate ML workloads, we’re seeing a big performance improvement compared to the Exynos 9820.

AIBenchmark 3 - NNAPI FP16

AIBenchmark 3 - 1 - The Life - FP16 AIBenchmark 3 - 2 - Zoo - FP16 AIBenchmark 3 - 3 - Pioneers - FP16 AIBenchmark 3 - 5 - Masterpiece - FP16 AIBenchmark 3 - 6 - Cartoons - FP16 AIBenchmark 3 - 9 - Berlin Driving - FP16 AIBenchmark 3 - 10 - WESPE-dn - FP16

In FP16 workloads, the Exynos 990’s GPU actually manages to more often outperform the Snapdragon 865’s Adreno unit. In workloads that allow it, HiSilicon’s NPU still is far in the lead in workloads as it support FP16 acceleration which isn’t present on either the Snapdragon or Exynos SoCs – both falling back to their GPUs.

AIBenchmark 3 - NNAPI FP32

AIBenchmark 3 - 10 - WESPE-dn - FP32

Finally, FP32 also again uses the GPU of each SoC, and again the Exynos 990 presents quite a large performance lead ahead of the Snapdragon 865 unit.

It’s certainly encouraging to see the Samsung SoC keep up with the Snapdragon variant of the S20, pointing out that other vendors now finally are paying better attention to their ML capabilities. We don’t know much at all about the DSP or the NPU of the Exynos 990 as Samsung’s EDEN AI SDK is still not public – I hope that they finally open up more and allow third-party developers to take advantage of the available hardware.

System Performance: 120Hz Winner GPU Performance & Power
Comments Locked

137 Comments

View All Comments

  • Andrei Frumusanu - Friday, April 3, 2020 - link

    No, there's no software application notion of displaying something at a given refresh rate - things just render as fast as possible unless. 3D games might have an FPS cap, but that's not refresh rate.
  • FunBunny2 - Friday, April 3, 2020 - link

    this is what I mean.

    "If you can run a game at 100 frames per second, you may see a tangible benefit from playing it on a monitor that can refresh that many times per second. But if you’re watching a movie at a classic 24 FPS (frames per second), a higher refresh rate monitor won’t make any difference."

    here: https://www.digitaltrends.com/computing/do-you-nee...

    IOW, unless the processor sending either video or coded application images does so 120 per second, all the 120hz screen does is re-scan each image multiple times. how can the refresh rate create modified images, between those sent by the processor? or do 90/120hz screens do just that?

    do you disagree with that author?
  • krazyfrog - Friday, April 3, 2020 - link

    The screen refreshes at a set rate regardless of the content being sent to it. In this case, it always refreshes at 120Hz. If the content is in 24fps, each frame of the video persists for 5 refreshes of the display. To the eye, it looks no different than watching the same 24fps video on a 60Hz display.
  • surt - Saturday, April 4, 2020 - link

    Not true. It does not look the same to your eye, and the difference is the latency from the time that information is ready to display to the time it reaches your eye. The 120hz display will show that transition from e.g. the 23rd to the 24th frame significantly faster.
  • FunBunny2 - Sunday, April 5, 2020 - link

    " It does not look the same to your eye"

    that's a may be. years ago I worked in a manufacturing plant, no windows and only florescent lights. one of the guys I worked with wore glasses that looked like very weak sunglasses, but no prescription. I asked him about them and he said his eye doctor prescribed them for his constant headaches. turns out that some folks rectify the 60hz flash of florescent light, and it hurts. the same phenomenon would occur with monitors. if you're not among the rectifiers, it's hard to see how you would see different at 120hz.
  • surt - Sunday, April 5, 2020 - link

    And yet, it's not hard to see at all. Response tests are undeniable. People's reactions are unquestionably faster on 120hz. Whether you notice the difference or not, it exists.
  • surt - Saturday, April 4, 2020 - link

    It matters to any game. If your game updates at 30fps, the 120hz display will get that information to your eye a fraction faster than the 60hz display, because the 'time to next frame' + 'time to display next frame' is always smaller on the 120hz.
  • eastcoast_pete - Friday, April 3, 2020 - link

    Great review, thanks Andrei! Question: just how much power draw does the 5G modem add, especially the mm ones for us in the US? Along those lines, can the 5G function disabled in software, so not just deselected, but actually shut off? I imagine that the phone hunting for mm connectivity when it's not there could eat quite a bit of battery life.
  • Andrei Frumusanu - Friday, April 3, 2020 - link

    I don't even have 5G coverage here so I wouldn't know!

    Yes, 5G can be disabled in the options. I would assume that actually shuts off the extra RF. Similarly, I don't know how the mmWave antenna power management works.
  • eastcoast_pete - Friday, April 3, 2020 - link

    Thanks for the reply! mm 5G coverage is supposedly "available" in some places here in the US, but I don't believe the carriers here have set up anywhere near enough cells for it to be viable. Plus, even if I'd get Gb download rates, they still have caps on their plans, unless one shells out for the premium unlimited ones. And those make the 20 Ultra's price tag look like a bargain (:

Log in

Don't have an account? Sign up now