The Snapdragon 865 Performance Preview: Setting the Stage for Flagship Android 2020
by Andrei Frumusanu on December 16, 2019 7:30 AM EST- Posted in
- Mobile
- Qualcomm
- Smartphones
- 5G
- Cortex A77
- Snapdragon 865
Machine Learning Inference Performance
AIMark 3
AIMark makes use of various vendor SDKs to implement the benchmarks. This means that the end-results really aren’t a proper apples-to-apples comparison, however it represents an approach that actually will be used by some vendors in their in-house applications or even some rare third-party app.
In AIMark 3, the benchmark uses each vendor’s proprietary SDK in order to accelerate the NN workloads most optimally. For Qualcomm’s devices, this means that seemingly the benchmark is also able to take advantage of the new Tensor cores. Here, the performance improvements of the new Snapdragon 865 chip is outstanding, posting in 2-3x performance compared to its predecessor.
AIBenchmark 3
AIBenchmark takes a different approach to benchmarking. Here the test uses the hardware agnostic NNAPI in order to accelerate inferencing, meaning it doesn’t use any proprietary aspects of a given hardware except for the drivers that actually enable the abstraction between software and hardware. This approach is more apples-to-apples, but also means that we can’t do cross-platform comparisons, like testing iPhones.
We’re publishing one-shot inference times. The difference here to sustained performance inference times is that these figures have more timing overhead on the part of the software stack from initialising the test to actually executing the computation.
AIBenchmark 3 - NNAPI CPU
We’re segregating the AIBenchmark scores by execution block, starting off with the regular CPU workloads that simply use TensorFlow libraries and do not attempt to run on specialized hardware blocks.
Starting off with the CPU accelerated benchmarks, we’re seeing some large improvements of the Snapdragon 865. It’s particularly the FP workloads that are seeing some big performance increases, and it seems these improvements are likely linked to the microarchitectural improvements of the A77.
AIBenchmark 3 - NNAPI INT8
INT8 workload acceleration in AI Benchmark happens on the HVX cores of the DSP rather than the Tensor cores, for which the benchmark currently doesn’t have support for. The performance increases here are relatively in line with what we expect in terms of iterative clock frequency increases of the IP block.
AIBenchmark 3 - NNAPI FP16
FP16 acceleration on the Snapdragon 865 through NNAPI is likely facilitated through the GPU, and we’re seeing iterative improvements in the scores. Huawei’s Mate 30 Pro is in the lead in the vast majority of the tests as it’s able to make use of its NPU which support FP16 acceleration, and its performance here is quite significantly ahead of the Qualcomm chipsets.
AIBenchmark 3 - NNAPI FP32
Finally, the FP32 test should be accelerated by the GPU. Oddly enough here the QRD865 doesn’t fare as well as some of the best S855 devices. It’s to be noted that the results here today were based on an early software stack for the S865 – it’s possible and even very likely that things will improve over the coming months, and the results will be different on commercial devices.
Overall, there’s again a conundrum for us in regards to AI benchmarks today, the tests need to be continuously developed in order to properly support the hardware. The test currently doesn’t make use of the Tensor cores of the Snapdragon 865, so it’s not able to showcase one of the biggest areas of improvement for the chipset. In that sense, benchmarks don’t really mean very much, and the true power of the chipset will only be exhibited by first-party applications such as the camera apps, of the upcoming Snapdragon 865 devices.
178 Comments
View All Comments
eastcoast_pete - Monday, December 16, 2019 - link
Thanks Andrei! Amy chance to post the S855's QRD's figures also? These QRDs are "for example" demo units, and the final commercial handsets are often different (faster). Also, any word from QC on how much AI processing power will be needed to run 5G functionality? Huawei's Kirin 990 5G has twice the AI TOPs than their LTE version, and that seems to be due to their (integrated) 5G modem using about half the AI TOPs when actually working in 5G modeeastcoast_pete - Monday, December 16, 2019 - link
Any chance, of course. Edit function would be nice.Andrei Frumusanu - Monday, December 16, 2019 - link
I don't see the point in showing the QRD855 results, there's a large spectrum of S855 device results out there and likely we'll see the same with the S865. The QRD855 and QRD865 aren't exactly apples-to-apples configuration comparisons either so that comparison doesn't add any value.ChitoManure - Monday, December 16, 2019 - link
Because QRDs from qualcomm might have the simikar cooling system and the OEMs usually have better thermal design which is why they are faster..Andrei Frumusanu - Monday, December 16, 2019 - link
None of the tests were made under thermal stress scenarios, the cooling isn't a limitation on the QRDs, the performance showcased is the best the chip can achieve.Kishoreshack - Monday, December 16, 2019 - link
Man the web benchmarks are DISAPPOINTINGfeel like buying a S10+ now
Kishoreshack - Monday, December 16, 2019 - link
Just shows how Samsung does the best implementation of Qualcomm Soc'seven last years Samsung 855 devices are able to out perform Snapdragon 865 in many benchmarks
Can't wait for S11 now
Kishoreshack - Monday, December 16, 2019 - link
Anyone even expected Qualcomm beating Apple in performance?You were dreaming then
don't know whom to blame Arm or Qualcomm
but the Android world is constantly receiving inferior chips
Karmena - Monday, December 16, 2019 - link
IMHO all these SOCs are at the level that average Joe can do with any of these and the device will feel snappy and good. Now it comes down to the OS delivering the performance and features that users crave.doungmli - Monday, December 16, 2019 - link
the only benchmarks are the web, 3dmark and geekbench for the a13 chip the rest is in favor of the snapdragon. It should perhaps be remembered that this is a soc so cpu + isp + gpu + ... and when adding the snapdragon >>>> A13. just see the AI markers which take into account the entire soc. For gfx bench it would be necessary to explain why so much difference whereas in the other benchmarks GPU there is not this difference but gfx bench is not outdated for more than a year for me it is no longer a reference. For web performance just see the speed tests on youtube to see that this score is not justified