System Performance

Given that we’ve seen excellent raw CPU performance of the Kirin 980, we should also largely see this translated over to actual system performance. System performance is what we call the performance of more realistic every-day workloads, which are most of the time mainly transactional in their nature, in contrast to the more continuous long SPEC tests of the previous pages.

The Mate 20’s come with Android 9/P out of the box, and in terms of mechanisms that promise to improve system performance, Huawei/HiSilicon employ a custom scheduler for the Kirin 980 that is able to properly deal with the three efficiency CPU groups (Perf & efficiency A76’s, and A55s).

Huawei has been locking down things quite tightly over the past year, so I wasn’t able to extract that much information out of the kernel. What I did find out is that it looks like they’re using a scheduler that is based on Google’s ACK (Android Common Kernel) and builds custom modifications on top of that. Among the key features that look to be enabled in the kernel is WALT – which I think if I’m not mistaken would make this the first non-Qualcomm SoC which sports the more responsive load tracking mechanism out of the box.

It’s to be noted that after our recent article addressing less than honest benchmarking behaviour, that Huawei has changed the behaviour of its battery power modes. The new “Performance mode” in the battery settings is off by default, which I found quite a bit odd as a default setting. To be able to get the full performance of the SoC blocks, this setting should be turned on, and we’ll note that all our testing was with the performance mode enabled, something which Huawei also recommended us to do.  

PCMark Work 2.0 - Web Browsing 2.0

Starting off with the PCMark Web Browsing 2.0 test, we see the Mate 20’s take a considerable lead among all Android devices. Here it is evidently clear that this is a considerable generational leap in performance, and more so compared to the previous generation Kirin 970 devices.

PCMark Work 2.0 - Video Editing

The video editing test again has become somewhat non-representative of performance as most flagship devices hover within the same score range without much difference between each other. I’m still now sure why some devices score ever so slightly higher or lower, but the absolute differences are quite minor..

PCMark Work 2.0 - Writing 2.0

The PCMark Writing test is among one of the most representative ones in terms of putting a number on overall device snappiness and speed. Here the Mate 20’s again take the lead, however the delta to the second best devices here isn’t quite as big as in the web browsing test. The OnePlus 6 and Pixel 3 both seem to have an advantage over other devices due to the fact that they’re running Android 9/P along with an up-to-date Qualcomm scheduler.

PCMark Work 2.0 - Photo Editing 2.0

The photo editing test consists of small workload bursts – applying photo filters via RenderScript APIs. Here both performance and again performance responsiveness are key. The Mate 20’s again do very well, however they don’t quite match the performance of some of the best Snapdragon 845 devices, featuring the more up to date Qualcomm schedulers.

PCMark Work 2.0 - Data Manipulation

The Data Manipulation test is heavily influenced by single-threaded performance. Here although they don’t seem to quite match the Pixel 3 in this particular test, the Mate 20’s are still ahead of most other Android phones.

PCMark Work 2.0 - Performance

In the overall PCMark performance test, the Mate 20’s just land ahead of the Pixel 3 and OnePlus 6.

Speedometer 2.0 - OS WebView

In the WebView tests where we first use Speedometer, a JS framework test, we see the Mate 20’s again take a good leap ahead of the second-best Android platforms based on the Snapdragon 845. Against the previous generation Kirin 970 phones, Huawei was again essentially able to double the performance. It’s still not enough to catch up to Apple, but at least we’re on par with the A10, a result that was also largely represented by the SPEC2006 results.

WebXPRT 3 - OS WebView

WebXPRT is a tad less microarchitecturally demanding than Speedometer, and here performance largely seems to scale with simple overall raw CPU execution power. Again we see a similar positioning as in Speedometer, with the Mate 20’s taking the lead among Android devices.

My experience with the devices pretty much matches the system benchmarks – the Mate 20’s are among the fastest devices on the market. Where the Kirin 980’s performance shines is in more complex and heavier workloads, such as loading a webpage or opening content of more heavy apps.

In terms of overall feel and responsiveness, I do feel that the Mate 20’s maybe weren’t quite as fast as the Pixel 3 or OnePlus 6. Here these phones do feel a bit quicker in opening some applications or new activities. It’s possible that Huawei maybe is lacking some OS framework related boosters that these phones might be using. I do plan to try to reintroduce empirical and controlled app loading time testing in the future, so this might be a topic we’ll revisit soon enough.

Second Generation NPU - NNAPI Tested GPU Performance & Power
Comments Locked

141 Comments

View All Comments

  • Javert89 - Friday, November 16, 2018 - link

    Perhaps the most interesting part is missing :( how is working (performance and power) the middle cluster at 1.92 ghz? Same performance of 2.8ghz A75 at half power usage?
  • Andrei Frumusanu - Friday, November 16, 2018 - link

    I couldn't test it without root.
  • ternnence - Friday, November 16, 2018 - link

    try syscall(__NR_sched_setaffinity, pid, sizeof(mask), &mask)
  • ternnence - Friday, November 16, 2018 - link

    FYI,https://stackoverflow.com/questions/7467848/is-it-...
  • pjcamp - Friday, November 16, 2018 - link

    If it weren't for Huawei's aggressively belligerent stance against unlocked bootloaders . . . .
  • name99 - Friday, November 16, 2018 - link

    Andrei, can you please explain something that I just do not understand in any of these phone reviews (Apple or Android).
    The die shots always show 4x 16-wide LPDDR4 PHYs. OK, so 64-bit wide channel to DRAM, seems reasonable.

    Now the fastest normal LPDDR4 is LPDDr4-2133, which in any normal naming scheme would imply 2,133MT/s. So one transaction, 8 bytes wide, gives us guaranteed-not-to-exceed of 17GB/s.
    But of course Huawei's Geekbench4 memory bandwidth is ~22GB/s. Maybe Huawei are using slightly faster LPDDr4-2166 or whatever, but the details don't change --- the only way the numbers work out is if the "maximum bandwidth" of the DRAM is actually around 34 GB/s.

    Which implies that EITHER
    - LPDDR4-2133 does NOT mean 2133MT/sec. (But that's what common sense would suggest, and this recent AnandTech article on DDR5
    https://www.anandtech.com/show/13605/sk-hynix-deve... )

    OR

    - somehow there is 128-bits of width between all the high-end phone SoCs (either 2 independent 64-bit channels [more likely IMHO] or a single 128-bit wide channel).

    Can you clarify?
  • anonomouse - Friday, November 16, 2018 - link

    It’s 2133MHz IO and it’s DDR, so 4266MT/s. Each LPDDR4 channel is 16 bits. Hence the common listing of LPDDR4X-4266.

    Usually these are advertised/listed at the MT/s rate so DDR4-2666 has an IO clock of 1333MHz. Main difference being that DDR4 has a 64 bit channel width.
  • name99 - Friday, November 16, 2018 - link

    But then look at the article I gave, for DDR5
    https://www.anandtech.com/show/13605/sk-hynix-deve...

    This includes sentences like "The new DDR5 chip from SK Hynix supports a 5200 MT/sec/pin data transfer rate, which is 60% faster than the 3200 MT/s rate officially supported by DDR4."
    which strongly implies that a DDR4-3200 is NOT running at 6400 MT/s.

    WTF is going on here? Micron lists their LPDDR4, for example, as LPDDR4-2133, NOT as LPDDR4-4266?
  • N Zaljov - Sunday, November 18, 2018 - link

    I fail to see any issues with the current naming convention, apart from being confusing asf.

    "Micron lists their LPDDR4, for example, as LPDDR4-2133, NOT as LPDDR4-4266" - of course they are: https://www.micron.com/parts/dram/mobile-ddr4-sdra...

    Although there seems to be a typo in the specs of their partlists, which can be confusing, but they are clearly listing their LPDDR4(x) as LPDDR4-4266 (or, typoed, LPDDR4-4166), with an I/O clk of 2133 MHz and an actual memory clockspeed of around 533,3 MHz (on-demand modulation will keep the clock of the memory arrays somewhere between 533,25 and 533,35, depending on the load).
  • Andrei Frumusanu - Friday, November 16, 2018 - link

    The DSU's interface is limited at 2x 128bit per ACE interface to the memory subsystem/interconnect (32B/cycle in each direction) times the frequency of the DSU/L3 of which we aren't certain in the Kirin 980, but let's take the S845 which runs at 1478MHz IIRC: ~47GB/s. Plenty enough. We don't know the interconnect bandwidth from the DSU to the memory controller. The memory controllers themselves internally run at a different frequency (usually half) but what matters is talking about the DRAM speed. The Kirin 980/Mate 20's run on LPDDR4X at 2133MHz, or actually 4266MT/s because it's DDR. That's a peak of 4*16*4266/8=34.12GB/s.

    The actual answer is a lot simpler and more stupid. Geekbench 4's multi-threaded memory test just caps out at 2 threads, so in reality there's only ever two CPUs stressing the memory controller. Beyond this I've been told by some vendors that it doesn't scale in the test itself.

    My conclusion: Ignore all the GB4 memory tests.

Log in

Don't have an account? Sign up now