System Performance

As we’ve extensively covered in various over the last year and more, CPU performance alone doesn’t signify all that much if the system isn’t able to properly take advantage of it in order to achieve better user experience.

Software here plays and incredibly important role, and we’ve seen some devices fall face flat in this regard. Last year’s Exynos 9810 powered Galaxy devices in particular made it abundantly clear just how much of a user experience difference this can make, vastly overshadowing the actual hardware performance.

For the Galaxy S10, we expand on our initial MWC coverage regarding system performance between the two chipset variants. Here we saw clear indications that the Snapdragon 855 variant would again win in these set of benchmarks, and likely end up as the better device in terms of user experienced performance.

We’ve retested the numbers with our own in-house devices, so let’s take a look if and how things have changed:

PCMark Work 2.0 - Web Browsing 2.0

Starting off with the PCMark web-browsing workloads, the Snapdragon 855 variant of the Galaxy S10 leads the benchmark scores. Even though we’re seeing the Qualcomm chipset in a production device now, it still looks to lose out to the Kirin 980 devices. Both Samsung and Huawei now have “performance” modes in the device settings. For Huawei, as per company product management explanations, this is actually the intended full performance of the device while the default mode is slightly more battery friendly for the mass-market users who aren’t as performance sensitive.

It’s not yet exactly clear what Samsung’s performance mode changes compared to the “optimised” default setting, however I’ve also seen that this latter setting can result in throttled performance which actually results in Snapdragon performance numbers falling back to the range of the Exynos 9820 figures.

Samsung’s new “CPU Limiter” for the Galaxy S10 now works fundamentally differently than last year; it doesn’t actually limit the peak frequency the CPU can reach,  but rather limits the total CPU capacity to 70% in the scheduler, meaning multi-threaded workloads will be throttled. Frankly I find this silly as the device doesn’t inherently become more efficient as it still has access to the higher frequencies, and thus the battery advantage of this function isn’t nearly as great as what users will have experienced with the Galaxy S9.

Correction edit: The CPU Limiter does seem to work similarly and limiting CPU frequencies, however for some reason its activation and effect is seemingly delyed and isn't immediate following a change in settings.

PCMark Work 2.0 - Video Editing

The video editing workload is still something dominated by Qualcomm. The performance here seems to be dictated by the responsiveness of the little cores. In absolute terms, the differences aren’t big and this part of PCMark isn’t very indicative of overall system performance.

PCMark Work 2.0 - Writing 2.0

The one test that is actually most indicative of experienced device responsiveness is the Writing 2.0 workloads. Here the Snapdragon 855 falls in at the top of the performance charts along the Kirin 980. The Exynos 9820 Galaxy S10 also does relatively well here, but just falls short of the competition.

What is however most important is that this is now the first Exynos chipset in several years where S.LSI was able to deliver a scheduler and DVFS system that wasn’t absolutely abysmal, as evidenced by the massive performance difference between the Galaxy S10 and past Exynos devices at the very bottom of the chart.

PCMark Work 2.0 - Photo Editing 2.0

The Photo Editing workload again showcases similar improvements on the part of the Exynos. Here Qualcomm still has a very large performance advantage which simply might be due to better GPU acceleration for the RenderScript workloads which are part of the photo filter editing in this test.

PCMark Work 2.0 - Data Manipulation

Finally the Data manipulation workload sees the Snapdragon Galaxy S10 again at the top of the charts, with the Exynos 9820 slightly trailing it.

PCMark Work 2.0 - Performance

Overall, the Snapdragon 855 Galaxy S10 ends up as our top performing phone in PCMark. The Exynos 9820 unit trails behind, however does showcase significant improvements over previous generation Exynos phones.

Web JS Benchmarks

Switching over to web-browser based benchmarks, we’re testing inside a simple OS WebView shell as this is most representative of user-experience when browsing websites through third-party apps.

Speedometer 2.0 - OS WebView

In Speedometer 2.0, both Galaxy S10 devices are neck-in-neck within margins of error. The performance improvements of the Snapdragon 855 over last year’s Snapdragon 845 in this test seems relatively conservative, the Kirin 980 clearly is able to showcase a bigger performance lead.

The Exynos 9820 showcases dramatic performance gains over last year’s Exynos 9810. The removal of the abysmal hot-plugging mechanism means that the big CPU cores are able to actually run at full speed all while having secondary threads on the other big and middle cores.

WebXPRT 3 - OS WebView

In WebXPRT 3, the Snapdragon 855 manages to match the Kirin 980 while Exynos 9820 trails slightly behind.

JetStream 2 - OS WebView

Finally, I wanted to add in results of the brand-new JetStream 2 test that was released just two days ago and we’ll likely looking to adopt in the future (after more careful analysis). An interesting aspect of this test is that it’s using web workers to parallel workloads, something we usually never had in the past for browser-based JavaScript benchmarks.

Here the Exynos 9820 has a bigger disadvantage than the Snapdragon 855. The fact that the Kirin 980 scores identical to the Snapdragon means the performance shouldn’t be linked to the lower performance middle cores, but still strongly dependent on the big core performance. Unfortunately this seems to be another benchmark that doesn’t agree with Samsung’s CPU microarchitecture, with the Exynos S10 falling behind by 22%, even scoring less than the Snapdragon 845 of last year. This workload also doesn’t seem like a constant sustained test so it’s likely that scheduler responsiveness will play a role.

Scheduler & DVFS responsiveness

To investigate scheduler responsiveness and device DVFS settings, we fall back to our scaling performance ramp test. This is a fixed instruction chain workload with fine-grained timing collection every certain amount of instructions. By converting the time taken for every instruction block we can convert this into the frequency of the resident CPU that the workload is currently scheduled on, giving us detailed frequency information over time.

We’re looking at both Galaxy S10 units as well as the Mate 20 with the Kirin 980. For the new Qualcomm chipset what stands out is that the Galaxy S10 is indeed more aggressive in its scaling than what we’ve seen in January on the Qualcomm reference platform, reaching peak performance in 67ms rather than 95ms. It’s interesting that now even though the Qualcomm chipset has a clear scheduler and DVFS speed advantage over the Kirin 980, it still only is able to match or slightly lose out to the HiSilicon chip.

The results of the Exynos 9820 aren’t nearly as performant as on the Snapdragon Galaxy S10. Here we’re seeing that the chip first scales and resides on the small A55 cores for up to 83ms before it switches over to the Cortex A75 cores. What is extremely weird here is that the workload is staying on the middle cores for a mere 15ms before it switches over to the big M4 cores, finally reaching peak performance at around the 143ms mark, essentially twice as slow as the Snapdragon chipset.

We’ve seen a similar story last year with the Exynos 9810, and the issue is inherently tied to the scheduler. Make no mistake here, the Exynos 9820 behaves infinitely better than the Exynos 9810, however I expected more of the new chip. One of the things I did last year was to introduce two new big changes to the scheduler’s load tracking mechanism; halving the PELT half-life from 32ms down to 16ms which by itself doubled the responsiveness. On top of that I’ve added util_est to increase performance of short periodic workloads that otherwise would have lost their load utilisation faster.

For the Exynos 9820, what Samsung did was essentially also adopt these two important changes… and that’s about it. Although the new chipset comes with a new scheduler that does make efforts to schedule things around in an energy efficient way, the core issue of the scheduler load being slow hasn’t been further improved beyond what I myself was able to achieve last year. Currently the Exynos 9820 is the only flagship Android SoC that still uses PELT in its scheduler as a load tracking mechanism as both Qualcomm and HiSilicon are making use of WALT, which is massively more responsive. Google actually wants to drop WALT out of the Android Common Kernel, however this happened only after PELT was made to be as responsive as WALT. One very important patch to achieve this is unfortunately missing from the Exynos 9820’s BSP which means as a delivered product the Exynos Galaxy S10 just has lower responsiveness. In next year’s Exynos we’ll probably finally see things equal out, however this will by then be 4 generations and years of Qualcomm SoCs being superior and giving better user experience simply because they have the better software stack.

That being said, it’s not all doom and gloom for the new Exynos 9820 Galaxy S10. What the scheduler lacks is actually made up by touch boosting as well as Android framework integrated boosters which are triggered by activity switches. These mechanisms actually help out a lot the user experience of the Exynos Galaxy S10 beyond what we can actually measure in standalone benchmarks such as PCMark or the web tests. In my subjective experience with both phones, yes the Snapdragon unit was slightly faster, but if I didn’t have both devices side by side to compare, it would be have been something quite hard to notice. What is important is that the experience on the Exynos 9820 is leagues ahead of what we’ve seen in the Exynos 9810 devices and past chipsets. Some of these OS-side boosters seem to have made it into the Android P update of Exynos 9810 devices, so while the kernel as remained largely unchanged, at least this part benefits last year’s devices.

Overall, both the Snapdragon 855 and Exynos 9820 Galaxy S10 give among the very best performance experiences among current Android phones, even if the latter has some rough edges here and there.

Inference Performance: APIs, Where Art Thou? GPU Performance & Power
Comments Locked

229 Comments

View All Comments

  • Irish910 - Saturday, March 30, 2019 - link

    Where is that website link that shows how many MORE times android tracked someone over iOS over the course of a day.....
  • id4andrei - Sunday, March 31, 2019 - link

    Google is pretty transparent about what it gathers and how it is using it. You can download at anytime anything relating to your metdata. You can wipe history of that data. You can disable tracking, personalized advertising and more. These controls are available to you in Android and your google account. This is one thing. Saying that Android has ads and has inherent security issues is another and it's plain bullshit. Saying that the Google store is the wild west is also bullshit.
  • name99 - Friday, March 29, 2019 - link

    "Exynos 9820 is the first tri-CPU cluster/group SoC which actually consists of three different CPU microarchitectures"

    It's not exactly comparable, but the A12/A12X has (at least) three different ARMv8 cores on it, the big cores (Vortex), the small cores (Tempest) and the tiny controller cores (but still ARMv8) Chinook. There are doubtless some number of M0s and suchlike ARMv7 cores also scattered around, but it's interesting that there are three different Apple-designed cores.

    It's also interesting, in terms of area, to scan
    https://en.wikipedia.org/wiki/Apple-designed_proce...

    Notable comparisons, for example, are A9X vs A10 (nominally both 16mm FF, but A10 uses the resources more efficiently) and A10 vs A10X.
  • Andrei Frumusanu - Friday, March 29, 2019 - link

    Oh come on you know better than this. The Chinooks are not part of the CPU cluster and aren't userspace program visible.

    I'm also not counting the Cortex A5's in the Exynos' audio and ISPs or the multitude of Cortex M3s it has.
  • name99 - Friday, March 29, 2019 - link

    Don't want to argue about it; I just thought this was an interesting point :-)

    I'd be just as interested if we learned that QC (or ARM proper) were using ARMv8 devices (ie "interesting" cores, not tiny cores) to handle any of their "controller" type functionality, eg controlling the NPU or GPU.
  • tuxRoller - Friday, March 29, 2019 - link

    I've not finished the article so perhaps you address these issues elsewhere.

    "I wish Samsung at least would mimic the haptics with the fingerprint sensor."

    Coupling haptics and an ultrasonic sensor that also looks beyond surfaces seem like it would be more difficult than just measuring capacitance.

    Also, since the ultrasonic sensor works when the screen is off one should expect the apparent interaction time to go up. Did you happen to time it when the screen was on? The last scenario would be timing its unlock cycle when the phone is in use (any of the password managers should be fine).
  • Andrei Frumusanu - Friday, March 29, 2019 - link

    > Did you happen to time it when the screen was on?

    I didn't do high-speed camera testing of it, but it does look every so slightly faster to respond.
  • tuxRoller - Saturday, March 30, 2019 - link

    That's not too bad then. Most of the reviews I've seen haven't mentioned the new sensor being particularly slow, so, your experience stood out to me.
  • name99 - Friday, March 29, 2019 - link

    I don't know if the iOS 12.2 update had a change to scheduler or JS that has an important effect on web scheduling, but I got 124 for Jetstream 2 on my iPhone XS which is, of course, substantially better than the 98 that Andrei sees.

    FWIW I got a very similar number on my iPad Pro A12X, and on my iMac Pro (Xeon W turbos to 4.2 GHz) I got 142, which is remarkably close to the A12/A12X number...
  • tipoo - Sunday, March 31, 2019 - link

    How nuts is it that for largely ST bound tasks like Javascript, the A12 hangs right in there with the Xeon W, which turbos to 4.2GHz.

    Scale up the core count and memory bandwidth and I don't see why anyone would assume that wouldn't be a very competent chip even for higher end systems, if the software for ARM support was there.

Log in

Don't have an account? Sign up now