System Performance

Raw CPU performance doesn’t always translate into actually better performance in-real world applications. Synthetic benchmarks are constant and long-running in their nature so performance response isn’t something that’s being tested. Real applications are a lot more bursty in their nature and might not only require high performance, but require high performance as fast as possible. Here a SoC’s scheduler and DVFS settings can have large impact on perceived “responsiveness”.

On Android, Futuremark’s PCMark is currently the best tool we have at hand to measure reproducible UI workloads. PCMark on Android makes usage of OS APIs and thus should be representative of workloads commonly found in applications.

PCMark Work 2.0 - Web Browsing 2.0

Starting off with the Web Browsing 2.0 benchmark, the Snapdragon Galaxy S9 performs very similar to the QRD845 which we previewed in February. The performance boost for the Snapdragon 845 is 12% compared to the Pixel 2 which was the best performing Snapdragon 835 platform.

We’re off for a troubling start for the Exynos 9810 as it posts some disappointing scores in the web test. We first covered the surprisingly bad performance of the new SoC in our MWC on-hands article, and while the final commercial posted a very slight improvement, it’s still massively underwhelming compared to what we expected.

PCMark Work 2.0 - Video Editing

The video test is both in part a test of CPU responsiveness as well as SoC decoder/encoder as well as OS API tests – with a mix of I/O performance thrown in. Flagship devices in this test have been relatively grouped tightly together for some time now but the Snapdragon 845 still manages a slight lead over the Exynos 9810 Galaxy S9.

PCMark Work 2.0 - Photo Editing 2.0

The Photo Editing test uses OS APIs to apply common effects on pictures, which in turn make use of RenderScript to enable GPU acceleration. The Qualcomm Galaxy S9 again posts excellent performance here, while not quite matching the promised performance on the QRD845, it still leads all other commercial devices.

Huawei’s Kirin SoCs use the same Mali GPU in an even weaker configuration, yet they showcase over double the performance. The issue here on Samsung SoCs again seems to be software related as the Renderscript are relatively short which makes the GPU never goes past the minimum 260MHz. Huawei optimises the DVFS driver to compute workloads and enacts a boost to ensure better performance.

PCMark Work 2.0 - Writing 2.0

PCMark Work 2.0 - Data Manipulation

The Writing and Data Manipulation tests are also heavily reliant on OS APIs and make use on Android UI rendering. The Snapdragon 845 here sees a larger discrepancy from the QRD platform, especially in the data manipulation test. The Writing test is especially I feel is a workload that is able to accurately represent the “snappiness” feeling of a device and to date the relative ranking between devices.

The Exynos Galaxy S9 again does not do well in in any of these workloads. I’ll talk about how this translated into real world performance in a bit.

PCMark Work 2.0 - Performance

The overall performance for PCMark and the Galaxy S9 sees a stark contrast between the Snapdragon and Exynos variants. Both variants don’t show that great of improvements over their predecessors, but the Exynos variant especially has such meagre improvements that it barely manages to distance itself from last year’s Exynos 8895 Galaxy S8 running Android Oreo 8.0 firmware. I also was disappointed in the PCMark performance of the Snapdragon 845 – but at least again this variant of the Galaxy S9 manages to top the rankings of current commercial devices, even if it doesn’t quite match the QRD845 scores.

As an aside, I’m still disappointed that Google restricted Accessibility Events in Android 7 and consequently  broke one of my favourite performance measuring tools; DiscoMark, which we last used in our Galaxy Note7 review. DiscoMark was able to empirically measure application startup times without them having to be compiled with debugging features on – the results were the best we ever had in terms of device responsiveness.

Until I find a replacement to empirically measure responsiveness, what I can give is my subjective experience with both variants of the Galaxy S9. The Snapdragon variant of the S9 is extremely fast and is up there with Google’s Pixel 2’s and is among the fastest Android devices I’ve used. I’ve got very little to complain about as it performed superbly. The Exynos 9810 was equally an extremely fast device, and make no mistake it as was among one of the fastest devices out there, however I found that it didn’t quite match the UI responsiveness of the Snapdragon variant in some scenarios. I also recently upgraded my Exynos 8895 Galaxy S8 to Android 8.0 Oreo and that seemed to have improved responsiveness when switching between apps, which further closed the gap between it and the Exynos 9810 variant of the S9.

Continuing onto our latest set of browser benchmarks we start with our new adoption of Speedometer 2.0 which is meant to replace past JavaScript micro-benchmarks with more representative JS framework tests showcasing web UI responsiveness. The benchmark is backed by the WebKit team at Apple and fully endorsed by Google’s Chrome team.

Speedometer 2.0 - OS WebView

The Snapdragon Galaxy S9 sees the excellent boost that we saw on Qualcomm’s reference platform and even actually manages to beat that score, leading all Android devices. Apple’s iPhone 7 and iPhone 8 generations are still ahead – this is due to Apple’s much faster Nitro JS engine which keeps improving, but also thanks to the A- chipset’s excellent CPUs which do great work in terms of raw performance.

The Exynos 9810 Galaxy S9 posts a massively disappointing score and just barely manages to outperform last year’s Galaxy S8’s by a hair’s margin. Certainly when I first heard of Samsung’s new big CPU I expected finally to have a SoC manage to compete in Apple in performance, however what we’re seeing here is just bad.

I hinted out that we would be switching to WebXPRT 3 for our 2018 suite and PrincipledTechnolgies released the new benchmark in late February. This will be the last article including WebXPRT 2015 as we compare the relative ranking of devices in both benchmarks.

WebXPRT 2015 - OS WebView

WebXPRT 3 - OS WebView

The Galaxy S9 in its Snapdragon variant sees excellent gains in both benchmarks and puts a clear generational gap between it and past Android devices, and actually manages to outperform the iPhone 7 in this test.

On the Exynos side of things we see again the trend of disappointing scores as this variant of the Galaxy S9 cannot distinguish itself against an Android 8.0 running variant of the E8895 Galaxy S8, going as far to scoring less than both S8’s in WebXPRT 2015.

Are the benchmarks correct, and why are they like that?

AnandTech is usually data-driven when making claims about performance so the stark contrast between the Exynos’ synthetic performance and the system benchmarks more than ever question the validity of both. There are two questions to answer here: are the benchmarks still working as intended and representative, and if they are, what happened to the Exynos 9810’s raw performance?

For the first question, I haven’t seen any evidence to contradict the results of our system benchmarks. The Exynos 9810 variant of the Galaxy S9 simply isn’t any faster in most workloads and in one-on-one comparisons against the Snapdragon 845 variant it was indeed the less consistent one in performance and losing out in terms of responsiveness, even if that difference in absolute terms is very minor.

As to why this is happening on the Exynos is something that I attribute to scheduler and DVFS. Samsung’s new scheduler and DVFS just seems absolutely atrociously tuned for performance. I tested an interactive workload on both Snapdragon and Exynos devices and the contrast couldn’t be any greater. On the Snapdragon 845 Galaxy S9 a steady state workload thread will seemingly migrate from a full idle state of the little CPUs onto the big CPUs after 65ms. At the migration moment the big CPUs kick into full gear at 2803MHz and will maintain that frequency for as long as the workload demands it.

On the Exynos 9810 Galaxy S9 the same workload will also migrate around at the 60ms time from the little cores up to the big cores, however once on the big cores the thread starts at the lowest frequencies of the big cluster – 650-741MHz. It takes the workload a whole 370ms until it reaches the 2314MHz state of the M3 cores – which according to the SPEC benchmarks is around the maximum single-threaded performance of the Snapdragon 845’s performance cores. To reach the full 2703MHz of the M3 cores the workload needs to have been active for a staggering 410ms before the DVFS mechanism starts switching to that frequency state.

UI workloads are highly transactional and very rarely is there something which takes longer than a few frames. The fact that the Exynos 9810 takes over 5x longer to reach the maximum performance state of the Snapdragon 845 basically invalidates absolutely everything about the performance of its cores. For workloads which are shorter than 400ms (which is a *lot* of time in computing terms) the Snapdragon will have already finished the race before the Exynos warms up. Only at higher workload durations would the Exynos then finally catch up. Acceleration vs maximum speed being the key aspects here. This is Samsung’s first EAS based scheduler for Exynos devices, and the way the schedutil governor is tuned here is a great disappointment for performance.

Beyond the Exynos’ overzealous “slow-and-steady” DVFS approach I’m also not happy how the core count/maximum frequency mechanism is implemented. This is a simple HR timer task that checks the CPU runqueues and based on a threshold of heavy threads it simply offlines or onlines the CPUs. The fixed interval here is 15ms when in a quad-core state and 30ms in dual- and single-core states. Beyond the fact that the whole offlineing/onlineing of the cores is extremely inefficient as a scheduler mechanism, it’s worrisome that when the SoC is in dual or single-mode and there’s suddenly a burst of threads, the CPUs will be highly underprovisioned in terms of capacity up to 30ms until the mechanism turns back on the other cores.

The fact that the DVFS mechanism is so slow completely invalidates the benefit of such a mechanism in the first place as the one use-case where single-threaded performance trumps everything is web-browsing and heavy JavaScript workloads, which by nature, are short and bursty. Samsung should have simply completely ignored frequencies above 2.1-2.3GHz (matching the Snapdragon in ST Performance), ignored this whole variable maximum frequency mechanism, and instead concentrated on getting performance through scheduler and DVFS response-time, something which Qualcomm seems to have learned to master. In the end S.LSI investment in a performant custom CPU core is sabotaged by very questionable software, and the Exynos’ CPU performance goals go largely unfulfilled in real interactive workloads.

CPU Battle - SPEC Performance & Efficiency GPU Performance & Power
Comments Locked

190 Comments

View All Comments

  • id4andrei - Tuesday, March 27, 2018 - link

    All reviewers go gaga for geekbench scores with iphones/ipads as well. In this case the GB scores prove that at least in chip design Samsung has made a huge leap. As the review has outlined, the problem lies with the scheduler and DVFS which Samsung can and should address.

    If "Samdung" is so bad at hardware design, how do you call Apple's high priced iphones of the last 3 years that could not sustain chip performance and had to be throttled so as to not crap out. All initial reviews were glowing but they were all impervious to the impeding throttling.
  • name99 - Tuesday, March 27, 2018 - link

    Dude, you really do yourself no favors by struggling so hard to criticize Apple.
    Apple's throttling has NOTHING to do with the CPU per se (ie the CPU is not generating excessive heat beyond spec, or because it has been running too fast for too long), it has to do with the BATTERY and with a concern that, if CPU performance were to spike the battery could not supply enough current.

    Very different problem, nothing to do with the CPU design. A real problem yes but totally irrelevant to the issues being discussed here.
  • Matt Humrick - Wednesday, March 28, 2018 - link

    Apple's big CPU and GPU are susceptible to thermal throttling when running sustained workloads too.

    Also, having to throttle a processor within a year of sale because its transient current requirements overwhelm the power delivery system is most definitely a design flaw.
  • Icehawk - Friday, April 6, 2018 - link

    My wife’s 6S is still working at 100% after several years, I get the feeling the amount of people affected is overblown as pretty much anything anti-Apple is. I do think Apple needs to look at a better way of dealing with this but it’s also not the armeggedon somemake it out to be. I am far from a Apple fanboy but I do like their iOS products but I am sure someone will make a retort of that nature. I’d say the same thing about the Samsung chip - not great but it is performant, perhaps if we stop thinking each year a new phone should blow us away it would help us be more realistic.
  • Lavkesh - Tuesday, March 27, 2018 - link

    "In this case the GB scores prove that at least in chip design Samsung has made a huge leap" - Please explain huge leap here? The new chip barely outperforms the older SOC.
  • ZolaIII - Monday, March 26, 2018 - link

    I am very disappointed with both SoC's. Qualcomm wasted so much space on bad L4 cache which only added to latency & generally wasted more. The 30% is enormous even if new A75 cores are 35% bigger (would be 50% with ARM's L2 reference cache size) I don't know about A630 vs A540 size but if it grown-up let's say 10% the cores & GPU would together accommodate for around 15~20% leaving L3 & L4 responsible for the rest. Would be much better they used it for GPU as it could had been 2x the size then. I am also very disappointed with new cache hierarchy as it turns out to be stupid and a waist of silicone. Seams to me neither SoC used good scheduler nor scheduling by the looks of things it seems Samsung used the CAF HPM sched settings for Snapdragon SoC very aggressive patched interactive without any restraints whatsoever & no hotplug whatsoever which is very south from optimal, reference QC platform seams to had at least used hotplug (as their is no other way to explain the difference of almost 1W in GPU testing as two vs four A75's active). On the other hand seems Samsung used Power aware schaduler instead HPM & very granulated hotplug producing very bad results as those are directly confronted two things & when splashed together can only result in catastrophic result. I prefer HPM configured to be used with limited task packing and a high priority tasks enabled with significant increase of time interval for it (so that it can skip CPU sched limit), for CPU sched interactive traditional not patched with tree step load limitations (idle so that it doesn't jump erratic on any back shade task, ideal that is considered as best sustainable leakage for given lithography & max sustainable for two core's [only on big cores] i also use boost enabled & set to ideal frequency one [same as in interactive]). Preferred to use core_ctl hotplug disabled for the two little & two big cores so that they never get switched off from it. I won't go further in details about it hire as its pointless. I find this idea balanced between always available/needed/total performance as most of the times two of each course are enough for most of tasks & if not it's not a biggie to wait for other two to kick in. There is a minor drow back in responsiveness on lite task's but actually it works as fast as possible on hard one's flagged as heavy tasks like for instance Chrome rendering. It's also very beneficial to GPU workloads where even switching of two little core's and giving even 100~150mv headroom to GPU means much.

    Sorry for getting a bit deep regarding how complete scheduling mechanism should be done but I had an urge to explain how it should be done as it's so terrible done in the both cases examined hire.
  • tuxRoller - Wednesday, March 28, 2018 - link

    It's not at all clear that the hpm is meaningfully better (much faster or much more power efficient) than a proper schedtune + energy model implementation.
    Scheduling is just ridiculously hard. Adding the constraints of: soft-realtime requirements, minimal battery usage, AND an asmp and you've got the current situation where there's not yet a consensus design. We are, however, starting to see signs of convergence, imho.
  • zeeBomb - Monday, March 26, 2018 - link

    I came...and I finally saw
  • phoenix_rizzen - Monday, March 26, 2018 - link

    Ouch. The Exynos S9 is just barely better than the Exynos S7. :( And that's what Canada's going to get.

    Here's hoping they can improve things via software updates. Was considering the S9 to replace the wife's now dead S6. She's been using my S7 for the past two months while I limp along with a cracked-screen Note4. Other than the camera and screen, this isn't looking like much or an upgrade for being two generations newer.

    Maybe we'll give the ZTE, Huawei, and Xiaomi phones another look ...
  • mlauzon76 - Monday, March 26, 2018 - link

    Samsung Exynos 9810 (Europe & Rest of World)

    Canada is the 'rest of [the] world', but we don't get that version, we never get anything with the Exynos processor, we get the following one:

    Qualcomm Snapdragon 845 (US, China, Japan)

Log in

Don't have an account? Sign up now