The Mobile CPU Core-Count Debate: Analyzing The Real World

Name: The Mobile CPU Core-Count Debate: Analyzing The Real World
Item: The Mobile CPU Core-Count Debate: Analyzing The Real World
Author: Andrei Frumusanu

by Andrei Frumusanu on September 1, 2015 8:00 AM EST

Posted in
Smartphones
CPUs
Mobile
SoCs

157 Comments | Add A Comment

157 Comments

Chrome - AnandTech Frontpage

Chrome is the de-facto browser application on a lot of Android devices. We again use it to load the AnandTech frontpage and to analyse the CPU's behaviour.

Starting off with the little cores:

Off the bat we see quite a large difference in the power state distribution graphs. Chrome seems to place much higher load on the little cores compared to S-Browser. When looking at the run-queue chart we see that indeed all cores are almost at their full capacity for a large amount of time.

What stands out though is a very large peak around the 4s mark. Here we see the little cores peak up to almost 7 threads, which is quite unexpected. This burst seems to overload the little cluster's capacity. The frequency also peaks to 1.3GHz at this point. The reason we don't see it go higher is probably that the threads are still big enough that they're picked up by the scheduler and migrated over to the big cluster at that point.

The big cores also see a fair amount of load. Similarly to the S-Browser we have 1 very large thread that puts a consistent load on 1 CPU. But curiously enough we also see some significant activity on up to 2 other big cores. Again, in terms of burst loads we see up to 3 big CPUs being used concurrently.

The total run-queue depths for the system looks very different for Chrome. We see a consistent use of 4-5 cores and a large burst of up to 8 threads. This is a very surprisng finding and impact on the way we perceive the core count usage of Chrome.

Browser: S-Browser - AnandTech Frontpage Browser: Chrome - BBC Frontpage

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

157 Comments

View All Comments

yankeeDDL - Tuesday, September 1, 2015 - link
Just wanted to say that it's agreat article. Well done and very interesting: the use of 4+4 cores on a mobile platform while on a PC we still have plenty of 2 cores CPUs, seemed quite ridiculous. But no, clearly, it makes sense.
Tolwyns - Tuesday, September 1, 2015 - link
Very interesting article. These test were done on Android 5, I take it. I know that this analysis is geared toward current hardware, but most of the "4cores are only marketing" discussion was quite a while back when most device had some version of Android 4. I wonder if the benefits of more cores did show up then. The second thing i'm interested in is "How much of this is applicable to other SOCs". Not much I gather. And related to that "How much of this is limited to Samsung devices", because they made the CPU and the Firmware-softwarelayer of the tested device.
SunLord - Tuesday, September 1, 2015 - link
I'm kinda curious how a 8 core version of the x20 with 2 lower power 4 mid power and 2 high power cores would perform
Shadowmaster625 - Tuesday, September 1, 2015 - link
It is kind of a misleading analysis. One single haswell core could juggle all of these processes and still have plenty of time to sleep. So you're not really telling us anything here. Is a wider fatter core better than all these narrow underpowered cores? Given the performance and power consumption of the apple SoCs, I would still have to say yes.
IanHagen - Tuesday, September 1, 2015 - link
This! When developing for iOS I usually have to span several threads (queues in Apple's world) for things that would otherwise block the main queue, which would cause the UI to "freeze" and the dual core SoC inside the devices I'm targeting are munching my threads absolutely fine. Just by saying that the several extre cores found in Android phones aren't sleeping you're not coming to any definitive conclusion about any clear advantage of having them.
nightbringer57 - Tuesday, September 1, 2015 - link
The thing is that when you have 4 threads, 4 cores can potentially do the job more efficiently with performance equal to a single core with 4 times the execution speed.
nightbringer57 - Tuesday, September 1, 2015 - link
*by efficiently, I mean, using less power*
metafor - Tuesday, September 1, 2015 - link
Potentially, but not necessarily. Threading and thread migration aren't free. It depends on how much performance you really need. The A57(R3), for instance, at very low frequencies is actually slightly more power efficient than the A53 at its peak frequency (surprising, I know).

If you have 4 threads that need absolutely-bare-minimum performance that a min-frequency single-core could handle, waking up 4 cores (even if they're smaller) and loading the code/data into the caches of each of those cores isn't necessarily a clear win. Especially if they share the same code.
lilmoe - Tuesday, September 1, 2015 - link
"The A57(R3), for instance, at very low frequencies is actually slightly more power efficient than the A53 at its peak frequency (surprising, I know)."

Cool story. Except that, in most of the smaller multithreaded workload cases, the little cores usually aren't near their saturation levels. Also, in most cases, when they _do_ get saturated, the workload is transferred and dealt with by big core or two in short bursts.

Even if it isn't a "clear win", in *some* workloads mind you, saying that there isn't any apparent merit in these configurations is really irresponsible.
metafor - Tuesday, September 1, 2015 - link
I don't think I said there's no merit to such configurations. I simply said parallelizing a workload isn't always a clear win over using a single core. It depends on the required performance level and the efficiency curve of the small core and big core.

If 4 threads running on 4 small cores at 50% FMax can be done by one big core at FMin without wasting any cycles, the advantage actually goes to the big core configuration. The small core configuration works if there's a thread that requires so little performance, it'd be wasteful to run it on the big core even at FMin.

The conclusion of which is best for the given workload isn't as clear cut as saying "look, the small cores are being used by a lot of threads!". But rather, by measuring power and perf using the two configurations.

The Mobile CPU Core-Count Debate: Analyzing The Real World

Chrome - AnandTech Frontpage

Post Your Comment

157 Comments

View All Comments

yankeeDDL - Tuesday, September 1, 2015 - link

Tolwyns - Tuesday, September 1, 2015 - link

SunLord - Tuesday, September 1, 2015 - link

Shadowmaster625 - Tuesday, September 1, 2015 - link

IanHagen - Tuesday, September 1, 2015 - link

nightbringer57 - Tuesday, September 1, 2015 - link

nightbringer57 - Tuesday, September 1, 2015 - link

metafor - Tuesday, September 1, 2015 - link

lilmoe - Tuesday, September 1, 2015 - link

metafor - Tuesday, September 1, 2015 - link

Log in

Don't have an account? Sign up now