Chrome - BBC Frontpage

To verify the findings of the previous use-case, we try to have a look at a different web-page. This time we load the BBC's mobile front-page. It's a fairly medium sized page with moderate complexity but which still represents a large amount of web content in mobile.
 

The little core data doesn't look much different than what we saw on the AnandTech frontpage. The little cores see a consistent high load, with a fairly large peak towards the main rendering phase of the page.

Chrome again seems to cause the system to spawn more threads than what the little cluster can accomodate.

The big cores also behave similarly to what we saw on the AnandTech front-page. There's a consistant load of a single large thread with some bursts where up to all 4 CPUs are doing some processing.

The total run-queue depths for the system again confirm what we saw in the previous scenario: Chrome is able to consistently make use of a large amount of threads, so that we see use of up to 6 CPUs with small bursts of up to almost 9 threads. 

What is interesting about the Chrome results is that most of the threads are placed on the little cores, meaning we have a large amount of small threads. Because the migration mechanisms of HMP don't migrate threads below a certain performance threshold, this causes some oversaturation of the little CPU cluster. 

This is an interesting implication for non-heterogeneous 8 core designs such as seen from MediaTek. In such a scenario having 8 little cores at more or less the same performance capacity would indeed make quite some sense. It's again MediaTek's X20 design with 2 clusters of 4 cores and a cluster of 2 high performance cores which comes to mind when looking at these results, as I can't help but think that this would be a use-case which would make perfect sense for that SoC.

Browser: Chrome - AnandTech Frontpage App: Hangouts Launch
POST A COMMENT

157 Comments

View All Comments

  • V900 - Tuesday, September 1, 2015 - link

    A question that DOESNT get answered however is: Does the fact that all cores get used, contribute to a better/faster user experience?

    If there was only 2 or 4 cores present, would they complete the tasks just as fast?

    In other words, Is there a gain from all 8 cores being used, or does all 8 cores get used just because they are there? (By low priority threads, which in a quad/dual core CPU would have been done sequentially, in just as fast a time?)

    Since Apples dual core iPhones, always outperform Android quad and octa core phones, I would think that the latter is closer to the truth.

    Read up on what some of the other posters here have written about low priority threads, and Microsofts research on the matter.

    And ignore anyone who tries to over-interpret this article!
    Reply
  • frenchy_2001 - Wednesday, September 2, 2015 - link

    > Does the fact that all cores get used, contribute to a better/faster user experience?
    It does not, as long as you CPU can process all the threads in a timely manner.
    It contributes to a lower power usage though, as power grows following the square of Voltage and voltage usage grows with frequency, while parallelization grows linearly.
    Basically, if 2 A53 @ 800MHz can do the same amount of work as 1 A53 @ 1.6GHz, the 2 slower cores will do it for less power (refer to the perf/W curve on the conclusion page).

    This was the goal of ARM when they designed big.LITTLE and this article shows that the S6 uses it correctly (by using small cores predominantly and keeping frequencies low). It is one more trick to deliver strong immediate computation, good perfs/W at moderate usage and great idle power while idling. I would not extrapolate beyond that as too many variables are in play (kernel/governor/HW/apps...)
    Reply
  • name99 - Tuesday, September 1, 2015 - link

    "When I started out this piece the goals I set out to reach was to either confirm or debunk on how useful homogeneous 8-core designs would be in the real world"

    You mean heterogeneous above rather than homogeneous.
    Reply
  • Andrei Frumusanu - Tuesday, September 1, 2015 - link

    No, I meant specifically 8x A53 SoCs. Reply
  • lilmoe - Tuesday, September 1, 2015 - link

    I've been waiting for this piece since the GS6 came out. I can't even imagine the amount of time and work you've put into it. THANK YOU Andre.

    Now I hope we can put to rest the argument that Android would do better with only 2 high performance cores VS more core configurations. Google has been promising this for years and they're finally _starting_ to deliver. They're not there yet, lots of work needs to be done to exterminate all that ridiculous overhead (evident in the charts).

    I'm also glad that it's finally evident that Chrome on Android VS SBrowser has significant impact on performance and battery life. It should only be fair to ask that Anandtech starts using the built-in browser for each respective device when benchmarking.

    We're _just_ reaping the benefits of properly implemented big.LITTLE configurations, in both hardware and software, after 2 years of waiting. What's funny is that both Qualcomm and Samsung are moving away from these implementations back to Quad-core CPUs with Kryo and Mongoose respectively... I personally hope we get the best of both worlds in the form of Mediatek's 10 core big.LITTLE implementation, except the 2 high perf cores being either Kryo or Mongoose for their relatively insane single-threaded performance.
    Reply
  • V900 - Tuesday, September 1, 2015 - link

    You're coming up with conclusions that aren't aupported by the article.

    Can we put 2 vs 8 core argument to rest? Nope.

    This test only shows, that when there are 4 (or 8) cores available, Android occasionally uses them all.

    It says NOTHING about whether an 8 core CPU would be faster than with 2 wide cores. (Samsung and Qualcomm are moving towards Apple-like wide dual core designs. I doubt they'd do that, if 8 cores were really always faster/economical than 2)

    In fact, the article doesn't really tell us whether 8 small cores are faster/more economical than 2 or 4 small cores. Keep in mind what people have brought up about the priority of threads. Some of the threads you see occupying all 8 cores, are low priority threads, that could just as quickly be completed in sequence if there were only 2 or 4 low power cores available.
    Reply
  • lilmoe - Tuesday, September 1, 2015 - link

    "Can we put 2 vs 8 core argument to rest? Nope."

    Are you sure we're on the same page here? We're talking efficiency, right?

    "This test only shows, that when there are 4 (or 8) cores available, Android occasionally uses them all."

    No it doesn't. Android is capable of utilizing all cores, yes, but it only allocates threads to the amount of cores *needed*, which is much, MUCH more power efficient than elevating a smaller number of high performance cores to their max performance/freq states.

    "It says NOTHING about whether an 8 core CPU would be faster than with 2 wide cores. (Samsung and Qualcomm are moving towards Apple-like wide dual core designs. I doubt they'd do that, if 8 cores were really always faster/economical than 2)".

    True, it doesn't show direct comparisons with modern wide cores running Android, because there isn't any. But even taking MT overhead and core switching overhead into account, I believe it's safe to say that things should be comparable (since the small cluster is rarely saturated), except (again) much more efficient. And no, QC and Samsung aren't moving to any dual core configuration; they're both moving to Quad-core configuration (ie: the most optimal for Android), which further proves the argument that more cores running at a lower frequency (and lower power draw) is more efficient than having fewer cores running at their relative max for MOBILE DEVICES.

    The problem isn't the premise, it's the means. ARM's reference core designs aren't optimal in comparison to custom designs neither in performance nor in power consumption. Theoretically speaking, if Qualcomm or Samsung use little versions of their custom cores in 8-core configurations, or 4x4 big.LITTLE, we might theoretically see tremendous power savings in comparison. Again, this applies to Android based on this article.

    "In fact, the article doesn't really tell us whether 8 small cores are faster/more economical than 2 or 4 small cores."

    This article STRICTLY talks about the impact of 4x4 core big.LITTLE configuration has on ANDROID if you want BOTH performance and maximum efficiency. It clearly displays how Android (and its apps) is capable of dividing the load into multiple threads; therefore having more cores has its benefits. Also, you can clearly see that there is noticeable overhead here and there, and throwing more cores at the problem, running at lower frequency, is a better brute force solution to, AGAIN, maximize efficiency while maintaining high performance WHEN NEEDED, which is usually in relatively short bursts. Android still has ways with optimization, but its current incarnation proves that more cores are more efficient.

    You are making the wrong comparisons here. What you should be asking for is comparisons between a quad-core A57 chip, VS an 8-core A53 chip, VS a 4x4 A57/A53 big.LITTLE chip. That, and only that would be a valid apples-to-apples comparison, which in this case is only valid when tested with Android. Unfortunately, good luck finding these chips from the same manufacturer built ont he same process...
    Reply
  • lilmoe - Tuesday, September 1, 2015 - link

    "they're both moving to Quad-core configuration (ie: the most optimal for Android)"

    In regard to large/wide core non-big.LITTLE designs that is.
    Reply
  • lefty2 - Tuesday, September 1, 2015 - link

    You are right. The article is deeply flawed. Nowhere is there any evidence of a 4 core device rendering a web page faster than a 2 core. Reply
  • lopri - Wednesday, September 2, 2015 - link

    Qualcom's next custom core is 2+2 but Samsung's is 4+4. But I agree with the gist of your argument. Different core counts, but they all aim the same goal - performance and efficiency. Reply

Log in

Don't have an account? Sign up now