Play Store App Updates

One of the most demanding real-world tasks on Android is the installing and updating of applications. Doing mass-updates on several applications at once can heat up most devices as we see heavy computational load not only in the install process, but also caused by ART's ahead-of-time compilation and optimization process.

I recorded a ~80s period where the Play Store updated several installation packages, such as Android's WebView component as well as a few other user-applications.

The Play Store update process seems to be extremely liberal with spawning threads. The little cores are severely over-capacity as we see package updates loading up to 8-9 threads onto the cluster. The two major peaks towards the end of the log especially demonstrate this fact as all CPUs vastly exceed the optimal run-queue depth of 1 when under load, which causes the scheduler to need to preemt between multiple processes.

While it may have been intriguing to see the little cores loaded to such extent, the big cluster seems outright shocking as it as well sees very significant thread-placement. This is one of the rare scenarios where having 4 big cores is not enough. Similarly to the little cores, we see peaks where the run-queue depth vastly exceeds the optimal value of 1.

When looking at the total system run-queue depth, things look for a lack of better description, quite ridiculous. We routinely have peaks where all 8 cores of the system are fully loaded and peak at over 10 threads. It looks like Google is able to massively parallelize the app update process and take advantage of even the highest core-count SoCs. This scenario is absolutely about maximum throughput and performance while utilizing all available hardware resources.

App: Play Store Open & Scroll Camera: Launch
Comments Locked

157 Comments

View All Comments

  • modulusshift - Tuesday, September 1, 2015 - link

    Heck yes. And of course I'm interested if anything like this is even remotely possible for Apple hardware, though likely it would require jailbreaks, at least.
  • Andrei Frumusanu - Tuesday, September 1, 2015 - link

    Unfortunately basically none of the metrics measured here would be possible to extract from an iOS device.
  • TylerGrunter - Tuesday, September 1, 2015 - link

    Add one more vote for the follow up with synthetics.
    I would also want to see how the multitasking compares with the Snapdragons as they use the different frequency and voltage planes per core instead of the big.LITTLE.
    But I guess that would be better to see with the SD 820, as the 810 uses big.LITTLE. Consider it a request for when it comes!
  • tuxRoller - Wednesday, September 2, 2015 - link

    Big.little can use multiple planes for either cluster. The issue is purely implementation, tmk.
  • TylerGrunter - Wednesday, September 2, 2015 - link

    big.LITTLE can be use different planes for each cluster but same for all cores in each cluster, Qualcomm SoCs can use different planes for each core, that's the difference and it's a big one.
    https://www.qualcomm.com/news/onq/2013/10/25/power...
    I'm not sure that can be done in big.LITTLE.
  • tuxRoller - Friday, September 4, 2015 - link

    I remember that but that doesn't say that big.LITTLE can't keep each core on its own power plane just that the implementations haven't.
  • soccerballtux - Tuesday, September 1, 2015 - link

    to balance everything out-- meh, that doesn't interest me. most of the time I'm concerned with battery life and every-day performance. Android isn't a huge gaming device so absolute performance doesn't interest me.
  • porphyr - Tuesday, September 1, 2015 - link

    Please do!
  • ppi - Tuesday, September 1, 2015 - link

    Go ahead. This is one of the most interesting performance digging on this site since the random-write speeds on SSDs.
  • jospoortvliet - Friday, September 4, 2015 - link

    Yes, this was an awesome and interesting read.

Log in

Don't have an account? Sign up now