Somehow both Anand and I ended up with international versions of Samsung’s Galaxy S 4, equipped with the first generation Exynos 5 Octa (5410) SoC. Anand bought an international model GT-I9500 while I held out for the much cooler SK Telecom Korean model SHV-E300S, including Samsung’s own SS222 LTE modem capable of working on band 17 (AT&T LTE) and Band 2,5 WCDMA in the US. Both of these came from Negri Electronics, a mobile device importer in the US.

For those of you who aren’t familiar with the Exynos 5 Octa in these devices, the SoC integrates four ARM Cortex A15 cores (1.6GHz) and four ARM Cortex A7 cores (1.2GHz) in a big.LITTLE configuration. GPU duties are handled by a PowerVR SGX 544MP3, capable of running at up to 533MHz.

We both had plans to do a deeper dive into the power and performance characteristics of one of the first major smartphone platforms to use ARM’s Cortex A15. As always, the insane pace of mobile got in the way and we both got pulled into other things.

More recently, a post over at Beyond3D from @AndreiF gave us reason to dust off our international SGS4s. Through some good old fashioned benchmarking, the poster alleged that Samsung was only exposing its 533MHz GPU clock to certain benchmarks - all other apps/games were limited to 480MHz. For the past few weeks we’ve been asked by many to look into this, what follows are our findings.

Characterizing GPU Behavior

Samsung awesomely exposes the current GPU clock without requiring root access. Simply run the following command over adb and it’ll return the current GPU frequency in MHz: 

adb shell cat /sys/module/pvrsrvkm/parameters/sgx_gpu_clk

Let’s hope this doesn’t get plugged, because it’s actually an extremely useful level of transparency that I wish more mobile platform vendors would offer. Running that command in a loop we can get real time updates on the GPU frequency while applications run different workloads.

Running any games, even the most demanding titles, returned a GPU frequency of 480MHz - just like @AndreiF alleged. Samsung never publicly claimed max GPU frequencies for the Exynos 5 Octa (our information came from internal sources), so no harm no foul thus far.


Running Epic Citadel - 480 MHz

Firing up GLBenchmark 2.5.1 however triggers a GPU clock not available elsewhere: 532MHz. The same is true for AnTuTu and Quadrant.


Running AnTuTu – 532 MHz SGX Clock

Interestingly enough, GFXBench 2.7.0 (formerly GLBenchmark 2.7.0) is unaffected. We confirmed with Kishonti, the makers of the benchmark, that the low level tests are identical between the two benchmarks. The results of the triangle throughput test offer additional confirmation for the frequency difference:

GT-I9500 Triangle Throughput Performance
Total System Power GPU Freq Run 1 Run 2 Run 3 Run 4 Run 5 Average
GFXBench 2.7.0 (GLBenchmark 2.7.0) 480MHz 37.9M Tris/s 37.9M Tris/s 37.7M Tris/s 37.7M Tris/s 38.3M Tris/s 37.9M Tris/s
GLBenchmark 2.5.1 532MHz 43.1M Tris/s 43.2M Tris/s 42.8M Tris/s 43.4M Tris/s 43.4M Tris/s 43.2M Tris/s
% Increase 10.8%           13.9%

We should see roughly an 11% increase in performance in GLBenchmark 2.5.1 over GFXBench 2.7.0, and we end up seeing a bit more than that. The reason for the difference? GLBenchmark 2.5.1 appears to be singled out as a benchmark that is allowed to run the GPU at the higher frequency/voltage setting.

The CPU is also Affected

The original post on B3D focused on GPU performance, but I was curious to see if CPU performance responded similarly to these benchmarks.

Using System Monitor I kept an eye on CPU frequency while running the same tests. Firing up GLBenchmark 2.5.1 causes a switch to the ARM Cortex A15 cluster, with a default frequency of 1.2GHz. The CPU clocks never drop below that, even when just sitting idle at the menu screen of the benchmark.

 
Left: GLBenchmark 2.5.1 (1.2 GHz), Right: GFXBench 2.7 (250 MHz - 500 MHz)

Run GFXBench 2.7 however and the SoC switches over to the Cortex A7s running at 500MHz (250MHz virtual frequency). It would appear that only GLB2.5.1 is allowed to run in this higher performance mode.

A quick check across AnTuTu, Linpack, Benchmark Pi, and Quadrant reveals the same behavior. The CPU governor is fixed at a certain point when either of those benchmarks is launched. 


Linpack for Android: Exynos 5 Octa all cores 1.6 GHz (left), Snapdragon 600 all cores 1.9 GHz (right)

Interestingly enough, the same behavior (on the CPU side) can be found on Qualcomm versions of the Galaxy S 4 as well. In these select benchmarks, the CPU is set to the maximum CPU frequency available at app launch and stays there for the duration, all cores are plugged in as well, regardless of load, as soon as the application starts.

Note that the CPU behavior is different from what we saw on the GPU side however. These CPU frequencies are available for all apps to use, they are simply forced to maximum (and in the case of Snapdragon, all cores are plugged in) in the case of these benchmarks. The 532MHz max GPU frequency on the other hand is only available to these specific benchmarks.

Digging Deeper

At this point the benchmarks allowed to run at higher GPU frequencies would seem arbitrary. AnTuTu, GLBenchmark 2.5.1 and Quadrant get fixed CPU frequencies and a 532MHz max GPU clock, while GFXBench 2.7 and Epic Citadel don’t. Poking around I came across the application changing the DVFS behavior to allow these frequency changes – TwDVFSApp.apk. Opening the file in a hex editor and looking at strings inside (or just running strings on the .odex file) pointed at what appeared to be hard coded profiles/exceptions for certain applications. The string "BenchmarkBooster" is a particularly telling one:

You can see specific Android java naming conventions immediately in the highlighted section. Quadrant standard, advanced, and professional, linpack (free, not paid), Benchmark Pi, and AnTuTu are all called out specifically. Nothing for GLBenchmark 2.5.1 though, despite its similar behavior. 

We can also see the files that get touched by TwDVFSApp while it is running: 

//sys/class/devfreq/exynos5-busfreq-int/min_freq
//sys/class/devfreq/exynos5-busfreq-mif/min_freq
+/sys/class/thermal/thermal_zone0/boost_mode
2/sys/devices/platform/pvrsrvkm.0/sgx_dvfs_min_lock

When the TwDVFSApp application grants special DVFS status to an application, the boost_mode file goes from value 0 to 1, making it easy to check if an affected application is running. For example, launching and closing Benchmark Pi:

shell@android:/sys/class/thermal/thermal_zone0 $ cat boost_mode                
1
shell@android:/sys/class/thermal/thermal_zone0 $ cat boost_mode                
0

There are strings for Fusion3 (the Snapdragon 600 + MDM9x15 combo) and Adonis (the codename for Exynos 5 Octa): 

doBoostAll
doBoostForAdonis
doBoostForAdonis::
doBoostForFusion3
doBoostForFusion3::

What's even more interesting is the fact that it seems as though TwDVFSApp seems to have an architecture for other benchmark applications not specifically in the whitelist to request for BenchmarkBoost mode as an intent, since the application is also a broadcast receiver. 

6Lcom/sec/android/app/twdvfs/TwDVFSBroadcastReceiver$1;
6Lcom/sec/android/app/twdvfs/TwDVFSBroadcastReceiver$2;
?Lcom/sec/android/app/twdvfs/TwDVFSBroadcastReceiver$IntentInfo;
4Lcom/sec/android/app/twdvfs/TwDVFSBroadcastReceiver;
boostIntent
5com.sec.android.intent.action.DVFS_FG_PROCESS_CHANGED
*com.sec.android.intent.action.SSRM_REQUEST

So we not only can see the behavior and empirically test to see what applications are affected, but also have what appears to be the whitelist and how the TwDVFSApp application grants special DVFS to certain applications.

Why this Matters & What’s Next

None of this ultimately impacts us. We don’t use AnTuTu, BenchmarkPi or Quadrant, and moved off of GLBenchmark 2.5.1 as soon as 2.7 was available (we dropped Linpack a while ago). The rest of our suite isn’t impacted by the aggressive CPU governor and GPU frequency optimizations on the Exynos 5 Octa based SGS4s. What this does mean however is that you should be careful about comparing Exynos 5 Octa based Galaxy S 4s using any of the affected benchmarks to other devices and drawing conclusions based on that. This seems to be purely an optimization to produce repeatable (and high) results in CPU tests, and deliver the highest possible GPU performance benchmarks. 

We’ve said for years now that the mobile revolution has/will mirror the PC industry, and thus it’s no surprise to see optimizations like this employed. Just because we’ve seen things like this happen in the past however doesn’t mean they should happen now.

It's interesting that this is sort of the reverse of what we saw GPU vendors do in FurMark. For those of you who aren't familiar, FurMark is a stress testing tool that tries to get your platform to draw as much power as possible. In order to avoid creating a situation where thermals were higher than they'd be while playing a normal game (and to avoid damaging graphics cards without thermal protection), we saw GPU vendors limit the clock frequency of their GPUs when they detected these power-virus style of apps. In a mobile device I'd expect even greater sensitivity to something like this. I suspect we'll eventually get to that point. I'd also add that just like we've seen this sort of thing many times in the PC space, the same is likely true for mobile. The difficulty is in uncovering when something strange is going on.

What Samsung needs to do going forward is either open up these settings for all users/applications (e.g. offer a configurable setting that fixes the CPU governor in a high performance mode, and unlocks the 532MHz GPU frequency) or remove the optimization altogether. The risk of doing nothing is that we end up in an arms race between all of the SoC and device makers where non-insignificant amounts of time and engineering effort is spent on gaming the benchmarks rather than improving user experience. Optimizing for user experience is all that’s necessary, good benchmarks benefit indirectly - those that don’t will eventually become irrelevant.

POST A COMMENT

108 Comments

View All Comments

  • Excors - Tuesday, July 30, 2013 - link

    "Nothing for GLBenchmark 2.5.1 though, despite its similar behavior."

    TwDVFSApp has some hard-coded package names, but it also gets a list from android.os.DVFSHelper.PACKAGES_FOR_BOOST_ALL_ADJUSTMENT. That's set in /system/framework/framework2.odex and contains:

    com.aurorasoftworks.quadrant.ui.standard
    com.aurorasoftworks.quadrant.ui.advanced
    com.aurorasoftworks.quadrant.ui.professional
    com.redlicense.benchmark.sqlite
    com.antutu.ABenchMark
    com.greenecomputing.linpack
    com.glbenchmark.glbenchmark25
    com.glbenchmark.glbenchmark21
    ca.primatelabs.geekbench2
    com.eembc.coremark
    com.flexycore.caffeinemark
    eu.chainfire.cfbench
    gr.androiddev.BenchmarkPi
    com.smartbench.twelve
    com.passmark.pt_mobile

    In my opinion, the fundamental problem is that almost everybody who reviews smartphones has a very juvenile approach - they want to run a benchmark and get a single number, because that lets them say "phone X has a bigger number than phone Y, so it is better" without engaging their brains. (And those are the more advanced reviews, that go beyond simply counting cores.)

    But phones aren't designed for a single performance number - they're designed for power *and* performance, and it's impossible to separate one from the other.

    I think a good review would analyse two things:

    * The power/performance curve. If a phone has well-designed hardware and software, it will give better performance at the same power as another phone, across the whole curve. You could make an attempt at comparison by tweaking the kernel's cpufreq settings, to force it to run at a particular speed, then measuring benchmark scores and battery current. Do the same for GPU speeds if possible. Repeat at every available speed and draw the curve. That's hardly ideal (you probably need to compensate for display brightness etc), but it's a start, and it'd allow more meaningful comparisons than a single number per benchmark per phone.

    * The tuning. A good phone will switch sensibly to higher-performance modes to give a responsive user experience. A bad phone will use too much power in mostly-non-interactive scenarios (on the home screen, recording a video, playing Angry Birds, etc), or it will take too long to ramp up when the user interacts (e.g. taking a photo or switching between apps will be slow and jerky). I think this is especially a concern with big.LITTLE - switching on the A15 cores is a relatively big commitment in time and power, so the software will have to be very careful to use them at the right times. It would be great if someone could at least attempt to measure this kind of thing quantifiably.

    Looking at those two aspects would help understand how well a phone has chosen its power/performance tradeoffs.

    But while reviewers keep comparing SunSpider and AnTuTu scores, it's no surprise that phone developers will spend a lot of effort on maximising those scores - sometimes subtly (like tweaking GPU driver optimisations for a benchmark's unique usage patterns), sometimes blatantly (like Samsung here) - because that gets good reviews and sells more phones. There's little point in kindly asking Samsung to focus less on benchmarks and accept worse reviews and lower sales and less money - if you want them to focus more on the user experience, that's what has to be measured in reviews.
    Reply
  • yhselp - Tuesday, July 30, 2013 - link

    It's hard not to attack your comment since all you say is generally true. However, it seems that you've never read an AnandTech smartphone review -- the whole of it, or you've somehow missed what it says (and many accompanying pipeline stories). At AT they never say X is better than Y because of some benchmark number, on the contrary -- they always say benchmarks are subjective and are only a part of the story. I've read some shockingly objective pieces on this matter here on AT; of what a benchmark's role is, what it represents, how it should be read, what significance is has -- all benchmarks are useful as long as you read them 'right' and AT does an excellent job of stressing that very point in each review. Also, they were the first to say that comparing smartphones cross-platform is a very subjective thing.

    So you should've probably familiarized yourself with AT better before posting over-generalized comments like this one. I'm not hating on you, just saying.
    Reply
  • abdealiv - Wednesday, July 31, 2013 - link

    you missed his point completely!! Sure AT has different suits for benchmarking & also mentions that benchmarks are subjective but 1st thing rest of 99.999% tech sites , reviewers, bloggers & general users do is benchmarks this new phones with those mentioned apps & compares instantly with other flagships ..so its not surprise that samsung feels the need to get positive reviews from those majority reviewers. this is why the guy above mentioned to change the review system as whole ..not for AT but for others out there. Reply
  • yhselp - Wednesday, July 31, 2013 - link

    Wait a second, my reply was strictly aimed at Excors "the guy above" and not this article so I don't believe I've missed any point. I am not defending benchmark optimizations, on the contrary -- I've always been against this game of cat and mouse between silicon makers; I'm all for focusing on improving the hardware and user experience.

    It seems you've missed the point of my reply as a response to Excors' comment completely!!1 While I agree that many other tech sites evaluate smartphones (and other products) solely by comparing benchmarks, however, Excors wrote his generalized comment (which I agreed with) here on AT, and not on other tech sites. Wouldn't you agree that it's pointless to leave feedback and advice at the wrong place, unless you believe AT doesn't review properly which is not the case. That's what my comment was about.
    Reply
  • kurkosdr - Tuesday, July 30, 2013 - link

    When will benchmark developers learn to RANDOMIZE any strings that identify the app and the fize size?

    Google "quack ati" for similar hilarity.

    PS: What if you run the benchmark again and again for hours and the phone dies a death from overheating? Will the samsung warranty cover it?
    Reply
  • tcool93 - Tuesday, July 30, 2013 - link

    Anandtech is an idiot for making such a big deal out of this. Now you got everyone trashing Samsung claiming they are "cheating". Which is total BS, because its not any different than what EVERY other company does, which is write drivers so that certain games or whatever perform better with their hardware. And that includes benchmark programs, and Anandtech knows it.

    We are also talking about a measly 53mhz... big deal. This is just stupid, now this biased BS will get spread everywhere, thanks to Anandtech.

    I believe Android tablets even have an overclock option in their settings.
    Reply
  • varad - Wednesday, July 31, 2013 - link

    "biased BS"?? - So how much does Samsung pay you for writing comments like this? Reply
  • Scannall - Wednesday, July 31, 2013 - link

    It smells like....... Astroturf. Reply
  • tcool93 - Tuesday, July 30, 2013 - link

    It would be funny to see Samsung sue Anandtech for spreading this fud. Reply
  • varad - Wednesday, July 31, 2013 - link

    It would be funnier if Anandtech also tracked your a/c to Samsung Reply

Log in

Don't have an account? Sign up now