Kirin 920 SoC & Platform Power Analysis

The central point of the Honor 6 is the new HiSilicon Kirin 920. This is the first non-Samsung big.LITTLE chip that managed to get to market in consumer devices. The Kirin 920 is the successor to HiSIlicon's Kirin 910T that is shipping with the Huawei Ascend P7, but don't let the minor naming scheme change fool you as the 920, or more aptly, the Hi3630 as its actual model number describes it, is a major generational upgrade in every measurable aspect.

The Hi3630 is a fully HMP-enabled big.LITTLE design with 4x Cortex A7 and 4x Cortex A15 cores. HiSilicon has remained relatively conservative with the clock speeds and as such we don't see them exceeding 1.3 and respectively 1.7GHz for the little and big clusters. We see implementation of newer r3 A15 silicon IP for the big CPUs and should expect better power management and power efficiency as opposed to past A15 implementations.

On the GPU side we find a Mali T628MP4 clocked in at 600MHz. This is nothing to write home about as the T628 was to be found in devices already over a year ago in the form of the Exynos 5420. The MP4 configuration is also a downgrade from Samsung's MP6 implementation, so we should expect lower performance. I feel a bit underwhelmed by HiSilicon's GPU decision here as it seems they target a more mid-range performance segment rather than trying to compete with Samsung and Qualcomm. We'll see later in the benchmark section how this works out for the Honor 6.

HiSilicon "Kirin 920" Hi3630 vs Direct Competitors
SoC HiSilicon
Hi3630
Samsung
Exynos 5422
Samsung
Exynos 5430
Qualcomm
MSM8974v3
CPU 4x Cortex A7 r0p5 @ 1.3GHz
+
4x Cortex A15 r3p3 @ 1.7GHz
4x Cortex A7 r0p5 @ 1.3GHz
+
4x Cortex A15 r2p4 @ 1.9GHz
4x Cortex A7 r0p5 @ 1.3GHz
+
4x Cortex A15 r3p3 @ 1.8GHz 
4x Krait 400 @ 2.3GHz
Memory
Controller
2x 32-bit @ 800MHz DDR
12.8GB/s b/w
2x 32-bit @ 933MHz DDR
14.9GB/s b/w 
2x 32-bit @ 1066MHz DDR
17.0GB/s b/w 
2x 32-bit @ 933MHz DDR
14.9GB/s b/w 
GPU Mali T628MP4 
@ 600MHz
Mali T628MP6
@ 533MHz 
Mali T628MP6
@ 600MHz
Adreno 330 @
 578MHz
Integrated
Modem
"Balong"
LTE Cat. 6 300Mbps
n/a n/a MDM 9x25
LTE Cat. 4
150MBps
Video
H/W
H264 1080p
Enc- & Decoder
H264 2160p
Enc- & Decoder
H264 2160p
Enc- & Decoder
+
H265 4K Decoder
H264 2160p
Enc- & Decoder
Mfc.
Process
TSMC
28nm HPm
Samsung
28nm HKMG
Samsung
20nm HKMG
TSMC
28nm HPm

The SoC is manufactured on TSMC's 28nm HPm process. Unfortunately I wasn't able to determine the running voltages of the chip as it seems HiSilicon employs a separate microcontroller and closed firmware layer for direct DVFS controlling (DVFS is still arbitrated by the kernel though).

We have a standard 2x32bit LPDDR3 memory interface running at 800MHz DDR, making available some 12.8GB/s of bandwidth to the SoC. Hardware video encoder and decoders allow for H264 1080p recording and playback. The SoC employs some auxillary accelerator blocks such as a JPEG hardware unit. We have little information on the ISP that HiSilicon employs but it should be of a similar design as Samsung employs, meaning a Cortex A5 core with dedicated SIMD accelerators. 

The NAND/MMC interfaces use the same DesignWare IP that we find on Exynos SoCs, deploying 3 controllers each handling the main eMMC NAND, the external SD card via SDIO, and also the Broadcom BCM4334 Wi-Fi chip via SDIO.

Probably the most important aspect of the Kirin 920 SoC is that it has a new integrated LTE modem built into the same die. The "Balong" modem is capable of category 6 LTE speeds with carrier aggregation, not only making this one among the first Cat. 6 modems, but the very first integrated silicon available from any vendor. Looking back at the rest of the SoC's specifications this might have been one of the reasons as to why the SoC appears to have conservative specifications, as modems take a long time to validate and having it integrated into a SoC also delays the whole chip.

Unfortunately we couldn't review the modem in this Chinese unit as it lacks the RF front-end compatible with western FDD networks. For what it's worth, it runs 2G and EDGE seemingly well...

Power management

While knowing about the silicon employed gives us some notion about its expected performance, nowdays modern power management makes it pretty much unpredictable as to how efficient a SoC will be. In the future I'll be trying to expose more of how vendors implement their power management schemes and what we should expect of devices in daily use.

In the case of the HiSilicon Hi3630 there's a bit of a double-edged sword story going on.

As a fully HMP-enabled big.LITTLE chip, the OS employs a full Global Task Scheduling (GTS) scheme inside of the Linux Kernel (version 3.10.33) on the device. To be able to understand GTS we need a little explanation around the core mechanism which decides how a task is migrated between the two clusters:

The kernel employs a mechanism to track load continously for each scheduler entity (a process or a cgroup of processes). This per-entity load-racking algorithm is at the core of the scheduler mechanic for GTS. A simplified overview defines three main control parameters: the up- and down-thresholds and the load-average period which acts as a window frame for the decision making. If a task's load exceeds the up-threshold, it is migrated over to the big cluster, and similarly if the task's load falls under the down-threshold it bounced back onto the little cluster.

In Huawei's case we see the use of the HMP up- and down- as variable control parameters as the prefered method to control performance and power of the chip as opposed to the usual clock-frequency limits. Keep this in mind for the battery life benchmarks as this will impact them in substantial ways.

The chip comes of course will advanced clock- and power-gating mechanisms for the CPU cores. We have the usual ARM architectural core clock-gating state WFI (Wait-for-interrupt) on a per-CPU basis on all modern ARM chips. As a secondary-level CPUIdle state HiSilicon power-gates each individual core for prolonged idle periods (C1), and finally if all CPUs inside a cluster are sitting in extended idle periods the whole cluster is shut down (C2). Keep in mind that we are talking about entry-latencies of 500µS for C1 and 5000µS for C2, and thus represent a very fine-grained power-gating scheme compared to SoCs of the past. The little cluster may not enter the C2 state while the screen is enabled.

Because the power-gating is done via CPUIdle and not via classical hotplugging, the CPUs appear always online to the system, so don't be alarmed if that seems unusual. This also avoids the overhead that is to be found in Qualcomm SoCs and past A9-based SoCs, as hotplugging is a very expensive operation that requires a CPU to be taken out of coherency and mandates a full stop of the system for a certain amount of time, and enables much finer grained idling due to the vastly decreased latency. This also might have a side-effect that to classical monitoring tools the A15 cores might be stuck on some higher frequency in the CPUFreq statistics, while in reality the whole cluster is simply power-gated. This mode of operation is valid for all present and future big.LITTLE SoCs.

An interesting fact that I noticed while analysing the Hi3630's software stack is that it employs different CPUIdle drivers for the two clusters, with differing idle-state parameters. This is in contrast to what I've seen Samsung do, so in that regard HiSilicon employs a better software implementation.

The little cluster scales in frequency from 400MHz up to 1300MHz in 200MHz steps and is controlled by a Interactive-based governor. Google has standardized the "boostpulse" QoS mechanic in its Interactive governor and the Hi3630 takes full advantage of it, boosting up to 1200MHz when triggered by user-space events. We notice this when switching between applications in Android. In addition, the HMP thresholds are lowered for the duration of the boostpulse, easing processes to be migrated over to the big cluster. DVFS switches happen on a more coarse 80ms interval.

On the big cluster, the chip scales from 800MHz to 1700MHz also in rough 200MHz steps. We have a more standard Ondemand governor with very conservative parameters as to avoid unnecessary switches to high frequencies. We see a extremely small sampling interval of 10ms on the big cluster, this is the fastest default setting I've seen on any ARM based SoC yet to date.

On the GPU side, the Mali T628MP4 scales from an idle 120MHz to 600MHz in 6 steps employing a Ondemand algorithm on a 20ms sample interval. Again, due to the SoCs having the same GPU IP I can't stop myself from comparing it to Samsung's implementation of the GPU DVFS drivers: This is a much more aggressive algorithm than what see see in Exynos SoCs. While the latter can only reach the higher frequencies in sequential order from frequency to frequency, the HiSilicon chip can directly jump from its minimum state to the full 600MHz with a much quicker reponse time. I'm still not sure how wise this is as it appears to be a tad too aggressive and may impact power efficiency. Usually ARM licensees are responsible for implementing GPU power gating on the SoC-level, so while I don't have any direct evidence of this without the driver sources, I'll assume this is the case for the Hi3630.

The memory controller's driver seems more or less identical to what Samsung deploys, scaling from 120MHz to 800MHz using an identical governor algorithm as the GPU, but also employing a QoS scheme when the use-case demands minimal bandwidth requirements.

Platform Power

Once in a while, we get lucky and a device comes with a coloumb-counting fuel-gauge that allows to do precise power measurements without much hassle and external equipment. To my delight, the Honor 6 is one of these and I promptly went on to do some power analysis of the phone.

Huawei Honor 6 Platform Power

First we see that the device's idle power at our standardised 200cd/m² measuring brightness comes in at 965mW, for comparison Anand did a similar measurement for the Galaxy S5 which came in at 854mW with its AMOLED screen. Further investigating minimum brightness at 684mW and maximum brightness at 1466mW gives us about an estimated range on how efficient the JDI-manufactured panel is.

Continuing on, I tested out the camera's power usage as that is one of the most power intensive tasks for a smartphone besides playing games. At 2.5W for the preview screen and 3W for 1080p video recording we still see very reasonable values competitive with what Qualcomm and Samsung provide.

Similarly a run of Sunspider averages out at around 3W. Interesting to see here was the discrepancy between Chrome and the stock provided browser. In all test cases I was able to achieve a lower power usage on the stock browser than on Chrome. This may very well have to do with optimized CPU & GPU libraries that OEMs ship with the phone versus the more generic ones that Google bundles with Chrome.

GFXBench is when things start to get ugly: a T-Rex onscreen run averages out at 4.6W power consumption which is beyond what we find in any other competing smartphones. This really peaked my interest and tried to isolate where the power was comping from. I forcefully turned off the A15 cluster and was able to shave off almost a full 1W off the power consumption while losing only 8% of performance in the benchmark. What's left is some minor power consumption on the A7 cluster and a large chunk going to GPU and memory. When normalizing for power and peformance, the Mali T628MP4 in the Kirin 920 comes around only half the perf/W of the Adreno 330 found in the Snapdragon 801 and performs very poorly.

ARM has promised a 400% energy efficiency improvement over the T604 in the T760 and we can see why that's desperately needed, the current generation of Midgard GPUs can't compete in either performance or in power efficiency. For avid gamers, it's certainly better to look at a Qualcomm device for lack of other options in current Android devices.

While the T-Rex numbers were bad, the CPU full load ones are a disaster. Turning on a 4-thread stress test  which fully loads the A15 cluster makes the device consume a whopping 7.5W. While we're going crazy might as well also try to see peak device power consumption: Running both the stress test and T-Rex in tandem results in an average power consumption of 8.5W. Here we finally see thermal throttling putting a limit to the device power as the SoC limits itself after a few seconds. Peak power comes in in at 11.5W in the intervals where the thermal mechanism clears the limits, only to re-enable them seconds later.

For academic purposes, I again disabled the A15 cluster to try to isolate power consumption on the A7 cores. The frugal nature of the Cortex A7 barely manages to exceed 1W for the cluster + memory combined.

It is clear that HiSilicon employs no power budgeting algorithms at all as the Kirin 920 leaves any kind of limiting solely to the thermal throttling driver. The problem with this approach is that you are trusting your application not to behave like a power virus. We've seen how disabling the big cluster in the T-Rex test-case can massively improve power consumption while having only little impact on performance. We have seen that is is possible to deploy a smart power allocation mechanism such as the one found in Samsung's GTS-enabled Exynos SoCs and remain within a TDP typical of a smartphone factor. This an enormous oversight in what otherwise seemed like an excellent software stack for the Kirin 920 - I hope HiSilicon in the future will resolve this issue as it's solely a software problem that's easily fixable.

EmotionUI 2.3 - Applications CPU performance
Comments Locked

59 Comments

View All Comments

  • TekDemon - Thursday, September 18, 2014 - link

    I wonder if the "Rog" mode is a reference to ASUS' Republic of Gamers (ROG) hardware line-i.e. a gaming mode. Given the weak GPU maybe the mode is there for people who want to play 3D games to be able to run everything at 720P and thus get acceptable framerates instead of everything having to be rendered at 1080P. It's actually a pretty great idea, especially with the newer 1440P screens on high end phones even the beefiest GPUs will struggle for framerates in graphics intense games.
  • p51d007 - Wednesday, September 24, 2014 - link

    I don't care for a user replaceable battery in my Ascend Mate2...it's 400mAH and lasts days at a time, plus, I'm tech savvy enough (40 years in electronics) that I can get one and replace it myself.
    Huawei is starting to make some noise in the market, which "should" benefit consumers by causing the competition to either step up to the plate, or get left behind.
    Right now, I'm a big fan of Huawei, even though the Mate2 isn't "flagship" in the spec department, it runs perfectly, fast, bright screen and the 2-3 day battery life? LOVE IT!
  • cnanews - Tuesday, September 30, 2014 - link

    I experienced a few surveys and purchaser remarks in a Chinese shopping sites where individuals have complained about wifi gathering issues
    http://cnanews.in/huawei-honor-6-with-octa-core-so...
  • ritwik - Tuesday, October 14, 2014 - link

    Isn't it an amazing device? It's just awesome, 3GB RAM with 1.7Ghz Octa core processor it's just superfast http://goo.gl/4wojuW
  • siteOwner - Saturday, October 18, 2014 - link

    Hi,

    Do you know if scheduler and governor used in Huawei Honor 6 are custom made by Huawei or are default from Linux Kernel? So if I install other rom will I get those core/task/scheduler/governor settings??

    Best Regards
  • equanim1ty - Wednesday, October 22, 2014 - link

    Yes.. There is definitely some issue with the Bus Bandwidth config for Honor 6 .
    Honor 6 has real problem with using Bluetooth and Internet simultaneously. Whenever I connected my Bluetooth (Stereo Headset), the internet bandwidth drops drastically

    Use case: If I'm on Viber through (Wifi @ 16Mbps or H+) , the bandwidth drops and it works fine without the Bluetooth. In order to confirm this I did multiple speed test while streaming offline Music ( Note: Music on SD card) - The internet connection speed dropped drastically from 16Mbps to the range of 1- 1.2 Mbps. I paused the music and it again jumped back to 14- 16Mbps. This happens even if I'm on 3G. I'm suspecting this is some type of implementation issue either with the architecture / bus configuration? Just wish this gets resolved with future ROM updates for an otherwise great device
  • equanim1ty - Wednesday, October 22, 2014 - link

    Yes.. There is definitely some issue with the Bus Bandwidth config for Honor 6 .
    Honor 6 has real problem with using Bluetooth and Internet simultaneously. Whenever I connected my Bluetooth (Stereo Headset), the internet bandwidth drops drastically

    Use case: If I'm on Viber through (Wifi @ 16Mbps or H+) , the bandwidth drops and it works fine without the Bluetooth. In order to confirm this I did multiple speed test while streaming offline Music ( Note: Music on SD card) - The internet connection speed dropped drastically from 16Mbps to the range of 1- 1.2 Mbps. I paused the music and it again jumped back to 14- 16Mbps. This happens even if I'm on 3G. I'm suspecting this is some type of implementation issue either with the architecture / bus configuration? Just wish this gets resolved with future ROM updates for an otherwise great device
  • spixel - Saturday, October 25, 2014 - link

    "The 5" 1080p display is manufacutred by JDI. The display is a non-IPS display and the viewing angles are visibly suffering from this, however it's not terrible"

    Seriously??? Of course the display is IPS, what on earth are you talking about? IPS is the standard display type for all modern smartphones except extremely cheap budget phones or those with Amoled.
  • Bala63 - Friday, October 2, 2015 - link

    Well, I have been using Honor 6 for almost a year and I would say this is the best budget phone that I ever had! Kirin outperforms Snapdragon in most segments and the phone performs like a butter! I'm a hardcore gamer and I enjoyed playing MC4, Mortal Kombat X, Immortals and what not and I never witnessed any lag at any point of time. Camera is decent and yes, u can't expect a DSLR for 20k. But trust me, for this price, there's no better camera in the market. Battery backup is excellent! I use 4G and I get 30% charge left after using it for 5 hours continuous. Wi-Fi is a real boon! The connectivity is continuous and it is through Wi-Fi that I download movies from yify! Believe me, I wasn't disappointed with the speed and downloading of torrents, not even once. And yes, Huawei did an excellent job providing a Lollipop update for Honor 6. Now I'm able to record games in 720p and upload it to YouTube! Come on guys, Huawei is new to smart phones and we can't expect miracles in their initial attempts. EMUI offers a smooth interface with a lot of cool new themes from Huawei market. And I forgot to tell you, this is a mini-HDD! With all apps installed, I still have around 8 GB of internal storage and a mammoth 64 GB external, memory card option. The phone offers an inbuilt phone manager that scans apps, informs you about junk files, apps that take space and stuff like that! So no need for an external anti virus app. Video calling works so well and flawless in 4G.In addition, Huawei offers special features like backup, touch functions for calls, gestures for apps and what not! Honor 6 is nothing short of a marvel and I'm proud to say this is the best budget phone that I've ever had!

Log in

Don't have an account? Sign up now