The Samsung Exynos 7420 Deep Dive - Inside A Modern 14nm SoC

Name: The Samsung Exynos 7420 Deep Dive - Inside A Modern 14nm SoC
Item: The Samsung Exynos 7420 Deep Dive - Inside A Modern 14nm SoC
Author: Andrei Frumusanu

by Andrei Frumusanu on June 29, 2015 6:00 AM EST

114 Comments | Add A Comment

114 Comments

CPU Power Consumption

The power consumption measurements are probably the most eagerly awaited and sought-after part of this piece, as they’re crucial for determining just how much of an effect the 14nm manufacturing process has on power efficiency. To get the numbers, we hook up the Galaxy S6 to an external power supply and energy meter.

Results labeled “load power” represent the difference between idle power consumption and the total power of a given scenario. This means for a given test, we measure the power consumption of the device while it is not doing any activity other than displaying the appropriate screen content. This method allows us to compensate for screen and miscellaneous device component power consumption. By controlling power management and performance of the device we can thus recreate very accurate active power figures for the SoC. One has to keep in mind though that this methodology doesn’t allow us to granularly separate always-used blocks of an SoC such as interconnects or DRAM – so there’s always a slight overhead on top of the IP block we’re interested in measuring power on.

We start by looking at the A53’s cluster and core power consumption. We use a power-virus that creates an artificial load on the CPU cores. This method gives us a good representation of the maximum power consumption at a given frequency, thus detailing the power curve of the silicon at various frequencies and voltage levels. Real-world use-cases will seldom be able to fully load the CPU to such extent, as even high loads will only reach 80-90% of the CPU’s capacity at a given frequency, and thus only consume about the same percentage in power.

Measured power consumption very largely follows the P = C * f *V² formula for dynamic power consumption, where power is a function of frequency times voltage squared multiplied by a constant value representing capacitance of the IP block. Semiconductor vendors also follow this formula in their thermal management drivers as they model the estimated power consumption.

We see that the little cores on the Exynos 7420 use up to 1W when loading up 4 threads on the cluster. This is a slightly higher value than what we saw on the Exynos 5433, but could be explained by the fact that the CPU is running at a 200MHz higher frequency state. Top voltage on the Note 4 unit I measured piece reached 1150mv while the Galaxy S6 tops out at 1037mV. A quick calculation of the fV² factor of the dynamic power formula points out to a value of 1613 for the 7420 and 1719 for the 5433, meaning that if we would just consider voltage and frequency the 7420 should definitely consume less power even at the higher clock rate. The logical explanation is that we’re seeing increased capacitance due the new chipset's implementation and layout. Capacitance can be deducted by verifying that the remaining missing term after fV² results in a steady constant value among all measurement points – and indeed it looks like the A53 cores on the Exynos 7420 have 30% higher capacitance than what we saw on the 5433.

An odd behavior that I’ve already measured on the Exynos 5430 is that the power increase diminishes with every added thread. ARM at the time had explained to me that this was caused by the A7 cores fighting for cluster resources and that each added thread would result in diminishing returns as each core would do less work (and thus consume less power). Supposedly the A53’s new architecture in the 5433 was able to handle the load much better and avoid this bottleneck and that is why we were able to see even increases in power with each added thread. Yet the 7420 exhibits the same issue as seen on the 5430, pointing out this may not have been an architectural characteristic of the cores after all. I’m not too sure what to make of this behavior and probably only Samsung knows the exact behind-the-scene changes that lead to it.

The core average maximum power consumption is the average between the power differences of core 1-2, 2-3 and 3-4. This metric is lower than the 1-core results of the power curve graph because it tries to account for the power overhead of the non-CPU consumption such as cluster, interconnect and memory which come out of their low-power states when the CPU is doing work. Even though the maximum power for the Exynos 7420’s A53 cores is higher than the 5433’s, it manages to beat the 5433 by 30-40% on a per-frequency efficiency basis. The massive voltage drop that the new 14nm FinFET brings to the table is enough to outweigh the increased capacitance of the cores.

A non-trivial part of the power figures that I’m not able to properly measure is the static leakage of the SoC. I tried to reach out to Samsung to comment on the improvement, but wasn’t able to get a concrete answer in regards to their SoC products.

Device Minimum Screen-on Power (~2 cd/cm² Brightness)
Device	Power Consumption (mW)
Galaxy S5 (Snapdragon 801)	258mW
Galaxy S5 LTE-A (Snapdragon 805)	354mW
Galaxy S6 (Exynos 7420)	358mW
Galaxy Note 4 (Exynos 5433)	452mW
Meizu MX4Pro (Exynos 5430)	530mW
Huawei P8 (Kirin 930)	~500mW

The Note 4 Exynos turned on black screen and idling consumed a minimum of 440mW while the S5 LTE-A (S805) uses 354mW. Other devices such as the Meizu MX4Pro or the Huawei P8 bottom out at respectively 530 and 500mW. The S6 on the other hand reaches down to only 330mW, a significant 25%+ reduction over other handsets, but still not enough to beat the efficiency of last year's Galaxy S5 which came in at only 258mW. This is an important metric as this is a power value that represents a non-avoidable constant drain whenever you actively use the device (Deep sleep states when the screen is off will power-gate most of the SoC and turn off other device components).

Besides the SoC, the display controller IC is one of the main power drains while it drives the pixel matrix of either LCD or AMOLED devices. ARM had previously shared with us that measuring the dedicated voltage rail of the display assembly on a Galaxy S5 lead to power values of around 90mW when displaying pure black. This value must have subsequently gone up as devices moved to 1440p resolution screens.

Moving on to the A57 cores we should be seeing some big improvements in power consumption. I’ve mentioned in the Note 4 Exynos review that I thought Samsung shipped the 5433 with too high clocks as the increased power consumption may not have been worth the small performance boost of the last 200-300 MHz. We first have a look at the variable thread-count power curves:

Maximum power consumption of the A57 cores comes in at 5.49W – a much more reasonable figure than the 7.39W seen in the 5433. When we look at the per-frequency power numbers this difference becomes even more significant as 1.9GHz on the 7420 uses only 4.12W compared to the 7.39W of the 5433. Similarly to the A53 cores, Samsung was able to take full advantage of the new process node as the maximum CPU voltage drops from 1.235V at 1.9GHz down to 1.037V at 2.1GHz (0.962V at equivalent 1.9GHz). The bottom frequencies see even larger reductions as we go from 900mV down to 675mV on the 700/800MHz states.

The core average maximum power consumption gives a simplified view the power curve. Here we see the drastic reduction in power the Exynos 7420 is able to provide as we see an overall decrease of 35-45% throughout the frequency curve. At 1900MHz the 7420 falls just a bit short of half the power of the 5433, which is impressive. Capacitance on the A57 also went up a bit; I was able to derive an average of 10% higher capacitance on the new chip, which isn’t quite as high an increase as on the A53 cores, but still a curious change in the physical characteristic of the new implementation.

PCMark is a great benchmark that shows of different kind of use-cases that one would daily encounter when using a smartphone. Thus it offers a great repeatable test-bench which can measure overall device efficiency. We measure the whole device's power as we cannot factor out the screen's power for on-screen dynamic tests, so this is also an apples-to-apples comparison to other devices we have figures on such as the Note 4 and MX4Pro.

Overall device power during the tests is very good. It's especially the web test which offers largest improvement over other devices as total power comes in at only 1.42W, over 1W less than the MX4Pro and Note 4. Overall the Galaxy S6 is currently the most least power consuming device I've yet come to measure, which should be very encouraging for the power metric of the device and SoC.

When taking into account the scores the device was able to achieve, we see an even greater improvement over past devices. The performance per Watt figures which depict efficiency are across the board 1.5-2x better than what we see in other devices. Of course the Galaxy S6's shows improved OLED efficiency as part of the whole package, but to be able to post such significant imrovements is nonetheless impressive. It's now understandable why Samsung deemed that a 2550mAh battery was enough for the Galaxy S6 as the device is able to use the available energy much more efficiently.

One of the first things I did when receiving my S6 review unit was to compile a custom kernel with access to the SoC’s voltage tables and try to see how far the chip allowed me to reduce voltages. Undervolting, much like overclocking in the PC space, is a popular modification for enthusiast users that like to tinker with their devices to try to squeeze out as much potential as possible. For mobile device we’re trying to aim for more power efficiency instead of more performance as today’s devices in a way already come overclocked at much higher maximum frequencies than what they’re able to sustain in terms of thermal loads.

Exynos 7420 Undervolting Results 4-Core Load Power (mW)
	A53 Cluster			A57 Cluster
Freq. (MHz)	Stock voltage	-50mV	-75mV	Stock voltage	-50mV	-75mV
2100	-			5481	4911	4661
2000	-			4781	4331	3991
1900	-			4111	3671	3441
1800	-			3641	3111	2944
1700	-			3089	2677	2500
1600	-			2621	2312	2186
1500	1026	916	894	2254	1928	1882
1400	859	768	743	1964	1791	1664
1300	699	634	625	1793	1577	1444
1200	606	536	509	1590	1351	1259
1100	491	459	424	1330	1151	1069
1000	391	354	337	1153	1009	921
900	340	298	277	969	829	761
800	270	230	221	843	695	690
700	225	192	180	-
600	172	139	128	-
500	132	108	98	-
400	104	79	71	-

To keep things simple, I measured power on the A53 and A57 cores when applying a global -50 and -75mV undervolt over the stock voltages of the individual power planes. As can be seen in the table, one can gain significant power efficiency as one reduces voltage. The theoretical reduction in power is easily calculated if one has the stock voltages and original power consumption at hand. It is possible estimate the power after undervolting by using the following formula:

P_Undervolt = P_Original / (V_Original² / V_Undervolt²)

For example on the 2100MHz state of the A57, this would come to: 5481mW / (1.037V² / 0.987V²) = 4965mW. The measured power indeed comes near that value at 4911mW. The difference should be explained due to factors we’re not taking into account in the simplified formula for power consumption as we’re disregarding static power leakage scaling, and most importantly in this case, temperature scaling.

This can be verified in the lower frequency states which dissipate a lot less power, such as the 1GHz A57 state: 1153mW / (0.712V² / 0.662V²) = 996mW, closer to the measured 1009mW.

I was able to go down to a global -87.5mV global undervolt before the device would crash and fail. It is generally difficult to find the minimal stable voltages for undervolting as it takes weeks to be able to fully test stability for a given voltage at each frequency. Again, it’s SoC temperature which is the big unknown variable here, as a transistor’s voltage threshold rises the colder the silicon gets. An undervolt can be unstable and crash the device if one leaves it to cool down below a certain level, while at the same voltage it can be perfectly usable in active usage or when it’s not allowed to cool down too much such as in one’s jeans pocket. For actual usage it’s always preferred to raise the voltages back up a step or two when one has identified an instability. Samsung’s closed-loop voltage control is an interesting new mechanism for undervolting as it allows further reducing of the safety margin without sacrificing stability. Since reassembling the S6 I’ve been using it as a daily device on a static -50mV across most frequencies and increased the voltage threshold the APM was allowed to undervolt up to -37.5mV, providing the best of both worlds.

CPU, Memory Performance & Device Disassembly CPU Power Management

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

114 Comments

View All Comments

Andrei Frumusanu - Monday, June 29, 2015 - link
Frankly, I don't know. I tried to ask Samsung a similar question but they refused to comment on customer relations. Meizu so far seems to be the only major vendor consistently using Exynos parts but as to why we haven't seen other vendors adopt them can be attributed to anything going from pricing to volume availability. Only the companies themselves know the details of these contracts.
gnx - Monday, June 29, 2015 - link
Thanks! The SoC market is really strange.
id4andrei - Monday, June 29, 2015 - link
This is Samsung's chance to eat Qualcomm's lunch. Close down node manufacturing for others(including Apple) and be like Intel. Either use Exynos or be satisfied with inferior nodes from other fabs.
CiccioB - Monday, June 29, 2015 - link
And that meas start competing with PP only, like Intel did.
That is, if you force others to go to other foundries, you have to be sure you have the best one, or in case TMSC comes up with a better PP (like a 16+nm revision) you have just thrown all your customers to your fab competitors, making double damage (or total one). Or just think if Intel tomorrow suddenly opens to ARM customers in order to saturate it's now rusting 14nm machineries. Samsung would be in great trouble after that eventual (and IMHO stupid) move.
Investing in PP i really expensive and there are other foundries capable of doing so. Samsung can't be sure to always be the best one on the market. And invest tons of billions of dollar every year to make sure to be the number one (for SoC of course).
ZeDestructor - Wednesday, July 1, 2015 - link
Samsung is part of a common platform alliance/agreement with GloFo, so while they could lock down and close others out, GloFo would not, so there's little commercial benefit from doing so.

They could of course coerce GloFo into doing the same, but that lands them into hot water with regulatory watchdogs like the FTC regarding anti-competitive practices and collusion, which while Samsung wouldn't really mind (no, really), GloFo would.
eh_ch - Monday, June 29, 2015 - link
How will it take for Samsung's process to trickle down to AMD via GloFo? Could it bridge the efficiency gap to nvidia / Intel? Holding out hope that ATI/MD will be competitive once more.
eh_ch - Monday, June 29, 2015 - link
How long will it take, that is
Adding-Color - Monday, June 29, 2015 - link
No, AMD won't have a technology advantage to Nvidia on next gen GPUs, currently it looks like nvidia will choose Samsung for their next node, and as Samsung and GloFo jav some kind of alliance and share processes (glofo licenses some Samsung processes AFAIK, the technology should be very similar for both, yet AMD should have a small HBM advantage, they have better relations to hynix (and helped to develop HBM) than nvidia.
jjj - Monday, June 29, 2015 - link
There won't be a HBM advantage from a technological point of view, at best AMD could get slightly better pricing but even that is unlikely since Nvidia has much higher volume. The first gen HBM was late and both Nvidia and AMD had plenty of time to prepare for it.
As for the process, we don't really know what foundry each will use and what version of the process.On the GPU side both are more likely to go TSMC or use both. On the CPU side AMD will likely go GloFo but not this early version of the process and Intel might go 10nm not long after AMD has 14nm. On 10nm TSMC and Samsung do seem to be catching up with Intel but doubt AMD will have 10nm early.
fluxtatic - Tuesday, June 30, 2015 - link
Hell, at this point I'd be happy to see AMD at < 28nm

The Samsung Exynos 7420 Deep Dive - Inside A Modern 14nm SoC

CPU Power Consumption

P_Undervolt = P_Original / (V_Original² / V_Undervolt²)

Post Your Comment

114 Comments

View All Comments

Andrei Frumusanu - Monday, June 29, 2015 - link

gnx - Monday, June 29, 2015 - link

id4andrei - Monday, June 29, 2015 - link

CiccioB - Monday, June 29, 2015 - link

ZeDestructor - Wednesday, July 1, 2015 - link

eh_ch - Monday, June 29, 2015 - link

eh_ch - Monday, June 29, 2015 - link

Adding-Color - Monday, June 29, 2015 - link

jjj - Monday, June 29, 2015 - link

fluxtatic - Tuesday, June 30, 2015 - link

Log in

Don't have an account? Sign up now

The Samsung Exynos 7420 Deep Dive - Inside A Modern 14nm SoC

CPU Power Consumption

PUndervolt = POriginal / (VOriginal² / VUndervolt²)

Post Your Comment

114 Comments

View All Comments

Andrei Frumusanu - Monday, June 29, 2015 - link

gnx - Monday, June 29, 2015 - link

id4andrei - Monday, June 29, 2015 - link

CiccioB - Monday, June 29, 2015 - link

ZeDestructor - Wednesday, July 1, 2015 - link

eh_ch - Monday, June 29, 2015 - link

eh_ch - Monday, June 29, 2015 - link

Adding-Color - Monday, June 29, 2015 - link

jjj - Monday, June 29, 2015 - link

fluxtatic - Tuesday, June 30, 2015 - link

Log in

Don't have an account? Sign up now

P_Undervolt = P_Original / (V_Original² / V_Undervolt²)