Conclusion & End Remarks

It’s been a tumultuous and busy week as we’ve only had the new Galaxy S21 Ultra in Snapdragon and Exynos variants for just a few days now, but that’s sufficient as we can generally come to a representative conclusion as to how Qualcomm’s and Samsung’s new generation flagship SoCs will play out in 2021 – and for the most part, it’s probably not what people were expecting.

Starting off with the most hyped up part of the new SoCs (mea culpa), both SoCs are the first to employ Arm’s newest Cortex-X1 cores, the first CPU generation in which Arm really went for a more “performance first” design philosophy. In general, the new CPU IP does live up to its claims, however Arm’s and our own performance projections weren’t met by the new SoCs, as they didn’t quite reach the configurations and clock frequencies we had hoped for 2021 designs. Both Qualcomm and Samsung didn’t invest on an 8MB L3 cache, and in particular Samsung didn’t even don their X1 core with a full 1MB of L2 cache. This does seem to be noticeable in the performance as the Snapdragon 888 does have small performance edge over the Exynos 2100. Samsung’s choice here given their years of wasting lots of silicon on humongous custom CPUs seems to be rather puzzling, but generally both vendors aren’t as aggressive as Apple is on investing die area into caches.

Qualcomm still has a clear memory subsystem advantage as the company has made large strides in latency this generation, and this results in even more extra performance. The Exynos this year surprised us with a much larger system level cache – which however seems to also add to latency and reduce performance.

More worrisome for the Exynos is its weird clock behaviour, with the new chip really struggling in maintaining its peak frequencies other than for very brief moments – the Snapdragon 888’s X1 core had no such issues. My Exynos S21 Ultra chip bin was quite terrible here, but the better silicon on my second S21 doesn’t improve things too much either.

The Exynos 2100’s Cortex-A78 cores are clocked higher than the Snapdragon 888’s, and this show up in performance, however in every-day workloads the DVFS of the Exynos actually behaves more similarly to the Snapdragon as it generally scales things to 2600MHz and only uses the 2808MHz peak frequencies of these cores in brief multi-threaded workloads, as long as thermals even allow it, as even these middle cores can get quite power hungry this generation.

Although both are using the same IP on the same process node, the Exynos 2100’s CPU just look to be more power hungry than the Snapdragon 888’s implementations. Given the apples-to-apples comparison, the only remaining possibility is just a weaker physical design implementation on Samsung LSI’s part – which is actually a point of concern, as we had hoped Exynos SoCs would catch up this year following their ditching of their custom CPU cores. Make no mistake – the new X1 cores are massively improved in performance and efficiency over last year’s M5 cores, it’s just that Qualcomm shows that it can be done even better.

On the GPU side of things, this generation feels wrong to me, and that’s solely due to the peak power levels these new SoCs reach, and which vendors actually left enabled in commercial devices.

Qualcomm had advertised 35% improved GPU performance this generation with the Snapdragon 888, and that might indeed be valid for peak performance, but certainly for Samsung devices that figure is absolutely unreachable for any reasonable amount of gaming periods, as the power consumption is through the roof at over 8W. I don’t see how other vendors might be able to design phones with thermal dissipations that allows for such power levels to actually be maintained without the phone’s skin temperatures exceeding +50°C (122°F), it’s just utterly pointless in my opinion.

In terms of sustained performance, the Snapdragon 888 is generally a 10-15% improvement over the Snapdragon 865 and 865+ - at least in these Samsung devices whose thermal limits and thermal envelopes are similar this generation, attempting to target 42°C peak skin temperatures, although the phones failed to stay below that threshold during the initial few minutes of the performance burn.

On the Exynos 2100 side, Samsung’s +40% performance claim can be considered accurate just for the fact that it generally applies to both peak and sustained performance figures. At peak performance, the SoC is just as absurd at 8W load, which is impossible to maintain. The good news here though, is that when throttling down, the Exynos 2100 is notably better than the Exynos 990 – however that’s not sufficient to catch up to last year’s Snapdragon 865, much less the new Snapdragon 888.

Samsung’s 5LPE process appears to be lacking

We don’t have deeper technical insights as to how Samsung’s process node compares in relation to TSMC’s nodes other than the actual performance of the chips we have in ours hands, so I’m basing my arguments based on the measured data that I’m seeing here.

At lower performance levels, we noted that the 5LPE node doesn’t look to be any different than TSMC’s N7P node, as the A55 cores in the Snapdragon 888 performed and used up exactly the same amount of power as in the Snapdragon 865. At higher performance levels however, we’re seeing regressions – the middle Cortex-A78 cores of the S888 should have been equal power, or at least similar, to the identically clocked A77 cores of the S865, however we’re seeing a 25% power increase this generation.

Similarly, in theory, the Exynos 2100 Cortex-A78 cores at 2.81GHz should have been somewhat similar in power to the 2.84GHz A77 cores of a Snapdragon 865, but it’s again at a 20-25% disadvantage in efficiency.

In fact, both SoCs on the CPU side don’t seem to be able to reach the Kirin 9000’s lower power levels and efficiency even though that chip is running at 3.1GHz – it’s clear to me that TSMC’s N5 node is quite superior in terms of power efficiency.

There are two conclusions here: For Samsung’s Exynos 2100 – it doesn’t really change the situation all that much. 5LPE does seem to be better than 7LPP, and the new chip is definitely more energy efficient than the Exynos 990 – although it does look that the new much more aggressive behaviour of the CPUs, while benefiting performance, can have an impact on battery life. We need more time with the phones to get to a definitive conclusion in that regard.

For Qualcomm’s Snapdragon 888, the new chip’s manufacturing seems to be giving it headwinds. At best, we’re seeing flat energy efficiency, and at worst, we’re seeing generational regressions. This all depends on the operating point, but generally, the new chip seems to be slightly more power hungry than its predecessor – although again, performance has indeed improved. On the CPU side, the performance boost could be noticeable, but more problematic is the sustained GPU performance increase, which is still quite minor. It’s at this point where we have to talk about things other than CPU and GPU, such as Qualcomm’s new Hexagon accelerator, or new camera and ISP capabilities. We weren’t able to test the AI/NPUs today as the software frameworks on the S21 Ultra aren’t complete so it’s something we’ll have to revisit in the future. Looking at all these results, it suddenly makes sense as to why Qualcomm launched another bin/refresh of the Snapdragon 865 in the form of the Snapdragon 870.

Overall, this generation seems a bit lacklustre. Samsung LSI still has work ahead of them in improving fundamental aspects of the Exynos SoCs, maturing the CPU cluster integration with the memory subsystem and adopting AMD’s RDNA architecture GPU in the next generation seem two top items on the to-do list for the next generation, along with just general power efficiency improvements. Qualcomm, while seemingly having executed things quite well this generation, seem to be limited by the process node. We can’t really blame them for this if they couldn’t get the required TSMC volume, but it also means we’re nowhere near in closing the gap with Apple’s SoCs.

In general, I’m sure this year’s devices will be good – but one should have tempered expectations. We'll be following up with full device reviews of the Galaxy S21 Ultras as well as the smaller Galaxy S21 soon - so stay tuned.

GPU Performance & Power: Very, Very Hot
Comments Locked

123 Comments

View All Comments

  • eastcoast_pete - Monday, February 8, 2021 - link

    Andrei, also special thanks for the power draw comparison of the A55 Little Cores in the (TSMC N7) 865 vs the (Samsung 5 nm) 888! That one graph tells us everything we need to know about what Samsung's current " 5 nm" is really comparable to. I really wonder if QC's decision to chose Samsung's fabbing was more based on availability (or absence thereof for TSMC's 5 nm) or on price?
  • DanD85 - Monday, February 8, 2021 - link

    Well, seems like Apple hogging most of TSMC 5nm node leaves other with no other choice but going with the lesser foundry.
  • heraldo25 - Monday, February 8, 2021 - link

    For such a thorough review it is shocking to see that software versions (build number) used during tests are not stated.
    It is absolutely essential that the review contains software versions, so that other can try to replicate results, and for the reviewing site, to have references during re-tests.
  • name99 - Monday, February 8, 2021 - link

    The milc win is certainly from the data prefetcher. In simulation milc also benefits massively from runahead execution, ie same principle (bring in data earlier).

    Has anyone identified a paper or patent that indicates what ARM are doing? A table driven approach (markov prefetcher) still seems impractical, and ARM don't go in for blunt solutions that just throw area at the problem. They might be doing something like scanning lines as they enter L2 for what look like plausible addresses, and prefetching based on those, which would cover a large range of pointer-based use cases, and seems like the sort of smart low area solution they tend to favor.
  • trivik12 - Monday, February 8, 2021 - link

    Hope Qualcomm moves next gen flagship SOC to TSMC again. Cannot be at so much disadvantage. Of course Samsung 3nm could narrow the gap, but that is more for 2023 flagships.
    Disappointing to see Exynos disappoint again. How is Exynos1080 as a mid range chipset?
  • geoxile - Monday, February 8, 2021 - link

    Their 3nm is expected to be on par with TSMC N5. The expect gains over 7nm are only 30% higher performance, 35% die area reduction, and 40-50% power reduction. Considering 5LPE is still behind N7P it's not much and will be barely be on par with N5 in density let alone efficiency.
  • jeremyshaw - Monday, February 8, 2021 - link

    In other words, Samsung strangled then killed SARC for their failures, only to find the failures were with SSI itself.
  • geoxile - Monday, February 8, 2021 - link

    You must be kidding... The Exynos 2100 is at least somewhat close to the Snapdragon 888 in CPU performance. Mali continues to be a problem, and remains so even for the Kirin 9000 on TSMC N5. Mongoose was an abomination that belonged maybe in 2015. Samsung Semiconductor is less competent than TSMC but SARC's mongoose team was a joke.
  • EthiaW - Monday, February 8, 2021 - link

    All those attempts to spend transistors niggardly and boost performance by high frequency have failed miserably.
    Single transistor performance seems to be decaying from node to node now. Flat & Not more enough transistor count=performance regression.
  • eastcoast_pete - Monday, February 8, 2021 - link

    Andrei, when you're testing the actual phone, could you check the battery life with the 5G modem on and off, respectively? 5G modems are supposedly quite power hungry also, and, if it's possible to turn 5G off (but leaving 4G LTE on), it would be interesting to see just how much power 5G really consumes.

Log in

Don't have an account? Sign up now