GPU Performance

Snapdragon 835’s updated Adreno 540 GPU shares the same basic architecture as Snapdragon 820’s Adreno 530, but receives some optimizations to remove bottlenecks along with some tweaks to its ALUs and register file. The Adreno 540 also reduces the amount of work done per pixel by using improved depth rejection, which could further improve performance and reduce power consumption.

Qualcomm is claiming a general 25% increase in 3D rendering performance relative to the Adreno 530 in S820. While not officially confirmed, it appears that Qualcomm is using the move to 10nm to increase peak GPU frequency to 710MHz, a roughly 14% increase over S820’s peak operating point, which would account for a significant chunk of the claimed performance boost.

GFXBench T-Rex HD (Onscreen)

GFXBench T-Rex HD (Offscreen)

GFXBench T-Rex is an older OpenGL ES 2.0-based game simulation that’s not strictly limited by shader performance like the newer tests, which is one reason why flagship phones have been hitting the 60fps V-Sync limit for awhile now in the onscreen portion of the test. More recently, we’ve seen the iPhone 7 Plus and Mate 9, which both have 1080p displays, average 60fps over the duration of the test. Now the Snapdragon 835 MDP/S becomes the first 1440p device to reach this milestone.

The Snapdragon 835 MDP/S outperforms the iPhone 7 Plus and Mate 9 when running offscreen at a fixed 1080p resolution. It’s also 25% faster than the Pixel XL, the highest performing Snapdragon 820 phone, exactly matching Qualcomm’s performance claim. Sliding a little further back along Adreno’s roadmap shows the Adreno 540 with almost a 2x advantage over the Nexus 6P’s Adreno 430 and a 4.5x advantage over the ZUK Z1’s Adreno 330.

GFXBench Car Chase ES 3.1 / Metal (On Screen)

GFXBench Car Chase ES 3.1 / Metal (Off Screen 1080p)

The GFXBench Car Chase game simulation uses a modern rendering pipeline with the latest features found in OpenGL ES 3.1 plus Android Extension Pack (AEP), including tessellation. Like many current games, it stresses ALU performance to deliver advanced effects.

Lower resolution 1080p displays paired with modern GPUs elevate the LeEco Le Pro3 (S821), OnePlus 3T (S820), and Huawei Mate 9 (Kirin 960) to the top of the chart in the onscreen portion of the test. The Snapdragon 835 MDP/S is the fastest 1440p device, besting the second-place Pixel XL by 29%.

Moving to the offscreen test shows the Adreno 540 GPU with a 25% lead over the Adreno 530 in S820. I do not usually put too much stock in performance claims on marketing slides, but so far Qualcomm’s claim is surprisingly accurate. Even more impressive is its 55% lead over the Mate 9’s Mali-G71MP8 GPU, which is based on ARM’s latest Bifrost microarchitecture and is running at 960MHz to 1037MHz during this test.

3DMark Sling Shot 3.1 Extreme Unlimited - Overall

3DMark Sling Shot 3.1 Extreme Unlimited - Graphics

3DMark Sling Shot 3.1 Extreme Unlimited - Physics

3DMark Sling Shot Extreme uses either OpenGL ES 3.1 on Android or Metal on iOS and stresses the GPU and memory system by rendering offscreen at 1440p (instead of 1080p like our other tests).

The Snapdragon 835’s 30% better overall score is pretty significant, considering that there’s only an 8% difference between all the phones using the Apple A10, Exynos 8890, Kirin 960, and S820/S821 SoCs. Diving into the graphics segment shows the Snapdragon 835 MDP/S outperforming the iPhone 7 Plus by 10% and both the S820 and Exynos 8890 versions of the Galaxy S7 by 24%.

Unlike the Adreno 530, which saw a significant uplift in geometry processing from changes to its microarchitecture, it does not appear that any additional changes were made to Adreno 540 based on its similar performance in 3DMark Sling Shot’s first graphics test. ARM’s Mali GPUs have done comparatively well in geometry processing tasks in the past, and in the first graphics test the Adreno 540 is only 11% faster than the Mate 9’s Mali-G71 GPU.

It’s in the second graphics test, which emphasizes shader performance, where we see the biggest gains from Adreno 540, with a 34% lead over the Galaxy S7’s Adreno 530 and a 50% lead over the Mate 9’s Mali-G71. Qualcomm’s changes to its ALUs and register file seem to pay dividends here.

The Physics test runs on the CPU and is heavily influenced by how well an SoC’s memory controllers handle random access patterns. The Snapdragon 835 MDP/S finishes ahead of the Mate 9 by 14% despite their similar CPU performance. The S835’s memory controllers deliver lower latency and higher bandwidth than Kirin 960’s, which could explain its better result in this test.

Basemark ES 3.1 / Metal

Basemark ES 3.1 / Metal Onscreen Test

Basemark ES 3.1 / Metal Offscreen Test

The Basemark ES 3.1 game simulation uses either OpenGL ES 3.1 on Android or Metal on iOS. It includes a number of post-processing, particle, and lighting effects, but does not include tessellation like GFXBench 4.0 Car Chase.

Until Vulkan support is added to benchmarks later this year, Android devices will continue to rely on OpenGL, putting them at a huge disadvantage to iPhones running Apple’s Metal graphics API, which dramatically reduces driver overhead when issuing draw calls. In this particular test, Metal helps push the iPhone 7 Plus in front of the Snapdragon 835 MDP/S by 73%.

ARM’s Mali GPUs perform better than their Adreno counterparts when running Basemark ES 3.1’s workloads; the Exynos 8890’s Mali-T880MP12 is 15% faster than S820’s Adreno 530 and Kirin 960’s Mali-G71MP8 is 25% faster than S835’s Adreno 540 in the offscreen test. The Snapdragon 835 MDP/S does perform 40% faster than the S820 in the Pixel XL, which is quite a bit more than the 25% gain it sees in our other tests.

GFXBench ALU 2 (Offscreen)

The common theme in all of the game simulation tests is the Adreno 540’s better ALU performance, so I thought it would be interesting to see how well it performs in GFXBench’s synthetic ALU test. Surprisingly, its microarchitecture improvements are of no help here. The S835’s 14% advantage over the S820 and 8% advantage over the S821 exactly mirror their differences GPU frequency, assuming 710MHz for S835, suggesting this workload is bottlenecked elsewhere. It still manages to outperform the Mate 9’s Kirin 960 by 32%, however.

CPU and System Performance Qualcomm on Benchmarks versus End-User Experiences
Comments Locked

128 Comments

View All Comments

  • zeeBomb - Wednesday, March 22, 2017 - link

    Its that time of year again!
  • name99 - Wednesday, March 22, 2017 - link

    "The [3DMark] Physics test runs on the CPU and is heavily influenced by how well an SoC’s memory controllers handle random access patterns. "

    No it isn't, at least not to an extent that matters in any modern CPU. Why do you keep posting this rubbish in review after review?

    The source code is available for examination. It basically tests (frequency)*(number of cores) and is useless for learning anything beyond that. That's why it's always the only test in which Apple looks bad --- because Apple's running two cores as opposed to 4/6/8/10 on Android, and, at least in the past, those cores were under-clocked relative to the Android cores.

    If people want to post the 3DMark Physics numbers, whatever, I don't care. But I do think doing so is a waste of reviewers' and readers' time --- there is simply no useful additional information provided by that benchmark.
    The fact that 3DMark continues to push it (as opposed to the way GeekBench every year or two tries to respond to complaints and concerns about its benchmarks) tell you something about the relative professionalism of the two companies.
  • Matt Humrick - Wednesday, March 22, 2017 - link

    "It basically tests (frequency)*(number of cores)"

    Both of these are factors, but it's not the whole story according to the developer I spoke with at Futuremark. If you have additional information to prove your claim, please share it with me via email. My mind is always open :)
  • name99 - Thursday, March 23, 2017 - link

    I looked into this in detail years ago when there was a big kerfuffle about the iPhone 5S score.
    I'm not interested in spending another day doing the exact same thing. I'll just point out that what I am saying matches the data.
    Sure, I'm not saying that THE ONLY THING is (frequency)*(number of cores), there's some small 5 to 10% variation around that; but that variation is unimportant --- the big picture is embedded in what I said.

    Now, does this mean it's a good benchmark? Well, how much code that people care about is multi-threaded (on Android and otherwise)?
    I'm not interested in relitigating that (given what I consider to be the astonishing incompetence and ignorance we saw on Anandtech the last time this was discussed, with a VAST proportion of readers apparently unaware of such concepts as timesharing, or how to accurately calculate the thread level parallelism of an executing piece of code).
    I will say that the most recent academic papers I've read, dated 2016, referring to work in around 2014, show that it's higher than you might expect, not as high as you might hope. Across a very wide range of Android apps the thread level parallelism is slightly larger than 2, showing, basically (in my interpretation)
    - an Android controlled thread doing misc stuff that's pretty busy
    - a main app thread
    - various small completion routines, async routines, and interrupts
    So basically two cores get you almost all the value in real world core, a third core occasionally picks up a small amount of extra available work.

    Now read what I am saying before getting upset. I'm NOT saying that ARM is stupid to ship 4 (performance) cores. ARM cores are tiny, they can be of (very occasional) value to a few talented developers today, and the only way we'll EVER get the mass market to code more parallel is to have the hardware out there as the default. So I'm happy that ARM is flooding the world with hexacore, octacore, decacore chips. (And I think Apple is being penny-wise and pound-foolish by not making every SoC they ship a triple core ala A8X --- the extra area would be small, and it would likewise provide an incentive for developers to get off their asses.)

    But that's a different argument from whether core-count provides "visible performance" today.
    I think the answer to that is clearly no. The first thing that matters to most users is snappiness (which depends, primarily, on flash performance, GPU [and the quality of the graphics code], performance governor (so does the CPU "start off" fast or "start off" slow and only get fast after .2 seconds of UI interaction? Then there are a few places where overall "endurance" performance matters (like much browser stuff, or viewing complicated PDFs --- both of which are very poorly threaded even as of 2017). Finally the cases where all cores all the time matters, and almost nothing else (the sort of thing 3DMark Physics is testing) are REALLY few and far between.

    Or to put it another way. Most CPU microarchitecture improvement since 2000 has been about discovering and exploiting the stochastic structure of REAL-WORLD computation. There are re-uses and patterns in branching, in memory access, in instruction execution that are exploited ever more aggressively in branch prediction, in cache insertion and liveness tracking, in prefetching, in loop buffers, etc. A benchmark that prides itself on randomness and in providing no way for all those smarts to add value is saying SOMETHING about the worst case performance of a CPU, but it's not clear that that something is of any value to almost everyone.
  • Frenetic Pony - Wednesday, March 22, 2017 - link

    How disappointing that yet another of the very few custom CPU designers is now gone. Looking at the general performance of the CPU now, I see no reason whatsoever to choose Qualcomm over some other, generic ARM hawker that's probably cheaper. They could at least stop pretending and just become a module seller, selling their GPU/modems/etc. separately as there doesn't seem to be any reason to choose a Qualcomm SOC as a whole.

    Other than ditching their stock (if you haven't already) none of this looks good for Qualcomm. Or for ARM for that matter, the A73 doesn't offer any performance boost over the A72 and is still trounced by Apple. Maybe the rumors of Google building its own CPU will come true and we'll see it in the Pixel 2.
  • serendip - Wednesday, March 22, 2017 - link

    No reason? They're one of the few developer and open source friendly chip manufacturers around, even if that relationship ventures into frenemy territory once in a while. Qualcomm modems and imaging blocks are pretty good too.

    Intel are developer friendly but their GPUs can be an abomination to work with. They've also effectively abandoned the mobile space. Mediatek, Huawei and Samsung either give a cold shoulder or a middle finger to devs.
  • StrangerGuy - Thursday, March 23, 2017 - link

    They dropped their own custom cores because why even bother when vanilla ARM does a better job while being cheaper...Plus the economics for a non-Apple, non-Samsung bleeding edge SoC no longer makes much sense, and 99.99% of the population buying these phones doesn't and won't give a shit to the SoC or the benchmarks.
  • Meteor2 - Thursday, March 23, 2017 - link

    It wouldn't surprise me if Qualcomm had multiple core design teams competing with each other. We've seen ARM cores come before; maybe full-custom will come back next year.
  • SyukriLajin - Thursday, March 30, 2017 - link

    I think they are just shifting their focus. The fact that they "rebranded" the snapdragon as a platform instead of just processor is one indicator. SOC is more than just cpu cores and they want people to know that. My guess is, they think that investing more money to develop a more unique platform is more important than spending valuable time and money on redoing the cpu core, ARM already invested a tons of money to develop it, no reason to reinvent the wheel when you can use the resources to provide a platform that will help them be more different then the others. I think it's wise for them. The resources would better off be spent to create a better DSP, modem, GPU etc, which will give them more return than a custom cpu core.
  • MrPhilo - Wednesday, March 22, 2017 - link

    So the Exynos 8895 GPU should in theory be faster than the 540 by quite a bit? Since the Huawei 9 uses 8, whereas Exynos uses 20, but of course with a lower clockspeed. I can see it being at least 20-30% faster than the 540.

Log in

Don't have an account? Sign up now