GPU Performance

Snapdragon 835’s updated Adreno 540 GPU shares the same basic architecture as Snapdragon 820’s Adreno 530, but receives some optimizations to remove bottlenecks along with some tweaks to its ALUs and register file. The Adreno 540 also reduces the amount of work done per pixel by using improved depth rejection, which could further improve performance and reduce power consumption.

Qualcomm is claiming a general 25% increase in 3D rendering performance relative to the Adreno 530 in S820. While not officially confirmed, it appears that Qualcomm is using the move to 10nm to increase peak GPU frequency to 710MHz, a roughly 14% increase over S820’s peak operating point, which would account for a significant chunk of the claimed performance boost.

GFXBench T-Rex HD (Onscreen)

GFXBench T-Rex HD (Offscreen)

GFXBench T-Rex is an older OpenGL ES 2.0-based game simulation that’s not strictly limited by shader performance like the newer tests, which is one reason why flagship phones have been hitting the 60fps V-Sync limit for awhile now in the onscreen portion of the test. More recently, we’ve seen the iPhone 7 Plus and Mate 9, which both have 1080p displays, average 60fps over the duration of the test. Now the Snapdragon 835 MDP/S becomes the first 1440p device to reach this milestone.

The Snapdragon 835 MDP/S outperforms the iPhone 7 Plus and Mate 9 when running offscreen at a fixed 1080p resolution. It’s also 25% faster than the Pixel XL, the highest performing Snapdragon 820 phone, exactly matching Qualcomm’s performance claim. Sliding a little further back along Adreno’s roadmap shows the Adreno 540 with almost a 2x advantage over the Nexus 6P’s Adreno 430 and a 4.5x advantage over the ZUK Z1’s Adreno 330.

GFXBench Car Chase ES 3.1 / Metal (On Screen)

GFXBench Car Chase ES 3.1 / Metal (Off Screen 1080p)

The GFXBench Car Chase game simulation uses a modern rendering pipeline with the latest features found in OpenGL ES 3.1 plus Android Extension Pack (AEP), including tessellation. Like many current games, it stresses ALU performance to deliver advanced effects.

Lower resolution 1080p displays paired with modern GPUs elevate the LeEco Le Pro3 (S821), OnePlus 3T (S820), and Huawei Mate 9 (Kirin 960) to the top of the chart in the onscreen portion of the test. The Snapdragon 835 MDP/S is the fastest 1440p device, besting the second-place Pixel XL by 29%.

Moving to the offscreen test shows the Adreno 540 GPU with a 25% lead over the Adreno 530 in S820. I do not usually put too much stock in performance claims on marketing slides, but so far Qualcomm’s claim is surprisingly accurate. Even more impressive is its 55% lead over the Mate 9’s Mali-G71MP8 GPU, which is based on ARM’s latest Bifrost microarchitecture and is running at 960MHz to 1037MHz during this test.

3DMark Sling Shot 3.1 Extreme Unlimited - Overall

3DMark Sling Shot 3.1 Extreme Unlimited - Graphics

3DMark Sling Shot 3.1 Extreme Unlimited - Physics

3DMark Sling Shot Extreme uses either OpenGL ES 3.1 on Android or Metal on iOS and stresses the GPU and memory system by rendering offscreen at 1440p (instead of 1080p like our other tests).

The Snapdragon 835’s 30% better overall score is pretty significant, considering that there’s only an 8% difference between all the phones using the Apple A10, Exynos 8890, Kirin 960, and S820/S821 SoCs. Diving into the graphics segment shows the Snapdragon 835 MDP/S outperforming the iPhone 7 Plus by 10% and both the S820 and Exynos 8890 versions of the Galaxy S7 by 24%.

Unlike the Adreno 530, which saw a significant uplift in geometry processing from changes to its microarchitecture, it does not appear that any additional changes were made to Adreno 540 based on its similar performance in 3DMark Sling Shot’s first graphics test. ARM’s Mali GPUs have done comparatively well in geometry processing tasks in the past, and in the first graphics test the Adreno 540 is only 11% faster than the Mate 9’s Mali-G71 GPU.

It’s in the second graphics test, which emphasizes shader performance, where we see the biggest gains from Adreno 540, with a 34% lead over the Galaxy S7’s Adreno 530 and a 50% lead over the Mate 9’s Mali-G71. Qualcomm’s changes to its ALUs and register file seem to pay dividends here.

The Physics test runs on the CPU and is heavily influenced by how well an SoC’s memory controllers handle random access patterns. The Snapdragon 835 MDP/S finishes ahead of the Mate 9 by 14% despite their similar CPU performance. The S835’s memory controllers deliver lower latency and higher bandwidth than Kirin 960’s, which could explain its better result in this test.

Basemark ES 3.1 / Metal

Basemark ES 3.1 / Metal Onscreen Test

Basemark ES 3.1 / Metal Offscreen Test

The Basemark ES 3.1 game simulation uses either OpenGL ES 3.1 on Android or Metal on iOS. It includes a number of post-processing, particle, and lighting effects, but does not include tessellation like GFXBench 4.0 Car Chase.

Until Vulkan support is added to benchmarks later this year, Android devices will continue to rely on OpenGL, putting them at a huge disadvantage to iPhones running Apple’s Metal graphics API, which dramatically reduces driver overhead when issuing draw calls. In this particular test, Metal helps push the iPhone 7 Plus in front of the Snapdragon 835 MDP/S by 73%.

ARM’s Mali GPUs perform better than their Adreno counterparts when running Basemark ES 3.1’s workloads; the Exynos 8890’s Mali-T880MP12 is 15% faster than S820’s Adreno 530 and Kirin 960’s Mali-G71MP8 is 25% faster than S835’s Adreno 540 in the offscreen test. The Snapdragon 835 MDP/S does perform 40% faster than the S820 in the Pixel XL, which is quite a bit more than the 25% gain it sees in our other tests.

GFXBench ALU 2 (Offscreen)

The common theme in all of the game simulation tests is the Adreno 540’s better ALU performance, so I thought it would be interesting to see how well it performs in GFXBench’s synthetic ALU test. Surprisingly, its microarchitecture improvements are of no help here. The S835’s 14% advantage over the S820 and 8% advantage over the S821 exactly mirror their differences GPU frequency, assuming 710MHz for S835, suggesting this workload is bottlenecked elsewhere. It still manages to outperform the Mate 9’s Kirin 960 by 32%, however.

CPU and System Performance Qualcomm on Benchmarks versus End-User Experiences
POST A COMMENT

128 Comments

View All Comments

  • Drumsticks - Wednesday, March 22, 2017 - link

    On iOS or Windows, sure. Android has widely different design parameters.

    Instead of just dismissing a 16 page analysis off hand, you should give it a read.

    http://www.anandtech.com/show/9518/the-mobile-cpu-...

    Single threaded performance is King on iOS and windows. Android seems to very much prefer having access to many threads in a lot of use cases.
    Reply
  • AnandTechReader2017 - Friday, April 21, 2017 - link

    Completely disagree for the Android OS.

    A nice thing Android does, if you'd like to try a simple java application, is that it automatically optimizes applications to use multiple threads even if you as a developer don't design it to do so. I noticed this the other day as I was building a quick prototype for a network application, whereby I just wanted to test it out, it never hit above 20% on each core (Android has a nice feature under Developer > Show CPU usage) even though the app should have frozen while waiting for the network thread to complete. Lovely libraries provided definitely have an impact on it and CPU developers take advantage of that fact when they create a CPU for an Android system, same thing that Apple does when it focuses on single-thread performance.
    Reply
  • MrSpadge - Wednesday, March 22, 2017 - link

    They showed the Android browser using many threads. What was missing from my POV was the performance gain from these additional threads. One can assume Google woudln't do it like that if it wasn't worth, but I'd prefer measurements. Reply
  • lefty2 - Wednesday, March 22, 2017 - link

    Browser use many threads to stop i/o requests from blocking the main thread, but those i/o are not doing any work, just waiting for the request to return from the server. Reply
  • melgross - Wednesday, March 22, 2017 - link

    No, what they found was that battery life wasn't effected. Sometimes using all cores gave a small boost to performance, and sometime it degraded performance. It's mostly marketing hype. The more the better. Reply
  • Gasaraki88 - Wednesday, March 22, 2017 - link

    This so wrong... Phone apps are almost all multi threaded. Reply
  • tuxRoller - Wednesday, March 22, 2017 - link

    ^
    |

    That person knows what they are talking about.
    Reply
  • tuxRoller - Wednesday, March 22, 2017 - link

    "most smartphone apps don't use multiple threads"

    Please show me the data backing up that statement.
    Reply
  • melgross - Wednesday, March 22, 2017 - link

    Multicore performance isn't real world on phones, and likely on tablets as well. Very few apps, almost none of them in fact, support more than two cores. Even when multitasking, something that isn't done on phones the way it might be on desktops, doesn't benefit terribly with more cores. And the legitimacy of using all big/little cores at once is even worse.

    Maybe, someday that will change, but not yet.
    Reply
  • BurntMyBacon - Thursday, March 23, 2017 - link

    I do think multicore performance is more important than you seem to believe, but as I said above, single core performance is also important. It is generally more important than multicore performance, but not so much that I can just dismiss multicore performance. The A10 still does well in most multithreaded use cases despite the lower number of cores.

    I've never been a fan of big.LITTLE anyways (particularly with the large clusters). It seems like the wrong way to handle the efficiency issue to me. Without going into a long discussion, I'll point out that the A9 (and predecessors) and Intel's lineup do just fine without it. If Android could assign tasks to individual cores based on need rather than swapping entire clusters in and out, then there may be some benefit to keeping background processes on low power cores to improve battery life and responsiveness of foreground applications, but you still wouldn't need a 4+4 configuration. In any case, that's a discussion for another time.
    Reply

Log in

Don't have an account? Sign up now