Today Qualcomm is disclosing a set of benchmark results for their new Snapdragon 888 SoC that’s set to power next year’s flagship Android devices. Usually, as in years past, we would have had opportunities to benchmark Qualcomm’s reference designs ourselves during the chipset launch event, or a few weeks later during CES. However due to obvious circumstances, this wasn’t possible this year.

As an alternative, Qualcomm is instead sharing with the press a set of benchmark results from their new Snapdragon 888 reference design phone. Usually, the point of having the press benchmark the devices themselves is that it adds independent verification of the benchmark scores. This time around we’ll have to make a little leap of faith in the accuracy of Qualcomm’s numbers – of course we still pretty much expect the figures to be accurate and be reproduced in commercial devices.

Unfortunately, because the majority of our more interesting mobile test suite around SoCs is comprised of custom internal benchmarks, that means that those will be missing from today’s rather brief coverage.

Among the benchmarks that Qualcomm has run are AnTuTu, GeekBench, GFXBench Aztec Normal and Manhattan 3.0, Ludashi AiMark, AITuTu, MLPerf and UL Procyon. We really only run a subset of those benchmarks are part of our regular coverage, so I’ll just be focusing on the very basics with GeekBench, GFXBench and Procyon.

GeekBench 5

Starting off we have GeekBench 5, which in my opinion is generally a good overall performance benchmark for CPUs, and generally scales in line with SPEC. Here we see the new Snapdragon 888’s first-time use of Cortex-X1 cores in action.

The single-threaded performance score has gone up from 919 points on the Snapdragon 865 to 1135 on the new SoC, a 23.5% performance uplift versus its direct predecessor. This is relatively in line with Qualcomm’s promoted performance boost of 25%, and generally is what we expected given Qualcomm’s implementation of the Cortex-X1 in the new chipset. As a reminder, the new X1 cores are clocked at 2.84GHz – the same frequency as the A77 cores on the S865, but lower than the 3.09GHz A77 cores of the Snapdragon 865+. As a result, against the 865+ the 888's performance advantage is only 15.4%, which doesn’t sound quite as exciting.

Multi-threaded performance of the new chip comes in at 16.9% better than its predecessors. This actually was a bit odder to see as I was expecting larger improvements. Thinking more about it, I guess it makes sense – the new Cortex-A78 core, which is being used as the 3x middle cores of the new SoC, is only advertised as offering a 7% IPC advantage over its predecessor.

Meanwhile, Qualcomm did increase their L2 cache size on the middle cores from 256KB to 512KB, but otherwise left their clock frequencies unchanged at 2.42GHz. Together with the unchanged 4x Cortex-A55 cores at 1.8GHz I guess the overall performance for the complete cluster hasn't really changed all that much, with the X1 prime core being the hero of the show for this generation.

Moving on to GPU performance, the new Snapdragon 888 features the new Adreno 660 GPU, where Qualcomm promises a 35% performance uplift. Qualcomm published GFXBench Aztec Normal and Manhattan 3.0 scores. We moved on from Manhattan 3.0 to Manhattan 3.1 long time ago, so we don’t have comparison scores against Qualcomm’s 169fps figure, but we do run Aztec Normal.

In this benchmark, Qualcomm’s listed score of 86fps is over 55% faster than previous generation Snapdragon 865 devices. This might be an outlier score, or it could be sign of the benefits of the additional memory bandwidth afforded by the SoC's faster LPDDR5-6400 support – Qualcomm did say that this generation the GPU will be able to stress that part of the chip much harder.

While the Snapdragon 888 doesn’t look like it’ll match the peak performance scores of the A13 or A14 SoCs used in Apple's iPhones, sustained performance will depend quite a bit on the power consumption of the chip. If this lands in at between 4 and 4.5W, then the majority of flagship Android phones in 2021 will likely be able to sustain this peak performance figure and allow Qualcomm to regain the mobile performance crown from Apple. Otherwise if the chip has to significantly throttle, then 888 will probably fall short of retaking the crown. But even if that's the case, for Android users it shouldn't matter too much: the generational leap over 2020 phones would still be immense, and by far one of the largest GPU performance leaps Qualcomm has been able to achieve to date.

UL Procyon

In terms of AI Benchmarks, Qualcomm didn’t really present anything in the same manner that we do, so this is a good opportunity to add UL’s new Procyon AI Inference benchmark to our suite.

The benchmark is able to run on various accelerators blocks within an SoC, and it's also is able to take advantage of custom TensorFlow Delegates, such as Samsung’s EDEN framework.

Here the new Snapdragon 888 is posting outstandingly good performance, delivering almost 3x the score of the Snapdragon 865+ and outright exceeding the theoretical throughput rate increases of the new Hexagon 780. The new Hexagon is a completely new IP and pretty much the single biggest improvement of the whole Snapdragon 888 as it promises great advancements in performance and power efficiency – not only against previous generation Snapdragons but also against competitor designs which don’t yet have such a flexible DSP/ML hardware block in their SoCs.

MLPerf 0.7.1 - Image Classification MLPerf 0.7.1 - Image Classification (Offline) MLPerf 0.7.1 - Object Detection MLPerf 0.7.1 - Image Segmentation MLPerf 0.7.1 - Language Processing

Qualcomm surprisingly also published MLPerf results on the new chip. The Android version of MLCommons' benchmark suite is fresh out the oven, and among other things, gives us a new standardized test that’s more aligned across the industry.

The new Snapdragon 888 is showcasing tremendous performance leaps compared to its predecessor, with gains of up to 4x in some of the tests. Again, this is well beyond just the theoretical computational throughput improvements of the execution units of the IP blocks, and very likely is tied to the new memory architecture of the new Hexagon block as a whole.

Overall Good First Impressions – Waiting For First Devices

Following up on its announcement just a few weeks ago, today’s benchmark score release helps to further validate our first impressions of (and expectations for) Qualcomm's new Snapdragon 888 SoC.

On the CPU side we’re seeing good improvements, even with Qualcomm's conservative claims. And meanwhile the new Adreno GPU seems to perform as well as Qualcomm has promised – if not a bit better. So as things stand, the missing piece of the puzzle is power consumption; if it ends up being competitive there, then Qualcomm has a shot at regaining the performance crown in mobile.

Finally, the new Hexagon DSP really stood out as being the most exciting piece of new hardware in the Snapdragon 888. These performance figures underscore just how far Qualcomm has come in a single generation, as evidenced by the new SoC's tremendous performance leaps over earlier chips.

Ultimately, while this isn't really one of our traditional performance previews – seeing as how we have to place trust in Qualcomm that their figures will be reproducible on commercial devices – it's at least a starting point for talking about performance. And, with that taken at face value, it’s looking like the new Snapdragon 888 won’t disappoint, setting up Qualcomm for another solid year of execution on the SoC front.

Related Reading:

POST A COMMENT

75 Comments

View All Comments

  • johnathanblade - Saturday, January 30, 2021 - link

    What this statement ignores is that there are many major competitors in the SOC world. All of them lag far behind Apple, but in second is pretty consistently Qualcomm. Qualcomm does peak better, it does sustained better, and it does graphics better than any of its non-Apple competition. Huawei comes close occasionally. Samsung, Mediatech, and Rockchip don't.

    Apple makes expensive chips that they know they can sell, and they fab farther away from the ARM reference than the other manufacturers.
    Reply
  • s.yu - Thursday, December 24, 2020 - link

    >a tech super power with endless pockets
    You mean Intel, with world No.1 R&D expenditure? ;)
    Reply
  • RobJoy - Tuesday, January 5, 2021 - link

    Qualcom is loaded, so money is no issue.
    I think it is the patents.
    Reply
  • Wilco1 - Friday, December 18, 2020 - link

    Android SoC vendors aren't willing to spend a lot of area on big cores and huge caches: here we have 4x512KB L2 plus a tiny 4MB L3 vs A14's 12MB L2 and 16MB L3. Reply
  • fishingbait - Friday, December 18, 2020 - link

    That has nothing to do with it at all. Android SOC vendors - really just Qualcomm, MediaTek and Samsung unless you want to count Huawei - chose to use more cores instead of bigger ones. While the competition - Apple - was until recently still using 2 cores and no one was making a real effort to make ARM laptops beyond a few cheap Chromebooks, it wasn't a problem. But when Apple went to 4 and then 6 cores and started crushing Android SOCs practically overnight, then there was a problem. MediaTek tried to address it with a 10 core design with 2 fastest, 4 fast and 4 efficiency cores, but it only gave them a 10% performance increase over their octacore designs. So ARM and Samsung went back to the drawing board to create a much better single core design: the X1. But this is a first generation design. There are still issues that they need to work out in order to have this core perform even faster. Also, they need to go from 1+3+4 design to a 2+2+4 design, from 1 Cortex X1 to 2, if they are going to have any chance of rivaling the A14. The problem is that by the time they perfect the design, Apple may well have 4 Firestorm performance cores instead of 2 for their smartphone chips, and on a 3nm process at that. But the biggest problem - coming up with their own big core design - has been solved.

    Ultimately, there is no "need" to play catchup with Apple smartphone speeds anyway. The vast majority of Android phones sold cost under $400, and the expensive Android phones are so because they offer a ton of features that iPhones won't have for 3-4 years. The real need is to come out with a PC chip for laptops and desktops to compete with the M1. That is the real value of the Cortex X1. We might even see Qualcomm make another try at being a supplier for ARM servers again - they began to in 2015 but gave up and pulled the plug in 2018 - with Samsung potentially joining them.
    Reply
  • Wilco1 - Friday, December 18, 2020 - link

    No - multithreaded performance is not an issue. That's already within 10% of A14 as the results above show. However single-threaded GB performance is 40% lower. The big issue is that the big cores don't have nearly enough cache, so they don't run as fast as they could. That can be solved by adding more cache, and lots and lots of it. L2 could become 1MB per core like in the Neoverse cores, L3 quadrupled to 16MB and 8-16MB as a system cache. That with an improved X1 core at similar frequency as SD865+ should match A14 both single and multithreaded.

    I'd say 4 big cores is more than enough (whether it is 1+3 or 2+2). Replacing the 4 little cores with one A78 variant optimized for best perf/Watt would be a good idea too.

    Whether it is worth chasing A14 is a good question indeed, but the key problem is willingness to spend lots of area on big cores and big caches (which either means higher cost or having to cut down GPU or AI performance).
    Reply
  • dudedud - Sunday, December 20, 2020 - link

    "That with an improved X1 core at similar frequency as SD865+ should match A14 both single and multithreaded."

    A13 single yes. A14 single nop.
    Reply
  • ichaya - Monday, December 21, 2020 - link

    You can see the diminishing returns in increasing on-die cache with A14 vs M1 L2 cache increase from 8 to 12MB. One core can't access all of the increase, and there are more big cores in the M1, but still a 33% increase in what one core can access (6 vs 8MB?) only gives a 7% increase in single core performance.

    That 7% might be worth it on a MB Pro, but it's most definitely not worth it for a phone SoC. QC SoCs are smaller than Apple SoCs, the 845 was the first to have a SLC/L3, and the 6/700-series SoCs still don't have it. More on-die cache is always good, it's a matter of price, though I would agree QC could do better.

    The A14 also has 4 memory channels, 16K pages, wider cores fed by more OoO execution and larger caches, and the perf/watt is impressive enough for low power, that I would say we should be seeing some of the same things from other ARM designs and even AMD/Intel eventually.
    Reply
  • TheinsanegamerN - Saturday, December 19, 2020 - link

    "That has nothing to do with it at all"

    You are completely ignorant. Larger caches absolutely play a part in higher IPC. What do you think contributes to "larger" cores? They dont just throw more transistors into a FPU and call it a day.
    Reply
  • RobJoy - Tuesday, January 5, 2021 - link

    They don't really.
    AMD has had 2x the cache Intel had and Intel always had better IPC than them.
    Until recently when cores were drastically redesigned.
    Reply

Log in

Don't have an account? Sign up now