Qualcomm's New Snapdragon S4: MSM8960 & Krait Architecture Explored
by Brian Klug & Anand Lal Shimpi on October 7, 2011 12:35 PM EST- Posted in
- Smartphones
- Snapdragon
- Arm
- Qualcomm
- Krait
- MDP
- Mobile
- SoCs
The Adreno 225 GPU
Qualcomm has historically been pretty silent about its GPU architectures. You'll notice that specific details of Adreno GPU execution resources have been absent from most of our SoC comparisons. Starting with MSM8960 however, this is starting to change.
The MSM8960 uses a current generation Adreno GPU with a couple of changes. Qualcomm calls this GPU the Adreno 225, a follow-on to Adreno 220. Subsequent Krait designs will use Adreno 3xx GPUs based on a brand new architecture.
As we discussed in our Samsung Galaxy S 2 review, Qualcomm's Adreno architecture is a tile based immediate mode renderer with early-z rejection. By Qualcomm's own admission, Adreno is somewhere in the middle of the rendering spectrum between IMRs and Imagination Technologies' TBDR architectures. One key difference is Adreno's tiling isn't as fine grained as IMG's.
Architecturally the Adreno 225 and 220 are identical. Adreno 2xx is a DX9-class unified shader design. There's a ton of compute on-board with eight 4-wide vector units and eight scalar units. Each 4-wide vector unit is capable of a maximum of 8 MADs per clock, while each scalar unit is similarly capable of 2 MADs per clock. That works out to 160 floating point operations per clock, or 32 GFLOPS at 200MHz.
Update: Qualcomm has clarified the capabilities of its 4-wide Vector ALUs. Similar to the PowerVR SGX 543, each 4-wide vector ALU is capable of four MADs (one per component). The scalar units cannot be combined to do any MADs, although they are helpful we haven't really been tracking those in this table (IMG has something similar) so we've excluded them for now.
Mobile SoC GPU Comparison | |||||||||||
Adreno 225 | PowerVR SGX 540 | PowerVR SGX 543 | PowerVR SGX 543MP2 | Mali-400 MP4 | GeForce ULP | Kal-El GeForce | |||||
SIMD Name | - | USSE | USSE2 | USSE2 | Core | Core | Core | ||||
# of SIMDs | 8 | 4 | 4 | 8 | 4 + 1 | 8 | 12 | ||||
MADs per SIMD | 4 | 2 | 4 | 4 | 4 / 2 | 1 | ? | ||||
Total MADs | 32 | 8 | 16 | 32 | 18 | 8 | ? | ||||
GFLOPS @ 200MHz | 12.8 GFLOPS | 3.2 GFLOPS | 6.4 GFLOPS | 12.8 GFLOPS | 7.2 GFLOPS | 3.2 GFLOPS | ? | ||||
GFLOPS @ 300MHz | 19.2 GFLOPS | 4.8 GFLOPS | 9.6 GFLOPS | 19.2 GFLOPS | 10.8 GFLOPS | 4.8 GFLOPS | ? |
Looking at the table above you'll see that this is the same amount of computing power than even IMG's PowerVR SGX 543MP2. However as we've already seen in our tests, Adreno 220 isn't anywhere near as quick.
Shader compiler efficiency and data requirements to actually populate a Vec4+1 array are both unknowns, and I suspect both significantly gate overall Adreno performance. There's also the fact that the Adreno 22x family only has two TMUs compared to four in the 543MP2, limiting texturing performance. Combine that with the fact that most Adreno 220 GPUs have been designed into single-channel memory controller systems and you've got a recipe for tons of compute potential limited by other bottlenecks.
With Adreno 225 Qualcomm improves performance along two vectors, the first being clock speed. While Adreno 220 (used in the MSM8660) ran at 266MHz, Adreno 225 runs at 400MHz thanks to 28nm. Secondly, Qualcomm tells us Adreno 225 is accompanied by "significant driver improvements". Keeping in mind the sheer amount of compute potential of the Adreno 22x family, it only makes sense that driver improvements could unlock a lot of performance. Qualcomm expects the 225 to be 50% faster than the outgoing 220
Qualcomm claims that MSM8960 will be able to outperform Apple's A5 in GLBenchmark 2.x at qHD resolutions. We'll have to wait until we have shipping devices in hand to really put that claim to the test, but if true it's good news for Krait as the A5 continues to be the high end benchmark for mobile GPU performance.
While Adreno 225 is only Direct3D feature level 9_3 compliant, Qualcomm insisted that when the time is right it will have a D3D11 capable GPU using its own IP - putting to rest rumors of Qualcomm looking to license a third party GPU in order to be competitive in Windows 8 designs. Although Qualcomm committed to delivering D3D11 support, it didn't commit to a timeframe.
108 Comments
View All Comments
introiboad - Friday, October 7, 2011 - link
Really? I wasn't aware of anyone else in the industry not using ARM's RTL and designing their cores from scratch.z0mb13n3d - Friday, October 7, 2011 - link
Well then, perhaps you haven't heard of Marvell and their Armada line of SoC's?introiboad - Friday, October 7, 2011 - link
Yes, I have heard of Marvell and Armada, isn't that what's left of XScale? Honestly I thought they had given up on what was XScale and licensed the RTL like everyone else instead, but it looks like I was wrong.metafor - Friday, October 7, 2011 - link
Which is probably why Anand specified tablet/smartphones. Marvell is, for all practical purposes, not a major or even relevant player in tablet/smartphones.It is worthy to note that both nVidia and (thus believed) Apple are utilizing their architectural licenses and are cooking up their own cores currently. But none will likely launch in 2012.
Anand Lal Shimpi - Friday, October 7, 2011 - link
The qualification there was "in the smartphone/tablet space". Marvell hasn't had any significant design wins in the high end Android, iOS, Windows Phone or QNX OS space that we cover.Is there another company you are referring to?
Take care,
Anand
Mike1111 - Friday, October 7, 2011 - link
What about the ST-Ericsson Nova A9600?http://www.stericsson.com/press_releases/NovaThor....
It's a 28nm dual-core Cortex-A15 (up to 2.5 GHz) with an Imagination Rogue GPU (Series 6, 210 GFLOPS). Taped out and set to ship in 2012:
http://www.eetimes.com/electronics-news/4226942/ST...
And I'm sure we will see an Apple A6 in the next 12 months (IMHO could be quite similar to the Nova A9600 in terms of CPU and GPU).
Anand Lal Shimpi - Friday, October 7, 2011 - link
Neither of those options are custom designs using the ARM ISA, they are full IP licenses.You are correct on ST-E's announcement though, I simply haven't been factoring them into discussions lately as they have been pretty much not present in the high-end smartphone space as of late.
Take care,
Anand
partylikeits1999 - Saturday, October 8, 2011 - link
I'm hearing that this one has slipped, and now the ST-Ericsson chipset with Rogue won't sample to OEMs until the first half of next year and therefore won't be in commercial products until the very end of 2012 if at all, whereas the MSM8960 is already sampling to OEMs according to Qualcomm. In other words, schedule-wise, you're probably comparing apples to oranges. I do agree with you though that we'll likely see A6 from Apple, in some form, by this time next year, but I think it'll be higher spec'd and will blow the doors off anything from STE. The more interesting question is whether the quad core 8064 that Qualcomm has mentioned for next year, can keep up with A6, both from a CPU and a GPU standpoint.ArunDemeure - Friday, October 7, 2011 - link
Marvell indeed hasn't had much luck in the high-end so far, but the same latest PJ4 core has been fairly successful both as the HSPA Pantheon 920 at BlackBerry (including some new BlackBerry 7 devices) and the TD-SCDMA Pantheon 910 for various China Mobile-specific phones. So I'm not sure it's a good idea to exclude them completely although they're certainly not in the same league as Qualcomm so I really don't blame you.macs - Friday, October 7, 2011 - link
Are you really sure that the upcoming Nexus Prime will use OMAP 4?Seems unlikely to me... A SGX 540 to power a 720p display when Samsung as their own better SOC with Mali 400?? And the rival iPhone 4S use a GPU that is 7/8 times faster than SGX540.
Sounds really stupid... I can't believe that OMAP 4 is the reference SOC for Android ICS