Samsung Galaxy S 2 (International) Review - The Best, Redefinedby Brian Klug & Anand Lal Shimpi on September 11, 2011 11:06 AM EST
The Fastest Smartphone SoC Today: Samsung Exynos 4210
Samsung has been Apple's sole application processor supplier since the release of the original iPhone. It's unclear how much Samsung contributes to the design process, especially with later SoCs like the A4 and A5 carrying the Apple brand. It's possible that Samsung is now no more than a manufacturing house for Apple.
Needless to say, the past few years of supplying SoCs for the iPhone and iPad have given Samsung a good idea of what the market wants from an application processor. We first got the hint that Samsung knew what it was up to with its Hummingbird SoC, used in the Galaxy S line of smartphones.
Hummingbird featured a 1GHz ARM Cortex A8 core and an Imagination Technologies PowerVR SGX 540 GPU. Although those specs don't seem very impressive today, Hummingbird helped Samsung ship more Android smartphones than any of its competitors in 2010. At a high level, Hummingbird looked a lot like Apple's A4 used in the iPad and iPhone 4. Its predecessor looked a lot like Apple's 3rd generation SoC used in the iPhone 3GS.
Hummingbird's successor however is Samsung's first attempt at something different. This is the Exynos 4210 application processor:
We first met the Exynos back when it was called Orion at this year's Mobile World Congress. Architecturally, the Exynos 4210 isn't too far from Apple's A5, NVIDIA's Tegra 2 or TI's OMAP 4. This is the same CPU configuration as all of the aforementioned SoCs, with a twist. While the A5, Tegra 2 and OMAP 4 all have a pair of ARM Cortex A9 cores running at 1GHz, Exynos pushes the default clock speed up to 1.2GHz. Samsung is able to hit higher clock speeds either through higher than normal voltages or as a result of its close foundry/design relationship.
ARM's Cortex A9 has configurable cache sizes. To date all of the A9 implementations we've seen use 32KB L1 caches (32KB instruction cache + 32KB data cache) and Samsung's Exynos is no exception. The L2 cache size is also configurable, however we haven't seen any variance there either. Apple, NVIDIA, Samsung and TI have all standardized on a full 1MB L2 cache shared between both cores. Only Qualcomm is left with a 512KB L2 cache but that's for a non-A9 design.
Where we have seen differences in A9 based SoCs are in the presence of ARM's Media Processing Engine (NEON SIMD unit) and memory controller configuration. Apple, Samsung and TI all include an MPE unit in each A9 core. ARM doesn't make MPE a requirement for the A9 since it has a fully pipelined FPU, however it's a good idea to include one given most A8 designs featured a similar unit. Without MPE support you run the risk of delivering an A9 based SoC that occasionally has lower performance than an A8 w/ NEON solution. Given that Apple, Samsung and TI all had NEON enabled A8 SoCs in the market last year, it's no surprise that their current A9 designs include MPE units.
NVIDIA on the other hand didn't have an SoC based on ARM's Cortex A8. At the same time it needed to be aggressive on pricing to gain some traction in the market. As a result of keeping die size to a minimum, the Tegra 2 doesn't include MPE support. NEON code can't be executed on Tegra 2. With Tegra 3 (Kal-El), NVIDIA added in MPE support but that's a discussion we'll have in a couple of months.
Although based on Qualcomm's own design, the Snapdragon cores include NEON support as well. Qualcomm's NEON engine is 128-bits wide vs. 64-bits wide in ARM's standard implementation. Samsung lists the Exynos 4210 as supporting both 64-bit and 128-bit NEON however given this is a seemingly standard A9 implementation I believe the MPE datapath is only 64-bits wide. In other words, 128-bit operations can be executed but not at the same throughput as 64-bit operations.
The same designs that implemented MPE also implemented a dual-channel memory controller. Samsung's Exynos features two 32-bit LPDDR2 memory channels, putting it on par with Apple's A5, Qualcomm's Snapdragon and TI's OMAP 4. Only NVIDIA's Tegra 2 features a single 32-bit LPDDR2 memory channel.
|ARM Cortex A9 Based SoC Comparison|
|Apple A5||Samsung Exynos 4210||TI OMAP 4||NVIDIA Tegra 2|
|Clock Speed||Up to 1GHz||Up to 1.2GHz||Up to 1GHz||Up to 1GHz|
|L1 Cache Size||32KB/32KB||32KB/32KB||32KB/32KB||32KB/32KB|
|L2 Cache Size||1MB||1MB||1MB||1MB|
|Memory Interface||Dual Channel LP-DDR2||Dual Channel LP-DDR2||Dual Channel LP-DDR2||Single Channel LP-DDR2|
Like most of its competitors, Samsung's memory controller does allow for some flexibility when choosing memory types. In addition to LPDDR2, the Exynos 4210 supports standard DDR2 and DDR3. Maximum data rate is limited to 800MHz regardless of memory type.
Based on everything I've said thus far, the Exynos 4210 should be among the highest performing SoCs on the market today. It has the same clock for clock performance as an Apple A5, NVIDIA Tegra 2 and TI OMAP 4430. Samsung surpassed those designs by delivering a 20% higher operating frequency, which should be tangible in typical use.
To find out let's turn to our CPU performance suite. We'll start with our browser benchmarks: SunSpider and BrowserMark:
Where we do see big gains from the Exynos' higher clock speed is in our Linpack tests. The single-threaded benchmark actually shows more scaling than just clock speed, indicating that here are other (possibly software?) factors at play here. Either way it's clear that the 20% increase in clock speed can surface as tangible if the conditions are right:
A clock speed advantage today is nice but it's something that Samsung's competitors will be able to deliver in the not too distant future. Where Samsung chose to really differentiate itself was in the graphics department. The Exynos 4210 uses ARM's Mali-400 MP4 GPU.
Shipping in smartphones today we have GPUs from three vendors: Qualcomm (Adreno), Imagination Technologies (PowerVR SGX) and NVIDIA (GeForce). Of those vendors, only Qualcomm and NVIDIA produce SoCs - Imagination simply licenses its technology to SoC vendors.
Both Apple and Intel hold significant amounts of Imagination stock, presumably to protect against an eager SoC vendor from taking control of the company.
ARM also offers GPU IP in addition to its CPU designs, however we've seen very little uptake until now. Before we get to Mali's architecture, we need to talk a bit about the different types of GPUs on the market today.