If you've been following our SoC related coverage, you'll probably have come across our coverage of Qualcomm's upcoming SoCs in their Mobile Development Platforms (MDPs). It's an interesting way to get both a feeling for the performance of a given platform before things are final, and to see how much OEMs affect the final performance. 

Qualcomm flew us out to San Francisco to take a look at its newest part, APQ8064, which is quad core Krait v2 at up to 1.5 GHz with Qualcomm's new Adreno 320 GPU, and no baseband. This is a SoC destined primarily for tablets, although the combination of APQ8064 and MDM9615 will likely also be a common upcoming platform for the highest end phones.

At present, this is the same Krait CPU as what we've seen in MSM8960 in phones like the USA versions of the Galaxy S 3 and HTC One X. Later on, Krait v3 will emerge with higher IPC and shorter critical paths (and clocks up to 1.7 or 2 GHz) and a resulting 10-15% boost in performance. For now however we're looking at 1.5 GHz APQ8064 with a Krait v2 inside and Qualcomm's newest scalar GPU architecture with Adreno 320. We're going to talk more about Adreno 320 closer to devices shipping, when Qualcomm feels comfortable talking architecture.

Probably the single biggest notable change is the option in Adreno 320 to change from a TBR (Tile Based Renderer) to an immediate mode renderer on the fly. By default, the render mode is still TBR, however an API is exposed to allow applications to request immediate mode. In the future, some heuristics will be used to determine which mode is faster, including rendering some frames in immediate mode, some frames in TBR mode. Initial shipping devices with Adreno 320 will however just expose an API until the switching system is finalized. Update: Adreno is still a TBR not TBDR as stated earlier. 

In terms of features, Adreno 320 adds OpenGL ES 3.0 (codename Halti) support, and GPGPU capabilities with OpenCL 1.2 and RenderScript. In terms of Windows APIs, Adreno 320 is Direct3D 11 feature level 9_3.

After a morning of sessions about benchmarks and how they reflect different areas of performance (which is another big discussion), we were given hands on time with the mobile development platform for APQ8064, the MDP/T APQ8064. MDP for Mobile Development Platform, T for Tablet. The MDP/T includes a 10.1" WXGA display (1366 x 720), 2 GB of LPDDR2 at 533 MHz (2x32 bits, PoP), 13 MP rear camera, 7 microphones, and all the usual ports and buttons. The tablet was running Android 4.0.4, and although the software is understandably not final, things were pretty stable. In addition, the MDP/T will be sold though Bsquare at some later date for $1299.

Performance

Before we get too far in our performance testing, a refresher of the usual caveats is a good idea. We were allowed unsupervised benchmarking time with the APQ8064 MDP/T, however this is still a reference platform. Final shipping devices may run at different speeds or deliver different performance based on their software configuration. While the MSM8960 MDP ended up performing very close to HTC's One X/S, anything can happen in the final implementation of an SoC.

We'll start our performance analysis with GLBenchmark, more specifically, some of the raw feature tests to see just how things have improved over the MSM8960:

GLBenchmark 2.1 - Fill Test

Raw fill rate almost tripled over the Adreno 225 in the HTC One X, and there's a healthy advantage over NVIDIA's Tegra 3 as well. Imagination's PowerVR SGX 543MP2 still manages a higher fill rate, and the MP4 in the new iPad can't be touched either.

GLBenchmark 2.1 - Triangle Test (White)

Raw polygon throughput is higher than everything aside from the 543MP4, an impressive step forward from the Adreno 225 but still not enough to outpace the high end ImgTec solution.

GLBenchmark 2.1 - Triangle Test (Textured, Fragment Lit)

Here we see nearly 2x the triangle throughput of the Adreno 225, and better performance than the 543MP2. The MP4 continues to be a monster though.

These next two tests are rather meaningless as they're bound by vsync. Hopefully we'll see a newer version of GLBenchmark soon enough that will stress these devices more at native resolutions:

GLBenchmark 2.1 - Pro (Standard)

GLBenchmark 2.1 - Egypt (Standard)

GLBenchmark gets around the default vsync requirement by rendering to an offscreen buffer at 720p, giving us a true apples-to-apples comparison of game-like performance among all of these SoCs. The quad-core S4 Pro with Adreno 320 does incredibly well:

GLBenchmark 2.1 - Pro - Offscreen 720p

In the older Pro test frame rates are insanely high for most of the devices, indicating the age of the benchmark, but the Adreno 320's standings are very good - second to only the PowerVR SGX 543MP4. Compared to the Adreno 225, the 320 is almost twice as fast.

GLBenchmark 2.1 - Egypt - Offscreen 720p

Egypt, the newer of the two "game" tests in GLBenchmark is a bit more stressful. Here the Adreno 320 gets extremely close to the SGX 543MP4 in the new iPad. Apple maintains a 6.8% performance advantage at 720p in this, a largely compute bound benchmark. Performance here is more than double that of the Adreno 225, and 72% faster than NVIDIA's fastest Tegra 3. 

Overall Adreno 320 looks to be a good step forward in performance, although still a bit slower than the latest and greatest from Imagination Technologies. Compared to what everyone else is shipping in Android based tablets/smartphones however, Adreno 320 is easily the new king of the hill.

Qualcomm integrated four Krait v2 cores in the APQ8064 running at 1.5GHz, so CPU performance should range from very similar to significantly better than the dual-core Snapdragon S4 depending on the workload. Just as we've seen with Tegra 3, heavily threaded workloads will scale quite nicely while lightly threaded workloads will look mostly the same:

SunSpider JavaScript Benchmark 0.9.1

Sunspider performance is excellent on the MDP/T, actually delivering a better score than the Medfield based Lava Xolo X900 (1279.4ms). It's unclear how much of this performance increase over the dual-core S4 is due to the added cores vs. software optimizations to the MDP/T's browser.

Rightware BrowserMark

BrowserMark tells a much more conservative story, however the S4 Pro is still able to outpace the dual-core S4 based One X by 22%. Again we're doing a bit of apples-to-oranges here since the browser and remaining software stack between devices isn't perfectly identical. 

Vellamo Overall Score

Vellamo is Qualcomm's in-house developed web performance and scrolling benchmark, which soon will expand to include some more testing beyond JavaScript and scrolling performance (more on that later). For now, APQ8064 eeks out of the rest of the pack, but not by a huge margin. This again is using WebView which isn't heavily threaded.

BaseMark OS Performance

BaseMark OS includes a heavily threaded benchmark that can hit all four cores in the MDP/T as well as in the Tegra 3 based devices. The overall score incoporates the SMP test but doesn't weight it too heavily. The end result is still good for the MDP/T; it's the fastest Android device we've tested here.

We ran the multithreaded Linpack Android test to confirm quad-core scaling and indeed we saw just that. While the HTC One X is good for a score of around 210 MFLOPS, the MDP/T with twice the cores hit 413 MFLOPS. We were able to get numbers as high as 514 MFLOPS, which is more a demonstration of the volatility of the test than anything else. 

Overall the quad-core S4 Pro should deliver everything we love about the dual-core S4's performance, just with more cores. As individual cores can be power gated, there shouldn't be much of a power penalty unless you actually need the extra power. The extra cores should come in handy with heavy multitasking (something we may see even more of on Windows RT tablet/notebook hybrids) or with the rare heavily threaded application. 

POST A COMMENT

35 Comments

View All Comments

  • MantasPakenas - Wednesday, July 25, 2012 - link

    I was trying to search for an argument along the lines that maybe it's hard for them to review international models due to "peculiarities" of the US wireless market (as those handsets might not be suitable for daily use on their carrier, etc), or maybe their need to cater to US audience makes this too low a priority, but then I remembered they did a review of the Medfield based Lava's Xolo X900 and I can't think of any other reason to justify their discrimination. Reply
  • Stuka87 - Wednesday, July 25, 2012 - link

    As I recall, the X900 was tested by one of the guys in Europe, not the US.

    But my guess is that it comes down to time, and that the bulk of the readers are in the US. But thats just a guess really.
    Reply
  • ltcommanderdata - Tuesday, July 24, 2012 - link

    I wonder when we will start getting details on OpenGL ES 3.0. I believe it's based on OpenGL 3.2, which should be pretty equivalent to DX10. However, the Adreno 320 only supports DirectX feature level 9_3 which is based on DX9.0b/SM2.0b, not even DX9.0c/SM3.0 much less DX10. It'd be interesting to see the feature differences between OpenGL ES 3.0, DX10/OpenGL 3.x, OpenGL ES 2.0, and DX9/OpenGL 2.x. Reply
  • B3an - Tuesday, July 24, 2012 - link

    I thought a DirectX feature level of 9_3 was basically DX9.0c? And that 9_2 was equal to DX9.0b? Either way its disappointing. I hope they upgrade this before its used in Win8/RT tablets.

    Its confusing how in the article it says "Adreno 320 is Direct X 11 feature level 9_3" when its not DX11 at all. Mistake?
    Reply
  • ltcommanderdata - Tuesday, July 24, 2012 - link

    http://msdn.microsoft.com/en-us/library/windows/de...

    9_3 only requires VS2.0a and PS2.0b. I believe this resulted from the differences in ATI and nVidia's implementation of SM3.0 where there were arguments over what was and was not required for SM3.0 compliance, notably ATI not implementing vertex shader texture fetch in their X1000 series. So 9_3 is a slightly relaxed SM3.0 or an enhanced SM2.0b encompassing what everyone could agree on.

    Direct X 11 feature level 9_3 is the correct terminology since 9_3 is available through the DX11 API bringing some of the stability and speed improvements of DX11 although not the new features obviously. Presumably true DX9.0c is still available separately (maybe not on Windows RT?) for compatibility.
    Reply
  • Ryan Smith - Wednesday, July 25, 2012 - link

    My expectation is that we won't get an approachable rundown of features until OpenGL ES 3.0 is finalized. Khronos's operations are largely in the open so we could put together a list based on draft revisions, but after the craziness that was OpenGL 3.0 it's proven to not be a good idea to assume anything about a new OpenGL standard until that standard is finalized. Reply
  • aryonoco - Tuesday, July 24, 2012 - link

    "It's unclear how much of this performance increase over the dual-core S4 is due to the added cores vs. software optimizations to the MDP/T's browser."

    This keeps coming up in your review of Android devices. Considering that going forward, I doubt you will be testing many (any) Android devices running Gingerbread or earlier, Chrome should now be available on all Android devices you'll be testing. Also considering that the version of Chrome downloaded from the Play store is going to be without OEM modifications, you can take out this software optimization variable when comparing SoCs if you switched to Chrome.
    Reply
  • lilmoe - Tuesday, July 24, 2012 - link

    ^this

    Or they can use the new Firefox browser and compare results with Chrome... The latest Firefox browser for Android is VERY fast.
    Reply
  • jleach1 - Monday, August 13, 2012 - link

    I agree, which was odd, because Firefox was GARBAGE before the UI and PX improvements came.

    But a .apk of the Chrome browser would be needed, because update rates are through the roof for gapps, and a new version would likely be out by the time the next architecture or SoC review would be up for grabs.

    Bottom line, Anandtech really can't rely on its own software. Relative performance is important, but looking at a chart of numbers representing a browser that has 2x the JavaScript performance due to an update and comparing it to the old version is a pain.

    It'd also be relevant to have updated, native browser data and benchmarks as well.

    Sorry for the typos, and grammatical errors, I'm typing this on a tablet.
    Reply
  • tipoo - Wednesday, July 25, 2012 - link

    Agreed, or really any non-default browser. Use the same software to isolate the hardware. Reply

Log in

Don't have an account? Sign up now