Performance

When I sat down with the 2X at CES, naturally the first thing we did was run our usual suite of benchmarking tools on the phone. The phone was running Android 2.2.1 then, and even though the numbers were good, LG hadn’t quite finalized the software and didn’t think those numbers were representative. We didn’t publish, but knew that performance was good.

Obviously now we don’t have such limitations, and won’t keep you waiting any longer to see how Tegra 2 compares to other phones we’ve benchmarked already. We’ve already talked about performance a bit in our initial preview from back when we got the 2X, but there’s obviously a lot more we have now.

Before we start that discussion however, we need to talk about multithreading in Android. Android itself already is multithreaded natively, in fact, that’s part of delivering speedy UI. The idea is to render the UI using one thread and distribute slow tasks into a background threads as necessary. In the best case multithreaded scenario on Android, the main thread communicates to child threads using a handler class, and hums along until they come back with results and messages. It’s nothing new from a UI perspective—keep the thread drawing the screen active and speedy while longer processes run in the background. The even better part is that multiprocessor smartphones can immediately take advantage of multiple cores and distribute threads appropriately with Android. That said, Android 3.x (Honeycomb) brings a much tighter focus on multithreading and bringing things like garbage collecting off of the first CPU and onto the second. In case you haven't figured it out by now, Android releases generally pair with and are tailored to a specific SoC. If you had to assign those, it'd look something like this: 2.0-2.1—TI OMAP3, 2.2—Qualcomm Snapdragon, 2.3—Samsung Hummingbird, 3.0—Tegra 2.

Back to the point however, the same caveats we saw with multithreading on the PC apply in the mobile space. Applications need to be developed with the expressed intent of being multithreaded to feel faster. The big question on everyone's mind is whether Android 2.2.x can take advantage of those multiple cores. Turns out, the answer is yes.

First off, we can check that Android 2.2.1 on the 2X is indeed seeing the two Cortex-A9 cores by checking dmesg, which thankfully is quite easy to do over adb shell after a fresh boot. Sure enough, inside we can see two cores being brought up during boot by the kernel:

<4>[  118.962880] CPU1: Booted secondary processor
<6>[  118.962989] Brought up 2 CPUs
<6>[  118.963003] SMP: Total of 2 processors activated (3997.69 BogoMIPS).
<7>[  118.963025] CPU0 attaching sched-domain:
<7>[  118.963036]  domain 0: span 0-1 level CPU
<7>[  118.963046]   groups: 0 1
<7>[  118.963063] CPU1 attaching sched-domain:
<7>[  118.963072]  domain 0: span 0-1 level CPU
<7>[  118.963079]   groups: 1 0
<6>[  118.986650] regulator: core version 0.5

The 2X runs the same 2.6.32.9 linux kernel common to all of Android 2.2.x, but in a different mode. Check out the first line of dmesg from the Nexus One:

<5>[ 0.000000] Linux version 2.6.32.9-27240-gbca5320 (android-build@apa26.mtv.corp.google.com) (gcc version 4.4.0 (GCC) ) #1 PREEMPT Tue Aug 10 16:42:38 PDT 2010

Compare that to the 2X:

<5>[ 0.000000] Linux version 2.6.32.9 (sp9pm_9@sp9pm2pl3) (gcc version 4.4.0 (GCC) ) #1 SMP PREEMPT Sun Jan 16 20:58:43 KST 2011

The major difference is the inclusion of “SMP” which shows definitively that Symmetric Multi-Processor support is enabled on the kernel, which means the entire platform can use both CPUs. PREEMPT of course shows that kernel preemption is enabled, which both have turned on. Again, having a kernel that supports multithreading isn’t going to magically make everything faster, but it lets applications with multiple threads automatically spread them out across multiple cores.

Though there are task managers on Android, seeing how many threads a given process has running isn’t quite as easy as it is on the desktop, however there still are ways of gauging multithreading. The two tools we have are both checking “dumpsys cpuinfo” from over adb shell, and simply looking at the historical CPU use reported in a monitoring program we use called System Panel which likely looks at the same thing.

The other interesting gem we can glean from the dmesg output are the clocks NVIDIA has set for most of the interesting bits of Tegra 2 in the 2X. There’s a section of output during boot which looks like the following:

<4>[  119.026337] ADJUSTED CLOCKS:
<4>[  119.026354] MC clock is set to 300000 KHz
<4>[  119.026365] EMC clock is set to 600000 KHz (DDR clock is at 300000 KHz)
<4>[  119.026373] PLLX0 clock is set to 1000000 KHz
<4>[  119.026379] PLLC0 clock is set to 600000 KHz
<4>[  119.026385] CPU clock is set to 1000000 KHz
<4>[  119.026391] System and AVP clock is set to 240000 KHz
<4>[  119.026400] GraphicsHost clock is set to 100000 KHz
<4>[  119.026408] 3D clock is set to 100000 KHz
<4>[  119.026415] 2D clock is set to 100000 KHz
<4>[  119.026423] Epp clock is set to 100000 KHz
<4>[  119.026430] Mpe clock is set to 100000 KHz
<4>[  119.026436] Vde clock is set to 240000 KHz

We can see the CPU set to 1 GHz, but the interesting bits are that LPDDR2 runs at 300 MHz (thus with DDR we get to 600 MHz), and the GPU is clocked at a relatively conservative 100 MHz, compared to the majority of PowerVR GPUs which run somewhere around 200 MHz. It turns out that AP20H lets the GPU clock up to 300 MHz under load as we'll discuss later. The other clocks are a bit more mysterious, Vde could be the Video Decode Engine, Mpe could be Media Processing Engine (which is odd since Tegra 2 uses the A9 FPU instead of MPE), the others are even less clear.

The other interesting bit is how RAM is allocated on the 2X—there’s 512 MB of it, of which 384 MB is accessible by applications and Android. 128 MB is dedicated entirely to the GPU. You can pull that directly out of some other dmesg trickery as well:

mem=383M@0M nvmem=128M@384M

The first 384 are for general RAM, the remaining 128 MB is NVIDIA memory which we can only assume is dedicated entirely to the GPU.

The Partners and the Landscape Performance: Tegra 2 Benchmarked
POST A COMMENT

75 Comments

View All Comments

  • rpmrush - Monday, February 7, 2011 - link

    Solid review, but please at least use spell check. I'm not a grammar or typo freak, but there were way too many simple typos that spell check wouldn't even let you get by with. At least have someone proof read it before you publish to the public. Reply
  • zowie - Tuesday, February 8, 2011 - link

    who can create a new type battery, who will be the richest man in the world Reply
  • uhuznaa - Tuesday, February 8, 2011 - link

    Yeah, and until then those who manage to come up with some decent power management will be the richest...

    Seriously, every improvement on the battery front almost always just leads to devices drawing more power. It's somewhat ironic that last year's iPhone still leads the pack when it comes to battery life. Power management (that is: don't draw more power than absolutely necessary by throttling or shutting down components that aren't needed or aren't fully needed in a given moment) is hard and boring design work nobody seems to care for. And with devices and software getting replaced with the next iteration every few months this is even understandable, it's just not worth the effort, especially when nobody seems to care and benchmarks are so much more important to the crowd.
    Reply
  • DanNeely - Tuesday, February 8, 2011 - link

    How is is typically played back: Cropped, or vertically resampled? Reply
  • Wilco1 - Tuesday, February 8, 2011 - link

    Tegra 3 has 4 1.5GHz Cortex-A9's according to a leaked slide.

    That was a great article! A few minor corrections: The ARM11 VFP is fully pipelined (so it can beat the A8 on FP performance). Like the A8, Scorpion is 2-way in-order, not limited out-of-order. In-order cores issue instructions in-order but may complete them out-of-order. On the other hand, OoO cores use register renaming to issue instructions out-of-order but complete them in-order.

    Note none of the micro benchmarks used emits Neon instructions. JIT compilers don't have enough time to generate high quality code, let alone autovectorize! For proper benchmarking you will need to run native code compiled with a quality compiler (not GCC - it is still far behind the state of the art on ARM, especially Thumb-2).
    Reply
  • metafor - Tuesday, February 8, 2011 - link

    I would argue with that definition of OoO. A design does not need register renaming in order to issue any arbitrary instruction OoO. It's simply a trade-off of whether to centralize hazard tracking on register accesses or on retirement. Reply
  • PWRuser - Tuesday, February 8, 2011 - link

    Excellent review. Please, in your future reviews don't stop including gems like this one:

    "Generally while browsing I can feel when Flash ads are really slowing a page down - the 2X almost never felt that way."

    That's what matters! Including hands on observations along with a full volley of synthetic benchmarks.

    This review comes as close as humanly possible to portraying a handset's ability to readers without the said readers trying it out.

    Your attention to detail puts other reviews to shame. Keep up the good work.
    Reply
  • sarge78 - Tuesday, February 8, 2011 - link

    Don't forget about ST-Ericsson's U8500 A9. They could be a major player in 2011/2012 with potential design wins from Nokia and Sony Ericsson. Reply
  • warisz00r - Tuesday, February 8, 2011 - link

    What equipments do you use to test the phone's audio quality with? Reply
  • phut- - Tuesday, February 8, 2011 - link

    "NVIDIA tells us that the Tegra 2 SoC is fully capable of a faster capture rate for stills and that LG simply chose 2MP as its burst mode resolution. For comparison, other phones with burst modes capture at either 1 MP or VGA. That said, unfortunately for NVIDIA, a significant technological advantage is almost meaningless if no one takes advantage of it. It'll be interesting to see if the other Tegra 2 phones coming will enable full resolution burst capture.  unfortunately for NVIDIA, a significant technological advantage is almost meaningless if no one takes advantage of it. It'll be interesting to see if the other Tegra 2 phones coming will enable full resolution burst capture.  meaningless if no one takes advantage of it. It'll be interesting to see if the other Tegra 2 phones coming will enable full resolution burst capture."

    LG have probably made this decision based on the sensitivity of the invariably minuscule sensor they will have used. Having 6 frames of 12mp is pointless if they are 12 incomprehensible megapixels due to the lacklustre sensitivity of the pixels in their chosen part.

    The kind of sensor you find delivering a meaningful burst in something like a 5D mk2 is enormous and power hungry, in comparison to an operating environment such as a phone.
    Reply

Log in

Don't have an account? Sign up now