CPU Performance

The big news with Tegra 3 is that you get four ARM Cortex A9 cores with NEON support instead of just two (sans NEON) in the case of the Tegra 2 or most other smartphone class SoCs. In the short period of time I had to test the tablet I couldn't draw many definitive conclusions but I did come away with some observations.

Linpack showed us healthy gains over Tegra 2 thanks to full NEON support in Tegra 3:

Linpack - Single-threaded

Linpack - Multi-threaded

As expected, finding applications and usage models to task all four cores is pretty difficult. That being said, it's not hard to use the tablet in such a way that you do stress more than two cores. You won't see 100% CPU utilization across all four cores, but there will be a tangible benefit to having more than two. Whether or not the benefit is worth the cost in die area is irrelevant, it only means that NVIDIA (and/or its partners) have to pay more as the price of the end product to you is already pretty much capped.

SunSpider JavaScript Benchmark 0.9.1

Rightware BrowserMark

The bigger benefit I saw to having four cores vs. two is that you're pretty much never CPU limited in anything you do when multitasking. Per core performance can always go up but I found myself bound either by the broken WiFi or NAND speed. In fact, the only thing that would bring the Prime to a halt was if I happened to be doing a lot of writing to NAND over USB. Keyboard and touch interrupts were a low priority at that point, something I hope to see addressed as we are finally entering the era of performance good enough to bring on some I/O crushing multitasking workloads.

Despite having many cores at its disposal, NVIDIA appears to have erred on the side of caution when it comes to power consumption. While I often saw the third and fourth cores fire up when browsing the web or just using the tablet, NVIDIA did a good job of powering them down when their help wasn't needed. Furthermore, NVIDIA also seems to prefer running more cores at lower voltage/frequency settings than fewer cores at a higher point in the v/f curve. This makes sense given the non-linear relationship between voltage and power.

From a die area perspective I'm not entirely sure having four (technically, five) A9 cores is the best way to deliver high performance, but without a new microprocessor architecture it's surely more efficient than just ratcheting up clock speed. I plan on providing a more thorough look at Tegra 3 SoC performance as I spend more time with a fixed Prime, but my initial impressions are that the CPU performance isn't really holding the platform back.

A Lesson in How Not to Launch a Product Tegra 3 GPU: Making Honeycomb Buttery Smooth
POST A COMMENT

204 Comments

View All Comments

  • Anand Lal Shimpi - Thursday, December 01, 2011 - link

    Thank you, I appreciate the kind words :)

    Take care,
    Anand
    Reply
  • cotak - Thursday, December 01, 2011 - link

    Everyone seems so impressed but for me the big elephant is why is the GPU slower than the ipad2's from a GPU company? And to boot the CPU performance isn't significantly faster either? What's going on? Reply
  • Death666Angel - Thursday, December 01, 2011 - link

    CPU is pretty fast when you look at multi-core enabled Linpack. Other programs probably don't handle the 4 cores very efficiently.
    As for the GPU, Apple has been very aggressive in marketing iOS (specifically the iPads) as mobile consoles, so they really delivered in the GPU department. The downside of that is that the die size of the A5 is 122mm² according to Anand (4s review), whereas Tegra3 even with 5 CPU cores only has 80mm² (Tegra3 launched article). :-)
    Reply
  • thunng8 - Thursday, December 01, 2011 - link

    Not sure if they are equivalent tests, but in the ipad2 review, ipad2 scored 170.9 MFLOP which is higher than the Transformer Prime's score of 135.9.

    I don't think the average consumer cares about how big the die size is, they will however notice the extra GPU performance.

    Also, even with the bigger dies size, it doesn't seem to affected battery life either.
    Reply
  • Blaster1618 - Thursday, December 01, 2011 - link

    GPU:
    Power VR SGX 543MP2 (iPAD 2) 60 nm
    8 Pixel processor * maximum 4 separate address per vector per cycle= 32 addresses per cycle.
    Tegra 3 40 nm
    12 Pixel processors x 1 separate address per vector=12 addresses per cycle.

    Isn't there a secret slot where you can slip in a NV104 processor and give this story a happy ending. Last time I bought an apple was a Apple IIc. (google it), but in this case power's simultaneous multi-threading beats the brawn of 12 processors. (darn). maybe wayne will get smart and 28nm.
    Reply
  • vision33r - Thursday, December 01, 2011 - link

    There's nothing to be impressed with. It's another poor attempt by Nvidia to rush a product out the door and getting their ass handed by the iPad2's higher optimized design.

    How embarrassing to let a 1GHZ dualcore SOC spank a 1.4GHZ quadcore Tegra 3.

    I don't know people are excited especially that from what we know of the upcoming Apple's A6 designs and iPad 3 will make this thing forgotten very soon.
    Reply
  • GmanMD - Thursday, December 01, 2011 - link

    Any idea as to whether you would be able to hook up a 4g wireless usb modem to the dock on this? It would be awesome to have that flexibility. Reply
  • medi01 - Thursday, December 01, 2011 - link

    I hope one day Anand would stop judging screens only on min/max brightness and would do a proper test, that would also compare gamut. Reply
  • Anand Lal Shimpi - Thursday, December 01, 2011 - link

    That day will come very soon... ;)

    Take care,
    Anand
    Reply
  • Toadster - Thursday, December 01, 2011 - link

    how do the these stack up against each other? Reply

Log in

Don't have an account? Sign up now