The Tegra 3 GPU: 2x Pixel Shader Hardware of Tegra 2

Tegra 3's GPU is very much an evolution of what we saw in Tegra 2. The GeForce in Tegra 2 featured four pixel shader units and four vertex shader units; in Tegra 3 the number of pixel shader units doubles while the vertex processors remain unchanged. This brings Tegra 3's GPU core count up to 12. NVIDIA still hasn't embraced a unified architecture, but given how closely it's mimicking the evolution of its PC GPUs I wouldn't expect such a move until the next-gen architecture - possibly in Wayne.

Mobile SoC GPU Comparison
  Adreno 225 PowerVR SGX 540 PowerVR SGX 543 PowerVR SGX 543MP2 Mali-400 MP4 GeForce ULP Kal-El GeForce
SIMD Name - USSE USSE2 USSE2 Core Core Core
# of SIMDs 8 4 4 8 4 + 1 8 12
MADs per SIMD 4 2 4 4 4 / 2 1 1
Total MADs 32 8 16 32 18 8 12
GFLOPS @ 200MHz 12.8 GFLOPS 3.2 GFLOPS 6.4 GFLOPS 12.8 GFLOPS 7.2 GFLOPS 3.2 GFLOPS 4.8 GFLOPS
GFLOPS @ 300MHz 19.2  GFLOPS 4.8 GFLOPS 9.6 GFLOPS 19.2 GFLOPS 10.8 GFLOPS 4.8 GFLOPS 7.2 GFLOPS

Per core performance has improved a bit. NVIDIA worked on timing of critical paths through the GPU's execution units to help it run at higher clock speeds. NVIDIA wouldn't confirm the target clock for Tegra 3's GPU other than to say it was higher than Tegra 2's 300MHz. Peak floating point throughput per core is unchanged (one MAD per clock), but each core should be more efficient thanks to larger caches in the design.

A combination of these improvements as well as newer drivers are what give Tegra 3's GPU its 2x - 3x performance advantage over Tegra 2 despite only a 50% increase in overall execution resources. In pixel shader bound scenarios, there's an effective doubling of execution horsepower so the 2x gains are more believable there. I don't expect many games will be vertex processing bound so the lack of significant improvement there shouldn't be a big issue for Tegra 3.

Ready for Gaming: Stereoscopic 3D and Expanded Controller Support

Tegra 3 now supports stereoscopic 3D for displaying content from YouTube, NVIDIA's own 3D Vision Live website and some Tegra Zone games. In its port of Android, NVIDIA has also added expanded controller support for PS3, Xbox 360 and Wii controllers among others.

Tegra 3 Video Encoding/Decoding and ISP

There's unfortunately not too much to go on here, especially not until we have some testable hardware in hand, but NVIDIA is claiming a much improved video decoder and more efficient video encoder in Tegra 3.

Tegra 3's video decoder can accelerate 1080p H.264 high profile content at up to 40Mbps, although device vendors can impose their own bitrate caps and file limitations on the silicon. NVIDIA wouldn't go into greater detail as to what's changed since Tegra 2, other than to say that the video decoder is more efficient. The video encoder is capable of 1080p H.264 base profile encode at 30 fps. 

The Image Signal Processor (ISP) in Tegra 3 is twice as fast as what was in Tegra 2 and NVIDIA promised more details would be forthcoming (likely alongside the first Tegra 3 smartphone announcements).

Memory Interface: Still Single Channel, DDR3L-1500 Supported

Tegra 3 supports higher frequency memories than Tegra 2 did, but the memory controller itself is mostly unchanged from the previous design. While Tegra 2 supported LPDDR2 at data rates of up to 600MHz, Tegra 3 increases that to LPDDR2-1066 and DDR3-L is supported at data rates of up to 1500MHz. The memory interface is still only 32-bits wide, resulting in far less theoretical bandwidth than Apple's A5, Samsung's Exynos 4210, TI's OMAP 4, or Qualcomm's upcoming MSM8960. This is particularly concerning given the increase in core count as well as GPU execution resources. NVIDIA doesn't expect memory bandwidth to be a limitation, but I can't see how that wouldn't be the case in 3D games. Perhaps it's a good thing that Infinity Blade doesn't yet exist for Android.

SATA II Controller: On Die

Given Tegra 3 will find itself in convertible Windows 8 tablets, this next feature makes a lot of sense. NVIDIA's latest SoC includes an on-die SATA II controller, a feature that wasn't present on Tegra 2.

The CPU ASUS' Transformer Prime: The First Tegra 3 Tablet
POST A COMMENT

94 Comments

View All Comments

  • Itaintrite - Wednesday, November 09, 2011 - link

    That HD decode processor will make a lot of people happy. Reply
  • JoeTF - Thursday, November 10, 2011 - link

    No it bloody won't. Tegra2 already has hardware video decode unit and it's main trademark is that, it cannot even decode properly (no prediction, but more importantly - not enough power to decode anything higher than L3.0 at 30fps).

    Hardware video decoder in Tegra3 is pretty much unchanged from T2. Hell, you can see light framedrops even in their marketing video.

    Good thing is that they added NEON instruction. Sadly, it mean we will have to use all four cores at 100% utilization to playback our videos correctly and under those conditions runtime will be severely constrained (the 8h they talk about is for hardware decode, not NEON-based cpu decode)
    Reply
  • 3DoubleD - Thursday, November 10, 2011 - link

    This is what I've been holding out for, so I really hope your wrong. Reply
  • psychobriggsy - Friday, November 11, 2011 - link

    The article states that the video decoder has been significantly enhanced in Tegra 3. Where do you get your information from? Reply
  • Jambe - Wednesday, November 09, 2011 - link

    "Die size has almost doubled from 49mm^2 to somewhere in the 80mm^2 range."

    ~80mm^2 is considerably more than double the area of 49mm^2, isn't it?
    Reply
  • eddman - Wednesday, November 09, 2011 - link

    Umm, no!! 80 is 63% bigger than 49. Simple as that. Reply
  • MamiyaOtaru - Friday, November 11, 2011 - link

    c'mon man it's not 49 millimeters squared, its 49 square millimeters. 49/80 is well less than 1/2 Reply
  • MamiyaOtaru - Friday, November 11, 2011 - link

    see this for some review: http://img379.imageshack.us/img379/6015/squaremmhe... Reply
  • vision33r - Wednesday, November 09, 2011 - link

    The Tegra 3 by being evolutionary, left a huge opening for other SOC to surpass it in a matter of months.

    I don't think the performance will be that huge like the Apple A4 - A5 is on the magnitude of 9x faster.

    It will be worthless by April when the Apple A6 comes out and spanks it silly and rumor has it that Apple maybe using a 1600x1200 10" display to up the ante.

    If this is true, it means Nvidia has to bring out a Tegra 4 fall summer or fall 2012.

    It will be a big iPad 2 X-mas for sure and iPad 3 will easily trump Tegra 3.
    Reply
  • metafor - Wednesday, November 09, 2011 - link

    I honestly don't think the biggest decision-maker for people considering an iOS tablet or Android tablet has to do with a ~40% difference in GPU performance.

    When comparing Android tablets to each other -- since the OS is the same -- many people will fall back on "well, x is faster than y". But a 2x performance difference isn't going to change my mind if I like Android better than iOS, or vice versa.

    Things like a high-res screen, battery life and usability of the OS have a much bigger impact; so I'd say nVidia or really any Android SoC vendor really aren't competing with Apple's silicon group.
    Reply

Log in

Don't have an account? Sign up now