At CES last week, NVIDIA announced its Tegra 4 SoC featuring four ARM Cortex A15s running at up to 1.9GHz and a fifth Cortex A15 running at between 700 - 800MHz for lighter workloads. Although much of CEO Jen-Hsun Huang's presentation focused on the improvements in CPU and camera performance, GPU performance should see a significant boost over Tegra 3.

The big disappointment for many was that NVIDIA maintained the non-unified architecture of Tegra 3, and won't fully support OpenGL ES 3.0 with the T4's GPU. NVIDIA claims the architecture is better suited for the type of content that will be available on devices during the Tegra 4's reign.
 
Despite the similarities to Tegra 3, components of the Tegra 4 GPU have been improved. While we're still a bit away from a good GPU deep-dive on the architecture, we do have more details than were originally announced at the press event.


    

Tegra 4 features 72 GPU "cores", which are really individual components of Vec4 ALUs that can work on both scalar and vector operations. Tegra 2 featured a single Vec4 vertex shader unit (4 cores), and a single Vec4 pixel shader unit (4 cores). Tegra 3 doubled up on the pixel shader units (4 + 8 cores). Tegra 4 features six Vec4 vertex units (FP32, 24 cores) and four 3-deep Vec4 pixel units (FP20, 48 cores). The result is 6x the number of ALUs as Tegra 3, all running at a max clock speed that's higher than the 520MHz NVIDIA ran the T3 GPU at. NVIDIA did hint that the pixel shader design was somehow more efficient than what was used in Tegra 3. 
 
If we assume a 520MHz max frequency (where Tegra 3 topped out), a fully featured Tegra 4 GPU can offer more theoretical compute than the PowerVR SGX 554MP4 in Apple's A6X. The advantage comes as a result of a higher clock speed rather than larger die area. This won't necessarily translate into better performance, particularly given Tegra 4's non-unified architecture. NVIDIA claims that at final clocks, it will be faster than the A6X both in 3D games and in GLBenchmark. The leaked GLBenchmark results are apparently from a much older silicon revision running no where near final GPU clocks.
 
Mobile SoC GPU Comparison
  GeForce ULP (2012) PowerVR SGX 543MP2 PowerVR SGX 543MP4 PowerVR SGX 544MP3 PowerVR SGX 554MP4 GeForce ULP (2013)
Used In Tegra 3 A5 A5X Exynos 5 Octa A6X Tegra 4
SIMD Name core USSE2 USSE2 USSE2 USSE2 core
# of SIMDs 3 8 16 12 32 18
MADs per SIMD 4 4 4 4 4 4
Total MADs 12 32 64 48 128 72
GFLOPS @ Shipping Frequency 12.4 GFLOPS 16.0 GFLOPS 32.0 GFLOPS 51.1 GFLOPS 71.6 GFLOPS 74.8 GFLOPS
 
Tegra 4 does offer some additional enhancements over Tegra 3 in the GPU department. Real multisampling AA is finally supported as well as frame buffer compression (color and z). There's now support for 24-bit z and stencil (up from 16 bits per pixel). Max texture resolution is now 4K x 4K, up from 2K x 2K in Tegra 3. Percentage-closer filtering is supported for shadows. Finally, FP16 filter and blend is supported in hardware. ASTC isn't supported.
 
If you're missing details on Tegra 4's CPU, be sure to check out our initial coverage. 
Comments Locked

59 Comments

View All Comments

  • Formul - Monday, January 14, 2013 - link

    nVidia has a history of seriously overhyping performance of their products. They did it with Tegra 2 and they did it with Tegra 3 so lets wait for some independent testing before believing them this time.
  • JomaKern - Tuesday, January 15, 2013 - link

    But they've gone public with a statement that Tegra 4 will beat A6X in GLBenchmark, they will look very douchey if they fail.

    Of course there is the remote possibility that they will benchmark a 1080p device against the iPad 4 in Egypt HD and claim victory, even though they lose at Egypt 1080p offscreen.
  • CeriseCogburn - Sunday, January 20, 2013 - link

    LOL - anti-nVidia fanboys are the ones who constantly entertain their own distorted world view hype...

    A5X vs. Tegra 3 in the Real World

    " In situations where a game is available in both the iOS app store as well as NVIDIA's Tegra Zone, NVIDIA generally delivers a comparable gaming experience to what you get on the iPad. In some cases you even get improved visual quality as well. The iPad's GPU performance advantage just isn't evident in those cases—likely because the bulk of iOS devices out there still use far weaker GPUs. That's effectively a software answer to a hardware challenge, but it's true.

    NVIDIA isn't completely vindicated however. " by ANANDTECH

    http://www.anandtech.com/show/5688/apple-ipad-2012...

    You umm... nVidia haterz were saying what ? Oh that's right, you got it 100% incorrect but why not continue repeating lies in the hopes that many peer group fools will believe it too....
  • Wolfpup - Tuesday, January 15, 2013 - link

    On the CPU side it's 4 A15s, all of which are more powerful than Swift, only there's 2x as many of them, and they're clocked faster...that's a no brainer there.

    On the GPU side, if Nvidia is saying the segmented architecture makes sense, I believe them. They've got well over half a decade's experience making high end unified designs...they could do so easier than anyone else could, so that they're not means they're almost certainly right about that.

    And to the "I'll wait for benchmarks" people, well sure, but just look at the physical size of this versus Tegra 3.

    Like...well, every version of Tegra prior to launch, I'm fairly excited about this (save for the issue that I don't know whether I'll ever own anything that runs on this!)
  • prashanth041 - Tuesday, January 15, 2013 - link

    somethimg nice to see
  • shodanshok - Tuesday, January 15, 2013 - link

    It seems quite similar to a 12 pipeline NV40 ;)

    Anyway it will be interesting to see the first benchmarks...
  • Commodus - Wednesday, January 16, 2013 - link

    You need a shipping product.

    Show me where I can buy a Tegra 4-based device today. Oh, right... they don't exist. The current iPad, meanwhile, has been on the market since October. By the time there's a Tegra 4 tablet, we may be hearing murmurs of an iPad with an A7X inside, and NVIDIA will at best have had a few months' lead before it's eclipsed again. If, of course, it leads at all.

    There's no doubt that the Tegra 4 is a big step forward for NVIDIA, but part of why Apple dominates tablets is because it knows how to walk the walk: the interval between announcement and shipping is usually 10 days, and it knows this because it's building the finished hardware. It doesn't have to throw a processor into the wilderness and pray someone decides to use it.
  • Krysto - Wednesday, January 16, 2013 - link

    I wonder if GLBenchmark 3.0 with support for OpenGL ES 3.0 will arrive soon. But either way you should at least try to get the 3Dmark11 benchmark, which should come out soon (no OpenGL ES 3.0 support, unfortunately, though), Anand, and test the new GPU's on these new benchmarks.
  • MikeHonet - Sunday, January 20, 2013 - link

    Oh, that's right I can't yet. I love how someone posts specs on a component and compares it to a shipping product that has been out for months. Too little too late AGAIN NVidia. By the time this is in shipping product Apple will have it's next gen product out already.

Log in

Don't have an account? Sign up now