NVIDIA's Tegra 2 Take Two: More Architectural Details and Design Wins

Name: NVIDIA's Tegra 2 Take Two: More Architectural Details and Design Wins
Item: NVIDIA's Tegra 2 Take Two: More Architectural Details and Design Wins
Author: Anand Lal Shimpi

by Anand Lal Shimpi on January 5, 2011 2:51 PM EST

21 Comments | Add A Comment

21 Comments

The GeForce ULV GPU

Tegra 2 integrates the GeForce ULV, an NVIDIA designed OpenGL ES 2.0 GPU.

At a high level NVIDIA is calling the GeForce ULV an 8-core GPU, however its not a unified shader GPU. Each core is an ALU but half of them are used for vertex shaders and the other half are for pixel shaders. You can expect the GeForce ULV line to take a similar evolutionary path to desktop GeForce in the future.

The four vertex shader cores/ALUs can do a total of 4 MADDs per clock, the same is true for the four pixel shader ALUs (4 MADDs per clock).

For those of you who follow NVIDIA’s desktop architecture: at the front end the machine works on a 4-thread warp, where each thread is a single pixel.

Architecturally, the GeForce ULV borrows several technologies that only recently debuted on desktop GPUs. GeForce ULV has a pixel cache, a feature that wasn’t introduced in GeForce on the desktop until Fermi. This is purely an efficiency play as saving any trips to main memory reduces power consumption considerably (firing up external interfaces always burns watts quicker than having data on die).

NVIDIA also moved the register files closer to the math units, again in the pursuit of low power consumption. GeForce ULV is also extremely clock gated although it’s not something we’re able to quantify.

NVIDIA did reduce the number of pipeline stages compared to its desktop GPUs by a factor of 2.5 to keep power consumption down.

The GeForce ULV supports Early Z culling, a feature first introduced on the desktop with G80. While G80 could throw away around 64 pixels per clock, early Z on GeForce ULV can throw away 4 pixels per clock.

The ROPs are integrated into the pixel shader, making what NVIDIA calls a programmable blend unit. GeForce ULV uses the same ALUs for ROPs as it does for pixel shaders. This hardware reuse saves die size although it adds control complexity to the design. The hardware can perform one texture fetch and one ROP operation per clock.

While GeForce ULV supports texture compression, it doesn’t support frame buffer compression.

Both AA and AF are supported by GeForce ULV. NVIDIA supports 5X coverage sample AA (same CSAA as we have on the desktop) and up to 16X anisotropic filtering.

We’ve done a little bit of performance testing of Tegra 2’s GPU vs. the PowerVR SGX 530/540 and Qualcomm’s Adreno 200/205. It’s too soon for us to publish Tegra 2 smartphone performance numbers, but the numbers NVIDIA is sharing with us show a huge advantage over competitive GPUs. In our own experience we’ve found that Tegra 2’s GPU is a bit faster than the PowerVR SGX 540, but nothing lines up with the numbers NVIDIA has shown us thus far. We’re still a couple of months away from final software and driver builds so it’s possible that things will change.

The CPU: Dual-Core ARM Cortex A9 Memory Controller, Video Decode, HW Flash and Camera Processor

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

21 Comments

View All Comments

tumbleweed - Wednesday, January 5, 2011 - link
Any OMAP 4 phone design wins announced yet? I really want to see how that battle plays out.
bplewis24 - Wednesday, January 5, 2011 - link
When will you guys be able to post the benchmarking numbers?

Also, if the 2X is running a beta build of 2.2 Froyo, why is the status bar black? Are those stock/borrowed photos?

Brandon
strikeback03 - Thursday, January 6, 2011 - link
Manufacturers could skin the status bar if they wanted, the Galaxy S phones with 2.1 have a black status bar
snoozemode - Wednesday, January 5, 2011 - link
Man, why did they have to do that.. After a few months the images get so bad when the plastic is all messy.
strikeback03 - Thursday, January 6, 2011 - link
Because otherwise it is the front lens of the camera getting nasty?
TareX - Wednesday, January 5, 2011 - link
Why did you describe the Optimus 2X as "one of the fastest smartphones we’ve encountered"??

That's very disappointing. What phone can be faster?
KidneyBean - Thursday, January 6, 2011 - link
The iPhone? Seriously, it does seem to be fast for what it can do. I'm an Android type of guy myself.
TareX - Thursday, January 6, 2011 - link
I understand the iPhone 4 "appearing" to be the fastest. But they said the aforementioned phrase right after mentioning that they can't reveal benchmarking scores. So it sounds like another phone beat it in benchmarks.
Aloonatic - Friday, January 7, 2011 - link
Well, even if it was/is the fastest phone that they have tested/benchmarked, they couldn't say it here as that would probably be revealing a little too much. As such they probably have to be as vague as they are, so that we are left guessing.
Cali3350 - Thursday, January 6, 2011 - link
Probably the iPhone 4. When it comes to being smooth iOS is untouched at this point in time (probably because everything is GPU accelerated).

NVIDIA's Tegra 2 Take Two: More Architectural Details and Design Wins

The GeForce ULV GPU

Post Your Comment

21 Comments

View All Comments

tumbleweed - Wednesday, January 5, 2011 - link

bplewis24 - Wednesday, January 5, 2011 - link

strikeback03 - Thursday, January 6, 2011 - link

snoozemode - Wednesday, January 5, 2011 - link

strikeback03 - Thursday, January 6, 2011 - link

TareX - Wednesday, January 5, 2011 - link

KidneyBean - Thursday, January 6, 2011 - link

TareX - Thursday, January 6, 2011 - link

Aloonatic - Friday, January 7, 2011 - link

Cali3350 - Thursday, January 6, 2011 - link

Log in

Don't have an account? Sign up now