NVIDIA Tegra K1 Preview & Architecture Analysis

Name: NVIDIA Tegra K1 Preview & Architecture Analysis
Item: NVIDIA Tegra K1 Preview & Architecture Analysis

by Brian Klug & Anand Lal Shimpi on January 6, 2014 6:31 AM EST

88 Comments | Add A Comment

88 Comments

Tegra K1 ISP & Video

NVIDIA’s Tegra K1 SoC also makes some dramatic improvements on the ISP side. We saw SoCs start arriving with two ISPs sometime in 2013, which allowed OEMs to deliver a host of new imaging experiences, like shot in shot video and simultaneous use of both front and rear cameras. With Tegra K1, NVIDIA is not only moving to two ISPs, but it’s also making ISP more of a first class citizen.

For those not familiar, ISP (Image Signal Processor) handles the imaging pipeline for still photos, video, and performs tasks like Bayer to RGB conversion (demosaicing), 3A (Autofocus, Auto Exposure, Auto white balance), noise reduction, lens correction, and so on. Although NVIDIA has always included an ISP onboard, I couldn’t shake the feeling that still imaging performance could’ve been better, especially in the few cases that allowed direct comparison (HTC One X). With Tegra K1, there’s more die area dedicated to ISP than in the past, and there are two of them to support the kind of dual camera applications that have quickly become popular.

Tegra K1 includes the third generation of NVIDIA’s ISP, capable of processing 600 MP/s on each ISP with 14 bit input, and support for up to 100 MP cameras. There are two of them, so NVIDIA quotes the total pixel throughput as up to 1.2 Gp/s. This is dramatically increased from Tegra 4, which supported up to 400 Mp/s at 10 bits per pixel. In addition the K1’s ISP now supports up to 4096 focus points, a 64x64 array, for its autofocus routine. The ISP also has better noise reduction, and local tone mapping, a feature we’ve also seen become popular for combining parts of images and recovering some of the dynamic range lost with ever shrinking pixel sizes.

Tegra K1 retains compatibility with the Chimera 1.0 features that we just saw in the Tegra Note 7, like object tracking, always-on HDR, slow motion capture, and full resolution burst, and adds more. NVIDIA has kept the Chimera brand for the K1 SoC, calling it Chimera 2.0, and envisions this architecture enabling things like better temporal pixel binning (combining 8 exposures from the CMOS to drive noise down further), faster panorama, video stabilization, and even better live preview with effects applied. The high level of Chimera seems to be the same – kernels that either run on the CPU, or on the GPU (ostensibly in CUDA this time) before or after the ISP and in a variety of image spaces (Bayer or RGB depending).

On the video side, Tegra K1 continues to support 2160p30 (4K or UHD video at 30FPS) encode and decode. Broken down another way, H.264 High Profile Level 5.1 decode and 4K H.264 High Profile 4.2 encode. The fact that there’s a Kepler next door made me suspect that NVENC was used for most of these tasks, but it turns out that NVIDIA still has discrete blocks for video encode of H.264, VP8, VC1, and others. These are the same video encode and decode blocks as what were used in Tegra 4, but with some further optimizations for power and efficiency. The Tegra K1 platform includes support for H.265 video decode as well, but this isn’t accelerated fully in hardware, rather the decode is split across NVENC and CPU.

NVIDIA showed off a K1 reference board doing 4Kp30 H.264 decode on an attached display, I didn’t notice any dropped frames. Of course that’s a given considering we saw the same thing on Tegra 4, but it’s still worth noting that the SoC is capable of driving 4K/UHD displays over eDP 1.4, LVDS and HDMI 1.4b.

The full GPIO breakdown for Tegra K1 includes essentially all the requisite connectivity you’d expect for a mobile SoC. For USB there’s 3 USB 2.0 ports, and 2 USB 3.0 ports. For storage Tegra K1 supports eMMC up to version 4.5.1, and there’s PCIe x4 which can be configured

The GPU Final Words

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

88 Comments

View All Comments

Nenad - Monday, January 13, 2014 - link
That is not real picture of GPUs/CPUs, it is photoshoped, so we do not know relative size of A15 and Denver cores.
chizow - Monday, January 6, 2014 - link
Of course they would, designating it as simply dual-core would intimate it's a downgrade when it clearly is not.
MrdnknnN - Monday, January 6, 2014 - link
"As if that wasn’t enough, starting now, all future NVIDIA GeForce designs will begin first and foremost as mobile designs."

I guess I am a dinosaur because this makes me want to cry.
nathanddrews - Monday, January 6, 2014 - link
Why? It was the best thing that ever happened to Intel (Core). Desktop graphics are in a rut. Too expensive, not powerful enough for the coming storm of high frame rate 4K and 8K software and hardware.
HammerStrike - Monday, January 6, 2014 - link
From a gaming perspective Intel's focus on mobile has let to 10%-15% performance increases in their desktop line whenever they release a new chip series. That's pretty disappointing, from a gaming performance perspective, even though I understand why they are focusing there.

Also, I disagree with you on desktop graphics - this is a golden time for them. The competition in the $200-$300 card range is fierce, and there is ton of great value there. Not sure why you think there is a "storm" of 4k and 8k content coming any time soon, as there isn't, but even 2x R9 290, $800 at MSRP (I know the mining craze has distorted that, but it will correct) can drive 4K today. Seeing as most decent 4K monitors are still $3000+, I'd argue it is the cost of the displays, and not the GPU's, that is holding back wider adaptation.

As long as nVidia keeps releasing competitive parts I really don't care what their design methodology is. That being said, power efficiency is the #1 priority in mobile, so if they are going to be devoting mindshare to that my concern is top line performance will suffer in desktop apps, where power is much less of an issue.
OreoCookie - Monday, January 6, 2014 - link
Since Intel includes relatively powerful GPUs in their CPUs, discrete GPUs are needed only for special purposes (gaming, GPU compute and various special applications). And the desktop market has been contracting for years in favor of mobile computers and devices. In the notebook space, thanks to Intel finally including decent GPUs in their CPUs, only high-end notebooks come with discrete GPUs. Hence, the market for discrete GPUs is shrinking (which is one of the reasons why nVidia and AMD are both in the CPU game as well as the GPU game).
MrSpadge - Monday, January 6, 2014 - link
> From a gaming perspective Intel's focus on mobile has let to 10%-15% performance increases in their desktop line whenever they release a new chip series. That's pretty disappointing

That's not because of their power efficiency oriented design, it's because their CPU designs are already pretty good (difficult to improve upon) and there's no market pressure to push harder. And as socket 2011 shows us: pushing 6 of these fat cores flat out still requires 130+ W, making these PCs the dinosaurs of old again (-> not mass compatible).
Sabresiberian - Monday, January 6, 2014 - link
I think you are misunderstanding the situation here. What will go in a mobile chip will be the equivalent of one SMX core, while what will go in the desktop version will be as many as they can cool properly. With K1 and Kepler we have the same architecture, but there is one SMX in the coming mobile solution, and 15 SMXs in a GeForce 780Ti. So, 15x the performance in the 780Ti (roughly) using the same design.

Maxwell could end up being made up of something like 20 SMXs designed with mobile efficiency in mind; that's a good thing for those of us playing at the high end of video quality. :)
MrSpadge - Monday, January 6, 2014 - link
This just means they'll be optimized for power efficiency first. Which makes a lot of sense - look at Haitii, it can not even reach "normal" clock speeds with the stock cooler because it eats so much power. Improving power efficiency automatically results in higher performance becoming achievable via bigger dies. What they decide to offer us is a different story altogether.
kpb321 - Monday, January 6, 2014 - link
My initial reaction was a little like MrdnknnN but when I thought about it I realized that may not be a bad thing. Video cards at this point are primarily constrained on the high end by power and cooling limitations more than anything else. The R9 is a great example of this. Optimizing for mobile should result in a more efficient design which can scale up to good desktop and high end performance by adding on the appropriate memory interfaces and putting down enough "blocks" SMXs in nvidia's case. They already do this to give the range of barely better than integrated video cards to top end 500+ dollar cards. I don't think the mobile focus is too far below the low end cards of today to cause major problems here.

NVIDIA Tegra K1 Preview & Architecture Analysis

Tegra K1 ISP & Video

Post Your Comment

88 Comments

View All Comments

Nenad - Monday, January 13, 2014 - link

chizow - Monday, January 6, 2014 - link

MrdnknnN - Monday, January 6, 2014 - link

nathanddrews - Monday, January 6, 2014 - link

HammerStrike - Monday, January 6, 2014 - link

OreoCookie - Monday, January 6, 2014 - link

MrSpadge - Monday, January 6, 2014 - link

Sabresiberian - Monday, January 6, 2014 - link

MrSpadge - Monday, January 6, 2014 - link

kpb321 - Monday, January 6, 2014 - link

Log in

Don't have an account? Sign up now