NVIDIA Tegra X1 Preview & Architecture Analysis

Name: NVIDIA Tegra X1 Preview & Architecture Analysis
Item: NVIDIA Tegra X1 Preview & Architecture Analysis

by Joshua Ho & Ryan Smith on January 5, 2015 1:00 AM EST

194 Comments | Add A Comment

194 Comments

Automotive: DRIVE CX and DRIVE PX

While NVIDIA has been a GPU company throughout the entire history of the company, they will be the first to tell you that they know they can’t remain strictly a GPU company forever, and that they must diversify themselves if they are to survive over the long run. The result of this need has been a focus by NVIDIA over the last half-decade or so on offering a wider range of hardware and even software. Tegra SoCs in turn have been a big part of that plan so far, but NVIDIA of recent years has become increasingly discontent as a pure hardware provider, leading to the company branching out in unusual ways and not just focusing on selling hardware, but selling buyers on whole solutions or experiences. GRID, Gameworks, and NVIDIA’s Visual Computing Appliances have all be part of this branching out process.

Meanwhile with unabashed car enthusiast Jen-Hsun Huang at the helm of NVIDIA, it’s slightly less than coincidental that the company has also been branching out in to automotive technology as well. Though still an early field for NVIDIA, the company’s Tegra sales for automotive purposes have otherwise been a bright spot in the larger struggles Tegra has faced. And now amidst the backdrop of CES 2015 the company is taking their next step into automotive technology by expanding beyond just selling Tegras to automobile manufacturers, and into selling manufacturers complete automotive solutions. To this end, NVIDIA is announcing two new automotive platforms, NVIDIA DRIVE CX and DRIVE PX.

DRIVE CX is NVIDIA’s in-car computing platform, which is designed to power in-car entertainment, navigation, and instrument clusters. While it may seem a bit odd to use a mobile SoC for such an application, Tesla Motors has shown that this is more than viable.

With NVIDIA’s DRIVE CX, automotive OEMs have a Tegra X1 in a board that provides support for Bluetooth, modems, audio systems, cameras, and other interfaces needed to integrate such an SoC into a car. This makes it possible to drive up to 16.6MP of display resolution, which would be around two 4K displays or eight 1080p displays. However, each DRIVE CX module can only drive three displays. In press photos, it appears that this platform also has a fan which is likely necessary to enable Tegra X1 to run continuously at maximum performance without throttling.

NVIDIA showed off some examples of where DRIVE CX would improve over existing car computing systems in the form of advanced 3D rendering for navigation to better convey information, and 3D instrument clusters which are said to better match cars with premium design. Although the latter is a bit gimmicky, it does seem like DRIVE CX has a strong selling point in the form of providing an in-car computing platform with a large amount of compute while driving down the time and cost spent developing such a platform.

While DRIVE CX seems to be a logical application of a mobile SoC, DRIVE PX puts mobile SoCs in car autopilot applications. To do this, the DRIVE PX platform uses two Tegra X1 SoCs to support up to twelve cameras with aggregate bandwidth of 1300 megapixels per second. This means that it’s possible to have all twelve cameras capturing 1080p video at around 60 FPS or 720p video at 120 FPS. NVIDIA has also made most of the software stack needed for autopilot applications already, so there would be comparatively much less time and cost needed to implement features such as surround vision, auto-valet parking, and advanced driver assistance.

In the case of surround vision, DRIVE PX is said to deliver a better experience by improving stitching of video to reduce visual artifacts and compensate for varying lighting conditions.

The valet parking feature seems to build upon this surround vision system, as it uses cameras to build a 3D representation of the parking lot along with feature detection to drive through a garage looking for a valid parking spot (no handicap logo, parking lines present, etc) and then autonomously parks the car once a valid spot is found.

NVIDIA has also developed an auto-valet simulator system with five GTX 980 GPUs to make it possible for OEMs to rapidly develop self-parking algorithms.

The final feature of DRIVE PX, advanced driver assistance, is possibly the most computationally intensive out of all three of the previously discussed features. In order to deliver a truly useful driver assistance system, NVIDIA has leveraged neural network technologies which allow for object recognition with extremely high accuracy.

While we won’t dive into deep detail on how such neural networks work, in essence a neural network is composed of perceptrons, which are analogous to neurons. These perceptrons receive various inputs, then given certain stimulus levels for each input the perceptron returns a Boolean (true or false). By combining perceptrons to form a network, it becomes possible to teach a neural network to recognize objects in a useful manner. It’s also important to note that such neural networks are easily parallelized, which means that GPU performance can dramatically improve performance of such neural networks. For example, DRIVE PX would be able to detect if a traffic light is red, whether there is an ambulance with sirens on or off, whether a pedestrian is distracted or aware of traffic, and the content of various road signs. Such neural networks would also be able to detect such objects even if they are occluded by other objects, or if there are differing light conditions or viewpoints.

While honing such a system would take millions of test images to reach high accuracy levels, NVIDIA is leveraging Tesla in the cloud for training neural networks that are then loaded into DRIVE PX instead of local training. In addition, failed identifications are logged and uploaded to the cloud in order to further improve the neural network. Both of these updates can be done either over the air or at service time, which should mean that driver assistance will improve with time. It isn’t a far leap to see how such technology could also be leveraged in self-driving cars as well.

Overall, NVIDIA seems to be planning for the DRIVE platforms to be ready next quarter, and production systems to be ready for 2016. This should mean that it's possible for vehicles launching in 2016 to have some sort of DRIVE system present, although it's possible that it would take until 2017 to see this happen.

GPU Performance Benchmarks Final Words

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

194 Comments

View All Comments

chizow - Monday, January 5, 2015 - link
Nvidia is only catching up on process node, because what they've shown is when comparing apples to apples:

1) They have a much faster custom 64-bit CPU (A8X needed 50% more CPU to edge Denver K1)
2) They have a much faster GPU architecture (A8X also needed 50% more GPU cores to edge Denver K1, but get destroyed by Tegra X1 on the same 20nm node).

As we can see, once it is an even playing field at 20nm, A8X isn't going to be competitive.
GC2:CS - Monday, January 5, 2015 - link
Thy just postponed their "much faster custom 64-bit CPU" in favor of off the shelf design and compared to A8X is much higher clocked.

A8X has just 33% percent more "cores" than k1 and aggain the GXA6850 GPU is probably miles under ~1Ghz clockspeed that nvidia targets.

And what's wrong with using a wider CPU/GPU ?

And yeah Tegrax1 is up to 2x faster than A8X, but considering it also runs at the same power as K1, it is not a lot more efficient.
chizow - Monday, January 5, 2015 - link
How do you get only 33% for A8X? A8 = 2 core, Denver K1 = 2 core, A8X = 3 core. 1/2 = 50% increase.

Same for A8X over A8. GPU cores went from 4 to 6, again, 2/4 = 50% increase. Total transistors went from 2Bn to 3Bn, again 50% increase.

In summary, Apple fully leveraged 20nm advantage to match Denver K1 GPU and edge in CPU (still losing in single-core) using a brute-force 50% increase in transistors and functional units.

Obviously they won't be able to pull the same rabbit out of the hat unless they go to FinFet early, which is certainly possible, but then again, its not really a magic trick when you pay a hefty premium for early access to the best node is it?

Bottomline is Nvidia is doing more on the same process node as Apple, simple as that, and that's nothing to be ashamed of from an engineering standpoint.
GC2:CS - Monday, January 5, 2015 - link
A8X got 8 GPU clusetrs. And I still can't get your idea, you think that A8X is worse because it's brute force ~ 50% faster ? Yeah it is brute force, but I don't know how can you preceive that as a bad thing.

They will certainly try to push finfet and rather hard I think.

And how can you say that nvidia is doing more on the same node while boasting how apple is the one who is doing more and how it's bad just above ?
chizow - Monday, January 5, 2015 - link
Wow A8X is 8 clusters and doesn't even offer a 100% increase over A8? Even worst than I thought, I guess I missed that update at some point over the holiday season.

The point is that in order to match the "disappointing" Denver K1, Apple had to basically redouble their efforts to produce a massive 3Bn transistor SoC while fully leveraging 20nm. You do understand that's really not much of an accomplishment when you are on a more advanced process node right?

Sure Apple may push FinFET hard, but from everything I've read, FinFET will be more widely available for ramp compared to the problematic 20nm, which was always limited capacity outside of the premium allocation Apple pushed for (since they obviously needed it to distinguish their otherwise unremarkable SoCs).

It should be obvious why I am saying Nvidia is doing more on the same process node, because when you compare apple to Apples, Nvidia's chip on the 28nm node is more than competitive with the 20nm Apple chips, and when both are on 20nm, its going to be no contest in Nvidia's favor.

Logical conclusion = Nvidia is doing more on the same process node, ie. outperforming their competition when the playing field is leveled.
lucam - Tuesday, January 6, 2015 - link
Chizow the more I read and the more I laugh. You compare clusters with cores they have different technologies and you still state this crap. Maybe would be better to compare how much both of them are capable in term of of GFLOPS at same frequency? This is count. Regarding your absurd discussion of processing node, since the Nvidia chip is so efficient, I look forward to see it in smartphones.
aenews - Saturday, January 24, 2015 - link
The A8X isn't on any phones either. In fact, they left it out of both iPhones AND the iPad Mini.

And take in mind, even the Qualcomm Snapdragon 805 had few design wins... only the Kindle Fire HDX for tablets. They scored two major phones (Nexus 6 and Note 4) but the other manufacturers haven't used it.
squngy - Monday, January 5, 2015 - link
He did not say it is worse, his whole point is that Apple most likely will not be able to do the same thing again.
tipoo - Tuesday, May 17, 2016 - link
Core counts are irrelevant across GPU architectures, they're just different ways of doing something.
If someone gets to the same power draw, performance, and die size with 100 cores as someone else does with 10, what does it matter?
Jumangi - Monday, January 5, 2015 - link
Uh the A8 is an actual product that exists and wait for it you can actually BUY a product with it in there. This is another mobile paper launch by Nvidia with the consumer having no idea when or where it will actually be. The only thing real enthusiasts should care about is the companies that can actually deliver parts people can actually use. Nvidia still has a loooong ways to go in that department. Paper specs mean shit.

NVIDIA Tegra X1 Preview & Architecture Analysis

Automotive: DRIVE CX and DRIVE PX

Post Your Comment

194 Comments

View All Comments

chizow - Monday, January 5, 2015 - link

GC2:CS - Monday, January 5, 2015 - link

chizow - Monday, January 5, 2015 - link

GC2:CS - Monday, January 5, 2015 - link

chizow - Monday, January 5, 2015 - link

lucam - Tuesday, January 6, 2015 - link

aenews - Saturday, January 24, 2015 - link

squngy - Monday, January 5, 2015 - link

tipoo - Tuesday, May 17, 2016 - link

Jumangi - Monday, January 5, 2015 - link

Log in

Don't have an account? Sign up now