NVIDIA GTC 2010 Wrapup

by Ryan Smith on October 10, 2010 12:56 AM EST

CUDA x86

One of the things that got a lot of attention in our comments for our Day 1 article was the announcement of a CUDA-x86 compiler by STMicroelectronics’ subsidiary The Portland Group. PGI is not a company’s name we normally throw around here, as the company’s primary focus is on building compilers for High Performance Computing, particularly the CUDA Fortran compiler. A lot of people asked about PGI’s compiler and what this meant for consumer CUDA applications, so we stopped and asked them about their CUDA-x86 compiler.

The long and the short of it is that the CUDA-x86 compiler is another PGI product meant for use in the HPC market. Specifically the company is targeting national laboratories and other facilities that do HPC work with massive clusters and would like to better integrate their CPU clusters and their GPU clusters. With the CUDA-x86 compiler, these facilities can compile their CUDA HPC applications so that they can be run on their CPU clusters without any changes, allowing them to use clusters that may have otherwise gone idle since they can’t normally process applications written for CUDA. CPUs of course won’t be nearly as fast as NVIDIA GPUs for CUDA code, but in some cases this may be better than letting those clusters go idle.

What this means for consumers however is that the compiler isn’t targeted for use with consumer applications. Conceivably someone could go to PGI and purchase the CUDA-x86 compiler for use in this manner, but this isn’t something the company is currently chasing. It also goes without saying that PGI’s compilers are expensive software normally licensed on a per-node basis, making the prospects of this scenario expensive and unlikely.

NVIDIA’s VDPAU for *nix

NVIDIA’s Video Decode and Presentation API for Unix (VDPAU) was also the subject of a session at GTC. VDPAU was first introduced in 2008 in order to expose NVIDIA’s video decode hardware to *nix operating systems, as unlike Windows these OSes did not have an operating system level API like DXVA to do this. Since 2008 NVIDIA has been adding to the library, and earlier this year they added a bunch of interoperability features to the API focused on providing new ways to access the GPU as a graphical device beyond what OpenGL can do. This entails things such as being able to directly set and get bits on the GPU and enhancements specifically for desktop compositing.

We’re not going to dwell on interoperability much, but while we were there we did ask NVIDIA about whether we’re going to see a unified video decode API for *nix any time soon. While NVIDIA has VDPAU and also has parties like S3 use it, AMD and Intel are backing the rival Video Acceleration API (VA API). Thus any *nix software that currently wants to support video decode acceleration on all GPUs needs to support the two different APIs.

The short answer according to NVIDIA is that we should not be expecting a unified API any time soon. There isn’t significant motivation on either side of the isle to come together on a single API, and with the ad-hoc nature of *nix development, there isn’t a single guiding force that can mandate anything. For the time being *nix developers looking to take advantage of video decode acceleration are going to continue to have to cover both APIs if they wish to cover the complete spectrum of GPUs.

GPU.NET: Making GPU Programming Easier

When NVIDIA and AMD started their GPU computing initiatives, both recognized that developers needed to be able to write a code at a level higher than the raw hardware. Assembly-level programming not only was mind-numbingly hard for most developers when faced with the complexity of a GPU, but allowing developers to do that meant that AMD and NVIDIA were locked in to providing low-level backwards compatibility, ala x86. Instead, both companies did a run-around on the process and focused on getting developers to use C-style languages (CUDA and Brook+, and later OpenCL) and then to have them compile these programs compile down to bytecode, which could then be further compiled down to the appropriate language by the driver itself. These bytecode languages were PTX for NVIDIA, and IL for AMD.

Furthermore by having a bytecode level, additional programming languages could in time be retrofitted to work with the GPU by writing new compilers that compiled these languages down to the right bytecode. This is what made OpenCL and DirectCompute possible, but also full languages such as Fortran. And while back in my day C++ was a perfectly reasonable high level language(ed: get off my lawn),  the reality of the situation is that both NVIDIA and AMD have really only focused on low-level programming so far, leaving programmers who have come to rely on modern high level features such as inheritance and garbage collection out in the cold. As it turns out, today’s GPUs are plenty capable of running code written in a high level language, but no one has written the compiler for it until now.

While at GTC we had a chance to talk to Tidepowerd, a small company developing such a compiler in the form of GPU.NET. GPU.NET is a compiler, runtime, and library combination that can interface with the .NET languages, allowing developers to write GPU computing code in C#, F#, and even VB.NET. As lower level languages have been displaced by the .NET languages and other such languages, this allows developers who are only accustomed to these languages to write applications in their higher level language of choice, making GPU programming more accessible to the high level crowd.

Ultimately GPU.NET is not a panacea, but it does help with accessibility. Programmers still need to be able to write parallel code – which is in and of itself hard – but they no longer need to deal with lower level constructs such as manual memory management as GPU.NET follows its .NET heritage and uses garbage collection. This also lets developers write code within Microsoft Visual Studio, and as we mentioned in our discussion on NVIDIA’s Parallel Nsight 1.5, Visual Studio support is a huge deal when it comes to reaching many programmers.

As a GPU enthusiast it’s my hope that new languages and runtimes such as GPU.NET will make GPU programming more accessible to the rank & file programmers that put together so many of today’s consumer applications, as the GPU future that NVIDIA, AMD, and Apple believe in requires more than a handful of programming wizards writing code for a few major applications. GPU.NET in particular has a hurdle to get over because it requires a runtime and runtimes are rarely looked upon favorably by users, but for the time being at least, this is what it’s going to take to write GPU compute code on the .NET languages. If not OpenCL, then hopefully the proliferation of GPU computing capable high level languages will be what’s necessary to kick start the development of GPU compute-enabled applications for consumers.

Cyberlink & Adobe: Views On Consumer GPU Computing
Comments Locked

19 Comments

View All Comments

  • dtdw - Sunday, October 10, 2010 - link

    we had a chance to Adobe, Microsoft, Cyberlink, and others about where they see GPU computing going in the next couple of years.

    shouldnt you add 'the' before adobe ?

    and adding 'is' after computing ?
  • tipoo - Sunday, October 10, 2010 - link

    " we had a chance to Adobe, Microsoft, Cyberlink, and others about where they see GPU computing going "

    Great article, but I think you accidentally the whole sentence :-P
  • Deanjo - Sunday, October 10, 2010 - link

    "While NVIDIA has VDPAU and also has parties like S3 use it, AMD and Intel are backing the rival Video Acceleration API (VA API)."

    Ummm wrong, AMD is using XvBA for it's video acceleration API. VAAPI provides a wrapper library to XvBA much like there is VAAPI wrapper for VDPAU. Also VDPAU is not proprietary, it is part of Freedesktop and the open source library package contains a wrapper library and a debugging library allowing other manufacturers to implement VDPAU support into their device drivers. In short every device manufacturer out there is free to include VDPAU support and it is up to the driver developer to add that support to a free and truly open API.
  • Ryan Smith - Sunday, October 10, 2010 - link

    AMD is using XvBA, but it's mostly an issue of semantics. They already had the XvBA backend written, so they merely wrote a shim for VA API to get it in there. In practice XvBA appears to be dead, and developers should use VA API and let AMD and the others work on the backend. So in that sense, AMD are backing VA API.

    As for NVIDIA, proprietary or not doesn't really come in to play. NVIDIA is not going to give up VAPAU (or write a VA API shim) and AMD/Intel don't want to settle on using VAPAU. That's the stalemate that's been going on for a couple of years now, and it doesn't look like there's any incentive on either side to come together.

    It's software developers that lose out; they're the ones that have to write in support for both APIs in their products.
  • electroju - Monday, October 11, 2010 - link

    Deanjo, that is incorrect. VA API is not a wrapper. It is the main API from freedesktop.org. It is created by Intel unfortunately, but they help extend the staled XvMC project to a more flexible API. VDPAU and XvBA came later to provide their own way to do about the same thing. They also include a backward compatibility to VA API. VDPAU is not open source. It is just provides structs to be able to use VDPAU, so this means VDPAU can not be changed by the open source community to implement new features.
  • AmdInside - Sunday, October 10, 2010 - link

    Good coverage. Always good to read new info. Often looking at graphics card reviews can get boring as I tend to sometimes just glance at the graphs and that is it. I sure wish Adobe would use GPU more for photography software. Lightroom is one software that works alright on desktops but too slow for my taste on laptops.
  • AnnonymousCoward - Monday, October 11, 2010 - link

    Holodeck? Cmon. It's a 3D display. You can't create a couch and then lay on it.
  • Guspaz - Tuesday, October 12, 2010 - link

    I'm sort of disappointed with RemoteFX. It sounds like it won't be usable remotely by consumers or small businesses who are on broadband-class connections; with these types of connections, you can probably count on half a megabit of throughput, and that's probably not enough to be streaming full-screen MJPEG (or whatever they end up using) over the net.

    So, sure, works great over a LAN, but as soon as you try to, say, telecommute to your office PC via a VPN, that's not going to fly.

    Even if you're working for a company with a fat pipe, many consumers (around here, at least) are on DSL lines that will get them 3 or 4 megabits per second; that might be enough for lossy motion-compensated compression like h.264, but is that enough for whatever Microsoft is planning? You lose a lot of efficiency by throwing away iframes and mocomp.
  • ABR - Tuesday, October 19, 2010 - link

    Yeah, it also makes no sense from an economic perspective. Now you have to buy a farm of GPUs to go with your servers? And the video capability now and soon being built in to every Intel CPU just goes for nothing? More great ideas from Microsoft.

Log in

Don't have an account? Sign up now