NVIDIA GTC 2010 Wrapup

by Ryan Smith on October 10, 2010 12:56 AM EST

CUDA x86

One of the things that got a lot of attention in our comments for our Day 1 article was the announcement of a CUDA-x86 compiler by STMicroelectronics’ subsidiary The Portland Group. PGI is not a company’s name we normally throw around here, as the company’s primary focus is on building compilers for High Performance Computing, particularly the CUDA Fortran compiler. A lot of people asked about PGI’s compiler and what this meant for consumer CUDA applications, so we stopped and asked them about their CUDA-x86 compiler.

The long and the short of it is that the CUDA-x86 compiler is another PGI product meant for use in the HPC market. Specifically the company is targeting national laboratories and other facilities that do HPC work with massive clusters and would like to better integrate their CPU clusters and their GPU clusters. With the CUDA-x86 compiler, these facilities can compile their CUDA HPC applications so that they can be run on their CPU clusters without any changes, allowing them to use clusters that may have otherwise gone idle since they can’t normally process applications written for CUDA. CPUs of course won’t be nearly as fast as NVIDIA GPUs for CUDA code, but in some cases this may be better than letting those clusters go idle.

What this means for consumers however is that the compiler isn’t targeted for use with consumer applications. Conceivably someone could go to PGI and purchase the CUDA-x86 compiler for use in this manner, but this isn’t something the company is currently chasing. It also goes without saying that PGI’s compilers are expensive software normally licensed on a per-node basis, making the prospects of this scenario expensive and unlikely.

NVIDIA’s VDPAU for *nix

NVIDIA’s Video Decode and Presentation API for Unix (VDPAU) was also the subject of a session at GTC. VDPAU was first introduced in 2008 in order to expose NVIDIA’s video decode hardware to *nix operating systems, as unlike Windows these OSes did not have an operating system level API like DXVA to do this. Since 2008 NVIDIA has been adding to the library, and earlier this year they added a bunch of interoperability features to the API focused on providing new ways to access the GPU as a graphical device beyond what OpenGL can do. This entails things such as being able to directly set and get bits on the GPU and enhancements specifically for desktop compositing.

We’re not going to dwell on interoperability much, but while we were there we did ask NVIDIA about whether we’re going to see a unified video decode API for *nix any time soon. While NVIDIA has VDPAU and also has parties like S3 use it, AMD and Intel are backing the rival Video Acceleration API (VA API). Thus any *nix software that currently wants to support video decode acceleration on all GPUs needs to support the two different APIs.

The short answer according to NVIDIA is that we should not be expecting a unified API any time soon. There isn’t significant motivation on either side of the isle to come together on a single API, and with the ad-hoc nature of *nix development, there isn’t a single guiding force that can mandate anything. For the time being *nix developers looking to take advantage of video decode acceleration are going to continue to have to cover both APIs if they wish to cover the complete spectrum of GPUs.

GPU.NET: Making GPU Programming Easier

When NVIDIA and AMD started their GPU computing initiatives, both recognized that developers needed to be able to write a code at a level higher than the raw hardware. Assembly-level programming not only was mind-numbingly hard for most developers when faced with the complexity of a GPU, but allowing developers to do that meant that AMD and NVIDIA were locked in to providing low-level backwards compatibility, ala x86. Instead, both companies did a run-around on the process and focused on getting developers to use C-style languages (CUDA and Brook+, and later OpenCL) and then to have them compile these programs compile down to bytecode, which could then be further compiled down to the appropriate language by the driver itself. These bytecode languages were PTX for NVIDIA, and IL for AMD.

Furthermore by having a bytecode level, additional programming languages could in time be retrofitted to work with the GPU by writing new compilers that compiled these languages down to the right bytecode. This is what made OpenCL and DirectCompute possible, but also full languages such as Fortran. And while back in my day C++ was a perfectly reasonable high level language(ed: get off my lawn),  the reality of the situation is that both NVIDIA and AMD have really only focused on low-level programming so far, leaving programmers who have come to rely on modern high level features such as inheritance and garbage collection out in the cold. As it turns out, today’s GPUs are plenty capable of running code written in a high level language, but no one has written the compiler for it until now.

While at GTC we had a chance to talk to Tidepowerd, a small company developing such a compiler in the form of GPU.NET. GPU.NET is a compiler, runtime, and library combination that can interface with the .NET languages, allowing developers to write GPU computing code in C#, F#, and even VB.NET. As lower level languages have been displaced by the .NET languages and other such languages, this allows developers who are only accustomed to these languages to write applications in their higher level language of choice, making GPU programming more accessible to the high level crowd.

Ultimately GPU.NET is not a panacea, but it does help with accessibility. Programmers still need to be able to write parallel code – which is in and of itself hard – but they no longer need to deal with lower level constructs such as manual memory management as GPU.NET follows its .NET heritage and uses garbage collection. This also lets developers write code within Microsoft Visual Studio, and as we mentioned in our discussion on NVIDIA’s Parallel Nsight 1.5, Visual Studio support is a huge deal when it comes to reaching many programmers.

As a GPU enthusiast it’s my hope that new languages and runtimes such as GPU.NET will make GPU programming more accessible to the rank & file programmers that put together so many of today’s consumer applications, as the GPU future that NVIDIA, AMD, and Apple believe in requires more than a handful of programming wizards writing code for a few major applications. GPU.NET in particular has a hurdle to get over because it requires a runtime and runtimes are rarely looked upon favorably by users, but for the time being at least, this is what it’s going to take to write GPU compute code on the .NET languages. If not OpenCL, then hopefully the proliferation of GPU computing capable high level languages will be what’s necessary to kick start the development of GPU compute-enabled applications for consumers.

Cyberlink & Adobe: Views On Consumer GPU Computing
Comments Locked

19 Comments

View All Comments

  • adonn78 - Sunday, October 10, 2010 - link

    This is pretty boring stuff. I mean the projectors ont eh curved screens were cool but what about gaming? anything about Nvidia's next gen? they are really falling far behind and are not really competing when it comes to price. I for one cannot wait for the debut of AMD's 6000 series. CUDA and PhysX are stupid proprietary BS.
  • iwodo - Sunday, October 10, 2010 - link

    What? This is GTC, it is all about the Workstation and HPC side of things. Gaming is not the focus of this conference.
  • bumble12 - Sunday, October 10, 2010 - link

    Sounds like you don't understand what CUDA is, by a long mile.
  • B3an - Sunday, October 10, 2010 - link

    "teh pr0ject0rz are kool but i dun understand anyting else lolz"

    Stupid kid.
  • iwodo - Sunday, October 10, 2010 - link

    I was about the post Rendering on Server is fundamentally, but the more i think about it the more it makes sense.

    However defining a codec takes months, actually refining and implementing a codec takes YEARS.

    I wonder what the client would consist of, Do we need a CPU to do any work at all? Or would EVERYTHING be done on server other then booting up an acquiring an IP.

    If that is the case may be an ARM A9 SoC would be enough to do the job.
  • iwodo - Sunday, October 10, 2010 - link

    Just started digging around. LG has a Network Monitor that allows you to RemoteFX with just an Ethernet Cable!.

    http://networkmonitor.lge.com/us/index.jsp

    And x264 can already encode at sub 10ms latency!. I can imagine IT management would be like trillion times easier with centrally managed VM like RemoteFX. No longer upgrade every clients computer. Stuff a few HDSL Revo Drive and let everyone enjoy the benefit of SSD.

    I have question of how it will scale, with over 500 machines you have effectively used up all your bandwidth...
  • Per Hansson - Sunday, October 10, 2010 - link

    I've been looking forward to this technology since I heard about it some time ago.
    Will be interesting to test how well it works with the CAD/CAM software I use, most of which is proprietary machine builder specific software...
    There was no mention of OpenGL in this article but from what I've read that is what it is supposed to support (OpenGL rendering offload)
    Atleast that's what like 100% of the CAD/CAM software out there use so it better be if MS wants it to be successful :)
  • Ryan Smith - Sunday, October 10, 2010 - link

    Someone asked about OpenGL during the presentation and I'm kicking myself for not writing down the answer, but I seem to recall that OpenGL would not be supported. Don't hold me to that, though.
  • Per Hansson - Monday, October 11, 2010 - link

    Well I hope OpenGL will be supported, otherwise this is pretty much a dead tech as far as enterprise industries are concerned.

    This article has a reply by the author Brian Madden in the comments regrading support for OpenGL; http://www.brianmadden.com/blogs/brianmadden/archi...

    "For support for apps that require OpenGL, they're supporting apps that use OpenGL v1.4 and below to work in the VM, but they don't expect that apps that use a higher version of OpenGL will work (unless of course they have a DirectX or CPU fallback mode)."
  • Sebec - Sunday, October 10, 2010 - link

    Page 5 -"... and the two companies are current the titans of GPU computing in consumer applications."

    Current the titans?

    "Tom believes that ultimately the company will ultimately end up using..."

Log in

Don't have an account? Sign up now