Why NVIDIA Thinks CUDA for C and Brook+ Are Viable Alternatives

While OpenCL is a high level API, it does require the programmer to perform certain tasks that don't have much to do with the parallel algorithm being implemented. OpenCL devices in the system need to be found and set up to properly handle the task at hand. This requires a lot of overhead like creation of a context, device selection, creating the command cue(s), management of buffers for supplying and collecting data on the OpenCL device, and dynamically compiling OpenCL kernels within the program. This is all in addition to writing kernels (data parallel functions) and actually using them in a program that does useful work.

The overhead and management work required is similar to what goes on with OpenGL. This makes sense considering the fact that both use GPUs, they can share data with eachother, and that the same standards body that manages OpenGL is now managing OpenCL. But the fact remains that this type of overhead is cumbersome and can be a real headache for anyone who is more interested in the algorithm. Like scientists working on HPC code who know the theory much better than the programming most of the time.

Both Brook+ and CUDA for C hide the complexity of setting up the hardware by allowing the driver to handle the details. This allows developers to write a kernel, use it, and forget about what's actually going on in the hardware for the most part. Going with something like this as a first move for both NVIDIA and AMD was a good move, as it allows developers to get familiar with the type of programming they will be doing in the future for data parallel problems without tacking higher levels of complexity than necessary.

NVIDIA, for one, believes a language extension as opposed to an API like OpenCL has major benefits and will always have a place in GPU computing (and especially in the HPC space where scientists don't want to be programmers any more than they need to). When asked if they would submit their language to a standards body, NVIDIA said that was highly unlikely as there are other language efforts out there and NVIDIA has been advancing CUDA for C much more rapidly than a standards body would.

On the down side, putting more control in the hands of the developer can result in better, faster code. There is a bit of a "black box" feeling to these solutions: you put code in and get results out, but you can't be sure what goes on in the middle to make it happen. OpenCL gives you better ability to fine tune the software and make sure that exactly what you want to happen happens. Despite NVIDIA's assertions that scientists interested in coding for HPC solutions will have a better experience with CUDA, the cost/benefit of ultra-fine tuning code for HPC machines leans heavily in favor of spending the time and money on optimizing. This means that OpenCL will likely be the choice for performance sensitive HPC applications. CUDA for C and Brook+ will likely have more of a place in just trying out ideas before settling on a final direction.

So there you have it. OpenCL will enable applications in the consumer space to take advantage of data parallel hardware, while Brook+ and CUDA may still have a place in the industry as well (but not on the consumer side of things). That is, until some other more popular standard data parallel language extensions come along and pushes both CUDA for C and Brook+ out of the market.

Open, Closed, Proprietary ... Sorting out the Confusion OpenCL Extending OpenGL
Comments Locked

37 Comments

View All Comments

  • yyrkoon - Saturday, January 3, 2009 - link

    Apparently I *am* more knowledgeable than some here. How you can twist the context of comments to your misguided reasoning ( that I favor Microsoft ) is beyond me. Do I prefer Windows to OSX ? Yes. Why? Because maybe Microsoft is not perfect, but at least they do not force unwanted hardware on me to use their software.

    Windows is the only real gaming OS. Period. And I suppose my comment about Cross platform applications, and other good strong possible uses in a *NIX environment fell on deaf ears too( uses for OpenCL ).

    There is nothing wrong with OSX, it is after all based on BSD. However I will not over pay for hardware *just* to use it either. There are too many free operating systems that are just as good. If I need Windows application compatibility, I will just run Windows. Apple offers me *nothing* I have to have.

    Now, who here is truly blind ?

  • melgross - Saturday, January 3, 2009 - link

    You just want to think you are.

    You have gaming on the brain. I guess you must BE a gamer as that's all they think about anyway.
  • Penti - Saturday, January 3, 2009 - link

    Really who cares about the gaming? This isn't a physics framework or engine.

    It can be used in games, but this isn't really about a discussion on Apple gaming. That's not really why it can "speak" to each other.

    Apple got a lot of professional applications that today uses the open standard OpenGL like photo editing, video editing, VFX and others (scientific apps etc) on their platform, for not only graphics but for gpgpu, from not only them selfs but from vendors such as Adobe and Avid. Most of the apps also use OpenGL for acceleration in Windows too. Besides that, OpenCL will be available for handheld devices such as mobile phones. Even though Microsoft does software for phones you won't see DX11 or GPGPU there. Not that I'm an Apple fanboy, but I can see why Apple builds on what's already around and extends OpenGL and free standards. They can't rely on close standards, most of their apps (other vendors for OS X) are to some degree cross platform as they should be. CUDA is already available on the Mac too. But you can't expect them to run DX. This isn't about Apple as an OEM either. It's about software (Microsoft does hardware too). It's engineered to fit a wider picture and a wider array of devices including Windows, there isn't anything bad about that. There isn't anything bad about getting consumer and professional apps a boost in using GPGPU. It's certainly what some ISVs want. Theres more then gaming in the world. Microsoft are free to do whatever and nobody has said that they aren't best on games, but people are also free to criticizes and complain about Microsoft, just as they are about Apple and there certainly is a lot to be criticizing both about. Apple for certain can't just be catering to it selfs, not when they and their software vendors want something else. Microsoft essentially can. As most are already deeply invested in Microsoft tech and soft. That doesn't mean Windows users can't benefit from the Apple developed OpenCL. Their certainly is Windows only apps that will use it. Even non OpenGL ones. It's not only a cross platform library.
  • Atechie - Friday, January 2, 2009 - link

    Drop the Apple-preaching, it's uninteresting as Apple is neither HPC nor the mainstay platform for CUDA/Brook+/OpenCL.

    .oO(I swear, Apple-jocks are like religious zealots, they can stop pushing their religion down everbody elses throat...interested or not.)
  • melgross - Saturday, January 3, 2009 - link

    Yeah, just like people like you who do the opposite?

    Why mention the company who did all the work, as long as it's Apple? Right? That' makes people fanboys if we think a proper mention should be made?
  • Shadowself - Friday, January 2, 2009 - link

    So anyone says anything positive about Apple and immediately that equates to being an Apple zealot? It appears more likely that your personal bias is showing.

    It is absolutely true that Apple's Mac has NEVER been a gamer's platform -- and it probably never will be. Additionally, Apple has never fully supported (or even properly supported, IMHO) any development other than their core groups (K-12, Undergraduate to some extent, graphics and motion picture artist communities, and publishing). Thus Apple supports low to mid range graphics card and very high end 3D cards -- but absolutely nothing for the moderate to high end gamer.

    However, Apple did do the vast majority of OpenCL before submitting it to become an open standard. Apple wants to expand its role in the graphics and motion picture communities. The only way to do this was to do something like OpenCL. Additionally, Apple knew that a completely closed set of APIs was not going to gain any traction. Thus they submitted it as an open standard and gave up control of it.

    Not mentioning that Apple did the majority of OpenCL is wrong. For anyone to claim Apple did this altruistically is wrong. To bash Apple for coming up with something that has become a cross platform standard that can utilize both AMD and nVidia cards as well as a host of other hardware is wrong.
  • yyrkoon - Thursday, January 1, 2009 - link

    I never said it wasn't true. Let us just say that I am less than inspired to even bother looking. OpenGL is very low on my personal list of priorities, and I could care less what Apple does( unless perhaps if someday they compete head to head with Microsoft ).

    Still, no matter how much I like or dislike OpenCL, chances are pretty good that on Windows platforms, it is going to be rendered( pun? ) moot. Maybe it will make the next greatest XGL even more powerful, so all those people who like to play with their application windows in linux can spend all day every day bragging/ making youtube videos about how their desktop UI can do *this*, and *that* while remaining even less productive than before ; )

    Yes, the above is sarcasm to some extent, but it also true to an extent as well. OpenCL will help those who prefer and alternative to Windows do similar things without having to own Windows. Scientists who want to use GPGPU(s) to crunch some serious numbers, etc. What it will not do however is make the majority of gamers out there happy. *Unless* the majority of game developers start using OpenGL/CL on the Windows platform( Which is very unlikely ). Certain cross platform applications however could benefit, sure.
  • Penti - Friday, January 2, 2009 - link

    So OpenCL and OpenGL is bad because it's cross platform and open standard? If you look at who's involved you see companies like ARM and embedded computing companies, they can't really use anything like DX11. This isn't just for games but GPGPU in general.

    It's not like there isn't apps using OpenGL on Windows either. But it's rather about a broader spectrum then owning or not owning Windows. It's for a wider category of devices then DX11 is. You won't have DX11 cellphones. But you will have OpenCL on the next gen Sony and Nintendo consoles, handhelds, settopboxes etc. In HPC too, there will be libraries/frameworks to help you out.

    Of course theres professional apps such as Photo-editing, video-editing and encoding, VFX, CAD / GIS, math and other engineering software that could benefit widely from Open CL. And a lot of them are cross-platform. Or at least would need the OpenCL on for example the Mac. Where they might have many customers.
  • kevinkreiser - Wednesday, December 31, 2008 - link

    a while back i published a paper that involved performing an iterative deconvolution on the GPU. the point of the paper was that we could do it in real-time and use it on videos with arbitrary spatially varying blur kernels.

    anyway the largest overhead was copying the render target (single iteration of the algorithm) to initialize the next iteration. if dx11 and opencl allow the gpu and cpu to work with the same memory, without the need to copy between the two, this will speed up gpgpu apps tremendously.
  • has407 - Monday, January 12, 2009 - link

    OpenCL itself is neutral; it provides both explicit copy and map functions, in both synchronous and asynchronous forms. Obviously what works best will depend on platform capabilities and run-time intelligence (e.g., copy/map optimizations based on platform capabilities and program behavior).

    However, that still doesn't necessarily allow for a large mapped/shared memory between the CPU and CPU. That and its efficacy is going to be implementation dependent and OpenCL has simply defined a model that should be portable and useful, even if suboptimal on a given implementation--but if you know enough about the implementation, gives you sufficient optimization choices.

    That requires some constraints on the memory model, in particular the consistency/correctness of various memory regions with respect to computational elements at different points and times, and especially with respect to mapped memory (NB: sec 5.2.8.1 of the spec).

Log in

Don't have an account? Sign up now