OpenCL 1.0: The Road to Pervasive GPU Computing

Name: OpenCL 1.0: The Road to Pervasive GPU Computing
Item: OpenCL 1.0: The Road to Pervasive GPU Computing
Author: Derek Wilson

by Derek Wilson on December 31, 2008 6:40 PM EST

Posted in
GPUs

37 Comments | Add A Comment

37 Comments

Why is Parallel Computing Hard?

There are plenty of issues with parallel programming. Breaking up the problem is often the most important and complex step, especially when the parallelism is not obvious. As we are rooted in a world of sequential programming, conceptualizing the parallelization of tasks that lend themselves to sequential programming is tough. This can require not only the reworking of code, but redesigning the entire process of solving a problem.

Even in problems that lend themselves to parallelism, exploiting the parallelism can be tough. Even if you know the best and fastest algorithm for solving a data parallel problem, it isn't always possible to translate that to an efficient program. For instance, if I want to multiply two matrices with 100k x 100k dimentions, I can't just spawn all the threads I would need. If I were using POSIX threads to calculate one cell of the result matrix each, I would spend more time creating threads and allocating resources than actually doing the computation. I've got to take the resources I have and use them to the best of my ability. Though I can do matrix multiplication in parallel, I have to be careful about how I break up the problem and I can't exploit all the parallelism possible because of the tools I normally work with.

We are also limited in terms of hardware resources. With only a few processors available for general purpose programming, even if the software overhead weren't an issue we couldn't actually get any speed up from parallelizing beyond a certain point. This not only means that we can't exploit tons of parallelism even if the algorithm lends itself to it and this discourages programmers from thinking in terms of parallelism.

How Does OpenCL Help?

What if we had not only a pool of hardware resources hundreds wide that could handle thousands of threads in flight at a time with no software overhead? Well, we do: it's called a GPU. And if we could use the GPU for processing, then we could spawn a bunch of threads and really chew through the matrix multiplication we talked about earlier (or whatever). We might still have to be concerned about how many hardware resources we have in order to best map the problem to the specific device in the system. And we still have the problem of actually spawning, managing and running threads on the GPU hardware.

But what if we could write a special function, called a kernel, that can instantly be spawned hundreds or thousands or millions of times and run on different data all without needing to handle creating and managing all the threads ourselves. And what if we didn't need to worry about how to break up our problem and left actually determining how to handle allocating threads to the runtime? Well, now we have a solution: that's OpenCL.

The GPU is the vehicle for exploiting data parallelism. But before now our vehicle has run like a train on a track called real-time 3D graphics acceleration. OpenCL removes the track and the limitations and builds in a steering wheel developers can use to take the GPU (and other parallel devices) anywhere a programmer can imagine.

Index Open, Closed, Proprietary ... Sorting out the Confusion

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

37 Comments

View All Comments

yyrkoon - Saturday, January 3, 2009 - link
Apparently I *am* more knowledgeable than some here. How you can twist the context of comments to your misguided reasoning ( that I favor Microsoft ) is beyond me. Do I prefer Windows to OSX ? Yes. Why? Because maybe Microsoft is not perfect, but at least they do not force unwanted hardware on me to use their software.

Windows is the only real gaming OS. Period. And I suppose my comment about Cross platform applications, and other good strong possible uses in a *NIX environment fell on deaf ears too( uses for OpenCL ).

There is nothing wrong with OSX, it is after all based on BSD. However I will not over pay for hardware *just* to use it either. There are too many free operating systems that are just as good. If I need Windows application compatibility, I will just run Windows. Apple offers me *nothing* I have to have.

Now, who here is truly blind ?
melgross - Saturday, January 3, 2009 - link
You just want to think you are.

You have gaming on the brain. I guess you must BE a gamer as that's all they think about anyway.
Penti - Saturday, January 3, 2009 - link
Really who cares about the gaming? This isn't a physics framework or engine.

It can be used in games, but this isn't really about a discussion on Apple gaming. That's not really why it can "speak" to each other.

Apple got a lot of professional applications that today uses the open standard OpenGL like photo editing, video editing, VFX and others (scientific apps etc) on their platform, for not only graphics but for gpgpu, from not only them selfs but from vendors such as Adobe and Avid. Most of the apps also use OpenGL for acceleration in Windows too. Besides that, OpenCL will be available for handheld devices such as mobile phones. Even though Microsoft does software for phones you won't see DX11 or GPGPU there. Not that I'm an Apple fanboy, but I can see why Apple builds on what's already around and extends OpenGL and free standards. They can't rely on close standards, most of their apps (other vendors for OS X) are to some degree cross platform as they should be. CUDA is already available on the Mac too. But you can't expect them to run DX. This isn't about Apple as an OEM either. It's about software (Microsoft does hardware too). It's engineered to fit a wider picture and a wider array of devices including Windows, there isn't anything bad about that. There isn't anything bad about getting consumer and professional apps a boost in using GPGPU. It's certainly what some ISVs want. Theres more then gaming in the world. Microsoft are free to do whatever and nobody has said that they aren't best on games, but people are also free to criticizes and complain about Microsoft, just as they are about Apple and there certainly is a lot to be criticizing both about. Apple for certain can't just be catering to it selfs, not when they and their software vendors want something else. Microsoft essentially can. As most are already deeply invested in Microsoft tech and soft. That doesn't mean Windows users can't benefit from the Apple developed OpenCL. Their certainly is Windows only apps that will use it. Even non OpenGL ones. It's not only a cross platform library.
Atechie - Friday, January 2, 2009 - link
Drop the Apple-preaching, it's uninteresting as Apple is neither HPC nor the mainstay platform for CUDA/Brook+/OpenCL.

.oO(I swear, Apple-jocks are like religious zealots, they can stop pushing their religion down everbody elses throat...interested or not.)
melgross - Saturday, January 3, 2009 - link
Yeah, just like people like you who do the opposite?

Why mention the company who did all the work, as long as it's Apple? Right? That' makes people fanboys if we think a proper mention should be made?
Shadowself - Friday, January 2, 2009 - link
So anyone says anything positive about Apple and immediately that equates to being an Apple zealot? It appears more likely that your personal bias is showing.

It is absolutely true that Apple's Mac has NEVER been a gamer's platform -- and it probably never will be. Additionally, Apple has never fully supported (or even properly supported, IMHO) any development other than their core groups (K-12, Undergraduate to some extent, graphics and motion picture artist communities, and publishing). Thus Apple supports low to mid range graphics card and very high end 3D cards -- but absolutely nothing for the moderate to high end gamer.

However, Apple did do the vast majority of OpenCL before submitting it to become an open standard. Apple wants to expand its role in the graphics and motion picture communities. The only way to do this was to do something like OpenCL. Additionally, Apple knew that a completely closed set of APIs was not going to gain any traction. Thus they submitted it as an open standard and gave up control of it.

Not mentioning that Apple did the majority of OpenCL is wrong. For anyone to claim Apple did this altruistically is wrong. To bash Apple for coming up with something that has become a cross platform standard that can utilize both AMD and nVidia cards as well as a host of other hardware is wrong.
yyrkoon - Thursday, January 1, 2009 - link
I never said it wasn't true. Let us just say that I am less than inspired to even bother looking. OpenGL is very low on my personal list of priorities, and I could care less what Apple does( unless perhaps if someday they compete head to head with Microsoft ).

Still, no matter how much I like or dislike OpenCL, chances are pretty good that on Windows platforms, it is going to be rendered( pun? ) moot. Maybe it will make the next greatest XGL even more powerful, so all those people who like to play with their application windows in linux can spend all day every day bragging/ making youtube videos about how their desktop UI can do *this*, and *that* while remaining even less productive than before ; )

Yes, the above is sarcasm to some extent, but it also true to an extent as well. OpenCL will help those who prefer and alternative to Windows do similar things without having to own Windows. Scientists who want to use GPGPU(s) to crunch some serious numbers, etc. What it will not do however is make the majority of gamers out there happy. *Unless* the majority of game developers start using OpenGL/CL on the Windows platform( Which is very unlikely ). Certain cross platform applications however could benefit, sure.
Penti - Friday, January 2, 2009 - link
So OpenCL and OpenGL is bad because it's cross platform and open standard? If you look at who's involved you see companies like ARM and embedded computing companies, they can't really use anything like DX11. This isn't just for games but GPGPU in general.

It's not like there isn't apps using OpenGL on Windows either. But it's rather about a broader spectrum then owning or not owning Windows. It's for a wider category of devices then DX11 is. You won't have DX11 cellphones. But you will have OpenCL on the next gen Sony and Nintendo consoles, handhelds, settopboxes etc. In HPC too, there will be libraries/frameworks to help you out.

Of course theres professional apps such as Photo-editing, video-editing and encoding, VFX, CAD / GIS, math and other engineering software that could benefit widely from Open CL. And a lot of them are cross-platform. Or at least would need the OpenCL on for example the Mac. Where they might have many customers.
kevinkreiser - Wednesday, December 31, 2008 - link
a while back i published a paper that involved performing an iterative deconvolution on the GPU. the point of the paper was that we could do it in real-time and use it on videos with arbitrary spatially varying blur kernels.

anyway the largest overhead was copying the render target (single iteration of the algorithm) to initialize the next iteration. if dx11 and opencl allow the gpu and cpu to work with the same memory, without the need to copy between the two, this will speed up gpgpu apps tremendously.
has407 - Monday, January 12, 2009 - link
OpenCL itself is neutral; it provides both explicit copy and map functions, in both synchronous and asynchronous forms. Obviously what works best will depend on platform capabilities and run-time intelligence (e.g., copy/map optimizations based on platform capabilities and program behavior).

However, that still doesn't necessarily allow for a large mapped/shared memory between the CPU and CPU. That and its efficacy is going to be implementation dependent and OpenCL has simply defined a model that should be portable and useful, even if suboptimal on a given implementation--but if you know enough about the implementation, gives you sufficient optimization choices.

That requires some constraints on the memory model, in particular the consistency/correctness of various memory regions with respect to computational elements at different points and times, and especially with respect to mapped memory (NB: sec 5.2.8.1 of the spec).

OpenCL 1.0: The Road to Pervasive GPU Computing

Why is Parallel Computing Hard?

How Does OpenCL Help?

Post Your Comment

37 Comments

View All Comments

yyrkoon - Saturday, January 3, 2009 - link

melgross - Saturday, January 3, 2009 - link

Penti - Saturday, January 3, 2009 - link

Atechie - Friday, January 2, 2009 - link

melgross - Saturday, January 3, 2009 - link

Shadowself - Friday, January 2, 2009 - link

yyrkoon - Thursday, January 1, 2009 - link

Penti - Friday, January 2, 2009 - link

kevinkreiser - Wednesday, December 31, 2008 - link

has407 - Monday, January 12, 2009 - link

Log in

Don't have an account? Sign up now