Cyberlink & Adobe: Views On Consumer GPU Computing
GTC is a professional show, but that doesn’t just mean it’s for professional software, or even consumer software for that matter. It’s also a show for professional software developers looking into how to better utilize GPUs in their products. This brought Adobe and Cyberlink to the show, and the two companies are the current titans of GPU computing in consumer applications. One of our major goals with GTC was to speak with each of them and to see what their expectations were for GPU computing in consumer applications in the future (something NVIDIA has heavily promoted for years), and we didn’t leave disappointed. In fact if anything we got some conflicting views, which showcases just where the use of GPU computing in consumer applications stands today.
We’ll start with Cyberlink, who was on hand to showcase their latest version of MediaExpresso (née, MediaShow Expresso), their GPU-powered video encoding suite. Tom Vaughan, their director of business development was on hand to show the software and answer our questions.
As it stands, Cyberlink develops products against both AMD’s Stream API and NVIDIA’s CUDA API, largely as a legacy of the fact that they’ve offered such products for a few years now when means they predate common APIs like OpenCL or DirectCompute. So our first questions were about the progress of these APIs, and where the company sees itself going with them.
Cyberlink is primarily a Windows company, so they're in a position where they can use either OpenCL or DirectCompute as they deem it necessary. They're already using DirectCompute for face detection in MediaShow (and this is the only commercial application using DirectCompute that we're aware of right now), although it sounds like the company is likely to do the bulk of their GPU computing work with OpenCL in the future. For products such as MediaShow Expresso, Tom believes that ultimately the company will end up using a single OpenCL codebase with a single codepath for the GPU acceleration of encoding and related features. From our perspective this is very forward-looking, as we are well aware that there are still issues with OpenCL for Cyberlink (or anyone else intent on shipping an OpenCL product) to deal with, such as issues to work out with the OpenCL drivers and the lack of a common ICD; and it doesn’t help that not everyone currently ships OpenCL drivers with their regular driver set.
Furthermore,, as Tom noted to us it's not in Cyberlink's best interests to immediately jump ship to DirectCompute or OpenCL for everything right away. As it stands there’s no good reason for them to reinvent the wheel by throwing out their existing work with CUDA and Stream, so those codebases could be in use for quite a while longer. New features will be written in OpenCL (when the drivers are ready), but for the time being the company is attached to CUDA/Stream for a while longer yet. But on that note, when they go to OpenCL they really do expect to have a single codebase, and will be running the same routines on both AMD and NVIDIA GPUs, as they believe it’s unnecessary to write optimized paths for AMD and NVIDIA even with the significant differences between the two companies’ GPU architectures.
In the wider viewer, Tom definitely sees additional companies finally getting in to GPU compute and for more things than just video encoding once the OpenCL driver situation stabilizes. I don’t believe Tom (or really anyone else) knows what else GPU computing is going to be used for in the consumer space, but it’s clear the groundwork for it is finally starting to come together. Truthfully we expected the OpenCL driver and application situation to be a great deal better more than a year after OpenCL was finalized, but it looks like that time is fast approaching.
The second group we talked to was Adobe, who was giving a session based around their experiences in shipping commercial software with GPU acceleration over the last 3 iterations of the Adobe Creative Suite. In Adobe’s case they’ve had GPU graphical acceleration since CS3, and CS5 added the first example of GPU compute acceleration with the CUDA-powered Mercury Engine in Premiere Pro CS5. Giving the talk was Kevin Goldsmith, the engineering manager of their Image Foundation group, who in turn provides the GPU-accelerated libraries the other groups use in their products.
Over the hour they listed a number of specific incidents they had, but there were a few common themes throughout the talk. The first was that they can’t ignore Intel’s CPUs or their GPUs; the latter because they are the #1 source of crashes for their GPU accelerated software, and the former because you have to rely on them when their GPUs aren’t stable or a suitable GPU simply isn’t available. In fact Kevin was rather frank about how Intel’s GPUs and drivers had caused them a great deal of grief, and that as a result they’ve had to blacklist a great deal of Intel product/driver combinations as the drivers advertise features that aren’t working correctly or aren’t stable. There was a very strong message that for the time being, any kind of GPU acceleration was really only viable on NVIDIA and AMD GPUs, which in turn is a much smaller market that requires that the company have CPU codepaths for everything they do with a GPU.
This lead in to a discussion on testing, which was the second common theme. The unfortunate reality is that GPUs are death to QA at the moment. CPU are easy to test against, and while the underlying OS can change it very rarely does so; meanwhile GPUs come far more often in far more varieties, and have a driver layer that is constantly changing. Case in point, Apple caused a lot of trouble for Adobe with the poorly performing NVIDIA drivers in the Mac OS X 10.6.4 update. For all practical purposes, using GPU acceleration in commercial software requires Adobe to have a significant automated GPU test suite to catch new driver regressions and other issues that can hamstring their applications, as manual testing alone would never catch all of the issues they see with automated testing.
The 3rd and final common theme was that GPUs aren’t a panacea and won’t solve everything. Amdahl's Law is of particular importance here, as only certain functions can be parallelized. If a route is inherently serial and cannot be done any other way, then parallelizing the rest of the code only makes sense if the serial routine is not already the biggest bottleneck; using the GPU only makes sense in the first place if the bottlenecks can easily be made to run on a GPU, and if heavy CPU/GPU communication is required as those operations are expensive.
The audience of an application also plays a big part in whether writing GPU code makes sense – most users won’t upgrade their computers for most programs, and most computers don’t have a suitable GPU for significant GPU computing. In Adobe’s case they focused on adding GPU computing first to the applications most used by professionals that were likely to upgrade their hardware for the software, and that was Premiere Pro. Premiere Elements on the other hand is a hobbyist application, and hobbyists won’t upgrade.
Finally, as for where the company is going with GPU computing, they’re in much the same boat as Cyberlink: they want to use OpenCL but they’re waiting for AMD and NVIDIA to get their drivers in order (Adobe finds new bugs almost daily). Furthermore unlike Cyberlink they find that the architecture of the GPU has a vast impact on the performance of the codepaths they use, and that when they do use OpenCL they will likely have different codepaths in some cases to better match AMD and NVIDIA’s different architectures
In the meantime Adobe believes that it’s quite alright to pick a vendor-specific API (i.e. CUDA) even though it limits what hardware is supported, as long as the benefits are worth it. This once more falls under the umbrella of a developer knowing their market: again going back to Premiere Pro, Premiere Pro is a product targeted at a professional market that’s likely to replace their hardware anyhow, and that market has no significant problem with the fact that Premiere Pro only supports a few different NVIDIA cards. Adobe doesn’t want to be in this situation forever, but it’s a suitable compromise for them until OpenCL is ready for widespread use.
Ultimately Adobe’s view on GPU computing is much like Cyberlink’s: there’s a place for GPU computing with consumer applications, OpenCL is the way to get there, but OpenCL isn’t ready yet. The biggest difference between the two is their perspective on whether different codepaths for different architectures are necessary: Adobe says yes, Cyberlink says no. Cyberlink’s position is more consistent with the original cross-platform goals of OpenCL, while Adobe’s position is more pragmatic in recognizing that compilers can’t completely compensate for when the code is radically different than the underlying architecture. As the consumer GPU computing market matures, it’s going to be important to keep an eye on whether one, both, or neither of these positions is right, as the ease in writing high performing GPU code is going to be a crucial factor in getting consumer GPU computing to take off for more than a handful of specialty applications.