DirectCompute, OpenCL, and the Future of CAL

As a journalist, GPGPU stuff is one of the more frustrating things to cover. The concept is great, but the execution makes it difficult to accurately cover, exacerbated by the fact that until now AMD and NVIDIA each had separate APIs. OpenCL and DirectCompute will unify things, but software will be slow to arrive.

As it stands, neither AMD nor NVIDIA have a complete OpenCL implementation that's shipping to end-users for Windows or Linux. NVIDIA has OpenCL working on the 8-series and later on Mac OS X Snow Leopard, and AMD has it working under the same OS for the 4800 series, but for obvious reasons we can’t test a 5870 in a Mac. As such it won’t be until later this year that we see either side get OpenCL up and running under Windows. Both NVIDIA and AMD have development versions that they're letting developers play with, and both have submitted implementations to Khronos, so hopefully we’ll have something soon.

It’s also worth noting that OpenCL is based around DirectX 10 hardware, so even after someone finally ships an implementation we’re likely to see a new version in short order. AMD is already talking about OpenCL 1.1, which would add support for the hardware features that they have from DirectX 11, such as append/consume buffers and atomic operations.

DirectCompute is in comparatively better shape. NVIDIA already supports it on their DX10 hardware, and the beta drivers we’re using for the 5870 support it on the 5000 series. The missing link at this point is AMD’s DX10 hardware; even the beta drivers we’re using don’t support it on the 2000, 3000, or 4000 series. From what we hear the final Catalyst 9.10 drivers will deliver this feature.

Going forward, one specific issue for DirectCompute development will be that there are three levels of DirectCompute, derived from DX10 (4.0), DX10.1 (4.1), and DX11 (5.0) hardware. The higher the version the more advanced the features, with DirectCompute 5.0 in particular being a big jump as it’s the first hardware generation designed with DirectCompute in mind. Among other notable differences, it’s the first version to offer double precision floating point support and atomic operations.

AMD is convinced that developers should and will target DirectCompute 5.0 due to its feature set, but we’re not sold on the idea. To say that there’s a “lot” of DX10 hardware out there is a gross understatement, and all of that hardware is capable of supporting at a minimum DirectCompute 4.0. Certainly DirectCompute 5.0 is the better API to use, but the first developers testing the waters may end up starting with DirectCompute 4.0. Releasing something written in DirectCompute 5.0 right now won’t do developers much good at the moment due to the low quantity of hardware out there that can support it.

With that in mind, there’s not much of a software situation to speak about when it comes to DirectCompute right now. Cyberlink demoed a version of PowerDirector using DirectCompute for rendering effects, but it’s the same story as most DX11 games: later this year. For AMD there isn’t as much of an incentive to push non-game software as fast or as hard as DX11 games, so we’re expecting any non-game software utilizing DirectCompute to be slow to materialize.

Given that DirectCompute is the only common GPGPU API that is currently working on both vendors’ cards, we wanted to try to use it as the basis of a proper GPGPU comparison. We did get something that would accomplish the task, unfortunately it was an NVIDIA tech demo. We have decided to run it anyhow as it’s quite literally the only thing we have right now that uses DirectCompute, but please take an appropriately sized quantity of salt – it’s not really a fair test.

NVIDIA’s ocean demo is a fairly simple proof of concept program that uses DirectCompute to run Fast Fourier transforms directly on the GPU for better performance. The FFTs in turn are used to generate the wave data, forming the wave action seen on screen as part of the ocean. This is a DirectCompute 4.0 program, as it’s intended to run on NVIDIA’s DX10 hardware.

The 5870 has no problem running the program, and in spite of whatever home field advantage that may exist for NVIDIA it easily outperforms the GTX 285. Things get a little more crazy once we start using SLI/Crossfire; the 5870 picks up speed, but the GTX 295 ends up being slower than the GTX 285. As it’s only a tech demo this shouldn’t be dwelt on too much beyond the fact that it’s proof that DirectCompute is indeed working on the 5800 series.

Wrapping things up, one of the last GPGPU projects AMD presented at their press event was a GPU implementation of Bullet Physics, an open source physics simulation library. Although they’ll never admit it, AMD is probably getting tired of being beaten over the head by NVIDIA and PhysX; Bullet Physics is AMD’s proof that they can do physics too. However we don’t expect it to go anywhere given its very low penetration in existing games and the amount of trouble NVIDIA has had in getting developers to use anything besides Havok. Our expectations for GPGPU physics remains the same: the unification will come from a middleware vendor selling a commercial physics package. If it’s not Havok, then it will be someone else.

Finally, while AMD is hitting the ground running for OpenCL and DirectCompute, their older APIs are being left behind as AMD has chosen to focus all future efforts on OpenCL and DirectCompute. Brook+, AMD’s high level language, has been put out to pasture as a Sourceforge project. Compute Abstract Layer (CAL) lives on since it’s what AMD’s OpenCL support is built upon, however it’s not going to see any further public development with the interface frozen at the current 1.4 standard. AMD is discouraging any CAL development in favor of OpenCL, although it’s likely the High Performance Computing (HPC) crowd will continue to use it in conjunction with AMD’s FireStream cards to squeeze every bit of performance out of AMD’s hardware.

The First DirectX 11 Games Eyefinity
Comments Locked

327 Comments

View All Comments

  • ClownPuncher - Wednesday, September 23, 2009 - link

    Absolutely, I can answer that for you.

    Those 2 "ports" you see are for aesthetic purposes only, the card has a shroud internally so those 2 ports neither intake nor exhaust any air, hot or otherwise.
  • Ryan Smith - Wednesday, September 23, 2009 - link

    ClownPuncher gets a cookie. This is exactly correct; the actual fan shroud is sealed so that air only goes out the front of the card to go outside of the case. The holes do serve a cooling purpose though; allow airflow to help cool the bits of the card that aren't hooked up to the main cooler; various caps and what have you.
  • SiliconDoc - Wednesday, September 23, 2009 - link

    Ok good, now we know.
    So the problem now moves to the tiny 1/2 exhaust port on the back, did you stick your hand there and see how much that is blowing ? Does it whistle through there ? lol
    Same amount of air(or a bit less) in half the exit space... that's going to strain the fan and or/reduce flow, no matter what anyone claims to the contrary.
    It sure looks like ATI is doing a big favor to aftermarket cooler vendors.

  • GhandiInstinct - Wednesday, September 23, 2009 - link

    Ryan,

    Developers arent pushing graphics anymore. Its not economnical, PC game supports is slowing down, everything is console now which is DX9. what purpose does this ATI serve with DX11 and all this other technology that won't even make use of games 2 years from now?

    Waste of money..
  • ClownPuncher - Wednesday, September 23, 2009 - link

    Clearly he should stop reviewing computer technology like this because people like you are content with gaming on their Wii and iPhone.

    This message has been brought to you by Sarcasm.
  • Griswold - Wednesday, September 23, 2009 - link

    So you're echoing what nvidia recently said, when they claimed dx11/gaming on the PC isnt all that (anymore)? I guess nvidia can close shop (at least the gaming relevant part of it) now and focus on GPGPU. Why wait for GT300 as a gamer?

    Oh right, its gonna be blasting past the 5xxx and suddenly dx11 will be the holy grail again... I see how it is.
  • SiliconDoc - Wednesday, September 23, 2009 - link

    rofl- It's great to see red roosters not crowing and hopping around flapping their wings and screaming nvidia is going down.
    Don't take any of this personal except the compliments, you're doing a fine job.
    It's nice to see you doing my usual job, albiet from the other side, so allow me to compliment your fine perceptions. Sweltering smart.
    But, now, let's not forget how ambient occlusion got poo-pooed here and shading in the game was said to be "an irritant" when Nvidia cards rendered it with just driver changes for the hardware. lol
    Then of course we heard endless crowing about "tesselation" for ati.
    Now it's what, SSAA (rebirthed), and Eyefinity, and we'll hear how great it is for some time to come. Let's not forget the endless screeching about how terrible and useless PhysX is by Nvidia, but boy when "open standards" finally gets "Havok and Ati" cranking away, wow the sky is the limit for in game destruction and water movement and shooting and bouncing, and on and on....
    Of course it was "Nvidia's fault" that "open havok" didn't happen.
    I'm wondering if 30" top resolution will now be "all there is!" for the next month or two until Nvidia comes out with their next generation - because that was quite a trick switching from top rez 30" DOWN to 1920x when Nvidia put out their 2560x GTX275 driver and it whomped Ati's card at 30" 2560x, but switched places at 1920x, which was then of course "the winning rez" since Ati was stuck there.
    I could go on but you're probably fuming already and will just make an insult back so let the spam posting IZ2000 or whatever it's name will be this time handle it.
    BTW there's a load of bias in the article and I'll be glad to point it out in another post, but the reason the red rooster rooting is not going beyond any sane notion of "truthful" or even truthiness, is because this 5870 Ati card is already percieved as " EPIC FAIL" !
    I cannot imagine this is all Ati has, and if it is they are in deep trouble I believe.
    I suspect some further releases with more power soon.



  • Finally - Wednesday, September 23, 2009 - link

    Team Green - full foam ahead!
    *hands over towel*
    There you go. Keep on foaming, I'm all amused :)
  • araczynski - Wednesday, September 23, 2009 - link

    is DirectX11 going to be as worthless as 10? in terms of being used in any meaningful way in a meaningful amount of games?

    my 2 4850's are still keeping me very happy in my 'ancient' E8500.

    curious to see how this compares to whatever nvidia rolls out, probably more of the same, better in some, worse in others, bottom line will be the price.... maybe in a year or two i'll build a new system.

    of course by that time these'll be worthless too.
  • SiliconDoc - Wednesday, September 23, 2009 - link

    Well it's certainly going to be less useful than PhysX, which is here said to be worthless, but of course DX11 won't get that kind of dissing, at least not for the next two months or so, before NVidia joins in.
    Since there's only 1 game "kinda ready" with DX11, I suppose all the hype and heady talk will have to wait until... until... uhh.. the 5870's are actually available and not just listed on the egg and tiger.
    Here's something else in the article I found so very heartwarming:
    ---
    " Wrapping things up, one of the last GPGPU projects AMD presented at their press event was a GPU implementation of Bullet Physics, an open source physics simulation library. Although they’ll never admit it, AMD is probably getting tired of being beaten over the head by NVIDIA and PhysX; Bullet Physics is AMD’s proof that they can do physics too. "
    ---
    Unfortunately for this place,one of my friends pointed me to this little expose' that show ATI uses NVIDIA CARDS to develope "Bullet Physics" - ROFLMAO
    -
    " We have seen a presentation where Nvidia claims that Mr. Erwin Coumans, the creator of Bullet Physics Engine, said that he developed Bullet physics on Geforce cards. The bad thing for ATI is that they are betting on this open standard physics tech as the one that they want to accelerate on their GPUs.

    "ATI’s Bullet GPU acceleration via Open CL will work with any compliant drivers, we use NVIDIA Geforce cards for our development and even use code from their OpenCL SDK, they are a great technology partner. “ said Erwin.

    This means that Bullet physics is being developed on Nvidia Geforce cards even though ATI is supposed to get driver and hardware acceleration for Bullet Physics."
    ---
    rofl - hahahahahha now that takes the cake!
    http://www.fudzilla.com/content/view/15642/34/">http://www.fudzilla.com/content/view/15642/34/
    --
    Boy do we "hate PhysX" as ati fans, but then again... why not use the nvidia PhysX card to whip up some B Physics, folks I couldn't make this stuff up.

Log in

Don't have an account? Sign up now