Lucid's Multi-GPU Wonder: More Information on the Hydra 100

Name: Lucid's Multi-GPU Wonder: More Information on the Hydra 100
Item: Lucid's Multi-GPU Wonder: More Information on the Hydra 100
Author: Derek Wilson

by Derek Wilson on August 22, 2008 4:00 PM EST

Posted in
GPUs

57 Comments | Add A Comment

57 Comments

Moving Machine Code Around

Lastly, there is another elephant in the room. It takes in API calls, but what does it send out to the GPUs? It can't send the hardware itself API calls, cause it doesn't know what to do with these. It must send machine code generated by the graphics driver. So all the difficult analysis of API calls and grouping of tasks and load balancing has to happen in software in the driver.

We really don't have any idea how this would work in the real world, but it seems like they'd have to send batches of either tagged or timed API calls to the driver and tell their chip which GPU is going to get the set. The silicon would then send the newly generated machine code down the pipe to the appropriate driver until it was told otherwise or something. And of course, the chip would also request and composite the pixel data and send it back to the display device.

But that would have to have a CPU load right? We really want and need more details before we can truly understand what is happening in this software and hardware combination. It is absolutely certain though, that the only practical way to do this is for the hardware itself to be switching machine code rather than API calls. And since the hardware also has almost no memory, it can't be doing analysis either.

The progression has to be: game -> Hydra software -> graphics card driver -> Hydra 100 hardware -> multiple graphics cards. Managing this seems like it would be a bit cumbersome, and it seems like they'd have to either set register on the hardware to tell it which direction to send the next set of commands or they would have to embed something in the commands being sent to the GPUs that would help the hardware figure out where to send the data. Or maybe it can trick the graphics driver into thinking the destination of the one graphics card it is rendering to changes multiple times in a frame. We don't know, but we really want to.

We do also get the added benefit that it offers a dedicated set of two x16 PCIe lanes to graphics hardware. This means that it can be very efficient in handling the data movement between the two and it can do full up and downstream communication with both at the same time. Since it doesn't have a memory, to composite the frame, it needs to do two reads and then a write while its doing the next two reads. It's gotta keep things moving. Which it can do very quickly and efficiently with all the PCIe lanes available to it and both graphics cards.

Also note that this really does dedicate all the graphics resources available (memory, geometry, pixel, etc.) to the processing of the scene. Each card even sees full PCIe x16 bandwidth (when the Hydra 100 is talking to it anyway), and the switching back and forth can actually act as another way to hide latency (one card continues to process data while the other is receiving and then back again).

What Does This Thing Actually Do? Barriers to Entry and Final Words

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

57 Comments

View All Comments

jeff4321 - Sunday, August 24, 2008 - link
If you think that NVIDIA and AMD have been stagnant, you haven't seen the graphics industry change. The basic graphics pipeline hasn't changed. It simply got smaller. A current NVIDIA or ATI GPU probably has as much computation power as an SGI workstation from the 90's. GPGPU is a natural extension of graphics hardware. Once the graphics hardware becomes powerful enough, it starts to resemble a general purpose machine, so you build it that way. It's possible because the design space for the GPU can do more (Moore's Law).

Since it's early in the deployment of using a GPU as an application-defined co-processor, I would expect there to be competing APIs. Believe it or not, in the late eighties, x87 wasn't the only floating point processor available for x86's. Intel's 387 was slower than Weitek's floating point unit. Weitek lost because the next generation CPUs at the time started integrating floating point. Who will win? The team that has better development tools or the team that exclusively runs the next killer app.

Dynamically changing between AFR and splitting the scene is hard to do. I'm sure that ATI and NVIDIA have experimented w/ this in-house and they are either doing it now, or they have decided that it kills performance because of the overhead to change it on the fly. How Lucid can do better than the designers of the device drivers and ASICs, I don't know.

Lucid Hydra is not competition for either NVIDIA or ATI. The Lucid Hydra chip is a mechanism for the principals of the company to get rich when Intel buys them to get access to Multi-GPU software for Larrabee. It'll be a good deal for the principals, but probably a bad deal for Intel.

Licensing Crossfire and SLI is a business decision. Both technologies cost a bundle to develop. Both companies want to maximize return.
AnnonymousCoward - Saturday, August 23, 2008 - link
I'm afraid this solution will cause unacceptable lag. If the lag isn't inherent, maybe the solution will require a minimum "max frames to render ahead / Prerender limit". I don't buy their "negligible" BS answer.

Does SLI require a minimum? I got the impression it does, from what I've read in the past. I don't have SLI, and use RivaTuner to set mine to "1".
Aethelwolf - Saturday, August 23, 2008 - link
Lets pretend, if only for a moment, that I was a GPU company interested giving a certain other GPU company a black eye. And lets say I have this strategy where I design for the middle range and then scale up and down. I would be seriously haggling lucid right now to become a partner in supplying me, and pretty much only me, besides intel, with their hydra engine.
DerekWilson - Saturday, August 23, 2008 - link
that'd be cool, but lucid will sell more parts if they work with everyone.

they're interested in making lots of money ... maybe amd and intel could do that for them, but i think the long term solution is to support as much as possible.
Sublym3 - Saturday, August 23, 2008 - link
Correct me if i am wrong but isn’t this technology still depending on making the hardware specifically for each DirectX version?

So when a new DirectX or OpenGL version comes out not only will we have to update our videos cards but also our motherboard at the same time?

Not to mention this will probably jack up the price on already expensive motherboards.

Seems like a step backwards to me...
DerekWilson - Saturday, August 23, 2008 - link
you are both right and wrong --

yes the need to update the technology for each new directx and opengl release.

BUT

they don't need to update the hardware at all. the hardware is just a smart switch with a compositor.

to support a new directx or opengl version, you would only need to update the driver / software for the hydra 100 ...

just like a regular video card.
magao - Saturday, August 23, 2008 - link
There seems to be a strong correlation between Intel's claims about Larrabee, and Lucid's claims about Hydra.

This is pure speculation, but I wouldn't be surprised if Hydra is the behind-the-scenes technology that makes Larrabee work.
Aethelwolf - Saturday, August 23, 2008 - link
I think this is the case. Hydra and Larrabee appear to be made for each other. I won't be surprised if they end up mating.

From a programmers view, Larrabee is very, very exciting tech. If it fails in the PC space, it might be resurrected when next-gen consoles come along, since it is fully programmable and claims linear performance (thanks to hydra?).
DerekWilson - Saturday, August 23, 2008 - link
i'm sure intel will love hydra for allowing their platforms to support linear scaling with multigpu solutions.

but larrabee won't have anything near the same scaling issues that nvidia and amd have in scaling to multi-gpu -- larrabee may not even need this to get near linear scaling in multigpu situation.

essentially they just need to build an smp system and it will work -- shared mem and all ...

their driver would need to optimize differently, but that would be about it.
GmTrix - Saturday, August 23, 2008 - link
If larrabee doesn't need hydra to get near linear scaling isn't hydra just providing a way for amd and nvidia to compete with it?

Lucid's Multi-GPU Wonder: More Information on the Hydra 100

Moving Machine Code Around

Post Your Comment

57 Comments

View All Comments

jeff4321 - Sunday, August 24, 2008 - link

AnnonymousCoward - Saturday, August 23, 2008 - link

Aethelwolf - Saturday, August 23, 2008 - link

DerekWilson - Saturday, August 23, 2008 - link

Sublym3 - Saturday, August 23, 2008 - link

DerekWilson - Saturday, August 23, 2008 - link

magao - Saturday, August 23, 2008 - link

Aethelwolf - Saturday, August 23, 2008 - link

DerekWilson - Saturday, August 23, 2008 - link

GmTrix - Saturday, August 23, 2008 - link

Log in

Don't have an account? Sign up now