Moving Machine Code Around

Lastly, there is another elephant in the room. It takes in API calls, but what does it send out to the GPUs? It can't send the hardware itself API calls, cause it doesn't know what to do with these. It must send machine code generated by the graphics driver. So all the difficult analysis of API calls and grouping of tasks and load balancing has to happen in software in the driver.

We really don't have any idea how this would work in the real world, but it seems like they'd have to send batches of either tagged or timed API calls to the driver and tell their chip which GPU is going to get the set. The silicon would then send the newly generated machine code down the pipe to the appropriate driver until it was told otherwise or something. And of course, the chip would also request and composite the pixel data and send it back to the display device.

But that would have to have a CPU load right? We really want and need more details before we can truly understand what is happening in this software and hardware combination. It is absolutely certain though, that the only practical way to do this is for the hardware itself to be switching machine code rather than API calls. And since the hardware also has almost no memory, it can't be doing analysis either.

The progression has to be: game -> Hydra software -> graphics card driver -> Hydra 100 hardware -> multiple graphics cards. Managing this seems like it would be a bit cumbersome, and it seems like they'd have to either set register on the hardware to tell it which direction to send the next set of commands or they would have to embed something in the commands being sent to the GPUs that would help the hardware figure out where to send the data. Or maybe it can trick the graphics driver into thinking the destination of the one graphics card it is rendering to changes multiple times in a frame. We don't know, but we really want to.

We do also get the added benefit that it offers a dedicated set of two x16 PCIe lanes to graphics hardware. This means that it can be very efficient in handling the data movement between the two and it can do full up and downstream communication with both at the same time. Since it doesn't have a memory, to composite the frame, it needs to do two reads and then a write while its doing the next two reads. It's gotta keep things moving. Which it can do very quickly and efficiently with all the PCIe lanes available to it and both graphics cards.

Also note that this really does dedicate all the graphics resources available (memory, geometry, pixel, etc.) to the processing of the scene. Each card even sees full PCIe x16 bandwidth (when the Hydra 100 is talking to it anyway), and the switching back and forth can actually act as another way to hide latency (one card continues to process data while the other is receiving and then back again).

What Does This Thing Actually Do? Barriers to Entry and Final Words
Comments Locked

57 Comments

View All Comments

  • Spivonious - Monday, August 25, 2008 - link

    I don't think nVidia or AMD will try to force Lucid out of the market. If I can actually get a 100% increase in performance from purchasing a second video card, I will.

    This chip only means more sales for nVidia and AMD.
  • 7Enigma - Tuesday, August 26, 2008 - link

    But that doesn't help their bottom line in the end. Right now CF and SLI are not very popular due to their scaling and custom profile issues. Because of that, many people spring for the highest priced single card they can afford. This keeps the market segment basically tiered the way any business would like. You have low end parts, mid-grade, and uber parts.

    Now throw in the possibility that this Hydra chip works as specified. That 3 tier system just fell apart. When you look at most of the non-mainstream parts from both sides (for example Nvidia's 280, 200, and say 9800/8800GTS), you'll notice that while the price of those chips are drastically different, the performance is not near as different. This makes sense from an R&D standpoint to recoup costs, but from a logical standpoint shelling out $650 for the 280 when it debuted WOULD NOT make sense if 2 200's or 2 9800's was significantly faster for the same or less total $$$.

    That's why both ATI/AMD and Nvidia don't want them in the market. It destroys the pricing structure, and would place much more influence on the bang for the buck part (currently this would hurt Nvidia with their 280 and favor slightly ATI/AMD with their cheaper 4870 and 4850).

    Why would I spend twice as much for a 30% increase in performance with a top of the line single card solution, when I could just get two of the cheaper version for a near 60% increase over the single top card (using general performance of the latest cards)? Sure I'd need a board to support it, but it would make the SLI/CF mobo's MUCH MUCH more attractive then they currently are (I have no plans to purchase a dual-slot mobo with my upcoming build....unless we can get some actual data before Jan09...not likely).
  • jnanster - Tuesday, August 26, 2008 - link

    This is terrible!
    I was all set to buy a new system in a few months.
    Now I have to wait again, again.
  • shin0bi272 - Tuesday, August 26, 2008 - link

    lol sorry dude... but hey this way you can wait for 8 core nehalem cpus too.
  • TheDoc9 - Monday, August 25, 2008 - link

    This article reads like the same sort of hype-machine dribble that many of the dot-com wonder companies used before the 2001 collapse so they could get investors interested.

    The writer of this piece is fortunately skeptical and he should be, more so even. I hope I'm wrong and we see this technology in a year or so, but it reminds me of Constellation 3D.
  • shin0bi272 - Sunday, August 24, 2008 - link

    The way they outlined it in one of their diagrams is, an instruction which usually goes from the cpu to the northbridge to the gpus and then the gpu's sort out which card should render the command. The Hydra changes that to, cpu to northbridge to hydra to which ever gpu is ready for a new instruction. Which means its essentially taking the place of the little bridge between the gpus and the chip that makes the decision on which card is rendering the scene.

    Nvidia and AMD could have put a chip like this on their motherboards yeah but then you wouldn't need to buy 2 of the same card (and it would possibly work for the competitors card too like the hydra does). Nvidia never tried a motherboard chip to my knowledge and ati did at one time do a y cable and software controlled card selection. But I dont believe that they had a chip on the motherboard either. That reminds me of the difference between a software raid5 card and a hardware raid5 card. The hardware raid card has much better performance but it costs 3x as much. Cost could still be a factor with this chip too. I mean if it ads an extra 20 or 50 dollars to a motherboard gamers will have no problem with that. But if its an extra 200 dollars would they? Gotta make back all that R&D money somehow even if Intel backed them.

    Another question is will this solution require a multi y-cable type device like ati used to do? If the different cards are rendering the scene at different times it would stand to reason. Or will one card be designated as the output card and all finished scenes be sent to that card? That would probably be a bad idea latency wise but who wants to buy a 4way y-split cable? Then again if im going to get linear performance out of sli I can spring for a cable. Could even make a 4 way hub sort of device so that all of the cards would feed into it and then one into the monitor. Could also do a multi-in and multi-out hub to do multiple monitors (though you might not need to do that it could be easier to add and subtract monitors with one).
  • computerfarmer - Sunday, August 24, 2008 - link

    It is nice to here about new products. I hope to see this work.

    I am still waiting for the AMD 790GX/SB750 review.
  • MamiyaOtaru - Sunday, August 24, 2008 - link

    What are the odds this will be cross platform? If it relies on drivers for doing a lot of stuff odds are it will not be, which would make it a nonstarter for me. And yes I know close to no one cares ;) I do though and I'd be interested to know.
  • metro15 - Sunday, August 24, 2008 - link

    hey. they do not need any motherboard manufacturer. Imagine a Intel Labaree graphic card with many cores synchronized with Lucid chip. The performance would be unbeatable.
  • pool1892 - Sunday, August 24, 2008 - link

    larrabee does not need hydra. it will reconfigure itself to suit the load. and with something like larrabee gen2 it will have qpi, which results in much lower latencies and much higher bandwidth.
    larrabee could even achieve more than linear skaling. (theoretically more cores could result in less context changes, which means more cache hits and less waiting cycles - this will of course not happen in reality)

Log in

Don't have an account? Sign up now