Where AFR Is Mediocre, and How Hydra Can Be Better

Perhaps it’s best that we first start with a discussion on how modern multi-GPU linking is handled by NVIDIA and AMD. After some earlier experimentation, both have settled on a method called Alternate Frame Rendering (AFR), which as the name implies has each card render a different frame.

The advantage of AFR is that it’s relatively easy to implement – each card doesn’t need to know what the other card is doing beyond simple frame synchronization. The driver in turn needs to do some work managing things in order to keep each GPU fed and timed correctly (not to mention coaxing another frame out of the CPU for rendering).

However even as simple as AFR is, it isn’t foolproof and it isn’t flawless. Making it work at its peak level of performance requires some understanding of the game being run, which is why for even such a “dumb” method we still have game profiles. Furthermore it comes with a few inherent drawbacks

  1. Each GPU needs to get a frame done in the same amount of time as the other GPUs.
  2. Because of the timing requirement, the GPUs can’t differ in processing capabilities. AFR works best when they are perfectly alike.
  3. Dealing with games where the next frame is dependent on the previous one is hard.
  4. Even with matching GPUs, if your driver gets the timing wrong, it can render frames at an uneven pace. Frames need to be spaced apart equally – when this fails to happen you get microstuttering.
  5. AFR has multiple GPUs working on different frames, not the same frame. This means that frame throughput increases, but not the latency for any individual frame. So if a single card gets 30fps and takes 16.6ms to render a frame, a pair of cards in AFR get 60fps but still take 16.6ms to render a frame.

Despite those drawbacks, for the most part AFR works. Particularly if you’re not highly sensitive to lag or microstuttering, it can get very close to doubling the framerate in a 2-card configuration (and less efficient with more cards).

Lucid believes they can do better, particularly when it comes to matching cards. AFR needs matching cards for timing reasons, because it can’t actually split up a single frame. With Hydra, Lucid is splitting up frames and gives them two big advantages over AFR: Rendering can be done by dissimilar GPUs, and rendering latency is reduced.

Right now, the ability to use dissimilar GPUs is the primary marketing focus behind the Hydra technology. Lucid and MSI will both be focusing almost exclusively on that ability when it comes to pushing the Hydra and the Fuzion. What you won’t see them focusing on is the performance versus AFR, the difference in latency, or game compatibility for that matter. The ability to use dissimilar GPUs is the big selling point for the Hydra & Fuzion right now.

So how does the Hydra work? We covered this last year when Lucid first announced the Hydra, so we’re not going to cover this completely in depth again. However here’s a quick refresher for you.

As the Hydra technology is based upon splitting up the job of rendering the objects in a frame, the first task is to intercept all Direct3D or OpenGL calls, and to make some determinations about what is going to be rendered. This is the job of Lucid’s driver, and this is where most of the “magic” is in the Hydra technology. The driver needs to determine roughly how much work will be needed for each object, also look at inter-frame dependences, and finally look at the relative power of each GPU.

Once the driver has determined how to best split up the frame, it then interfaces with the video card’s driver and hands it a partial frame composed of only the bits it needs to render. This is followed by the Hydra then reading back the partial frames, and compositing them into one whole frame. Finally the complete frame is sent out to the primary GPU (the GPU the monitor is plugged into) to be displayed.

All of this analysis and compositing is quite difficult to do (which is in part why AMD and NVIDIA moved away from frame-splitting schemes) which is what makes Hydra’s method the “hard” method. Compared to AFR, it takes a great deal more work to split up a frame by objects and to render them on different GPUs.

As with AFR, this method has some drawbacks:

  1. You can still microstutter if you get the object allocation wrong. Some frames may put too much work on the weaker GPU
  2. Since you can use mismatched cards, you can’t always use “special” features like Coverage Sampling Anti-Aliasing unless both cards have the feature.
  3. Synchronization still matters.
  4. Individual GPUs need to be addressable. This technology doesn’t work with multi-GPU cards like the Radeon 5970 or the GeForce GTX 295.

This is also a good time to quickly mention the hardware component of the Hydra. The Hydra 200 is a combination PCIe bridge chip, RISC processor, and compositing engine. Lucid won’t tell us too much about it, but we know the RISC processor contained in it runs at 300MHz, and is based on Tensilica’s Diamond architecture. The version of the Hydra being used in the Fuzion is their highest-end part, the LT24102, which features 48 PCIe 2.0 lanes (16 up, 32 down). This chip is 23mm2 and consumes 5.5W. We do not have any pictures of the die or know the transistor count, but you can count on it using relatively few transistors (perhaps 100M?)

Ultimately in a perfect world, the Hydra method is superior – it can be just as good as AFR with matching cards, and you can use dissimilar cards. In a practical world, the devil’s in the details.

Index A Look at the Hydra Software
POST A COMMENT

47 Comments

View All Comments

  • krneki457 - Friday, January 08, 2010 - link

    Sorry Ryan just noticed you wrote the article. Well it was just an idea how to get at least some SLI results with as little hassle as possible. Presuming Hydra can be turned off to work only as PCIe bridge, than this ought to work. Reply
  • chizow - Thursday, January 07, 2010 - link

    Have you tried flashing the Trinergy BIOS for SLI support? It might kill off Hydra capabilities in the meantime and deprecate the Hydra 200 to its basest form, a PCIe controller but for purposes of measuring N-mode performance that should suffice. The other alternative would be to simply use the Trinergy with SLI results as a plug-in doppelganger since it is identical to the Fuzion, save for the NF200 vs. Hydra 200 serving as PCIe switches. Reply
  • jabber - Thursday, January 07, 2010 - link

    I think it has some promise. I think the ultimate aim is to be able to 'cobble' together a couple of GPUs of similar capability, have them work efficiently together and not have to worry about profiles. The profiles could just be handled seamlessly in the background.

    If they can push towards that then I'll give them the time.
    Reply
  • chizow - Thursday, January 07, 2010 - link

    The technology does still rely on profiles though. You don't need to set-up game specific profiles like with Nvidia, even if that kind of granularity is probably the best option, your choices are limited to a handful of somewhat generic performance/optimization profiles provided by Lucid.

    The scariest part of it all is that these profiles will rely on specific profiles/drivers from both Nvidia and AMD too. I'm pretty sure its covered in this article, but its covered for sure in Guru3D's write-up. Hydra only plans to release updates *QUARTERLY* and those updates will only support specific drivers from Nvidia and ATI.

    Obviously, depending on Lucid's turnaround time, you're looking at signficant delays in their compatibilities with Nvidia/ATI, but you're also looking at potentially 3 months before an update for an Nvidia/ATI driver that supports a newer game you're interested in playing. Just way too many moving parts, added complexity and reliance on drivers/profiles, all for a solution that performs worst and costs more than the established AFR solutions.
    Reply
  • danger22 - Thursday, January 07, 2010 - link

    maybe the amd 5000 cards are to new to have support for hyrda? what about trying some older lower end cards? just for interest... i know you wouldn't put them in a $350 mobo
    Reply
  • vol7ron - Thursday, January 07, 2010 - link

    I like the way this technology is headed.

    Everyone is saying "fail" and maybe they're right because they want more from the release, but I think this still has potential. I would say either, keep the funding going, or open it up to the community at large to hopefully adopt/improve.

    The main thing is that down the road this will be cheaper, faster, better. When SSDs came out stuttering, people were also saying "fail."
    Reply
  • shin0bi272 - Thursday, January 07, 2010 - link

    I know how you feel but their original claim was scalar performance with a 7watt chip on the mobo. It's not even as good as standard crossfire (and probably not even standard sli) so that's what's prompting the fail comments. Instead of getting 75fps on call of juarez with a pair of 5850's they should be getting 99 or 100 according to their original claim. Dont get me wrong it functions and for a chip thats literally a couple of months old (maybe 24 since its announcement) thats great but the entire point of hydra was to do it better out of the box than the card makers were doing it. Reply
  • shin0bi272 - Thursday, January 07, 2010 - link

    I had high hopes for this technology but alas it appears it is just not meant to be. Maybe its the single pci-e 16x lane they are using to try to feed 2 pci-e 2.0 16x lane video cards... just saying. Would have been nice to be able to keep my 8800gtx and add in a 5870 but oh well. Reply
  • AznBoi36 - Thursday, January 07, 2010 - link

    Why would you spend $350 on this mobo and then spend another $350 for a 5870, just so you can use your old 800GTX with a minimal gain? You could spend $150 on a CF mobo, plus 2 4890's at $150 each for a total of $350 that would give a 5870 a run for it's money. Reply
  • shin0bi272 - Thursday, January 07, 2010 - link

    oh and the reason for the 5850's is because I am really wanting the dx11 capabilities... I could go with 2 4890's and end up paying less yes but it wouldnt be dx11. Reply

Log in

Don't have an account? Sign up now