AMD & Multi-GPU Stuttering: A Work In Progress

When it comes to stuttering there’s really two classes of stuttering that need to be discussed. The first is single-GPU stuttering as we’ve discussed in the previous pages, where driver issues, application issues, and the context buffer all interact to influence the pace for how frames are doled out before finally being rendered and presented. The second type of stuttering, micro-stuttering, is endemic to multi-GPU configurations. In micro-stuttering on top of all of the other issues encountered with single-GPU stuttering, there are also further variance and frame interval issues introduced due to how multi-GPU configurations split their workloads.

In brief, in multi-GPU setups, be it single-card products like the GTX 690 or multiple cards such as a pair of 7970s, the primary mode of splitting up work is a process called Alternate Frame Rendering (AFR). In AFR, rather than have multiple GPUs working on a single frame, each GPU gets its own frame. This method has over time proven to be the most reliable method, as attempting to split up a single frame over multiple GPUs (with their relatively awful interconnect) has proven to be unreliable and difficult to get working. AFR in contrast is by no means perfect and has to deal with inter-frame dependency issues – where the next frame relies in part on the previous frame – but this is still easier to implement and more consistent than previous efforts at splitting frames.

Moving on, due to the mechanisms of AFR, it can further impact the frame intervals and as a result whether stuttering is perceived. To do AFR well it’s necessary to pace the output of each GPU such that each GPU is delivering a rendered frame at as even a rate as possible; not too soon after the previous frame, and not too late such that the following frame comes up quickly. In a 2 GPU setup, which is going to be the most common, this means the second GPU needs to produce a finished frame when the first GPU is roughly half-way done with its current frame. Should this fail to happen then we have micro-stuttering.

Micro-stuttering has been a longstanding issue on multi-GPU setups. Both NVIDIA and AMD have worked on the issue to various degrees, but at the end of the day multi-GPU setups have never proven to be as reliable as single-GPU setups, which is why our editorial position on the matter has been to always favor single powerful GPUs over multiple GPUs when at all possible. To that end, just as FRAPS has ignited an interest in single-GPU stuttering issues, it has also ignited an interest in multi-GPU stuttering issues.

The bulk of AMD’s presentation – and consequently our own article – has been focused on single-GPU issues. AMD is working on multi-GPU issues too, but the team that is handling microstuttering is not the team we were speaking to. The team we were speaking to is the team that has been handling single-GPU issues. As such AMD’s statements on the matter are meaningful, but brief.

Multi-GPU stuttering has become an important issue for AMD just as single-GPU stuttering has, and AMD is working on a resolution for it. That resolution will come in or around a July driver drop, at which point AMD will introduce some new driver options to control how their cards deal with the issue. In the meantime however micro-stuttering and how AMD’s multi-GPU technology compares to NVIIDA’s multi-GPU technology is likely to become a bigger issue before AMD can push out their new driver. So AMD may be spending the next couple of months on the defensive.

AMD’s current position on micro-stuttering is that they are favoring latency (and not frame intervals) above all else. By keeping their latency low and as even as possible the resulting input lag from multi-GPU setups is reduced, making the experience more responsive for the user. This is a position that’s essentially in alignment with how they’re handling single-GPU stuttering too, but in the single-GPU world there isn’t a deliberate frame pacing aspect to take into consideration since they merely need to render frames as fast as they receive them.

In any case, AMD’s position on the issue has been one where they clearly still think they’re right, but also one where they’re going to lighten up on their position regardless. The alternative approach to favoring latency is to favor the frame interval, which in the case of multi-GPU setups means focusing on frame pacing. By deliberately delaying frames AMD can ensure they arrive more evenly, but in doing so they would increase the latency in the rendering pipeline, and ultimately the latency the user experiences. AMD already does this to some degree, but today it’s not being done explicitly and favored over latency concerns, which is what’s going to change.

In a typical AMD move, AMD will ultimately be leaving this up to the user. In their July driver AMD will be introducing a multi-GPU stuttering control that will let the user pick between an emphasis on latency, or an emphasis on frame pacing. The former of course being their current method, while the latter would be their new method to reduce micro-stuttering at the cost of latency.

We don’t have any more details on this driver or AMD’s frame pacing method at this time, but in our conversation with AMD they didn’t sound like they were worried about having any problems implementing explicit frame pacing, and that it was merely a matter of the process taking a while. Frame pacing itself can cause its own stutter issues – holding back one frame but not another can sometimes make the frame display evenly, but from a simulation step only a few milliseconds after the previous step – but ultimately the pacing process will cause the simulation to try to match the GPU and pace itself accordingly, so it would be a transient issue.

On a final note, more so than even single-GPU stuttering, multi-GPU stuttering is something that unfortunately FRAPS is poorly prepared for. By looking at Present calls it’s completely blind to how the GPU is doing any frame pacing, which means it’s currently difficult to see the impact of frame pacing short of a high-speed camera. As further tools are developed that let us analyze the end of the rendering chain, this will allow us to more properly analyze how frame pacing works and what its true impact on the user is.

AMD & Single-GPU Stuttering: Causes & Solutions Final Words
Comments Locked

103 Comments

View All Comments

  • Juddog - Tuesday, March 26, 2013 - link

    What the hell you talking about? Network latency is an entirely different subject.
  • Juddog - Tuesday, March 26, 2013 - link

    I had meant the above as a reply to the guy talking about network fragmentation; I'm not sure why the reply in the new format doesn't auto-nest the response.
  • danielkza - Tuesday, March 26, 2013 - link

    Because then the measurement wouldn't be representative of the performance users will actually see?
  • polaco - Tuesday, March 26, 2013 - link

    Thanks a lot for this interesting article. Is astonishing to see how minimal software issues can severely degrade performance and efforts done in other areas, turning a company less competitive with the money losses this takes with it.
    Also is a reminder of how important is to implement deep quality and performance evaluations in software development. Is a shame that in today software industry the delivery dates are more important than quality many times and programmers end up delivering half baked applications from also half baked requirements.
    Thanks again.
  • sudz - Tuesday, March 26, 2013 - link

    Good to know I'm not going crazy. Almost every game I play has a decent frame rate, but still doesn't seem smooth. (Gigabyte Windforce 6850 OC) Tried underclocking, overclocking, Different PC's... I thought I had a dud card.
  • DemBones79 - Tuesday, March 26, 2013 - link

    Reading through the whole article, I became increasingly convinced that it's not that FRAPS is necessarily a bad tool for measuring this, but that people need guidance in how to interpret the graphs correctly.

    The first time I saw a frame latency (or whatever you're calling them now) graph, my first impression wasn't, "Wow, look at all these little latency spikes." It was, "Holy sh*t! Look at those huge freaking spikes!" It was a simple matter of severity. I think anyone can take a look at the "heartbeat", see that it is a recurring pattern with a relatively consistent frequency, and- while they may not be able to say for certain if it is indicative of a problem- they can say that it is "normal" for that particular card. It's the huge spikes, the ones that aren't occurring at consistent intervals, that are so much more severe than the "heartbeat", that are the issue.

    How hard would it be for a reviewer to draw a pair of horizontal lines across the graph to indicate the limits of "normal" stuttering, where anything beyond the lines in either direction would be considered "abnormal"? A method of separating the signal from the noise.

    Furthermore, I thought it was reviewers noticing a difference- that framerate alone couldn't explain- in the way games played between ATI and NVIDIA that prompted the whole investigation into latency. Several sections in the article mention how FRAPS results may not be indicative of user experience. But it was user experience that prompted using FRAPS to try and explain what was being observed.
  • JPForums - Tuesday, March 26, 2013 - link

    Thre are two things you need to keep in mind:
    1) Nvidia also agrees with the limitation of FRAPS. In fact, IIRC they were the first to voice the issue that FRAPS recordings are in the wrong place and can only infer what actually needs to be recorded. The author is correct, when Ati and Nvidia agree, we should at least pay attention.

    2) Though your your points are AFAIK correct and well articulated, they still point to the issue of FRAPS inferring, rather than recording the the targeted information. The difference is, rather than consistency of output frames, you are looking for consistency of simulation steps. I agree that this is a metric that really needs to be covered. In fact, I would even go as far as matching simulation steps to their corresponding frame times to expose issues when short steps are accompanied by long frames or vice versa.

    Unfortunately, FRAPS can't measure any of this directly and even for your points proves to be limited to inference. That said, until a reviewer gets tools that can reveal this information, inference via FRAPS is better than no information at all. Pcperspective's comments on AMD's stuttering issues are related (as they state) to crossfire setups. I could see the differences between CF and SLI in blind tests (though SLI also has some microstutter) and this only confirms it. The runt frames only add fuel to the fire. I'm open to using AMD in single GPU builds, but only use Nvidia for multiGPU builds. Perhaps this will change in July, but I'm guessing there will still be plenty of work to do.
  • JPForums - Tuesday, March 26, 2013 - link

    Thre are two things you need to keep in mind:
    1) Nvidia also agrees with the limitation of FRAPS. In fact, IIRC they were the first to voice the issue that FRAPS recordings are in the wrong place and can only infer what actually needs to be recorded. The author is correct, when Ati and Nvidia agree, we should at least pay attention.

    2) Though your your points are AFAIK correct and well articulated, they still point to the issue of FRAPS inferring, rather than recording the the targeted information. The difference is, rather than consistency of output frames, you are looking for consistency of simulation steps. I agree that this is a metric that really needs to be covered. In fact, I would even go as far as matching simulation steps to their corresponding frame times to expose issues when short steps are accompanied by long frames or vice versa.

    Unfortunately, FRAPS can't measure any of this directly and even for your points proves to be limited to inference. That said, until a reviewer gets tools that can reveal this information, inference via FRAPS is better than no information at all. Pcperspective's comments on AMD's stuttering issues are related (as they state) to crossfire setups. I could see the differences between CF and SLI in blind tests (though SLI also has some microstutter) and this only confirms it. The runt frames only add fuel to the fire. I'm open to using AMD in single GPU builds, but only use Nvidia for multiGPU builds. Perhaps this will change in July, but I'm guessing there will still be plenty of work to do.
  • hero1 - Tuesday, March 26, 2013 - link

    Long time reader first timer commentor. I really liked this article, and have liked most of the articles here. What I want to say is, I hope that AMD fixes their drivers and address both single and dual gpu issues. I personally didn't have any stuttering when I had 2x7970s but they sometimes lost the link to each other and my system would only see one. I switched to the Titan since I got it for a reasonable price. Now this articles makes me wonder whether I should go back and grab the 2x7970s and save some cash in hopping that AMD has the mutliple GPUs issue solved by early summer. It's good to see them working to address the issue and hope we never have to encounter this again once it's done with. Next step should be how their mutli gpu solutions scale. Thanks Ryan and keep up the good work.
  • Hrel - Tuesday, March 26, 2013 - link

    That was a good breakdown of Direct3D. I'd like to see another one for OpenGL if we could. A side by side comparison would be nice.

Log in

Don't have an account? Sign up now