AMD & Single-GPU Stuttering: Causes & Solutions

Thus far we’ve discussed stuttering and the rendering pipeline in theory, and taken a look at an example of the rendering pipeline in practice through GPUView. With a basic understanding of those principles, we can finally get into explaining AMD’s specific situation. Why did AMD have a single-GPU stuttering problem?

The shortest answer also the bluntest answer: AMD had a stuttering problem because AMD wasn’t looking for a stuttering problem. AMD does a great deal of competitive analysis (read: seeing what NVIDIA is doing) on overall performance, but AMD was never doing competitive analysis for stuttering.

Because stuttering is such a complex issue and AMD had such great knowledge into their drivers, AMD assumed that stuttering was occurring due to the applications and the OS, things that were out of their control. Furthermore because those things were out of their control, AMD assumed that they were happening to NVIDIA and Intel GPUs too. After all, there wasn’t any kind of competitive analysis to scientifically confirm this. AMD never saw that NVIDIA cards weren’t experiencing as much stuttering, and consequently never saw that they did in fact have more control over stuttering than they first thought.

Ultimately it wasn’t until Scott Wasson and other journalists went to work with FRAPS and kept it up that it became obvious to AMD that they had a problem. FRAPS may be a coarse tool, but even it could see some of AMD’s stuttering issues.

Since that time AMD has been hard at work on fixing the issue, producing new driver builds later last year and in the first part of this year to address the issue. AMD’s latest drivers have been fixing bugs, engaging workarounds, and otherwise taking care of this issue so that they can be competitive with NVIDIA when it comes to stuttering.

There is still work to do – AMD quickly fixed their DX9 issues, while DX10 fixes are in the process of being rolled out – but in many ways this is a post-mortem on the issue rather than being an explanation of what AMD will do in the future. Not every game is fixed yet, but many are. Scott Wasson’s most recent results show an incredible improvement for AMD compared to where they were even 6 months ago.

The biggest changes AMD has made from here on out are that they’re now doing competitive analysis on stuttering and they’re explicitly looking for it in their tools, to ensure that stuttering issues don’t return (at least in as much as they are able to control). With many of the bugs and issues that lead to stuttering in the first place already fixed, AMD can use what they’ve learned to analyze future games and try to catch issues before the game is released, or at the very least fix it as quickly as possible.

AMD’s gameplan aside, there are two remaining questions on the subject that need to be dealt with. The first is what happened at a technical level to cause the stuttering in the first place, and what, if anything, can be done about the remaining stuttering.

The answer to the first question is that what went wrong depends on the game. AMD did not go into specific detail about individual games, but they did lay out the types of issues they ran across. For example, resource limits may occur in the application or the driver, triggering a stall that in turn triggers stuttering. Discarding the constant or vertex buffers too often was one such cause of this, as it would mean the driver would need to wait for one of the buffers to actually become freed up before the job could proceed.

Other times the issue was the driver itself misbehaving in a way AMD never expected. In one such case AMD’s driver was sporadically consuming far more CPU time than AMD intended, something AMD never even realized was possible. The end result was yet another block that triggers a stall that triggers stuttering.

Yet still other problems are in the application and the OS itself. As we’ve mentioned before AMD cannot fix these issues because they’re not under AMD’s control, but as it turns out AMD can effectively trick the OS and applications into behaving better. So AMD has implemented workarounds in their drivers for these application/OS issues, which doesn’t strictly fix the problem but will mitigate it.

A recurring theme in all of these issues was that they were easy to fix. It may only take AMD an hour to find the cause of a stuttering instance, and then even less time to make a driver change to deal with it. Stuttering on the whole is still a complex issue, but in AMD’s case they were easy fixes once AMD started looking for the problem.

Perhaps the most interesting thing about this entire process – and the most embarrassing thing for AMD – is not just that stuttering was occurring and they weren’t looking for it, but by not looking for stuttering they were leaving performance on the table. Stuttering doesn’t just impact the frame intervals, but many of those stalls where stuttering was occurring were also stalling the GPU entirely, reducing overall performance. One figure AMD threw around was that when they fixed their stuttering issue on Borderlands 2, overall performance had increased by nearly 13%, a very significant increase in performance that AMD would normally have to fight for, but instead exposed by an easy fix for stuttering. So AMD’s fixing their stuttering has not only resolved that issue, but in certain cases it has helped performance too.

This isn’t to say that AMD can fix all forms of stuttering. As we’ve already discussed, Windows isn’t a real time operating system and the PC platform itself is highly variable. Especially in resource constrained scenarios it’s simply not going to be possible to fix all forms of stuttering. If the CPU gets busy and the Present call from the application gets held up, then there’s nothing AMD can do other than to process it once it does arrive. This is the purpose of the context queue, to help smooth things out at the cost of some latency.

Moving on, though it’s outside the scope of this article for both a lack of time and a lack of tools, we will be looking at stuttering on AMD cards and NVIDIA cards as the necessary tools become available. AMD hasn’t fixed all of their issues yet and they waste no time admitting to it, so we will want to track their progress and see just how far along they are in bringing this issue under control.

Finally, we wanted to spend a bit more time talking about FRAPS in relation to what AMD discovered, and why FRAPS may still see issues that are not present.

The above is what AMD is calling the heartbeat pattern, and it’s something FRAPS is reporting even in some of the games they’ve fixed. This highlights one of the problems with trying to monitor frame intervals based on Present calls, as the context queue is absorbing the uneven frame dispatch, but FRAPS doesn’t realize it.

In a heartbeat situation the next Present gets delayed coming out of the application for whatever reason, which results in the rendering pipeline feeding from the context queue for a bit while nothing new comes in. Eventually the block is cleared and the application submits the next Present, at which point FRAPS records the Present as having come relatively later. Furthermore, since the context queue has been at least partially drained, there’s still room for one more frame, so rather than idling for a bit the application immediately gets to work on the next frame. As a result the next Present hits the context queue sooner than average, resulting in the early frame as picked up by FRAPS.

In this scenario, at the end of the rendering pipeline every frame could be displayed at an even pace despite the unevenness at the input, but FRAPS would never know. This doesn’t mean it’s not an issue, as uneven presents will cause the gap in time between the simulation steps to suddenly become uneven as well. But unless the heartbeat pattern occurs with high regularity or the size of the beat is enough to let the context queue drain completely, the impact from this scenario is far less than having the frames come out of the end of the rendering pipeline unevenly. Ultimately it’s another form of stuttering, but in the case of FRAPS looks far worse than it would be if we were measuring the end of the rendering pipeline and what the user was actually seeing.

The Tools of the Trade: FRAPS & GPUView AMD & Multi-GPU Stuttering: A Work In Progress
Comments Locked

103 Comments

View All Comments

  • mi1stormilst - Tuesday, March 26, 2013 - link

    All of us will benefit from the light shed on the subject with better testing and companies paying closer attention to issues and work arounds related to the subject. Still we would not even be talking about better testing methods right now without the attention it got from The Tech Report. I look forward to more sites implementing some type of real world testing methods that results in a true user experience evaluation. I reread the article and still standby my original conclusion. The Tech Report gets credit, but rather then stopping there this article seems to attack their methodology when they themselves had already admitted that it was less then perfect. To date there are still not better tools being used for reviews and The Tech Report still got the point across with what was available. I am a huge fan for what they did over there as I could not pinpoint why my AMD experience was less than optimal. It forced me to early retire my 6950 grab a very affordable 660 OC and enjoy a much smoother game experience. This is my first nVidia card since my trusty 4200ti and I am not looking back until AMD is on par with nVidia in the stuttering department...it was literally making me motion sick )-:
  • SPBHM - Tuesday, March 26, 2013 - link

    "holding back one frame but not another can sometimes make the frame display evenly, but from a simulation step only a few milliseconds after the previous step"

    wouldn't this also happen with the single GPU "heartbeat stuttering"?
  • BrightCandle - Tuesday, March 26, 2013 - link

    Yes it would, which is exactly the problem with the heartbeat pattern that AMD's problem causes. You can deliver the frames evenly out to the monitor but their contents has a noticeable stutter due to the graphics driver accepting the frames unevenly. The heartbeat is a sign of a real problem without a doubt, all non smooth frame time captures are. What they are not is a sign that the DVI monitor is seeing frames at those periods, but then no one ever said that was what was being measured anyway.

    The best way to think about it is that this is the problem going into the pipeline, measuring the output also needs to be done to get the smoothness on the output. Only with both can you understand the impact. We have half the picture, and that half is accurately measured by fraps.
  • Gunbuster - Tuesday, March 26, 2013 - link

    Design and launch a product. Ignore user feedback.

    Did we forget about those people with $2000 laptops sporting AMD mobile card drivers that didn't work correctly for over a year due to some bug with the graphics switching MUX? This seems to be a pattern that revolves around AMD software people being wholly out of their depth, overworked, or just not caring. They don’t even seem to be able to figure out when they have a fix. The laptop GPU story here on AT was presented as AMD sending over beta drives and asking “Did we fix it this time?”
  • rootheday - Tuesday, March 26, 2013 - link

    One minor correction to the description of the submission of commands through the stack - the DirectX runtime under Windows Vista and later does NOT accumulate a frames' worth of draw calls before sending them to the UMD. I believe it sends state and draw calls to the UMD immediately.

    The UMD accumulates commands in the command buffer and flushes them to the KMD either when a present call occurs, when the command buffer is full, or when the application requests to read back the results of enqueued rendering (Map/Lock/read Query result).

    It used to be true under Windows XP that the dx runtime accumulated calls and dispatched them to the driver - but that is because in XP, the driver ran in kernel mode and it was too expensive to make the user mode->kernel mode transition on every "SetState", etc call.
  • tynopik - Tuesday, March 26, 2013 - link

    "frame latter than it would have" -> later (pg 3)
  • cactusdog - Tuesday, March 26, 2013 - link

    As a long time ATI/AMD fan this report doesn't fill me with confidence. It appears AMD is using anandtech for their public relations spin on the stuttering issue. I don't blame anandtech for running the story, AMD's comments are newsworthy and anandtech deserves credit for being honest about AMD's intentions. On the negative side, the explanation about fraps not being an effective tool only need to be said once, it seems (by the number of times it was mentioned) that AMD's message is to make sure everyone knows Fraps its not accurate, but doesn't explain why Nvidia performs better.

    On the issue, it sounds like AMD is conceding and preparing us for much of the same. No where in the explanation do they mention why Nvidia performs better in the latency tests, other than to say its not what the end user is seeing. Well I disagree, users have been complaining about stuttering for years. I just don't believe that AMD have never looked into this issue before. Also with the multi-gpu stuttering. It has been an issue since crossfire/SLI first appeared and nothing has really happened there.

    Im a fan of AMD cards but I use both brands and personally I have noticed Nvidia do a better job with latency and general responsiveness in game, whereas ATI/AMD has the edge with image quality. Its subtle, and probably not something the average user notices but a lot of people do notice.. If AMD can solve this issue they would sell many more cards but by the sounds of this article, its too big and complex for them to solve completely without major work. Hence the excuses. Nvidia has to play by the same rules, the same OS etc and they do a better job at latency/stuttering, hopefully AMD can fix it enough to at least perform as good as a NVidia card.
  • WaltC - Tuesday, March 26, 2013 - link

    "NVIDIA made a big deal about moving away from timedemos and average frame rates during the early GeForce FX (NV30) days, when its cards might have delivered a decent gaming experience but were slaughtered in most benchmarks."

    Well, that's not really what happened at all...;) The chip "slaughtering" everything nVidia made in those days was the ATi R300. Seems rather strange to tell just half of that story. And the problem nVidia had with benchmarks wasn't technical--it was that nVidia was found to be actively cheating in 3dMark (camera on rails), among other cheats/shortcuts/optimizations in their drivers. The benchmarks told a story nVidia couldn't abide, and that was how much better the R300 was than anything nVidia had at the time. R300 was in every sense a revolution in the 3d gpu markets, blowing everything else away. All gpus on the market today are descended from R300 (just as all Intel and AMD x86 cpus are descended from AMD's original 64-bit Opterons.) nVidia did eventually own up to all of it, right before cancelling the nV30 after a month or two in production, however. People kept publishing proof after proof of what nVidia was doing until finally the company said "uncle." nVidia has been a better company since, imo. At least, its products are certainly better.

    I'm using a single ATi gpu and over the last few years I have to say that I haven't seen any stuttering worth mentioning. Whenever I have seen stuttering it is usually due to some software condition or other, and rectified by the appropriate patch. I do appreciate your pointing out that Fraps isn't perfect and I think TR should stop pretending that it is. Fraps as you point out was never intended to measure this kind of latency and so using it to produce data other than frame-rate data is an "off-label" use of the program, imo. And also as you point out, I use vsync more often than not.

    Really, though, I would loathe seeing AMD optimizing its drivers just to look better in TR's off-label Fraps usage...!...;) Let's hope that doesn't happen as I got quite a belly full of that sort of thing back in the nV30 days--enough to last me a lifetime.
  • beginner99 - Tuesday, March 26, 2013 - link

    How can FRAPS detect any vendor-specific stuttering if it injects itself before the gpu-driver is called?
    The second thing is that v-sync is just crap. I'm not a professional gamer, not even close but in certain games turning it off made me a much better player and the difference is huge. even more annoyingly it was not directly noticeable. I did not "feel" anything changed. Except that my stats were better. Tearing and stuttering: no issue for me so far.
  • DanNeely - Tuesday, March 26, 2013 - link

    The timing at the point it's measuring is normally blocked until the queue the GPUs feeding from has an open slot?

Log in

Don't have an account? Sign up now