The Start: The Rendering Pipeline In Detail

Before we can even discuss the concept of stuttering and other frame timing anomalies, we need to first take a look at a high-level overview of the Windows rendering pipeline. The pipeline isn’t particularly complex, but understanding where various stages of the process are in the hands of Windows, the CPU, the driver, and the video card is necessary to understand where bottlenecks and delays can occur.

At its most fundamental level, rendering a frame is a 3 part process. An application needs to pass data to Windows, Windows needs to manage the process and interface with the drivers, and finally once Windows and driver preparation is complete, a frame can be passed off to the GPU for final rendering and display.

At the top of the chain is the application itself. This is where user input is being handled and where in the context of a game the simulation is being executed. From a technical perspective, it is the application that is the first arbitrator for game smoothness; applications are responsible for adjusting the simulation rate in order to keep the flow of frames smooth. If the application cannot ensure an even rate, then nothing else that follows will really matter.

The reality of course is that this is harder than it sounds. It is not an insurmountable problem, but PCs are devices with a wide spectrum of performance and capabilities. A dual-core processor with an iGPU performs very different from a hex-core processor with a small army of GPUs, and an application needs to be able to accommodate this so that the simulation operates as evenly as possible in both CPU and GPU-bottlenecked scenarios.

Ultimately any timing model is going to be reactive, adjusting itself in response to prior events and how long previous frames took to render. Though another option is to shortcut this process entirely and operate at a fixed (or capped) simulation rate, either basing a game around 30Hz/60Hz operation, or decoupling rendering from the simulation entirely. Anyone who has uncapped id Software’s Rage for example will find that the game simply does not behave correctly without its 60Hz cap.

Static or dynamic, once a simulation has a suitable timing model in place we can then begin to look further down the chain, which is where we first encounter Direct3D, Windows’ primary 3D rendering API. Direct3D is nothing short of an enormous, complex structure of API calls and features. We tend to reduce it to version numbers and marque features for the sanity of ourselves and our readers – as we will here – but it goes without saying that Direct3D takes years to master; and for a GPU manufacturer it’s made all the more complex by the simultaneous existence of the modern iteration of Direct3D (DX10+), and the classic iteration that is DX9 and its predecessors.

For the purpose of the rendering pipeline Direct3D has a few different jobs. First and foremost, it is collecting draw calls from the application, combining them, and processing them for further work. Once a complete frame’s worth of draw calls has been collected, Direct3D passes its processed work over to the first component of the video card driver stack, the User Mode Driver (UMD).

It’s the UMD that is primarily responsible for taking the output of Direct3D and turning it into work batches the GPU can handle. These work batches, command buffers (aka Display Lists), are collections of instructions and data suitable for processing by the target GPU. Among other things, the UMD is responsible for shader compilation and assigning rendering elements to the correct (and best) surface formats for the GPU.


A logical view of single command buffer; from Microsoft's Direct3D documentation

When the UMD’s work is complete, it passes its command buffer back over to Direct3D. Direct3D in turn passes that command buffer to the context queue, our first real bottleneck. We’ll get back to why this is a bottleneck in a bit, but briefly, the context queue is responsible for queuing the individual command buffers in order to smooth out the rendering process. Queuing command buffers at this stage increases frame rendering latency, but by providing a buffer of buffers it allows the rendering pipeline to absorb any variances in rendering time or simulation time to more smoothly render frames.

The context queue has also gone by other names over the years, such as the flip queue and the pre-rendered frames queue. This is the source of the 3 frame render-ahead limit in Windows that is sometimes exposed in games and drivers, as Windows will by default queue up to 3 frames in this manner. This can be controlled by application developers, but most will leave it at 3 so long as a game is smoothly moving along.

Beyond the context queue we have Windows’ GPU scheduler, which is what regulates the popping of command buffers off of the context queue to be fed to the kernel mode GPU driver (KMD). Beyond this point the rest of the pipeline is rather simple, with the KMD taking the command buffer and feeding it to the GPU, all the while the KMD and GPU work together to manage the operation of the GPU. When a frame is finally completed, the GPU generates an interrupt to inform the KMD and OS about the completion.

At the end of this process we have a rendered frame sitting in the GPU’s back-buffer, but the frame itself is not displayed automatically. At the end of a batch of command buffers – effectively making the beginning and ends of frames – is the Direct3D Present() call. Present is the command that is responsible for telling the GPU to flip the back buffer to the front and to present the rendered frame to the user. Only once the Present call executes does a frame get displayed. The Present call, though not a command buffer object, still follows the same rendering path as the command buffers, including queuing up in the Context Queue.

Introduction Just What Is Stuttering?
Comments Locked

103 Comments

View All Comments

  • Galidou - Saturday, March 30, 2013 - link

    Nope, never, I remember Nvidia back in the days of the 6800 GT that caused INFINITE stuttering(worse I've ever seen) with Nforce 3 or was it nforce 4 motherboard that I had. Only thing I could do to fix it was to underclock the video card, go back to older drivers. That made me lose 30-40% performance.

    They never ever fixed the problem or admitted it, EVER. I had to change video card after 6 months of trying everything. Nvidia forums were full of it not even an answer from them that they were fixing that issue. Some were able to fix this by disabling AGP fastwrites or other tricks but others had no choice doing what I did and lose the performance...
  • HisDivineOrder - Tuesday, March 26, 2013 - link

    It's great that AMD admitted to a problem, but wow what a big problem to have totally missed. I guess they were so busy laying off engineers and R&D they didn't keep ahead of the game.
  • haplo602 - Tuesday, March 26, 2013 - link

    all nice and fine, but now please get your arse moving and do something for OpenGL performance AMD !!!
  • kzinti1 - Tuesday, March 26, 2013 - link

    If Windows is a major problem with stuttering, then why can't they develop a user-switchable "gaming mode" to make the OS prioritize the resources of the OS in favor of the games and their rendering processes?
  • HisDivineOrder - Tuesday, March 26, 2013 - link

    Microsoft is the company that might work something like that out. Unfortunately, Microsoft is also one of the companies that wants you to go buy a console. So I don't think they're going to facilitate what you suggest.

    I also suspect it's not as simple as what you suggest since it'd require game support, low level changes, etc. But ultimately, it doesn't matter how easy or hard it is because MS won't do it. They have no reason to.

    If they cared about PC gaming in the slightest, I think they'd have ported Halo 3, ODST, Halo Reach, Halo 4, Gears of War 2, Gears of War 3, or Fable 2 to PC. Face it. MS gave up on PC gaming. Steam is what kept it going and Steam is what will carry it forward.

    And the Steam Box may do exactly what you're suggesting.
  • mikato - Wednesday, March 27, 2013 - link

    I'm pretty sure they care a bit because gaming is the only reason many people still use Windows.
  • mgambrell - Wednesday, March 27, 2013 - link

    methinks you place too much confidence in their acumen. As an exercise, find one thing microsoft has done lately which can be spun as plausibly in service of windows gamers.
  • Dribble - Tuesday, March 26, 2013 - link

    Fundamentally AMD failed because instead of making a driver to play games well, they make one that's there to give the highest fps at the expense of everything else. They were the first for example they customize the driver for every game - which makes the driver an order of magnitude more complex and introduced a lot more bugs to everything for a few % more performance.

    They did this because they care about the bottom line numbers shown in reviews more then actually playing the game well. Only now a reviewer has focused on stuttering are they focusing on it. It's not the only problem either - runt frames was also exposed by another tool which if anything is a cheat to exploit fraps - but AMD haven't got as far as discussing that yet.

    This is a problem - AMD should be making drivers to play games well, not to look good in reviews. Journalists shouldn't be the ones having to do AMD's driver QA. I can't believe AMD didn't know about the stuttering - it's obvious even with a slow cam, they just didn't think it was important because it didn't effect their sales because journalists weren't reporting on it.
  • Spoelie - Tuesday, March 26, 2013 - link

    Read the article again, your assumptions are wrong.

    Fixing the stuttering provided an increase in averaged framerates (in cases up to 13%), so it would've made them look a lot better even in traditional reviews not reporting on stuttering. And that's a huge delta for a small software change.

    If anything, you could blame them for ineptitude, but there's no ill-will here.
  • Dribble - Tuesday, March 26, 2013 - link

    The increase in fps was a surprise to them. The article suggests that if they had known it would increase fps they would have done it ages ago. Fact is there was stuttering, they knew about it but ignored it - the "well we assumed everyone else stuttered too" excuse isn't great. Clearly it was fixable, and a side effect was it even increased fps, but they were so fixated on fps charts in reviews that it was never deemed important enough to look at until the reviews started castigating them for it.

    If they had actually been trying to make the card as good as possible for gamers to play with they would have fixed it years ago as stuttering really matters to people trying to play the games.

Log in

Don't have an account? Sign up now