Digging Deeper: Galloping Horses Example

Rather than pull out a bunch of math and traditional timing diagrams, we've decided to put together a more straight forward presentation. The diagrams we will use show the frames of an actual animation that would be generated over time as well as what would be seen on the monitor for each method. Hopefully this will help illustrate the quantitative and qualitative differences between the approaches.

Our example consists of a fabricated example (based on an animation example courtesy of Wikipedia) of a "game" rendering a horse galloping across the screen. The basics of this timeline are that our game is capable of rendering at 5 times our refresh rate (it can render 5 different frames before a new one gets swapped to the front buffer). The consistency of the frame rate is not realistic either, as some frames will take longer than others. We cut down on these and other variables for simplicity sake. We'll talk about timing and lag in more detail based on a 60Hz refresh rate and 300 FPS performance, but we didn't want to clutter the diagram too much with times and labels. Obviously this is a theoretical example, but it does a good job of showing the idea of what is happening.

First up, we'll look at double buffering without vsync. In this case, the buffers are swapped as soon as the game is done drawing a frame. This immediately preempts what is being sent to the display at the time. Here's what it looks like in this case:

 


Good performance but with quality issues.


 

The timeline is labeled 0 to 15, and for those keeping count, each step is 3 and 1/3 milliseconds. The timeline for each buffer has a picture on it in the 3.3 ms interval during which the a frame is completed corresponding to the position of the horse and rider at that time in realtime. The large pictures at the bottom of the image represent the image displayed at each vertical refresh on the monitor. The only images we actually see are the frames that get sent to the display. The benefit of all the other frames are to minimize input lag in this case.

We can certainly see, in this extreme case, what bad tearing could look like. For this quick and dirty example, I chose only to composite three frames of animation, but it could be more or fewer tears in reality. The number of different frames drawn to the screen correspond to the length of time it takes for the graphics hardware to send the frame to the monitor. This will happen in less time than the entire interval between refreshes, but I'm not well versed enough in monitor technology to know how long that is. I sort of threw my dart at about half the interval being spent sending the frame for the purposes of this illustration (and thus parts of three completed frames are displayed). If I had to guess, I think I overestimated the time it takes to send a frame to the display.

For the above, FRAPS reported framerate would be 300 FPS, but the actual number of full images that get flashed up on the screen is always only a maximum of the refresh rate (in this example, 60 frames every second). The latency between when a frame is finished rendering and when it starts to appear on screen (this is input latency) is less than 3.3ms.

When we turn on vsync, the tearing goes away, but our real performance goes down and input latency goes up. Here's what we see.

 


Good quality, but bad performance and input lag.


 

If we consider each of these diagrams to be systems rendering the exact same thing starting at the exact same time, we can can see how far "behind" this rendering is. There is none of the tearing that was evident in our first example, but we pay for that with outdated information. In addition, the actual framerate in addition to the reported framerate is 60 FPS. The computer ends up doing a lot less work, of course, but it is at the expense of realized performance despite the fact that we cannot actually see more than the 60 images the monitor displays every second.

Here, the price we pay for eliminating tearing is an increase in latency from a maximum of 3.3ms to a maximum of 13.3ms. With vsync on a 60Hz monitor, the maximum latency that happens between when a rendering if finished and when it is displayed is a full 1/60 of a second (16.67ms), but the effective latency that can be incurred will be higher. Since no more drawing can happen after the next frame to be displayed is finished until it is swapped to the front buffer, the real effect of latency when using vsync will be more than a full vertical refresh when rendering takes longer than one refresh to complete.

Moving on to triple buffering, we can see how it combines the best advantages of the two double buffering approaches.

 


The best of both worlds.


 

And here we are. We are back down to a maximum of 3.3ms of input latency, but with no tearing. Our actual performance is back up to 300 FPS, but this may not be reported correctly by a frame counter that only monitors front buffer flips. Again, only 60 frames actually get pasted up to the monitor every second, but in this case, those 60 frames are the most recent frames fully rendered before the next refresh.

While there may be parts of the frames in double buffering without vsync that are "newer" than corresponding parts of the triple buffered frame, the price that is paid for that is potential visual corruption. The real kicker is that, if you don't actually see tearing in the double buffered case, then those partial updates are not different enough than the previous frame(s) to have really mattered visually anyway. In other words, only when you see the tear are you really getting any useful new information. But how useful is that new information if it only comes with tearing?

What are Double Buffering, vsync and Triple Buffering? Wrapping It Up
Comments Locked

184 Comments

View All Comments

  • DerekWilson - Friday, June 26, 2009 - link

    Really, the argument against including the option is more complex ...

    In the past, that extra memory required might not have been worth it -- using that memory on a 128mb card could actually degrade performance because of the additional resource usage. we've really recently gotten beyond this as a reasonable limitation.

    Also, double buffering is often seen as "good enough" and triple buffering doesn't add any additional eye candy. triple buffering is at best only as performant as double buffering. enabling vsync already eliminates tearing. neither of these options requires any extra work and triple buffering (at least under directx) does.

    Developers typically try to spend time on the things that they determine will be most desired by their customers or that will add the largest impact. Some developers have taken the time to start implementing triple buffering.

    but the "drawback" is development time... part of the goal here is to convince developers that it's worth the development time investment.
  • Frumious1 - Friday, June 26, 2009 - link

    ...this sounds fine for games running at over 60FPS - and in fact I don't think there's really that much difference between double and triple buffering in that case; 60FPS is smooth and would be great.

    The thing is, what happens if the frame rate is just under 60 FPS? It seems to me that you'll still get some benefit - i.e. you'd see the 58 FPS - but there's a delay of one frame between when the scene is rendered and when it appears on the display. You neglected to spell out the the maximum "input latency" is entirely dependent on frame rate... though it will never be more than 17ms I don't think.

    I'm not one to state that input lag is a huge issue, provided it stays below around 20ms. I've used some LCDs that have definite lag (Samsung 245T - around 40ms I've read), and it is absolutely distracting even in normal windows use. Add another 17ms for triple buffering... well, I suppose the difference between 40ms and 57ms isn't all that huge, but neither is desirable.
  • GourdFreeMan - Friday, June 26, 2009 - link

    Derek has already mentioned that the additional delay is at most one screen refresh, not exactly one refresh, but let me add two more points. First, the the additional delay will be dependent on the refresh rate of your monitor. If you have a 60 Hz LCD then, yes it will be ~16.6ms. If you have a 120 Hz LCD the additonal delay would be at most ~8.3ms. Second, that if you are running without vsync, the screen you see will be split into two regions -- one region is the newly rendered frame, the other will be the previous frame that will be the same age as the entire frame you would be getting with triple buffering. Running without vsync only reduces your latency if what you are focusing on is in the former.

    Also, we should probably call this display lag, not input lag, as the rate at which input is polled isn't necessarily related to screen refresh (it is for some games like Oblivion and Hitman: Blood Money, however).
  • DerekWilson - Friday, June 26, 2009 - link

    you are right that maximum input latency is very dependent on framerate, but I believe I mentioned that maximum input latency with triple buffering is never more than 16.67ms, while with double buffering and vsync it could potentially climb to an additional 16.67ms due to the fact that the game has to wait to start rendering the next frame. If a frame completes just after a refresh, the game must artificially wait until after the next refresh to start drawing again giving something like an upper limit of input lag as (frametime + 33.3ms).

    With triple buffering, input lag is no longer than double buffering without vsync /for at least part of the frame/ ... This is never going to be longer than (frametime + 16.7ms) in either case.

    triple buffering done correctly does not add more input lag than double buffering in the general case (even when frametime > 17ms) unless/until you have a tear in the double buffered case. and there again, if the frames are similar enough that you don't see a tear, then then there was little need for an update half way through a frame anyway.

    i tried to keep the article as simple as i could, and getting into every situation of where frames finish rendering, how long frames take, and all that can get very messy ... but in the general case, triple buffering still has the advantages.
  • DerekWilson - Friday, June 26, 2009 - link

    sorry, i meant input lag /due/ to triple buffering is never more than 16.67ms ... but the average case is shorter than this.

    total input lag can be longer than this because frame data is based on input when the frame began rendering so when framerate is less than 60FPS, frametime is already more than 16.67ms ... at 30 FPS, frametime is 33.3ms.
  • Edirol - Friday, June 26, 2009 - link

    The wiki article on the subject mentions that it depends on the implementation of triple buffering. Can frames be dropped or not? Also there may be limitations to using triple buffering in SLI setups.
  • DerekWilson - Friday, June 26, 2009 - link

    i'm not a fan of wikipedia's article on the subject ... they refer to the DX 3 frame render ahead as a form of "triple buffering" ... I disagree with the application of the term in this case.

    sure, it's got three things that are buffers, but the implication in the term triple buffering (just like in the term double buffering) when applied to displaying graphics on a monitor is more specific than that.

    just because something has two buffers to do something doesn't mean it uses "double buffering" in the sense that it is meant when talking about drawing to back buffers and swaping to front buffers for display.

    In fact, any game has a lot more than two or three buffers that it uses in it's rendering process.

    The DX 3 frame render ahead can actually be combined with double and triple buffering techniques when things are actually being displayed.

    I get that the wikipedia article is trying to be more "generally" correct in that something that uses three buffers to do anything is "triple buffered" in a sense ... but I submit that the term has a more specific meaning in graphics that has specifically to do with page flipping and how it is handled.
  • StarRide - Friday, June 26, 2009 - link

    Very Informative. WoW is one of those games with inbuilt triple buffering, and the ingame tooltip to the triple buffering option says "may cause slight input lag", which is the reason why I haven't used triple buffering so far. But by this article, this is clearly false, so I will be turning triple buffering on from now on, thanks.
  • Bull Dog - Friday, June 26, 2009 - link

    Now, how do we enable it? And when we enable it, how do we make sure we are getting triple buffering and not double buffering?

    ATI has an option in the CCC to enable triple buffering for OpenGL. What about D3D?
  • gwolfman - Friday, June 26, 2009 - link

    What about nVidia? Do we have to go to the game profile to change this?

Log in

Don't have an account? Sign up now