Digging Deeper: Galloping Horses Example

Rather than pull out a bunch of math and traditional timing diagrams, we've decided to put together a more straight forward presentation. The diagrams we will use show the frames of an actual animation that would be generated over time as well as what would be seen on the monitor for each method. Hopefully this will help illustrate the quantitative and qualitative differences between the approaches.

Our example consists of a fabricated example (based on an animation example courtesy of Wikipedia) of a "game" rendering a horse galloping across the screen. The basics of this timeline are that our game is capable of rendering at 5 times our refresh rate (it can render 5 different frames before a new one gets swapped to the front buffer). The consistency of the frame rate is not realistic either, as some frames will take longer than others. We cut down on these and other variables for simplicity sake. We'll talk about timing and lag in more detail based on a 60Hz refresh rate and 300 FPS performance, but we didn't want to clutter the diagram too much with times and labels. Obviously this is a theoretical example, but it does a good job of showing the idea of what is happening.

First up, we'll look at double buffering without vsync. In this case, the buffers are swapped as soon as the game is done drawing a frame. This immediately preempts what is being sent to the display at the time. Here's what it looks like in this case:

 


Good performance but with quality issues.


 

The timeline is labeled 0 to 15, and for those keeping count, each step is 3 and 1/3 milliseconds. The timeline for each buffer has a picture on it in the 3.3 ms interval during which the a frame is completed corresponding to the position of the horse and rider at that time in realtime. The large pictures at the bottom of the image represent the image displayed at each vertical refresh on the monitor. The only images we actually see are the frames that get sent to the display. The benefit of all the other frames are to minimize input lag in this case.

We can certainly see, in this extreme case, what bad tearing could look like. For this quick and dirty example, I chose only to composite three frames of animation, but it could be more or fewer tears in reality. The number of different frames drawn to the screen correspond to the length of time it takes for the graphics hardware to send the frame to the monitor. This will happen in less time than the entire interval between refreshes, but I'm not well versed enough in monitor technology to know how long that is. I sort of threw my dart at about half the interval being spent sending the frame for the purposes of this illustration (and thus parts of three completed frames are displayed). If I had to guess, I think I overestimated the time it takes to send a frame to the display.

For the above, FRAPS reported framerate would be 300 FPS, but the actual number of full images that get flashed up on the screen is always only a maximum of the refresh rate (in this example, 60 frames every second). The latency between when a frame is finished rendering and when it starts to appear on screen (this is input latency) is less than 3.3ms.

When we turn on vsync, the tearing goes away, but our real performance goes down and input latency goes up. Here's what we see.

 


Good quality, but bad performance and input lag.


 

If we consider each of these diagrams to be systems rendering the exact same thing starting at the exact same time, we can can see how far "behind" this rendering is. There is none of the tearing that was evident in our first example, but we pay for that with outdated information. In addition, the actual framerate in addition to the reported framerate is 60 FPS. The computer ends up doing a lot less work, of course, but it is at the expense of realized performance despite the fact that we cannot actually see more than the 60 images the monitor displays every second.

Here, the price we pay for eliminating tearing is an increase in latency from a maximum of 3.3ms to a maximum of 13.3ms. With vsync on a 60Hz monitor, the maximum latency that happens between when a rendering if finished and when it is displayed is a full 1/60 of a second (16.67ms), but the effective latency that can be incurred will be higher. Since no more drawing can happen after the next frame to be displayed is finished until it is swapped to the front buffer, the real effect of latency when using vsync will be more than a full vertical refresh when rendering takes longer than one refresh to complete.

Moving on to triple buffering, we can see how it combines the best advantages of the two double buffering approaches.

 


The best of both worlds.


 

And here we are. We are back down to a maximum of 3.3ms of input latency, but with no tearing. Our actual performance is back up to 300 FPS, but this may not be reported correctly by a frame counter that only monitors front buffer flips. Again, only 60 frames actually get pasted up to the monitor every second, but in this case, those 60 frames are the most recent frames fully rendered before the next refresh.

While there may be parts of the frames in double buffering without vsync that are "newer" than corresponding parts of the triple buffered frame, the price that is paid for that is potential visual corruption. The real kicker is that, if you don't actually see tearing in the double buffered case, then those partial updates are not different enough than the previous frame(s) to have really mattered visually anyway. In other words, only when you see the tear are you really getting any useful new information. But how useful is that new information if it only comes with tearing?

What are Double Buffering, vsync and Triple Buffering? Wrapping It Up
Comments Locked

184 Comments

View All Comments

  • profoundWHALE - Monday, January 19, 2015 - link

    You'll need backlight strobing to get CRT-like performance on LCDs. Take a look at http://www.blurbusters.com/
  • texkill - Friday, June 26, 2009 - link

    First, let me sum up the actual advantage of triple buffering: smoothing out variable draw times when game framerate < monitor refresh. That's it.

    This article severely overstates the case for triple buffering when it says "there is an option that combines the best of both worlds with no sacrifice in quality or actual performance." Okay so you want "the best of both worlds" which would be no tearing and minimum input lag? And the example used to prove this is 300 fps on 60hz. Well guess what, I can give you the best of both worlds with something called "waiting a while." See those horse figures at the beginning of each frame in the double-buffer figure? Move them from the beginning of the frame to near the end and viola, input lag is looking good again.

    But actually it gets even better when you add multithreading to a double-buffered solution. Now you not only don't have to draw frames that will *never be seen by any living creature on Earth* (not the default behavior in DirectX btw), you can actually make use of the CPU time that would otherwise be spent in the graphics api to do something useful like physics or AI. You also then don't need to have frames that are drawing when the v-sync happens and causing the input lag and smoothness to vary every single frame (again, not the default DX behavior).

    Triple buffering has its place when drawing times vary and smooth animation is desired. But it should definitely not be blindly demanded of all game developers when most of them already know the tradeoffs and have already made very good judgments on this decision.
  • DerekWilson - Friday, June 26, 2009 - link

    this is more of an additional advantage. without vsync, double buffering still starts drawing the same frame that triple buffering would start drawing but changes frames in between. throw in vsync and you still get a doubling of worst case added input lag (and an increase in average case input lag too).

    and it's not about drawing the frames that will never be seen -- it's about not seeing frames that are outdated when newer frames can be finished before the next refresh (reducing input lag).

    multithreading still helps triple buffering ... i don't see why that even enters into the situation.

    the game can't know for sure how long a frame will take to render when it starts rendering (otherwise it would know how long it could wait to start the process so that the frame is as new as possible before the next refresh). there is no way to avoid having frames that are being worked on during a vertical refresh.
  • JarredWalton - Friday, June 26, 2009 - link

    VSYNC is really the absolutely worst solution to this problem in my opinion. Let's say you have a game that runs at ~75FPS on average on your system, with VSYNC off. Great. Enable triple buffering and you still get 75FPS average, though some frames will never be seen. Use double buffering with VSYNC and you'll render 60FPS... ideally, at least.

    The problem with VSYNC is that you get lower minimum frame rates, and those become very noticeable. If you're running at 60FPS most of the time, then drop to 30FPS or 20FPS or 15FPS (notice how all of those are an even divisor of 60), those lows become even more distracting. Far more common, unfortunately, is that maintaining 60FPS with many games is very difficult, even with high-end hardware. Rather than getting a smooth 60FPS, what you usually end up with is 30FPS.

    Finally, in cases where the frame rate is much higher than the refresh rate, triple buffering does give you reduced image latency relative to double buffering with VSYNC - though as Derek points out it still has a worst case of 16.7ms (lower than double with VSYNC).
  • zulezule - Friday, June 26, 2009 - link

    Your comment made me realize that I'd prefer my GPU to render the 60 vsync-ed frames and stay cool, instead of rendering 300 fps (out of which 4/5 are useless), overheat, become noisy and maybe even crash. The only case when I'd want more frames rendered would be when they are used to insert something in the one visible frame, as for example if the 4 invisible frames are averaged with the visible one to create motion blur. However, I'm pretty sure beautiful motion blur can be obtained much more easily.
  • DerekWilson - Friday, June 26, 2009 - link

    The advantages still exist at a sub 60 FPS level. I just chose 300 FPS to illustrate the idea more easily.

    At less than 60 FPS, the triple buffered case still shows the same performance as double buffering -- they both start rendering the same frame after a refresh. double buffering with vsync still adds more input lag on average than the other cases.
  • Mills - Friday, June 26, 2009 - link

    You made a good case of something currently impossible (if I understand you correctly) being better than triple buffering but I don't see where you made the case that triple buffering isn't better than double buffering in the case of FPS being much greater than refresh rate.

    The point is, when we are given a choice between double and triple, is there a reason not to choose triple?
  • texkill - Friday, June 26, 2009 - link

    What's impossible about it?

    Yes, there are drawback to triple buffering. Implement it the way directX does by default and you get input lag. Implement it the way the article suggests and you get wasted cpu and jerky animation. And either way you are sacrificing video memory that could have been used for something else.
  • DerekWilson - Friday, June 26, 2009 - link

    1) DirectX does not implement triple buffering (render-ahead is not the same and should not be referred to as "triple buffering" when set to 3 frames). The way to think of the DX mess is that they set up a queue to for the back buffer, but there is only one real back buffer and one front buffer (even with 3 frame render ahead, it is essentailly double buffered if we're talking about page flipping).

    2) The triple buffering approach described in this article is the only thing that should actually be called "triple buffering" if we are contrasting it with "double buffering" and referring to page flipping. Additionally, it does not create jerky animation -- the animation will be much smoother than either double buffering with or without vsync (either because frames have less lag or because they don't tear).
  • toyota - Friday, June 26, 2009 - link

    yeah it makes me wonder why both card companies dont even allow it straight from the cp for DX games if there are no drawbacks. also it seems like all game developers would incorporate it in their games if again there were no drawbacks.

Log in

Don't have an account? Sign up now