Digging Deeper: Galloping Horses Example

Rather than pull out a bunch of math and traditional timing diagrams, we've decided to put together a more straight forward presentation. The diagrams we will use show the frames of an actual animation that would be generated over time as well as what would be seen on the monitor for each method. Hopefully this will help illustrate the quantitative and qualitative differences between the approaches.

Our example consists of a fabricated example (based on an animation example courtesy of Wikipedia) of a "game" rendering a horse galloping across the screen. The basics of this timeline are that our game is capable of rendering at 5 times our refresh rate (it can render 5 different frames before a new one gets swapped to the front buffer). The consistency of the frame rate is not realistic either, as some frames will take longer than others. We cut down on these and other variables for simplicity sake. We'll talk about timing and lag in more detail based on a 60Hz refresh rate and 300 FPS performance, but we didn't want to clutter the diagram too much with times and labels. Obviously this is a theoretical example, but it does a good job of showing the idea of what is happening.

First up, we'll look at double buffering without vsync. In this case, the buffers are swapped as soon as the game is done drawing a frame. This immediately preempts what is being sent to the display at the time. Here's what it looks like in this case:

 


Good performance but with quality issues.


 

The timeline is labeled 0 to 15, and for those keeping count, each step is 3 and 1/3 milliseconds. The timeline for each buffer has a picture on it in the 3.3 ms interval during which the a frame is completed corresponding to the position of the horse and rider at that time in realtime. The large pictures at the bottom of the image represent the image displayed at each vertical refresh on the monitor. The only images we actually see are the frames that get sent to the display. The benefit of all the other frames are to minimize input lag in this case.

We can certainly see, in this extreme case, what bad tearing could look like. For this quick and dirty example, I chose only to composite three frames of animation, but it could be more or fewer tears in reality. The number of different frames drawn to the screen correspond to the length of time it takes for the graphics hardware to send the frame to the monitor. This will happen in less time than the entire interval between refreshes, but I'm not well versed enough in monitor technology to know how long that is. I sort of threw my dart at about half the interval being spent sending the frame for the purposes of this illustration (and thus parts of three completed frames are displayed). If I had to guess, I think I overestimated the time it takes to send a frame to the display.

For the above, FRAPS reported framerate would be 300 FPS, but the actual number of full images that get flashed up on the screen is always only a maximum of the refresh rate (in this example, 60 frames every second). The latency between when a frame is finished rendering and when it starts to appear on screen (this is input latency) is less than 3.3ms.

When we turn on vsync, the tearing goes away, but our real performance goes down and input latency goes up. Here's what we see.

 


Good quality, but bad performance and input lag.


 

If we consider each of these diagrams to be systems rendering the exact same thing starting at the exact same time, we can can see how far "behind" this rendering is. There is none of the tearing that was evident in our first example, but we pay for that with outdated information. In addition, the actual framerate in addition to the reported framerate is 60 FPS. The computer ends up doing a lot less work, of course, but it is at the expense of realized performance despite the fact that we cannot actually see more than the 60 images the monitor displays every second.

Here, the price we pay for eliminating tearing is an increase in latency from a maximum of 3.3ms to a maximum of 13.3ms. With vsync on a 60Hz monitor, the maximum latency that happens between when a rendering if finished and when it is displayed is a full 1/60 of a second (16.67ms), but the effective latency that can be incurred will be higher. Since no more drawing can happen after the next frame to be displayed is finished until it is swapped to the front buffer, the real effect of latency when using vsync will be more than a full vertical refresh when rendering takes longer than one refresh to complete.

Moving on to triple buffering, we can see how it combines the best advantages of the two double buffering approaches.

 


The best of both worlds.


 

And here we are. We are back down to a maximum of 3.3ms of input latency, but with no tearing. Our actual performance is back up to 300 FPS, but this may not be reported correctly by a frame counter that only monitors front buffer flips. Again, only 60 frames actually get pasted up to the monitor every second, but in this case, those 60 frames are the most recent frames fully rendered before the next refresh.

While there may be parts of the frames in double buffering without vsync that are "newer" than corresponding parts of the triple buffered frame, the price that is paid for that is potential visual corruption. The real kicker is that, if you don't actually see tearing in the double buffered case, then those partial updates are not different enough than the previous frame(s) to have really mattered visually anyway. In other words, only when you see the tear are you really getting any useful new information. But how useful is that new information if it only comes with tearing?

What are Double Buffering, vsync and Triple Buffering? Wrapping It Up
Comments Locked

184 Comments

View All Comments

  • Schmide - Friday, June 26, 2009 - link

    My conceptions.

    Triple Buffering is 2 back buffers that alternate a copying (BLT) to a front buffer(primary/screen) while the other is rendering.

    Double Buffering is two surfaces that trade places between front and back buffer by switching states. Only works in full screen mode.

    Back Buffering where one surface is rendered to then copied to the front buffer(primary/screen). Often falsely called Double Buffering.

    Triple Buffering is designed to avoid the surface lock during a copy to the front buffer (BLT) in windowed mode so the next rendering cycle can start early. In full screen mode it just adds an extra step (BLT) in the rendering cycle, since a hardware is swap moves no memory just pointers.

    I would imagine the only reason a Triple Buffer would reduce tearing is, on average the back buffer copy is playing catchup to the primary surface update and the chances of half rendered frames is a bit less.

    So proper use would be

    Double Buffer - Full Screen Rendering.
    Back Buffer - Simple Full/Windowed Rendering
    Triple Buffer - Complex Windowed Rendering.
  • Schmide - Friday, June 26, 2009 - link

    PrinceGaz explained it so I understand below.
  • Schmide - Friday, June 26, 2009 - link

    -"is"

    I want to add. Vsinc can be a problem because of the synchronous nature between game code and rendered frames. The more frames you get the better your character moves. If you lock down/cap your frames you may be loosing some response.

    Example. In cod4 crash, the wall by the dumpster near the 3 story building, you can only jump over it if your frames get above 125. I assume there is some round off error and Euler like calculations going on.

    The ideal rendering cycle, other than a fixed or capped game play engine, would be: vsinc, update, render a frame, do game code without rendering over and over, repeat.
  • DerekWilson - Friday, June 26, 2009 - link

    triple buffering does not use a blit to move a back buffer to a front buffer -- it is still done with buffer renaming.

    i.e. you'll have three pointers: one to the frame currently being rendered, one to the most recently completed frame (these are both back buffers), and one to the front buffer.

    after a vertical refresh completes, if there is not a more recently completed frame than the current front buffer, the current front buffer locks again and the same frame is drawn. If there is a more recently completed frame newer than the one that was just drawn, then this buffer becomes the front buffer and the old front buffer becomes the other back buffer.

    when the GPU finishes rendering into one back buffer, it marks that buffer as the most recently completed and swaps the pointers so that it's current buffer was the previous most recently completed buffer that is not the front buffer.

    ...

    i know, clear as much right?

    but really, there is no blit involved in a sane triple buffering implementation.
  • nvmarino - Friday, June 26, 2009 - link

    Hey Derek, thanks for the article. Any chance you could provide more detail about the issues with SLI and triple buffering? Such as why it's an issue, can the issues be overcome by game developers or is it an issue at the driver level, and also what are the typical problems an end-user would experience?
  • Compddd - Friday, June 26, 2009 - link

    Or can I turn Vsync off and just leave triple buffering on? Like in L4D or TF2 for instance?
  • DerekWilson - Friday, June 26, 2009 - link

    it is not possible to run triple buffering without vsync.

    the purpose of triple buffering is to provide a buffer that can remain locked during the vertical redraw (so that there is no corruption); this IS vsync.

    but the advantage is that there are still two buffers left over so that you can always save the most recently completed frame while working on the next one (and also not corrupting what is currently being displayed).

    think of it like this: there is one current work space, one most recently completed frame, and one vsync'd buffer.
  • Compddd - Friday, June 26, 2009 - link

    Why do these games like L4D and TF2 have the option to turn off Vync or Triple bufferng then? Or turn them both on, or turn one on and leave the other one off?
  • JonP382 - Saturday, June 27, 2009 - link

    They don't. There's an option to turn on vsync with double buffering, or vsync with triple buffering. Or no vsync.
  • Atechie - Friday, June 26, 2009 - link

    Thanks for showning me why still keeping my 2x21"CRT's are a good choice, so I don't get less IQ, fake black, tearing suckt 60Hz refresh and all the other crap that make LCD's less than steller for gaming.

Log in

Don't have an account? Sign up now