Scanout and the Display

Alright. So depending on the game, we are up to somewhere between 13ms and 58ms after our mouse was moved. The GPU just finished rendering and swapped the finished frame to the front buffer. What happens next is called scanout: the frame is sent out the DVI-I port over the cable and to the monitor.

If our monitor's refresh rate is 60Hz (as is typical these days), it will actually take something like 16ms to send the full frame to the monitor (plus there's about half a millisecond of "blanking" between frames being sent) giving us 16.67ms of transmission delay. In this case we are limited by the bandwidth capabilities of DVI, HDMI and DisplayPort and the timing standards put forth by VESA. So to send a full frame of anything to the display we will have 16.67ms of input lag added. Some monitors will display this data as it is received, but others will latch input meaning the full frame must be sent before it can be displayed (but let's not get too far ahead of ourselves). Either way, we will consider the latency of this step to be at least one frame (as the monitor will still take 16ms to draw the image either way).

So now we need to talk about vsync. Let's pretend we aren't using it. Let's pretend our game runs at a rock solid exact 60 FPS and our refresh rate is 60Hz, but the buffer swap happens half way between each vertical sync. This means every frame being scanned out would be split down the middle. The top half of the frame will be an additional 16.67ms behind (for a total of 33.3ms of lag). Of course, the bottom half, while 16.67ms newer than the top, won't have it's own top half sent until the next frame 16.67ms later.

In this particular case, the way the math works out if we average the latency of all the pixels on a split frame we would get the same average latency as if we enabled vsync. Unfortunately, when framerate is either higher or lower than refresh rate, vsync has the potential to cause tons of problems and this equivalence doesn't carry in the least.

If our frametime is just longer than 16.67ms with vsync enabled, we will add a full additional frame of latency (with no work being done on the GPU) before we are able to swap the finished buffer to the front for scanout. The wasted work can cause our next frame not to come in before the next vsync, giving us up to two frames of latency (one because we wait to swap and one because of the delay in starting the next frame). If our framerate is higher than 60 FPS, our GPU will have to stop working after rendering until the next vsync. This is a waste of resources and decreases overall performance, but definitely not by as much as if we use vsync at less than the monitor refresh. The upper limit of additional delay is 16.67ms minus frametime (less than one frame) rather than two full frames.

When framerate is lower than refresh rate, using either a 1 frame flip queue with vsync or triple buffering will allow the graphics hardware to continue doing rendering work while adding between 0 and 16.67ms of additional latency (the average will be between the two extremes). So you get the potential benefits of vsync (no tearing and synchronization) without the additional decrease in performance that occurs when no work gets done on the GPU. At framerates higher than refresh rate, when using a render queue, we do end up adding an additional frame of latency per number of frames we render ahead, so this solution isn't a very good one for mitigating input latency (especially in twitch shooters) in high framerate games.

Once the data is sent to the monitor, we've got more delay in store.

We've already mentioned that some LCDs latch the entire frame before display. Beyond this delay, some displays will perform image processing on the input (including scaling if this is not done on the graphics hardware). In some cases, monitors will save two frames to overdrive LCD cells to get them to respond faster. While this can improve the speed at which the picture on the monitor changes, it can add another 16.67ms to 33.3ms of latency to the input (depending on whether one frame is processed or two). Monitors with a game mode or true 120Hz monitors should definitely add less input lag than monitors that require this sort of processing.

Add, on top of all this, the fact that it will take between 2ms and 16ms for the pixels on the LCD to actually switch (response time varies between panels and depending on what levels the transition is between) and we are done: the image is now on the screen.

So what do we have total after the image is flipped to the front buffer?

One frame of lag for transmission (to display a full frame), up to 1 frame of lag if we enable triple buffering (or 1 frame render ahead and we run at less than refresh rate), up to two frames of lag if we just turn on vsync, at framerates higher than the refresh rate we we'll add an additional frame of lag for every frame we render ahead with vsync on, and zero to 2 frames of lag for the monitor to display the image (if it does extensive image processing).

So after crazy speed from the mouse to the front buffer, here we are waiting ridiculous amounts of time to get the image to appear on the screen. We add at the very very least 16.67ms of lag in this stage. At worst we're taking on between 66.67ms and 83.3ms which is totally unacceptable. And that's after the computer is completely done working on the image.

This brings our totals up to about 33ms to 80ms input lag for typical cases. Our worst case for what we've outlined, however, is about 135ms of latency between mouse movement and final display which could be discernible and might start to feel mushy. Sometimes game developers stray a bit and incur a little more input lag than is reasonable. Oblivion and Fallout 3 come to mind.

But don't worry, we'll take a look at some specific cases next.

Of the GPU and Shading Realworld Testing w/ High Speed Video
POST A COMMENT

83 Comments

View All Comments

  • DerekWilson - Monday, July 20, 2009 - link

    This is how we disable vsync.

    We got the same results in lag with present interval set to either 1 or 0 ... it really didn't make a measurable difference in our testing.
    Reply
  • DerekWilson - Monday, July 20, 2009 - link

    to clarify a little, this is why i think that Gamebryo (or Bethesda) must do some sort of internal timing that strictly enforces framerate, CPU time, or something based on some other factor than present interval. Reply
  • NetSoerfer - Monday, July 20, 2009 - link

    On page 5, the fifth paragraph begins with "If our frametime is just longer than 16.67ms...". The next paragraph is meant to describe the opposite but begins with "When framerate is lower than refresh rate...".

    Longer frametime equals lower refresh rate. The second paragraph should read "When framerate is higher than refresh rate..." or "When frametime is shorter than refresh rate...".
    Reply
  • DerekWilson - Monday, July 20, 2009 - link

    No, the next paragraph is not meant to describe the opposite case ...

    The first paragraph you cite describes the effects of double-buffered vsync on framerates both lower than refresh (first half of the paragraph) and higher than refresh (second half of the paragraph).

    The second paragraph you cite describes the effects of a 1 frame flip-queue with vsync or triple buffering on framerates that are lower than refresh.

    Sorry if that wasn't clear.
    Reply
  • Per Hansson - Sunday, July 19, 2009 - link

    Hi, I tried your recommendation with "overclocking" the mouse (erm, we are really just changing the speed of the USB port, not the mouse right?)

    Anyway, I've got a MS IntelliMouse Explorer v3.0
    When I run "Direct Input Mouse Rate" it shows my lag as 8ms at 125hz...

    So I used the driver hidusbf and changed the frequency to 1000hz, this resulted in 1.4ms and 700hz with my mouse...

    But now to begin with I had the mouse speed set to max in the Intellipoint mouse setup, and also "enhance pointer precision" enabled...

    And at 125hz / 8ms lag that gave me a good speed, a bit slower than I had in Win2K but still acceptable (current os is XP x64)
    But now with my "overclocked" mouse the movement is waaay to slow, I need a bigger mousepad to move the mousepointer all across my monitor
    Is this intended or just due to MS drivers or whatever?

    I was planning on getting the Microsoft Habu gaming mouse developed by Razer because the current iteration of the Explorer 3.0 is a POS with crap microbuttons that keep failing, think I've been through 3 of these in the last 2 years, even replaced them with ones bought at Elfa but they also failed after a couple months
    Anyway, will all mouse have this speed issue at high ouse rates? (above 125hz)
    Reply
  • MarktheC - Monday, July 27, 2009 - link

    Re: "But now with my "overclocked" mouse the movement is waaay to slow, I need a bigger mousepad to move the mousepointer all across my monitor. Is this intended or just due to MS drivers or whatever?"

    Yes, this is "how it works" (but it can be fixed).

    What's happening is this: At 125 Hz and a given on-the-pad mouse speed, each mouse report might be returning (say) 16 counts/report.
    The XP/Vista/7 "Enhance pointer precision" code uses the "16" value to lookup an acceleration curve (SmoothMouseXCurve/SmoothMouseYCurve) and apply a scaling factor to the mouse input (approx x 1.4 when the mouse count is 16). The pointer moves ~1.4 * 16 = ~22 pixels.

    If the report rate is changed to to 1000 Hz, each mouse report returns 2, 2, 2, 2, 2, 2, 2, 2 instead (same gross movement of 16, but spread over 8 times as many reports). Now the XP/Vista/7 "Enhance pointer precision" code uses "2" to lookup the acceleration curve and returns a scaling factor (~0.6 when the mouse count is 2). The pointer moves ~0.6 * 2 * 8 = ~9 pixels and you perceive the mouse as slow.

    This is (somewhat) described here:
    http://www.codinghorror.com/blog/archives/000977.h...">http://www.codinghorror.com/blog/archives/000977.h...
    http://www.microsoft.com/whdc/archive/pointer-bal....">http://www.microsoft.com/whdc/archive/pointer-bal....

    BUT Microsoft made a silly design mistake!:
    http://donewmouseaccel.blogspot.com/2009/06/out-of...">http://donewmouseaccel.blogspot.com/200...t-of-syn...

    A solution is to tweak the Registry: HKEY_CURRENT_USER\Control Panel>Mouse>SmoothMouseXCurve and SmoothMouseYCurve values.
    Treat each group of 4 bytes as a 32-bit integer, and divide by 8 (for 1000 Hz). AFAIK, doing this for both SmoothMouseYCurve & SmoothMouseXCurve should return the acceleration back to normal.

    A BETTER solution may be to stick with "Enhance pointer precision" and 125 Hz for normal Windows work, and use 1000 Hz only for gaming AND TURN OFF "Enhance pointer precision" when gaming (if required by the game: most modern games uses DirectX to read the mouse, which ignores the "Enhance pointer precision" checkbox anyway).

    Re: "I was planning on getting the Microsoft Habu ... will all mouse have this speed issue at high mouse rates? (above 125hz)"

    I don't know: I expect the Habu driver will do the right thing and not need any fix as above, but I don't know...
    Reply
  • DerekWilson - Monday, July 20, 2009 - link

    Actually ... the report / second rate should have zero impact on the speed of the pointer. I do say should -- something odd could be happening like it could be dropping counts in order to assemble reports that fast (i.e. your mouse could be too overclocked and might be doing things wrong). But I am not a hardcore mouse overclocker myself so I'd do a little research on it.

    I would recommend, if your mouse can't actually hit 1000Hz, to drop it down to 500 reports/second instead of 1000 ... it should be more consistent that way, and maybe it will fix your pointer speed issue.

    The CPI (reported as DPI) will have an impact on pointer speed. But so will things like setting mouse speed to maximum and using "enhance pointer precision" ... though these latter two don't really have desirable results.

    I strongly recommend leaving mouse speed at the middle notch ... setting it higher actually skips pixels (though "enhance pointer precisions" makes your mouse able to move one pixel at a time if you move it really slowly). And I also recommend not using "enhance pointer precision" as well ...

    These MS pointer ballistics can cause problems in older games, but if the developer did the "right" thing and used either DirectInput or raw input devices then the pointer speed settings shouldn't affect games (only the sensitivity slider in the game should affect pointer speed if it's done right). In most cases going forward you should be able to use the OS to manipulate your pointer speed without negatively impacting your game ... but there is a chance that these settings could negatively impact your gaming experience if the developer used a less desirable way to access the mouse data.
    Reply
  • Per Hansson - Monday, July 20, 2009 - link

    Thanks, the behaviour is the same at 250hz and 500hz
    Those rates just slow down the mouse more...

    There would be no way at all that I could set the mouse speed slider to the middle and get used to that, same for not having enhance pointer precision on

    Guess sometimes you just can't win eh? ;)
    In fact I was quite annoyed by the change in ballistics going from Win2K which supported acceleration which I used and really liked to WinXP which only has this "enhance pointer precision" option
    Reply
  • valnar - Sunday, July 19, 2009 - link

    "It is possible to overclock your mouse."

    Now I've seen everything. :)
    Reply
  • DerekWilson - Sunday, July 19, 2009 - link

    It was bound to happen wasn't it?

    This has been around for a few years now, but (for obvious reasons) never made it into the mainstream gaming community. And, really, now that high performance mice are much more available it isn't as much of an issue.
    Reply

Log in

Don't have an account? Sign up now