Exploring Input Lag Inside and Out

Name: Exploring Input Lag Inside and Out
Item: Exploring Input Lag Inside and Out
Author: Derek Wilson

by Derek Wilson on July 16, 2009 12:45 PM EST

Posted in
GPUs

85 Comments | Add A Comment

85 Comments

Combating Input Lag and Final Words

Now that we know what's going on and what the factors are, what can we do about it?

Sometimes 1ms can be the difference between your input getting to the software in time to be included in the next frame. Most of the time it won't. Of course, the difference between 8ms and 2ms could actually make a frame of difference (up to 16.67ms) in input lag. A mouse that can handle 500 reports/second is what we recommend as a good balance.

It is possible to overclock your mouse. You will still be limited by the physical capabilities of the mouse, but running the USB port for the mouse at a higher rate can help, especially if you don't want to invest in a more expensive mouse. There are tools out there to both check your mouse report rate (with Direct Input Mouse Rate) and to change the rate by replacing the usb driver. When changing the rate on Vista SP1 or Windows 7, drivers will need to be signed. This can be accomplished by using testing signatures and forcing windows to load them. NGOHQ offers a good tutorial on this here.

CPU, memory and GPU will impact the input lag between the mouse and the display. The GPU, as the main internal bottleneck in games, will likely have the largest single impact (higher framerate means less time between frames and less lag), but this is heavily dependent on game design. The basic recommendation is a modestly priced dual core CPU, inexpensive RAM, and a fast GPU. Faster CPUs and RAM could potentially benefit but will not likely provide a huge return on investment in this case.

For input lag reduction in the general case, we recommend disabling vsync. For NVIDIA card owners running OpenGL games, forcing triple buffering in the driver will provide a better visual experience with no tearing and will always start rendering the same frame that would start rendering with vsync disabled. Only input latency after the time we would see a tear in the frame would be longer, and this by less than a full frame of latency.

Unfortunately, all other implementations that call themselves triple buffering are actually one frame flip queues at this point. One frame render ahead is fine at framerates lower than the monitor refresh, but if the framerate ever goes past refresh you will experience much more input lag than with vsync alone. For everyone without multiGPU soluitons, we recommend setting flip queue or max pre-rendered frames to either 1 or 0. Set it to 1 if framerate is always less than monitor refresh and set it to 0 if framerate is always greater than or equal to monitor refresh. If it goes back and forth, only NVIDIA's OpenGL triple buffering will provide the best of both worlds without tearing and will further reduce input lag in high framerate situations.

Improperly handling vsync (enabling or disabling a 1 frame flip queue at the wrong time) can degrade performance by at least one additional whole frame. But with multiGPU options, we really don't have a choice. With more than one GPU in the system, you will want to leave maximum pre-rendered frames set to the default of 3 and allow the driver to handle everything. Input lag with multiGPU systems is something we will want to explore at a later time.

You will want a monitor that doesn't do much (if any) processing. Preferably with a "game" mode. We recently took a look at a few monitors to get a feel for the difference in input processing. While we didn't test it in this article, adding another 16ms to 33ms to input lag is just not a good idea.

One of the largest benefits to games that don't inherently carry a lot of input lag is refresh rate. A real 120Hz refresh rate can significantly benefit input lag especially in twitch shooters. While that impact would be less in games where the framerate can't keep up, the hail of additional frames that can be incurred between the computer and the monitor will still be significantly impacted. Additionally, vsync (even in the worst case) is much cheaper on a high refresh rate monitor. Triple buffering (or even 1 frame flip queues with performance lower than refresh) and 120Hz monitors are a match made in heaven.

Final Words

What started out as a short article on the concept of input lag ended up touching on quite a few key issues in gaming. We get into a few of the concepts of game design and program flow in addition to looking at hardware impact. While we hadn't planned on it, picking up a camera that can do 1200 FPS allowed us to actually measure the input lag of a couple real games.

There are quite a few good nuggets to take away. First, input lag is hugely dependent on the game. There will be games that optimize for reducing input lag and others that do not. In some games it is more important than others. For games that incur huge amounts of input lag, there is only so much that can be done. Using the tips we provided will definitely help get people on the path to lower input lag.

Unfortunately, sometimes reducing input lag to its minimum requires spending money especially on the display side of things. Just make sure to read reviews that look at display lag as avoiding a display that adds an extra 16ms to 33ms of input lag is definitely a good start. Beyond that, a faster GPU is the next most important upgrade, and a mouse that can do at least 500 reports per second is a good idea.

Realworld Testing w/ High Speed Video

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

85 Comments

View All Comments

DerekWilson - Sunday, July 19, 2009 - link
It was bound to happen wasn't it?

This has been around for a few years now, but (for obvious reasons) never made it into the mainstream gaming community. And, really, now that high performance mice are much more available it isn't as much of an issue.
Kaihekoa - Saturday, July 18, 2009 - link
From the conclusion this point wasn't clear to me.
DerekWilson - Sunday, July 19, 2009 - link
at present triple buffering in DirectX == a 1 frame flip queue in all cases ...

so ... it is best to disable triple buffering in DirectX if you are over refresh rate in performance (60FPS generally) ...

and it is better to enable triple buffering in DirectX if you are under 60 FPS.
Squall Leonhart - Wednesday, March 30, 2011 - link
This is not always the case actually, there are some DirectX engines specifically the age of empires 3 engine as an example, that have hitching when moving around the map unless triple buffering is forced on the game.
billythefisherman - Saturday, July 18, 2009 - link
First of all I'd like to say well done on the article you're probably the first person outside of game industry developers to have looked at this rather complex topic and certainly the first to take into account the whole hardware pipeline as well.

Sadly though there are some gaping holes in your analysis mainly focused around the CPU stage. Sadly your CPU isn't going to run any faster than your GPU (and actually the same is correct in reverse) as one is dependent on the other (the GPU is dependent on the CPU). As such the CPU may finish all of its tasks faster than the GPU but the CPU will have to wait for the GPU to finish rendering the last frame before it can start on the next frame of logic.

No game team in the world developing for a console is going to triple buffer their GPU command list.

I intentionally added 'developing for a console' as this is also an important factor I'd say around 75% (being very conservative) of mainstream PC games now are based on cross platform engines. As such developers will more than likely gear their engines to the consoles as these make up the largest market segment by far.

The consoles all have very limited memory capacities
in comparison to their computational power and so developers will more than likely try to save memory over computation thus a double buffered command list is the norm. Some advanced console specific engines actually dropping down to a single command buffer and using CPU - GPU synchronisation techniques because of CPU's being faster than GPU's. This kind of thing isnt going to happen on the PC because the GPU is invariably faster than the CPU.

When porting a game to PC a developer is very unlikely to spend the money re-engineering the core pipline because of the massive problems that can cause. This can be seen in most 'DirectX 10' games, as they simply add a few more post processing effects to soak up the extra power - you may call it lazy coding, I don't, it's just commercial reality these are businesses at the end of the day.

So both your diagrams on the last page are wrong with regards to the CPU stage as they will be roughly the same amount of time as the GPU in the vast majority of frames because of frame locality ie one frame differs little to the next frame as the player tends not to jump around in space and so neighbouring frames take similar amounts of time to render.

Onto my next complaint :
"If our frametime is just longer than 16.67ms with vsync enabled, we will add a full additional frame of latency (with no work being done on the GPU) before we are able to swap the finished buffer to the front for scanout. The wasted work can cause our next frame not to come in before the next vsync, giving us up to two frames of latency (one because we wait to swap and one because of the delay in starting the next frame)."

What are you talking about man!?! You don't drop down to 20fps (ie two more frames of latency) because you take 17ms to render your frame - you drop down to 30fps! With vsync enabled your graphics processor will be stalled until the next frame but thats all and you could possibly kick off your CPU to calculate the next frame to take advantage of that time. Not that thats going to make the slightest jot of difference if you're GPU bound because you have to wait for the GPU to finish with the command buffer its rendering (as you don't know where in the command buffer the GPU is).

As I've said on the consoles there are tricks you can do to synchronise the GPU with the CPU but you don't have that low level control of the GPU on the PC as Nvidia/ATI don't want the internals of thier drivers exposed to one another.

And as I've said not that you'd want to do such a thing on PC as the CPU is usually going to be slower than the GPU and cause the GPU to stall constantly hence the reason to double buffer the command buffer in the first place.

I've also tried to explain in my posts to your triple buffering article why there's a lot cobblers in the next few paragraphs.
DerekWilson - Sunday, July 19, 2009 - link
Fruit pies? ... anyway...

Thanks for your feedback. On the first issue, the console development is one of growing importance as much as I would like for it not to be. At some point, though, I expect there will be an inflection point where it will just not be possible to build certain types of games for consoles that can be built on PCs ... and we'll have this before the next generation of consoles. Maybe it's a pipedream, but I'm hoping the development focus will shift back to the PC rather than continue to pull away (I don't think piracy is a real factor in profitability though I do believe publishers use the issue to take advantage of developers and consumers).

And I get that with GPU as bottleneck you have that much time to use the CPU as well ... but you /could/ decouple CPU and GPU and gain performance or reduce lag. Currently, it may make sense that if we are GPU limited the CPU stage will effectively equal the GPU stage in latency -- and likewise that if we are CPU limited, the GPU state effectively equals the CPU stage (because of stalling) in input latency.

Certainly it is a more complex topic than I illustrated, and if I didn't make that clear then I do apologize. I just wanted to get across the general idea rather than a "this is how it always is" kind of thing ... clearly Fallout 3 has even more input lag than any of my worst case scenarios account for even with 2 frame of image processing on the monitor ... I have no idea what they are doing ...

...

As for the second issue -- you can get up to two frames of INPUT LAG with vsync enabled and 17ms GPU time.

you will get up to these two frames (60Hz frames) of input lag at 30FPS ...

I'm not talking about the frame rate dropping to 2 frames then 1 frame (20 FPS) ... I'm talking about the fact that, at best, your input is gathered 17ms before your frame completes on the GPU (1 frame of input lag) and (because it missed vsync) it will take another frame for that to hit the screen (for a total of two).
billythefisherman - Monday, July 20, 2009 - link
I have to re-iterate: well done on tackling this rather complex issue, I applaud you! (I just wish you hadn't whipped up your punters so much in the benefits of triple buffering!)
Gastra - Saturday, July 18, 2009 - link
For (quite a lot if you follow the links) of information on what an optical mouse see:
http://hackedgadgets.com/2008/10/15/optical-mouse-...">http://hackedgadgets.com/2008/10/15/optical-mouse-...
DerekWilson - Sunday, July 19, 2009 - link
That's pretty cool stuff ... And it lines up pretty well with our guess at mouse sensor resolution for the G9x.

It'd still be a lot nicer if we could get the specs straight from the manufacturer though ...
PrinceGaz - Friday, July 17, 2009 - link
"For input lag reduction in the general case, we recommend disabling vsync. For NVIDIA card owners running OpenGL games, forcing triple buffering in the driver will provide a better visual experience with no tearing and will always start rendering the same frame that would start rendering with vsync disabled."

I'm going to ask this again I'm afraid :) Are you sure Derek? Does nVidia's triple-buffer OpenGL driver implementation do that, or is it just the same as what most people take triple-buffer rendering to be, that is having one additional back buffer to render to so as to provide a steady supply of frames when the framerate dips below the refresh rate? Have you got confirmation either from screenshots or something else (like nVidia saying that is how it works) that OpenGL triple-buffering is any different from Direct3D rendering, or how AMD handle it?.

Because if you don't, then all you are saying is that triple-buffering is a second back-buffer which is filled to prevent lags when the framerate falls below the refresh rate. Do you know for sure that nVidia OpenGL drivers render constantly when in triple-buffer mode or are you only assuming they do so?

Exploring Input Lag Inside and Out

Combating Input Lag and Final Words

Post Your Comment

85 Comments

View All Comments

DerekWilson - Sunday, July 19, 2009 - link

Kaihekoa - Saturday, July 18, 2009 - link

DerekWilson - Sunday, July 19, 2009 - link

Squall Leonhart - Wednesday, March 30, 2011 - link

billythefisherman - Saturday, July 18, 2009 - link

DerekWilson - Sunday, July 19, 2009 - link

billythefisherman - Monday, July 20, 2009 - link

Gastra - Saturday, July 18, 2009 - link

DerekWilson - Sunday, July 19, 2009 - link

PrinceGaz - Friday, July 17, 2009 - link

Log in

Don't have an account? Sign up now