Better AA: Dynamic Super Resolution & Multi-Frame Sampled Anti-Aliasing

On a personal note, the subject of anti-aliasing has always been near and dear to my heart. When you review video cards for a living you start to see every minor defect, and this is especially the case for jaggies and other forms of aliasing. So when new anti-aliasing modes are being introduced it is always a time of great interest.

Dynamic Super Resolution

With the launch of Maxwell 2 NVIDIA is going to be launching 2 new anti-aliasing technologies. The first of these technologies is called Dynamic Super Resolution, and it is a sort of brute force anti-aliasing method targeted at games that do not support real anti-aliasing or do not support it well.

In the case of Dynamic Super Resolution (DSR), NVIDIA achieves anti-aliasing by rendering a frame at a resolution higher than the user’s monitor (the Super Resolution of DSR), and then scaling the image back down to the monitor’s native resolution. This process of rendering at a higher resolution and then blending pixels together when the image is scaled down results in a higher quality image that is less aliased than an image rendered at a native resolution, owing to the additional detail attained from rendering at a higher resolution.

Although NVIDIA is first introducing DSR with Maxwell 2 GPUs, the technique is actually much older than that. For enthusiasts this process is better known as downsampling, and while it has been around for years it has been relatively inaccessible to the masses due to the hacky nature of unsupported downsampling, which among other things requires tweaking settings for monitors, drivers, and games all alike. As a result while NVIDIA can’t lay claim to the idea of downsampling, this is still a significant improvement in the downsampling process because downsampling is now being promoted to a first-class feature, which means it brings with it the full development backing of NVIDIA and the wider accessibility that will bring.

Of course it should also be noted that NVIDIA and enthusiasts aren’t the only parties who have been engaging in downsampling, as game developers as well have periodically been adding the feature directly to their games. Among our benchmarking suite, Battlefield 4, Company of Heroes 2, and Thief all support the equivalent of downsampling; BF4 and CoH2 allow a game to be internally rendered at a higher resolution, and Thief has SSAA modes that do the same thing. As a result there are already some games on the market that utilize downsampling/DSR, with the difference/advantage of NVIDIA’s implementation being that it makes the technique accessible to games that do not implement it on their own.

Digging a bit deeper, the image quality advantage of downsampling/DSR is that it’s fundamentally a form of Super Sample Anti-Aliasing (SSAA). By rendering an image at a higher resolution and then scaling it down, DSR is essentially sampling each pixel multiple times, improving the resulting image quality by removing geometry, texture, and shader aliasing. And like true SSAA, DSR is going to be very expensive from a rendering standpoint – you’re potentially increasing your frame resolution by 4x – but if you have the performance to spare then DSR will be worth it, and this is the basis of NVIDIA’s inclusion of DSR as a first-class feature.

Meanwhile from an image quality standpoint DSR should be a decent but not spectacular form of SSAA. Because it’s simply rendering an image at a larger size, DSR functionally uses an ordered pixel grid. For anti-aliasing purposes ordered grids are suboptimal due to the fact that near-vertical and near-horizontal geometry doesn’t get covered well, which is why true AA techniques will use rotated grids or sparse grids. None the less while DSR’s resulting sample pattern isn’t perfect it is going to be much better than the alternative of forgoing anti-aliasing entirely.



Anti-Aliasing Example: Ordered Grid vs. Rotated Grid (Images Courtesy Beyond3D)

DSR to that end can be considered a sort of last-resort method of SSAA. For games that support proper RG/SG SSAA, those anti-aliasing methods will produce superior results. However as a number of games do not support native anti-aliasing of any kind due to the use of deferred renderers, DSR provides a way to anti-alias these games that is compatible with their rendering methods.

Moving on, under the hood NVIDIA is implementing DSR as a form of high resolution rendering combined with a 13-tap Gaussian filter. In this process NVIDIA’s drivers present a game with a fake resolution higher than the actual monitor (i.e. 3840x2160 for a true 1080p monitor), and then have the game render to that higher resolution while using the Gaussian filter to blend the results down to the lower resolution. The fact that NVIDIA is using a Gaussian filter here as opposed to a simple box filter definitely raises a few eyebrows due to the potential for unwanted blurring, and this is something we will be taking a look at next week in our image quality analysis of GTX 980.

In the meantime the use of downsampling in this fashion means that DSR will have a high but less-than-perfect compatibility rate. Games that can’t render at very high resolutions will not be usable with DSR, and games that render incorrectly at those resolutions will similarly be problematic. In practice many games should be able to render at 4K-like resolutions, but some fraction of those games will not know how to scale up the UI accordingly, resulting in a final UI that is too small after the image is scaled down.


Looking at the broader picture, from a marketing and product perspective DSR is another tool for NVIDIA for dealing with console ports. Games that are ported from current-gen and last-gen consoles and don’t make significant (if any) use of newer GPU features will as a rule of thumb look little-if-any better on the PC than they do their original console. This in turn leaves more powerful GPUs underutilized and provides little incentive to purchase a PC (and an NVIDIA GPU) over said consoles. But by implementing DSR, NVIDIA and NVIDIA users can attain a leg-up on consoles by improving image quality through SSAA. And while this can’t make up for a lack of texture or model quality, it can convincingly deal with the jaggies that would otherwise be present on both the PC and the console.

With that in mind, it should be noted that DSR is primarily geared towards low DPI monitor users; 1080p, 900p, 1200p, etc. High DPI monitor users can simply run a game natively at 4K, at which point they likely won’t have much performance left over for any further anti-aliasing anyhow. Meanwhile DSR for its part will support resolution factors of between 1.2x (1.1 x 1.1) and 4x (2 x 2), allowing the resolution used to vary depending on the desired quality level and resulting performance. From a quality perspective 4x will in turn be the best factor to use, as this is the only factor that allows for potentially clean integer scaling (think Retina display). For this reason DSR also offers a smoothness control, which allows the user to control the intensity of the Gaussian filter used.

Meanwhile for end users NVIDIA will be exposing DSR at two points. DSR is currently implemented in the NVIDIA control panel, which allows for direct control of the scaling factor and the smoothness on a per-game basis. Meanwhile DSR will also be exposed in GeForce Experience, which can enable DSR for games that NVIDIA has vetted to work with the technology and are running on computers fast enough to render at these higher resolutions.

Finally, while DSR is currently limited to Maxwell 2 video cards, NVIDIA has not-so-subtly been hinting that DSR will in time be ported to NVIDIA’s previous generation cards. The technique itself does not require any special Maxwell 2 hardware and should easily work on Kepler hardware as under the hood it’s really just a driver trick. However whether Kepler cards are fast enough to use DSR with an adequate resolution factor will be another matter entirely.

Multi-Frame Sampled Anti-Aliasing

NVIDIA’s other new anti-aliasing technology for Maxwell 2 is the unfortunately named Multi-Frame sampled Anti-Aliasing. Whereas DSR was targeted at the quality segment of the market as a sort of last resort AA method for improving image quality, Multi-Frame Sampled Anti-Aliasing is targeted at the opposite end of the spectrum and is designed to be a more efficient form of MSAA that achieves similar results with half as many samples and half of the overhead.

Unlike DSR, Multi-Frame Sampled Anti-Aliasing is implemented on and requires new Maxwell 2 hardware, which is NVIDIA’s new programmable MSAA sampling pattern ability in their ROPs. This feature allows NVIIDA to dynamically alter their MSAA sample patterns, which is a key feature of Multi-Frame Sampled Anti-Aliasing, and therefore cannot easily be backported to existing hardware.

In any case, Multi-Frame Sampled Anti-Aliasing is based on the concept of changing the MSAA sample pattern in every frame, in practice using a 2x (2 sample) MSAA pattern and combining the results from multiple frames to mimic a 4x (4 sample) MSAA pattern. If it’s done right then you should receive results comparable to 4x MSAA with the cost of 2x MSAA.

Once you can grasp the concept of changing sample patterns, the idea is actually relatively simple. And in fact like DSR it has been done before in a lesser form by none other than AMD (or at the time, ATI). In 2004 with their X800 series of cards, AMD launched their Temporal Anti-Aliasing technology, which was based on the same sampling concept but importantly without any kind of frame combining/blending. Over the years Temporal AA never did see much use, and was ultimately discontinued by AMD.


Compare & Contrast: AMD's Discontinued Temporal AA

What sets Multi-Frame Sampled Anti-Aliasing apart from Temporal AA and similar efforts – and why NVIDIA thinks they will succeed where AMD failed – is the concept of temporal reprojection, or as NVIDIA calls it their temporal synthesis filter. By reusing pixels from a previous frame (to use them as pseudo-MSAA samples), the resulting frame can more closely match true 4x MSAA thanks to the presence of multiple samples. The trick is that you can’t simply reuse the entire last frame, as this would result in a much less jagged image that also suffered from incredible motion blur. For this reason the proper/best form of temporal reprojection requires figuring out which specific pixels to reproject and which to discard.

From an image quality standpoint, in the ideal case of a static image this would correctly result in image quality rivaling 4x MSAA. As a lack of camera motion means that the pixels being sampled never changed, the samples would line up perfectly and would fully emulate 4x MSAA. However once in motion the overall image quality is going to be heavily reliant on the quality of the temporal reprojection. In the best case scenario for motion Multi-Frame Sampled Anti-Aliasing still will not perfectly match 4x MSAA, and in the worst case scenario for motion it could still result in either 2x MSAA-like anti-aliasing, significant blurring, or even both outcomes.

Multi-Frame sampled Anti-Aliasing also has one other catch that has to be accounted for, and that’s frame rates. At low framerates – below 30fps – the time between frames grows so large that temporal reprojection would become increasingly inaccurate and the human eyes would pick up on the sample pattern changes, which means that this anti-aliasing technique is only usable with high frame rates. Importantly this is actually one of the benefits of Multi-Frame sampled Anti-Aliasing, as the lower overhead of a 2x sample pattern makes it easier to maintain higher framerates.

For what it’s worth, while NVIDIA is the first GPU vendor to implement temporal AA with temporal reprojection in their drivers, they are not the first individual overall. Over the years a few different game engines have implemented AA with temporal reprojection, the most notable of which is Crytek’s CryEngine 3. In Crysis 3 temporal reprojection was implemented as part of the SMAA anti-aliasing technique. The result was effective at times, but SMAA does result in some blurring, though this is difficult to separate from the effects of morphological filtering in SMAA. In any case the point is that while we will reserve our final comments for our evaluation of Multi-Frame sampled Anti-Aliasing, we are expecting that it will result in some degree of blurring compared to the 4x MSAA it is emulating.

Moving on, while Multi-Frame sampled Anti-Aliasing can potentially be used in a number of scenarios there are two specific scenarios NVIDIA will be targeting with the technology, both of which are performance-critical situations. The first of which is 4K gaming, where the strain of 8 million pixels alone leaves little room for anti-aliasing. In this case Multi-Frame sampled Anti-Aliasing can be enabled for a relatively low performance penalty. Meanwhile NVIDIA’s other usage scenario is VR headset gaming, where frame latency is critical and yet jaggies are highly visible. 4x MSAA is fully usable here, however the increase in frame rendering time may not be desirable, so Multi-Frame sampled Anti-Aliasing would allow for a similar quality without quite as long of an increase in frame rendering times.

In both cases Multi-Frame sampled Anti-Aliasing could be enabled at the driver level, with NVIDIA’s drivers intercepting the call for MSAA and instead providing their new anti-aliasing technique. At this point we don’t know for sure what compatibility will be like, so it remains to be seen what games it will work with. NVIDIA for their part is noting that they “plan to support […] a wide range of games” with the technology.

Wrapping things up, at this point in time while NVIDIA is publicly announcing Multi-Frame sampled Anti-Aliasing and has shown it to the press, it is not in shipping condition yet and is unavailable in NVIDIA’s current driver set. NVIDIA is still classifying it as an upcoming technology, so there is currently no set date or ETA for when it will finally be shipped to GTX 900 series owners.

Display Matters: HDMI 2.0, HEVC, & VR Direct Launching Today: GTX 980 & GTX 970
POST A COMMENT

274 Comments

View All Comments

  • kron123456789 - Friday, September 19, 2014 - link

    Look at "Load Power Consuption — Furmark" test. It's 80W lower with 980 than with 780Ti. Reply
  • Carrier - Friday, September 19, 2014 - link

    Yes, but the 980's clock is significantly lowered for the FurMark test, down to 923MHz. The TDP should be fairly measured at speeds at which games actually run, 1150-1225MHz, because that is the amount of heat that we need to account for when cooling the system. Reply
  • Ryan Smith - Friday, September 19, 2014 - link

    It doesn't really matter what the clockspeed is. The card is gated by both power and temperature. It can never draw more than its TDP.

    FurMark is a pure TDP test. All NVIDIA cards will reach 100% TDP, making it a good way to compare their various TDPs.
    Reply
  • Carrier - Friday, September 19, 2014 - link

    If that is the case, then the charts are misleading. GTX 680 has a 195W TDP vs. GTX 770's 230W (going by Wikipedia), but the 680 uses 10W more in the FurMark test.

    I eagerly await your GTX 970 report. Other sites say that it barely saves 5W compared to the GTX 980, even after they correct for factory overclock. Or maybe power measurements at the wall aren't meant to be scrutinized so closely :)
    Reply
  • Carrier - Friday, September 19, 2014 - link

    To follow up: in your GTX 770 review from May 2013, you measured the 680 at 332W in FurMark, and the 770 at 383W in FurMark. Those numbers seem more plausible. Reply
  • Ryan Smith - Saturday, September 20, 2014 - link

    680 is a bit different because it's a GPU Boost 1.0 card. 2.0 included the hard TDP and did away with separate power targets. Actually what you'll see is that GTX 680 wants to draw 115% TDP with NVIDIA's current driver set under FurMark. Reply
  • Carrier - Saturday, September 20, 2014 - link

    Thank you for the clarification. Reply
  • wanderer27 - Friday, September 19, 2014 - link

    Power at the wall (AC) is going to be different than power at the GPU - which is coming from the DC PSU.

    There are loses and efficiency difference in converting from AC to DC (PSU), plus a little wiggle from MB and so forth.
    Reply
  • solarscreen - Friday, September 19, 2014 - link

    Here you go:

    http://books.google.com/books?id=v3-1hVwHnHwC&...
    Reply
  • PhilJ - Saturday, September 20, 2014 - link

    As stated in the article, the power figures are total system power draw. The GTX980 is throwing out nearly double the FPS of the GTX680, so this is causing the rest of the system (mostly the CPU) to work harder to feed the card. This in tun drives the total system power consumption up, despite the fact the GTX980 itself is drawing less power than the GTX680. Reply

Log in

Don't have an account? Sign up now