Another New Anti-Aliasing Mode: Enhanced Quality AA

With the 6800 series AMD introduced Morphological Anti-Aliasing (MLAA), a low-complexity post-processing anti-aliasing filter. As a post-processing filter it worked with a wide variety of games and APIs, and in most cases the performance overhead was not very severe. However it’s not the only new anti-aliasing mode that AMD has been working on.

New with the 6900 series is a mode AMD is calling Enhanced Quality Anti-Aliasing. If you recall NVIDIA’s Coverage Sample Anti-Aliasing (CSAA) introduced with the GeForce 8800GTX, then all of this should sound quite familiar – in fact it’s basically the same thing.

Under traditional MSAA, for a pixel covered by 2 or more triangles/fragments, 2, 4, or 8 subpixel samples are taken to determine what the final pixel should be. In the process the color of the triangle and the Z/depth of the triangle are both sampled and stored, and at the end of the process the results are blended together to determine the final pixel value. This process works well for resolving aliasing along polygon edges at a fraction of the cost of true super sampling, but it’s still expensive. Collecting and storing the Z and color values requires extra memory to store the values and extra memory bandwidth to work with the values. Ultimately while we need enough samples to determine colors of the involved triangles, we do not always need a great deal of them. With a few color/Z samples we have all of the color data we need in most cases, however the “hard” part of anti-aliasing becomes what the proper blending of color values should be.


1 Pixel Covred by 2 Triangles/Fragments

Thus we have EQAA, a compromise on the idea. Color/Z samples are expensive, but just checking if a triangle covers part of a subpixel is very cheap. If we have enough color/Z samples to get the necessary color information, then just doing additional simple subpixel coverage checks would allow us better determine what percentage of a pixel is covered by a given polygon, which we can then use to blend colors in a more accurate fashion. For example with 4x MSAA we can only determine if a pixel is 0/25/50/75/100 percent covered by a triangle, but with 4x EQAA where we take 4 color samples and then 4 additional coverage-only samples, we can determine blending values down to 0/12/25/37/50/62/75/87/100 percent coverage, the same amount of accuracy as using 8x MSAA. Thus in the right situation we can have quality similar to 8x MSAA for only a little over 4x MSAA’s cost.


MSAA & EQAA Sample Patterns

In reality of course this doesn’t always work out as well. The best case scenario is that the additional coverage samples are almost as good as having additional color/Z samples, while the worst case scenario is that additional coverage samples are practically worthless. This depends on a game-by-game, if not pixel-by-pixel basis. In practice additional coverage samples are a way to slightly improve MSAA quality for a very, very low cost.

While NVIDIA has had the ability to take separate coverage samples since G80, AMD has not had this ability until now. With the 6900 hardware their ROPs finally gain this ability.

Beyond that, AMD and NVIDIA’s implementations are nearly identical except for the naming convention. Both can take a number of coverage samples independent of the color/Z samples based on the setting used; the only notable difference we’re aware of is that like AMD’s other AA modes, their EQAA mode can be programmed to use a custom sample pattern.

As is the case with NVIDIA’s CSAA, AMD’s EQAA mode is available to DirectX applications or can be forced through the drivers. DirectX applications can set it through the Multisample Quality attribute, which is usually abstracted to list the vendor’s name for the mode in a game’s UI. Otherwise it can be forced via the Catalyst Control Center, either by forcing an AA mode, or as is the case with NVIDIA, enhancing the AA mode by letting the game set the AA mode while the driver overrides the game and specifies different Multisample Quality attribute. Thus the “enhance application settings” AA mode is new to AMD with the 6900 series.

To be honest we’re a bit ruffled by the naming choice. True, NVIDIA did go and have to pick daft names for their CSAA modes (when is 8x not 8 sample MSAA?), but ultimately CSAA and EQAA are virtually identical. NVIDIA has a 4 year lead on AMD here, and we’d just as well use NVIDIA’s naming conventions for consistency. Instead we have the following.

Coverage Sampling Modes: CSAA vs EQAA
NVIDIA Mode
(Color + Coverage)
AMD
2x 2+0 2x
N/A 2+2 2xEQ
4x 4+0 4x
8x 4+4 4xEQ
16x 4+12 N/A
8xQ 8+0 8x
16xQ 8+8 8xEQ
32x 8+24 N/A

AMD ends up having 1 mode NVIDIA doesn’t, 2xEQ, which is 2x MSAA + 2x cover samples; meanwhile NVIDIA has 16x (4x MSAA + 12 cover samples) and 32x (8x MSAA + 24 cover samples). Finally, as we’ll see, just as is the case for NVIDIA additional coverage samples are equally cheap for AMD.

Tweaking PowerTune Meet the 6970 & 6950
POST A COMMENT

167 Comments

View All Comments

  • B3an - Thursday, December 16, 2010 - link

    Very stupid uninformed and narrow-minded comment. People like you never look to the future which anyone should do when buying a graphics card, and you completely lack any imagination. Theres already tons of uses for GPU computing, many of which the average computer user can make use of, even if it's simply encoding a video faster. And it will be use a LOT more in the future.

    Most people, especially ones that game, dont even have 17" monitors these days. The average size monitor for any new computer is at least 21" with 1680 res these days. Your whole comment is as if everyone has the exact same needs as YOU. You might be happy with your ridiculously small monitor, and playing games at low res on lower settings, and it might get the job done, but lots of people dont want this, they have standards and large monitors and needs to make use of these new GPU's. I cant exactly see many people buying these cards with a 17" monitor!
    Reply
  • CeepieGeepie - Thursday, December 16, 2010 - link

    Hi Ryan,

    First, thanks for the review. I really appreciate the detail and depth on the architecture and compute capabilities.

    I wondered if you had considered using some of the GPU benchmarking suites from the academic community to give even more depth for compute capability comparisons. Both SHOC (http://ft.ornl.gov/doku/shoc/start) and Rodinia (https://www.cs.virginia.edu/~skadron/wiki/rodinia/... look like they might provide a very interesting set of benchmarks.
    Reply
  • Ryan Smith - Thursday, December 16, 2010 - link

    Hi Ceepie;

    I've looked in to SHOC before. Unfortunately it's *nix-only, which means we can't integrate it in to our Windows-based testing environment. NVIDIA and AMD both work first and foremost on Windows drivers for their gaming card launches, so we rarely (if ever) have Linux drivers available for the launch.

    As for Rodinia, this is the first time I've seen it. But it looks like their OpenCL codepath isn't done, which means it isn't suitable for cross-vendor comparisons right now.
    Reply
  • IdBuRnS - Thursday, December 16, 2010 - link

    "So with that in mind a $370 launch price is neither aggressive nor overpriced. Launching at $20 over the GTX 570 isn’t going to start a price war, but it’s also not so expensive to rule the card out. "

    At NewEgg right now:

    Cheapest GTX 570 - $509
    Cheapest 6970 - $369

    $30 difference? What are you smoking? Try $140 difference.
    Reply
  • IdBuRnS - Thursday, December 16, 2010 - link

    Oops, $20 difference. Even worse. Reply
  • IdBuRnS - Thursday, December 16, 2010 - link

    570...not 580...

    /hangsheadinshame
    Reply
  • epyon96 - Thursday, December 16, 2010 - link

    This was a very interesting discussion to me in the article.

    I'm curious if Anandtech might expand on this further in a future dedicated article comparing what NVIDIA is using to AMD.

    Are they also more similar to VLIW4 or VLIW5?

    Can someone else shed some light on it?
    Reply
  • Ryan Smith - Thursday, December 16, 2010 - link

    We wrote something almost exactly like you're asking for for our Radeon HD 4870 review.

    http://www.anandtech.com/show/2556

    AMD and NVIDIA's compute architectures are still fundamentally the same, so just about everything in that article still holds true. The biggest break is VLIW4 for the 6900 series, which we covered in our article this week.

    But to quickly answer your question, GF100/GF110 do not immediately compare to VLIW4 or VLIW5. NVIDIA is using a pure scalar architecture, which has a number of fundamental differences from any VLIW architecture.
    Reply
  • dustcrusher - Thursday, December 16, 2010 - link

    The cheap insults are nothing but a detriment to what is otherwise an interesting argument, even if I don't agree with you.

    As far as the intellect of Anandtech readers goes, this is one of the few sites where almost all of the comments are worth reading; most sites are the opposite- one or two tiny bits of gold in a big pan of mud.

    I'm not going to "vastly overestimate" OR underestimate your intellect though- instead I'm going to assume that you got caught up in the moment. This isn't Tom's or Dailytech, a little snark is plenty.
    Reply
  • Arnulf - Thursday, December 16, 2010 - link

    When you launch an application (say a game), it is likely to be the only active thread running on the system, or perhaps one of very few active threads. CPU with Turbo function will clock up as high as possible to run this main thread. When further threads are launched by the application, CPU will inevitably increase its power consumption and consequently clock down.

    While CPU manufacturers don't advertise this functionality in this manner, it is really no different from PowerTune.

    Would PowerTune technology make you feel any better if it was marketed the other way around, the way CPUs are ? (mentioning lowest frequencies and clock boost provided that thermal cap isn't met yet)
    Reply

Log in

Don't have an account? Sign up now