Original Link: http://www.anandtech.com/show/1205
Image Quality Analysis Fall 2003: A Glance Through the Looking Glassby Derek Wilson on December 10, 2003 11:14 PM EST
- Posted in
The question of image quality is much more complicated than determining which video card renders a scene the fastest. Years ago, we could say that the image that came out of two different computer systems should be exactly the same because developers controlled every aspect of how their program ran with software, rather than leaving some decisions to the hardware on which the program was running. With the advent of hardware acceleration, developers could get impressive speed gains from their software. As a side effect, the implementation of very basic functionality was defined completely by the designers of the hardware (e.g. ATI and NVIDIA). For example, a developer no longer needs to worry about the mathematics and computer science behind mapping a perspective correct texture onto a surface; now, all one needs to do is to turn on the hardware texturing features that they want and assign textures to surfaces. In addition to saving the developer from having to code these kinds of algorithms, this took away some control and made it so different hardware could produce different output (there is more than one correct way to implement every feature).
Obviously, there are many more pros to hardware acceleration than cons. The speed gains that we are able to make in real-time 3D rendering alone excuse any problems caused. Since the developer doesn't need to worry about writing code worthy of a Ph.D. in mathematics (as that is left to the GPU designers), games can be developed faster or more time can be spent on content. The only real con is the loss of control over how everything is done.
Different types of hardware do things differently. There is more room for choice in how things are done in 3D hardware than in something like an x86 processor. For one thing, IHVs have to support APIs (DirectX and OpenGL) rather than an instruction set architecture. There is much more ambiguity in asking a GPU to apply a perspective correct lighted mipmap to a surface with anisotropic filtering than in asking a CPU to multiply two numbers. Of course, we see this as a very good thing. The IHVs will be in constant competition to provide the best image quality at the fastest speed with the lowest price.
Unfortunately, defining image quality is a more difficult task than it seems. Neither ATI nor NVIDIA produce images that match the DX9 reference rasterizer (Microsoft's tool to estimate what image should be produced by a program). There is, in fact, no “correct” image for any given frame of a game. This makes it very hard to draw a line in the sand and say that one GPU does something the right way and the other one does not.
There is the added problem that taking screenshots in a game isn't really the best place to start when looking for a quantitative comparison. Only a handful of tests will allow us to grab the exact same frame of a game for use in a direct comparison. We are always asking developers to include benchmarks in their games, and this is a feature that we would love to see in every benchmark.
The other issue with screenshots is trying to be sure that the image we grab from the framebuffer (the part of the GPU's memory that holds information about the screen) is the same as the image we see on the screen. For instance, NVIDIA saves some filtering and post-processing (work done on the 2D image produced from the 3D scene) until data is being sent out from the framebuffer to the display device. This means that the data in the framebuffer is never what we see on our monitors. In order to make it so people could take accurate screenshots of their games, NVIDIA does the same post-processing effects on the framebuffer data when a screenshot is taken. While screenshot post-processing is necessary at the moment, using this method introduces another undesirable variable into the equation. To this end, we are working very hard on finding alternate means of comparing image quality (such as capturing images from the DVI port).
When trying to render scenes, it is very important to minimize the amount of useless work a GPU does. This has led to a great number of optimizations being implemented in hardware that attempt to do less work whenever possible. Implementing such optimizations is absolutely necessary for games to run smoothly. The problem is that some optimizations make a slight difference in how a scene is rendered (such as approximating things like sine and inverse square root using numerical methods rather than calculating the exact answer). The perceptibility (or lack thereof) of the optimization should be an important factor in which optimizations are used and which are not. Much leeway is allowed in how things are done. In order to understand what's going on, we will attempt to explain some of the basics of real-time 3D rendering.
Color and AlphaThere is an incredible volume of technology that goes into producing a 3D scene in real-time. Due to the vast amount of information on this topic, we will only be covering the points necessary to help understand the visual differences we saw in the games that we tested. And what better place to start than one of the most basic aspects of any image: color.
In a computer, the color of a pixel is determined by four values: red, green, blue, and alpha. Red, green, and blue are the primary colors of light and can be combined in different intensities to create millions of other colors. The alpha value is a specification of how opaque or translucent a color should be, which allows for some very complex layering effects and translucent objects that can't be achieved with color alone (this is called alpha blending).
The translucent floor in this scene is a good example of what alpha blending can do.
(Click to enlarge.)
Using a lot of alpha blending can bring just about any GPU to a crawl. In light of the limitations of human visual perception and the accuracy of color representation in a computer, very translucent objects can be discarded, since the alpha blending won't significantly change the color of the final result. It is common practice in the world of 3D graphics to put a threshold on how opaque something needs to be before it is considered for drawing at all. This really helps to speed up rendering, and (ideally) doesn't impact the experience of the game at all.
Texture Mapping and FilteringIn the beginning was the wire frame, and it was good. But texture mapping changed completely the face of computer generated 3D graphics. Everything from coloring, to lighting and shadows, to bumping and displacement can be done with texture mapping. For this discussion, we will be talking about mipmapping, bilinear, trilinear, and anisotropic filtering.
To set the stage, our 3D scene has an object and a texture map (an image to be applied to the object). When we stare “through the looking glass,” we see that each screen pixel maps to a particular area of our object. Likewise, each area on the object maps to an area in the texture map. Unfortunately, a texture map has a fixed number of data points (pixels in the image), while the surface of the object is continuous. This means that it is possible for an area on the object to map to a position that lies between pixels in the texture image. In order to fill in the gaps, developers need to choose a method to interpolate existing data. The first method developers used to solve this problem was simply to make the color of an area equal to that of the nearest pixel in the texture map.
The image quality resulting from using the nearest texel (a pixel in a texture map) is something commonly referred to as pixelization (large blocks of color visible when objects with low resolution textures fill the screen). Anyone who's played early first-person shooters (like Doom, Duke3D, etc.) will know this from how the screen looks when pressed up against a wall.
To solve this, rather than using the nearest texel in the texture map, we can do linear interpolation. We take the 4 surrounding texels, and interpolate between the two pairs of pixels in one direction. With the two resulting points, we then do another linear interpolation to get something closer to what the color should be. This is called bilinear filtering.
|The light blue color is interpolated linearly from the other two.|
Another problem with what we have already talked about is that when high resolution textures are used on a surface that is far away, one screen pixel can be mapped to an area in the texture map, while the neighboring screen pixels are mapped to entirely different areas of the texture. Essentially, the size of the screen pixel is much greater than that of the texel. The visual result of this is a shimmering or sparkling in distant textures.
We can fix this by making multiple versions of texture maps with different resolutions. These multiple resolution texture maps are mipmaps (as this is a description of mipmapping). A Level of Detail (LOD) calculation based on the distance of the area on the object that we are texturing to the viewer is used to choose a mipmap level with texels close to the size of (but not smaller than) screen pixels.
One of the things that is completely different between ATI and NVIDIA is the way LOD is calculated for mipmapped textures. NVIDIA uses the Euclidean distance (sqrt(x2, y2, x2)) in their calculations, while ATI uses a weighted Manhattan distance calculation (.28*x+.53*y+.19*z). This causes the way textures look to vary a great deal between GPUs when looking at anything but very near or very far surface.
Of course, we still have problems that we need to solve. When the LOD calculation dictates a change in the mipmap being used on a surface, we can see a discontinuity in the texturing of that surface. To combat this, trilinear filtering was devised. First, we do filtering on the two surrounding mipmap levels, then interpolate between the resulting values.
|This diagram shows one way to do trilinear filtering.|
The result is a smooth transition between mipmap levels. Of course, there are plenty of different algorithms for doing all this interpolation, and there is not a GPU on the market that does full trilinear filtering all the time. This operation is very expensive to do for every pixel on the screen. There are plenty of optimizations to make trilinear faster, such as only doing bilinear filtering where banding isn't an issue. This is acceptable as long as there is no perceptible loss in visual quality.
But the madness doesn't stop there. As it happens, trilinear filtering is isotropic. What this means to us is that trilinear filtering produces blurry images when not viewed straight on. The reason this happens is that the pixels in the mipmap levels are laid out in a square grid. Anisotropic filtering allows us to change the shape of the region we sample and interpolate between each mipmap based on the viewing angle. (Think of this as needing to be done for the same reason tilting a square object makes the far edge look narrower than the near edge.) The way this is done is also up to the implementation, thus adding another level of complexity in determining the color of one pixel due to one texture on one surface.
AntialiasingNon-horizontal and non-vertical straight lines and edges need to be approximated via multiple small horizontal or vertical lines. This is due to the fact that pixels on the screen are laid out in a grid. The resulting problem is called aliasing and it can bee seen as jagged edges in an image.
To combat this, supersample antialiasing was created. The principle behind this is that a larger resolution image is generated for each scene. This larger image is then downsampled and filtered. Of course, generating this image is very expensive, and to combat the performance hit, multisample antialiasing was devised.
Multisample antialiasing works on the same principal as supersample without having to generate the image at a larger resolution. The idea is to generate multiple, slightly-shifted versions of each pixel at the same time for use in the filtering process. The level of antialiasing has to do with the number of pixels generated (subsamples) to calculate the final pixel color value.
The differences in image quality come in when we look at what subsamples are generated for each pixel. Generating subsamples that form a square with horizontal and vertical edges within each pixel means we are using a method called ordered grid antialiasing. This doesn't lend itself to smoothing very close to horizontal or very close to vertical edges. To address such edges, we can slightly rotate or skew the square in order (giving us the rotated grid antialiasing scheme).
|The squares represent pixels while the dots within are multisample subsamples.|
As it happens, ATI uses a rotated grid antialiasing, while NVIDIA sticks with ordered. This creates a lot of room for variation between the cards antialiased images.
The HardwareFor this test, we used the high-end ATI Radeon 9800XT and NVIDIA GeForce FX 5950. The drivers employed were CATALYST 3.9 for ATI and ForceWare 53.03 for NVIDIA. Where automated screenshots could be taken, they were, and in other games, hypersnap was used in order to help ensure that the image we captured from the cards was as close to the image on the screen as possible.
The screenshots were taken on the same AMD Athlon64 FX51 system that we used in the performance tests. We have made full-sized png images available for all of our screenshots, and we disabled resampling and used the highest quality lossy jpeg compression that Photoshop offers when resizing and saving the inline versions.
D3D AF TesterFor these tests, we used the D3D AF Tester that can be found at 3dcenter. Our tunnel test was taken by turning up the anisotropic filtering level to 8x and using full-colored mipmaps (all other settings were default). The plane tests were done using a distance of 15 and an angle of 75 degrees at each level of anisotropy and stage tested.
|ATI Tunnel 8x Anisotropic Filtering (Click to enlarge.)|
|NVIDIA Tunnel 8x Anisotropic Filtering (Click to enlarge.)|
We can see the affect of ATI's weighted Manhattan distance calculation on the tunnel test. The LOD calculation is heavily dependent on x and y positions. This really gives us a good idea of just how different NVIDIA and ATI GPUs render textures. The resolution of the texture used varies a great deal with respect to the angle of the surface being textured. This means that just by rotating something around, the quality of the filtering being used is compromised (a smaller resolution texture than needed will be used). As there is a lot of interpolation going on here and horizontal and vertical surfaces are very common, the effects aren't particularly noticeable in game play. Of course, this helps to explain why in some screenshots, NVIDIA's textures look sharper than ATI's, while in others, ATI's textures look sharper than NVIDIA's.
We can see the differences between trilinear and the different levels of anisotropic filtering on the plane test. Anisotropic filtering is able to use higher resolution textures better at greater distances when the texture is at an angle to the viewer. Also, the differences between levels (2x, 4x, and 8x) of anisotropic filtering can be seen. Enabling a higher level of anisotropic filtering extends the distance from the viewer at which anisotroic filtering stops and trilinear filtering starts.
From these pictures, it is also evident that NVIDIA does more bilinear filtering when texels are near screen pixel sizes. This has been dubbed “brilinear” filtering by some, as the method blends bilinear filtering more heavily than usual into higher order filtering techniques. We have yet to see any perceptible image quality issues arise from this, and as long as banding is not evident and the highest possible resolution textures are used, this filtering method serves its purpose. NVIDIA assures us that if we can point out a loss in image quality in any game, they will correct it.
In general, these images look very similar. The biggest difference we noted was that ATI seems to have a higher threshold for how opaque something needs to be in order to be drawn. Originally, we had noticed this issue in frame 4000, and NVIDIA pointed out an instance of the problem in frame 5100.
The most noticeable thing in this comparison is that the flashlight on the NV card has soft edges, while the ATI has hard edges. Since soft edges on lighting or shadowing effects are generally harder to calculate, it seems like the NVIDIA card is doing a little more work in this scene. Gearbox has stated that there are some problems with ATI's image quality, but we don't know exactly what was being referenced.
Jedi Knight: Jedi Academy
In this game, with or without anisotropic filtering and antialiasing, there is a noticeable difference in the quality of the light sabre. We saw slight issues with ATI's alpha blending algorithms in Aquamark3, but this is a much more pronounced difference. It is possible that there are other factors behind this discrepancy, and we will try to look into the matter further.
We can say from our 4xAA 8xAF screenshots that ATI does a better job in eliminating jagged edges as the rotated grid scheme suggests it should.
Tomb Raider: Angel of Darkness
Tests here were too tricky to get close enough to the same frame on both cards for a difference image. We can see, however, that the ground on the ATI card has more lighting effects. We can see the same thing in the frame with anisotropic filtering and antialiasing turned on.
Again, ATI does a better job at antialiasing in this game than the NVIDIA card.
Also, ATI has helped us track down the motion issue that we've been seeing on their cards. ATI supports using a separate filtering scheme for texture magnification (filtering when the screen pixels are smaller than texels) and minification (filtering when the screen pixels are larger than texels). NVIDIA hardware requires that magnification be done using the same filtering method as minification. Since TRAOD only requests anisotropic filtering be done on texture minification, NVIDIA does anisotropic filtering just fine while ATI doesn't do anisotropic filtering on magnification. The difference in filtering methods causes the flickering effect by which we have been bothered, and could be fixed via a patch from EIDOS; though, ATI reports that EIDOS is unwilling to do so. ATI could fix the problem themselves by doing application detection (determining that TRAOD is running and then adjusting settings specific to that game), but ATI is unwilling to take this step (even though it would be valid and helpful) in order to avoid the controversy.
The two GPUs render pretty similar images in this game. This is quite interesting considereing that Tron 2.0 makes very heavy use of a glow effect (with high transparency) driven by a pixel shader program. It is possible that this has something to do with the way the shader was written, but whatever the reason, we were very happy with the image quality in this game on both cards.
There really isn't any use in enabling anisotropic filtering in this game, as the style of the textures just doesn't benefit from the additional filtering. We can see, however, that the ATI card does a better job on antialiasing.
Unreal Tournament 2003
These images are very similar. Even if we were to look at a different image and see some variation, there isn't a perceptible difference in the way each card renders this scene.
Here, we have an 800% zoomed image in order to take a look at antialiasing. As we can see, both cards do a pretty good job of eliminating any jagged edges in this example, but we aren't looking at near horizontal or near vertical lines (which would look better with ATI's rotated grid antialiasing).
At NVIDIA's Editor's Day this year, it was pointed out that under certain conditions, ATI cards will not render detail textures on objects very close to the viewer, while NVIDIA cards would. This issue has been clarified for us by EPIC. The problem occurs when the LOD setting in UT2K3 is above 0.5, which affects ATI's mipmapping algorithm differently than NVIDIA's (ATI cards never rendered the highest detail mipmap under these conditions). With the LOD set below 0.5, there is no problem with ATI rendering detail textures.
X2: The Threat
Both cards look fairly similar in this game, until we turn on AA.
The ATI card does a much better job of smoothing out edges in this game. The difference in quality is definitely noticeable. The stuttering issue that we noted in earlier NVIDIA drivers has been significantly smoothed out with the 53.03 ForceWare release. While the motion is still smoother under ATI, we are glad to see that NVIDIA has worked to improve their image quality in this game. The fix has also resulted in lower performance numbers (which we will revisit in an upcoming article).
Final WordsMaking useful sense of all this information is tricky at best.
There is no real-time 3D engine or hardware in existence that does everything the “right way” to all things visible all the time. In order to affect real-time 3D rendering, trade-offs must be made. Only when these trade-offs become highly perceptible are they a problem.
It is the GPU makers' responsibility to implement optimizations in ways that don't negatively impact the image quality of a scene, but there really isn't a way to quantitatively make a decision about any given optimization. That which is acceptable to one person may not be acceptable to another, and it is a tough call to make.
One stop gap is the end user community's perspective on the issues. If it is decided that a particular optimization shouldn't be done by the people who own (or potentially own) a particular card, it is in the GPU makers' best interest to make some changes.
It is in game developers' best interest to work with GPU makers to keep image quality top notch for their game's sake. In fact, rather than concentrating on getting raw frame rate to the end user, IHVs should focus on getting powerful and easy-to-use features to the developer community. So far, ATI has a leg up on ease of use (developers have said that programming has gone quickly and smoothly with ATI cards with NVIDIA code paths taking longer to tweak), while NVIDIA's hardware offers more flexibility (NVIDIA allows much longer shader programs than ATI and offers functionality above the minimum of current APIs). At this point, ATI is in a better position because it doesn't matter if NVIDIA offers more functionality, if the only code that can take advantage of it runs incredibly slow after taking a very long time to develop. Hopefully, we will see more flexibility from ATI and fewer nuances in how programs need to be written from NVIDIA in next year's hardware.
At this point, ATI uses a more visually appealing algorithm for antialiasing, while NVIDIA does a better job calculating texture LOD and does more alpha blending. The question we are asking now is whether or not these optimizations degrade image quality in any real way. We feel that NVIDIA needs to refine its antialiasing algorithms, and ATI needs to do a better job of rendering alpha effects. We are still looking into the real world effects of the distance calculations that ATI uses in determining LOD, but the problem definitely manifests itself in a more subtle way than the other two issues that we have raised.
The decision on what is acceptable is out of our hands, and we can't really declare a clear winner in the area of image quality. We can say that it appears from the tests we've done that, generally, NVIDIA hardware does more work than ATI. Honestly, it is up to the reader to determine what aspects of image quality are important, and how much of what we covered is relevant.
We really don't have a good way to compare pixel shader rendering quality yet. The possible issues with different shader implementations have yet to be seen in a game, and we hope they never will. It is a developer's responsibility to create a game that gives a consistent experience across the two most popular GPUs on the market, and both ATI and NVIDIA have the ability to produce very high quality shader effects. Each architecture has different limitations that require care when programming, and we will still have to wait and see whether or not there will be image quality differences when more DX9 games hit the shelves.
For now, we are committed to bringing to light as much information as possible about image quality and optimizations in graphics hardware. Armed with this information, individuals will be able to come to their own conclusions about which optimizations go too far and which serve their intended purpose. We hope that all the details that we have brought to light have served their purpose in helping our readers to make informed decisions about graphics hardware.