Workstation Graphics: AGP Cross Section 2004by Derek Wilson on December 23, 2004 4:14 PM EST
- Posted in
ATI FireGL X3-256 TechnologyThe FireGL X3-256 is based on ATI's R420 architecture. While this isn't a surprise, it is interesting that the highest end AGP offering that ATI has on the table is based on the X800 Pro. On the PCI Express side, ATI is offering a higher performance part, but for now, the FireGL on AGP is a little more limited than on PCI Express. When we tackle the PCI Express workstation market, we'll bring out a clearer picture of how ATI's highest end workstation component stacks up against the rest of the competition. As the ATI part isn't positioned as an ultra high end workstation solution, we'll be focusing more on price performance. Unfortunately for ATI, the street price of the 3Dlabs Wildcat Realizm 200 comes in at just about the same as the Radeon FireGL X3-256 and is targeted at a higher performance point. But we'll have to see how that pans out when we've taken a look at the numbers. For now, let's pop open the hood on the ATI FireGL X3-256.
We will start out with the vertex pipeline as we did with the NVIDIA part. The overall flow of data is very similar to the Quadro, except, of course, that the ATI part runs with 12 pixel pipelines rather than 16. The internals are the differentiating factor.
We can see that the ATI vector unit supports the parallel operation of a 4x 32-bit vector unit and a 32-bit scalar unit. This allows the same type of operation that the NVIDIA GPU supports, but the FireGL lacks the VS 3.0 capabilities and support for vertex textures. Interestingly, in the documents that list the features of the FireGL X3, we see that "Full DX9 vertex shader support with 4 vertex units" is mentioned in addition to its "6 geometry engines". This obviously indicates that 2 of the geometry engines don't handle full DX9 functionality. This isn't of as much importance in a workstation part, as the fixed function path will be more often stressed, but it's worth noting that this core is based on the desktop part and we didn't pick up this information from any of our desktop briefings or data sheets.
The FireGL X3-256 employs the HyperZ HD engine that the Radeon uses, which combines early/hierarchical z hardware with a z/stencil cache and z compression. The hierarchical z engine looks at tiles of pixels (in the case of the FireGL 16x16 blocks), and if the entire block is occluded, up to 256 pixels can be eliminated in one clock. These pixels never need to touch the fragment/pixel processing hardware and save a lot of processing power. When we look at the pixel engine, we can see that ATI divides their pixels into "quad" pipes as well, but an NVIDIA and ATI quad is defined slightly differently. On ATI hardware, data out of setup is tiled into those 16x16 blocks for the hierarchical z pass. It's these blocks on which each quad pipe shares its efforts.
Inside each of the pixel pipes, we have something that also looks similar to the NVIDIA architecture. It is possible for ATI to handle completing two vector 3 operations and 2 scalar operations in combination with a texture operation every clock cycle. This is what the hardware ends up looking like:
Since the texture unit does not share hardware with either of the shader math units, ATI is able to handle theoretically more math per clock cycle in its pixel shaders than NVIDIA. The 3 + 1 arrangement is also not as robust as NVIDIA claims it to be, as NVIDIA is capable of handling 2 vector + 2 vector operations.
ATI is not as robust as either NVIDIA's architecture or 3Dlabs with only PS2.0 support. The FireGL can only support between 512 and 1536 shader instructions depending on the conditions, and uses fp24 for processing. The Radeon architecture has favored DirectX over OpenGL traditionally, so we will be very interested to see where these pre-dominantly OpenGL benchmarks will end up.
As far as rasterization is concerned, ATI does not support any floating point framebuffer display types. The highest accuracy framebuffer that the FireGL X3-256 supports is a 10-bit integer format, which is good enough for many applications today. As with both 3Dlabs' and NVIDIA's parts, the FireGL X3-256 includes dual 10-bit RAMDACs and 2 dual-link DVI-I connections allowing support of up to 9MP displays. Unlike the Wildcat Realizm and Quadro FX lines, there is no way to get any sort of genlock, framelock, or SDI output support for the FireGL line. This puts ATI behind when it comes to video editing, video walls, multi-system displays, and broadcast solutions.
The added features that ATI's FireGL X3-256 supports beyond the Radeon include:
- Anti-aliased points and lines - Lines and points are smoothed as they're drawn in wireframe mode. This is much higher quality and faster than FSAA when used for wireframe graphics, and is of the utmost importance to designers who use workstations for wireframe manipulation (the majority of the 3D workstation market).
- Two-sided lighting - In the fixed function pipeline, enabling two-sided lighting allows hardware lights to illuminate both sides of an object. This is useful for viewing cut-away objects. SM 3.0 supports two-sided lighting registers for programmable shaders, but these don't apply to the fixed function light sources.
- OpenGL overlay planes - Overlays are useful for adding to a 3D accelerated viewport without making the buffer dirty. This can significantly speed up things like displaying pop-up windows or selection highlights in 3D applications.
- 6 user defined clip planes - User defined clip planes allow the cutting away of surfaces in order to look inside objects in application that support their creation.
- Quad-buffered stereo 3D support - This enables smooth real-time stereoscopic image output by supporting a front-left, back-left, front-right, and back-right buffer for display.