Original Link: http://www.anandtech.com/show/536
Although they have come very close in recent times, ATI has always fallen short of being able to deliver a truly high performance gaming video card. The example that comes to mind is the ATI Rage 128. When we previewed the solution back in December of 1998, the Rage 128 was able to give NVIDIA’s Riva TNT a run for its money while at the same time providing us with the illusion of 32-bit color rendering with virtually no performance penalty. We later found out that the performance penalty was in fact there, just masked by poor 16-bit color performance, but at the time, the Rage 128 seemed to be the best 2D/3D gaming solution.
Had the Rage 128 been available in December when we first looked at it, it would have definitely found its way into the systems of holiday buyers that were looking for a better gaming experience. However, it wasn’t until a few months later that the Rage 128 finally made its way onto store shelves, and by that time, 3dfx and NVIDIA were ready with their Voodoo3 and TNT2 products that had no problem dominating the aging Rage 128 architecture.
ATI’s most recent attempt was the Rage Fury MAXX, a brute force attempt to compete with NVIDIA’s high performing GeForce. While the MAXX performed much more competitively than the Rage 128 at its release, and while the MAXX did come out in a reasonable time frame, the solution was plagued by the usual ATI driver problems and a number of other issues such as not being able to work properly on VIA Apollo Pro 133A/KX133 motherboards until a recent beta driver was made publicly available.
ATI is at it again, this time making some extremely powerful statements about what their next-generation product is capable of doing. At this year’s WinHEC, ATI unveiled their Rage6C core which will be sold under the name Radeon 256.
Corresponding to a new series of products, the Radeon name will replace the Rage name as the next-generation of graphics solutions from ATI. ATI is betting quite a bit on the success of the Radeon 256, and it will have to be more successful than the Rage line it is replacing in order for that bet to pay off.
While we don’t have a card to dazzle you with, or pages of benchmarks for you to surf through, we do have an interesting preview of exactly what is behind what ATI claims will be the fastest desktop level graphics accelerator upon its release.
Let’s take a look at the technology behind the Radeon 256.
Another T&L Supporter
The Radeon 256 marks a new move for ATI, and with this step, they have chosen the path of NVIDIA, believing that the future lies in Hardware T&L and not entirely in the cinematic effects boasted by 3dfx’s Voodoo5.
The Radeon name will be used by ATI from this point forward for all of their chips that feature their “Charisma Engine,” which is their Hardware Transform, Clipping & Lighting Engine.
The Charisma Engine handles much more than NVIDIA’s Hardware T&L engine does because, in addition to performing transforming and lighting calculations on the chip, the engine also allows for clipping operations as well as two interesting features, vertex skinning and keyframe interpolation, to be performed on-chip instead of offloading those calculations onto the host CPU.
Vertex skinning allows for more realistic bending/moving of polygons in games that use skeletal animation so that characters appear to move more realistically than the blocky motion we’re used to. The perfect example would be Half Life, which uses skeletal animation; unfortunately vertex skinning isn’t supported by the engine.
Keyframe interpolation is another animation acceleration feature that takes the starting and ending frames of an animation and, by measuring the changes in the two frames, can interpolate intermediate frames so that you don’t to generate as many frames for a single animation.
Just as with the GeForce, this allows the host CPU to handle other operations since its time is freed up by the Charisma Engine. A potential downside to this feature is that, as CPUs get faster and unless a game specifically takes advantage of the features supported by the Charisma Engine, the host CPU may be able to handle these tasks in a faster manner than the Charisma Engine.
ATI is betting on the most popular games coming out this Christmas taking advantage of these specific features so that their Charisma Engine will actually have some purpose to it, which hasn’t been the case thus far with NVIDIA’s Hardware T&L since very few currently available games take advantage of hardware T&L.
Luckily for ATI (and for NVIDIA), it seems like they put their money on the right feature since there will be quite a bit of support for hardware T&L in games that will be available towards the end of this year. And since NVIDIA has already done quite a bit of promotion for their hardware T&L engine, ATI should have a much easier time promoting the Charisma Engine since they can just point to NVIDIA’s GeForce/GeForce 2 and basically say “we’re following their lead.” This also weakens 3dfx’s argument that hardware T&L isn’t necessary; then again, by the time the Radeon 256 is actually available, 3dfx’s next-generation product (Rampage) should be available, and it may boast a hardware T&L engine of its own.
ATI basically took it upon themselves to add as many DirectX 8 features as possible without having a hard copy of the specification to go by. In doing so, they are betting quite a bit on their Radeon 256 and its Charisma Engine.
Exactly how powerful is this Charisma Engine?
ATI claims that the Radeon 256 is the “most advanced GPU [Graphics Processing Unit] ever designed” and its 30 million transistor count will lead you to believe that they aren’t lying.
Capable of processing 30 Million Triangles per Second (compared to ~10 million triangles per second for the GeForce and about double that for the GeForce 2 GTS), the Charisma Engine will be more powerful than the GeForce’s T&L engine and the GeForce 2 GTS’ T&L engine. While the same can’t be said about NVIDIA’s forthcoming NV20, which should be due out sometime in September, for now and when it’s officially available, the Radeon 256 will have the most powerful T&L engine available on the market.
One of the major advantages ATI is claiming that their Charisma Engine holds over NVIDIA’s T&L solution is that it can retain close to 25% more of its original performance as more light sources are added than the GeForce’s T&L engine.
The 0.18-micron Radeon 256 features an interesting architecture, consisting of two rendering pipelines with three texture units per rendering pipeline.
Each pipeline is spec’d to run at up to 200MHz (core clock), which will result in a fill rate of 400 megepixels per second and 1.2 gigatexels per second. ATI feels that having three texture units per pipeline is going to help them in future games, and although it will hold them back with dual textured games, they strongly believe that developers will begin using three or more textures per pixel in future games.
According to ATI, even if you’re using 200MHz DDR SDRAM (effectively 400MHz), you’re limited to a 300 megapixels per second fill rate at 32-bit color with a 32-bit Z-Buffer, so adding more pixel pipelines wouldn’t help them, which is why they focused on having three texture units per pipeline. This is their reason for not adding more pixel pipelines, whether or not this is actually the case is another issue entirely.
One thing you will notice about the Radeon 256 is that there is a strong sense of solid developer consulting on the part of ATI. Instead of stabbing in the dark regarding to what features the Carmacks and Sweenys of the industry will be implementing in the future, ATI focused on improving their developer relations and consulting with the developers that are going to be driving the industry and asked them what features they would like to see.
Each Radeon 256 chip (indicating the possibility of a Radeon 256 MAXX product with more than one Radeon 256 chip) is capable of driving up to 128MB of SDRAM. Depending on the availability of specific SDRAM parts, we may see Radeon 256 based products emerge with more than 32MB per chip.
The Radeon 256 architecture features a 128-bit memory bus capable of interfacing with both SDR and DDR SDRAM. As of now, we can expect to see 200MHz DDR SDRAM (effectively 400MHz) on the shipping Radeon 256 based product; however, depending on the availability of 200MHz DDR SDRAM, this could change. Considering that you can’t expect to see a Radeon 256 based product until August 2000, getting 200MHz DDR SDRAM shouldn’t be a problem for ATI closer to the shipping date of the card.
Assuming that the Radeon 256 is outfitted with 200MHz DDR SDRAM, this gives it a fairly large amount of memory bandwidth to play around with. With a 128-bit memory bus, running at 200MHz DDR, we’re talking about 6.4GB/s of peak available memory bandwidth. For comparison purposes, a DDR GeForce has 4.8GB/s of available memory bandwidth, and the upcoming GeForce 2 GTS will have around 10% more than that.
As we have seen over the past few months, the memory bandwidth of today’s graphics accelerators is severely limiting their performance. The perfect example is the incredible performance advantage switching to DDR SGRAM offered the GeForce over its initial SDR SDRAM memory solution. This should definitely be an area where the Radeon 256 excels, but then again, with the features that ATI is promising with this new core, they’re going to need all the memory bandwidth they can get.
One of the more interesting features of the Radeon 256 is its support for something ATI likes to call HyperZ technology.
When talking about conserving memory bandwidth, a major hog is the Z-Buffer, which determines how “deep” objects on the screen are supposed to be. Especially when dealing with a 32-bit Z-Buffer, which requires twice the storage space as a 16-bit Z-Buffer, you lose quite a bit of memory bandwidth because of the Z-Buffer.
ATI’s HyperZ technology borrows its theory from tile based rendering architecture (such as that found on the PowerVR2), which basically renders any given scene in tiles instead of on a per polygon basis, which allows objects that aren’t going to be visible to the viewer (because they are covered up by other objects) to be skipped during the rendering process.
Now since the Radeon 256 doesn’t boast a tile based rendering architecture, it cannot render a scene in this manner; this is where HyperZ comes in. HyperZ enables various forms of compression of the data going to the Z-buffer and performs an early culling of polygons so that objects that aren’t visible to the viewer aren’t rendered.
According to ATI, when enabled, HyperZ technology can boost the effective memory bandwidth by 20% and increase the fill rate of the Radeon 256 to 1.5 gigatexels per second, up from the 1.2 gigatexels per second fill rate that the Radeon 256 would otherwise have.
As scenes become more complex (which ATI is counting on, hence their support for a hardware T&L engine – Charisma Engine), the benefits of the HyperZ technology’s improvements on reads/writes to the Z-buffer will become amplified.
Pixel Tapestry Architecture
The Radeon 256 is full of fancy names for features that you will see pop up on other products as well. One such feature set is ATI’s Pixel Tapestry Architecture, which is just ATI’s way of describing the 3D features their Radeon 256 architecture will offer.
If you recall, one of the big new features of NVIDIA’s GeForce was its support for Cubic Environment Mapping (CEM), which has made its way onto the Radeon 256. One of the criticisms for NVIDIA’s support of CEM was that the GeForce didn’t have the fill rate to truly take advantage of it.
In defense of NVIDIA, the GeForce didn’t need to have the fill rate to take advantage of CEM since, at the release of the product, there were no available titles that took advantage of it. The Radeon 256 should be available when the potential for games to take advantage of CEM is greater than when NVIDIA originally announced support for it last year.
The name of Matrox’s game was Bump Mapping when they announced and released their G400 series. The Expendable screenshot depicting how Environment Mapped Bump Mapping (EMBM) could be used to make water seem more like water and less like a floating texture quickly became one of the most impressive forms of eye candy ever produced by a company. Unfortunately, there was limited developer support for EMBM and in the cases where it was put to use, the performance hit was very noticeable.
ATI’s Pixel Tapestry Architecture includes support for all three forms of bump mapping: Dot Product 3 (used by the Permedia 3 Create!), EMBM (used by the G400), and the classic embossed bump mapping which most cards support. The difference between ATI’s EMBM solution and Matrox’s is that, instead of forcing developers to create a separate bump map texture in order to achieve the effect, the Radeon 256 applies the effect on a per pixel basis. The result is an easier time for developers and a more efficient way of performing EMBM than Matrox’s solution.
Another interesting feature of the Pixel Tapestry Architecture is its support for what is known as a Priority Buffer. Essentially, a priority buffer assigns each polygon or object a value depending on how far that particular object/polygon is from a light source. Using this value, the intensity of lighting effects on the polygons/objects in a scene can be adjusted so that things closer to a light source receive a harsher lighting effect while those further away receive a softer light.
This can be quite helpful when it comes to determining the resulting shadows from these light sources. What comes to mind when we talk about a priority buffer is 3dfx’s talk of soft shadows courtesy of their T-buffer.
One of the more revolutionary features included in the Pixel Tapestry Architecture is the support for 3D textures. Currently, textures are simply 2D entities used to “paint” surfaces with a certain look. For example, the walls in Quake III are simply flat surfaces covered by a texture to make the surface appear more realistic.
A 3D texture would actually be a chunk of that wall, and it would have depth to it. So theoretically, if you had a wall made from 3D textures, you could fire a rocket into the wall and blow a chunk of the wall out and have that piece actually feature some depth/texture to it. More realistic lighting is also something that 3D textures could be used for, and it’s something that id Software’s John Carmack has expressed interest in.
Applying 3D textures in a game is just like using 2D textures. The main problem in this situation is that they take up a considerable amount of memory because there is physically more information that needs to be stored.
Realistically speaking, you won’t see many (if any) games take advantage of 3D textures until sometime next year. The feature is a part of DirectX 8, and by the time it is implemented in games, we will have a new set of graphics accelerators (more powerful ones) capable of handling the additional storage requirements set forth by 3D textures.
Luckily, you can still use texture compression to help lessen the burden of 3D textures, just as we currently do with 2D textures using DXTC, S3TC and FXT1 texture compression algorithms.
Finally, to top off the Pixel Tapestry feature set, the Radeon 256 will boast support for Full Scene Anti-Aliasing and the same Motion Blur/Depth of Field effects that 3dfx’s Voodoo5 currently supports courtesy of its T-buffer. According to ATI, they only tossed these features in because 3dfx was offering them, and it didn’t hurt to offer the same; however, ATI does not believe that they are extremely important features (especially FSAA). ATI’s primary concern is being able to display higher resolution images.
ATI has always been known for their superior video support, and in continuing that tradition, the Radeon 256 includes an on-die TMDS transmitter that allows for a DVI output to drive flat panels at resolutions up to 1600 x 1200. ATI believes that this year will be the year we start seeing the transition made to DVI flat panels, though we’ve heard that time and time again. Maybe this time we’ll actually see it happen.
The Radeon 256 also supports complete HDTV decoding support, once again, on-die with full support for all ATSC resolutions.
The Radeon 256 also has a feature called adaptive de-interlacing, which attacks one of the major problems with video quality on a PC – the process of de-interlacing the video. There are two main algorithms used to de-interlace video, Bob de-interlacing and Add Field de-interlacing. Bob de-interlacing works best on pictures while Add Field is best for stationary text. Adaptive de-interlacing basically chooses the best of the two algorithms on a per pixel basis to achieve a sharper image, which is demonstrated by the following pictures:
And as you would expect, the Radeon 256 is fully compatible with the Rage Theater chip, meaning that there is a strong possibility for something like an All in Wonder Radeon 256 product down the line.
The biggest downfall to the Radeon 256 will be its release time frame, which is scheduled for “this summer,” but more realistically, we can expect it in the late August timeframe. Considering that the 3dfx Voodoo5 and NVIDIA GeForce 2 GTS are both due out in the next few weeks, waiting until August may be painful for those users that are itching to upgrade now.
At the same time, we can’t help but remember ATI’s history with major product releases like this. The Rage 128 would have been an instant hit had it come out in the Christmas timeframe it was set to debut in, but because most of the shipments went to OEMs, the retail consumers didn’t see boards until late February the following year. Radeon 256 shipments to OEMs should occur during the summer, and if all goes according to plan, we’ll see retail availability of Radeon 256 based products in August…let’s hope that everything goes according to plan.
When the Radeon 256 does in fact debut (assuming that its debut occurs in August), it should be one of the fastest things available, but you also have to remember that less than a month later NVIDIA will be releasing their NV20, which may put the Radeon 256 to shame since it will be much more than a faster GeForce 2 GTS, it will be NVIDIA’s “new” architecture.
Then we have the problem of driver support. The Radeon 256 should be able to use the Rage 128 drivers as a base, but ATI’s driver support for the Rage 128 series has been sub-par at the most. If the Radeon is to succeed, ATI must get their act together and produce better drivers.
Regardless of what happens, ATI could have a major success on their hands, but it has to come out on time (even then, it may be too late) and it has to have solid driver support, two things that ATI has had trouble doing in the past. But with the amount of effort that ATI is putting into the development of this product, we may just see a strong competitor come August.