E3 2005 - Day 2: More Details Emerge on Console GPUsby Anand Lal Shimpi on May 19, 2005 1:24 PM EST
- Posted in
- Trade Shows
More Detail on the Xbox 360 GPU
ATI has been working on the Xbox 360 GPU for approximately two years, and it has been developed independently of any PC GPU. So despite what you may have heard elsewhere, the Xbox 360 GPU is not based on ATI's R5xx architecture.
Unlike any of their current-gen desktop GPUs, the 360 GPU supports FP32 from start to finish (as opposed to the current FP24 spec that ATI has implemented). Full FP32 support puts this aspect of the 360 GPU on par with NVIDIA's RSX.
ATI was very light on details of their pipeline implementation on the 360's GPU, but we were able to get some more clarification on some items. Each of the 48 shader pipelines is able to process two shader operations per cycle (one scalar and one vector), offering a total of 96 shader ops per cycle across the entire array. Remember that because the GPU implements a Unified Shader Architecture, each of these pipelines features execution units that can operate on either pixel or vertex shader instructions.
Both consoles are built on a 90nm process, and thus ATI's GPU is also built on a 90nm process at TSMC. ATI isn't talking transistor counts just yet, but given that the chip has a full 10MB of DRAM on it, we'd expect the chip to be fairly large.
One thing that ATI did shed some light on is that the Xbox 360 GPU is actually a multi-die design, referring to it as a parent-daughter die relationship. Because the GPU's die is so big, ATI had to split it into two separate die on the same package - connected by a "very wide" bus operating at 2GHz.
The daughter die is where the 10MB of embedded DRAM resides, but there is also a great deal of logic on the daughter die alongside the memory. The daughter die features 192 floating point units that are responsible for a lot of the work in sampling for AA among other things.
Remember the 256GB/s bandwidth figure from earlier? It turns out that that's not how much bandwidth is between the parent and daughter die, but rather the bandwidth available to this array of 192 floating point units on the daughter die itself. Clever use of words, no?
Because of the extremely large amount of bandwidth available both between the parent and daughter die as well as between the embedded DRAM and its FPUs, multi-sample AA is essentially free at 720p and 1080p in the Xbox 360. If you're wondering why Microsoft is insisting that all games will have AA enabled, this is why.
ATI did clarify that although Microsoft isn't targetting 1080p (1920 x 1080) as a resolution for games, their GPU would be able to handle the resolution with 4X AA enabled at no performance penalty.
ATI has also implemented a number of intelligent algorithms on the daughter die to handle situations where you need more memory than the 10MB of DRAM on-die. The daughter die has the ability to split the frame into two sections if the frame itself can't fit into the embedded memory. A z-pass is done to determine the location of all of the pixels of the screen and the daughter die then fetches only what is going to be a part of the scene that is being drawn at that particular time.
On the physical side, unlike ATI's Flipper GPU in the Gamecube, the 360 GPU does not use 1T-SRAM for its on-die memory. The memory on-die is actually DRAM. By using regular DRAM on-die, latencies are higher than SRAM or 1T-SRAM but costs should be kept to a minimum thanks to a smaller die than either of the aforementioned technologies.
Remember that in addition to functioning as a GPU, ATI's chip must also function as a memory controller for the 3-core PPC CPU in the Xbox 360. The memory controller services both the GPU and the CPU's needs, and as we mentioned before the controller is 256-bits wide and interfaces to 512MB of unified GDDR3 memory running at 700MHz. The memory controller resides on the parent die.