Inside the Xenos GPU

As previously mentioned, the 48 shaders will be able to run either vertex or pixel shader programs in any given clock cycle. To clarify, each block of 16 shader units is able to run a shader program thread. These shader units will function on a slightly higher than DX9.0c, but in order to take advantage of the technology, ATI and Microsoft will have to customize the API.

In order to get data into the shader units, textures are read from main memory. The eDRAM of the system is unable to assist with texturing. There are 16 bilinear filtered texture samplers. These units are able to read up to 16 textures per clock cycle. The scheduler will need to take great care to organize threads so that optimal use of the texture units are made. Another consideration to take into account is anisotropic filtering. In order to perform filtering at beyond bilinear levels, the texture will need to be run through the texture unit more than once (until the filtering is finished). If no filtering is required (i.e. if a shader program is simple reading stored data), the vetex fetch units can be used (either with a vertex or a pixel shader program).

In the PC space, we are seeing shifts to more and more complex pixel shaders. Large and larger textures are being used in order to supply data, and some predict that texture processing will eclipse color and z bandwidth in the not so distant future. We will have to see if the console and desktop space continue to diverge in this area.

One of the key aspects of performance for the Xbox 360 will be in how well ATI manages threads on their GPU. With the shift to the unified shader architecture, it is even more imperative to make sure that everything is running at maximum efficiency. We don't have many details on ATI's ability to context switch between vertex and pixel shader programs on hardware, but suffice it to say that ATI cannot afford to have any difficulties in managing threads on any level. As making good use of current pixel shader technology requires swapping out threads on shaders, we expect that this will go fairly well in this department. Thread management is likely one of the most difficult things ATI had to work out to make this hardware feasible.

Those who paid close attention to the amount of eDRAM (10MB) will note that this is not enough memory to store the entire framebuffer for displays larger than standard television with 4xAA enabled. Apparently, ATI will store the front buffer in the UMA area, while the back buffer resides on the eDRAM. In order to manage large displays, the hardware will need to render the back buffer in parts. This indicates that they have implemented some sort of very large grained tiling system (with 2 to 4 tiles). Usually tile based renderes have many more tiles than this, but this is a special case.

Performance of this hardware is a very difficult aspect to assess without testing the system. The potential is there for some nice gains over the current high end desktop part, but it is very difficult to know how easily software engineers will be able to functionally use the hardware before they fully understand it and have programmed for it for a while. Certainly, the learning curve won't be as steep as something like the PlayStation 2 was (DirectX is still the API), but knowing what works and what doesn't will take some time.

ATI's Modeling Engine

The adaptability of their hardware is something ATI is touting as well. Their Modeling Engine is really a name for a usage model ATI provides using their unified shaders. As each shader unit is more general purpose than current vertex and pixel shaders, ATI has built the hardware to easily allow the execution of general floating point math.

ATI's Modeling Engine concept is made practical through their vertex cache implementation. Data for general purpose floating point computations moves into the vertex cache in high volumes for processing. The implication here is that the vertex cache has enough storage space and bandwidth to accommodate all 48 shader units without starvation for an extended period of use. If the vertex cache were to be used solely for vertex data, it could be much less forgiving and still offer the same performance (considering common vertex processing loads in current and near term games). As we stated previously, pixel processing (for now) is going to be more resource intensive than vertex processing. Making it possible to fill up the shader units with data from the vertex cache (as opposed to the output of vertex shaders), and the capability of the hardware to dump shader output to main memory is what makes ATI's Modeling Engine possible.

But just pasting a name on general purpose floating point math execution doesn't make it useful. Programmers will have to take advantage of it, and ATI has offered a few ideas on different applications for which the Modeling Engine is suited. Global illumination is an intriguing suggestion, as is tone mapping. ATI also indicates that higher order surfaces could be operated on before tessellation, giving programmers the ability to more fluidly manipulate complex objects. It has even been suggested that physics processing could be done on this part. Of course, we can expect that Xbox 360 programmers will not implement physics engines on the Modeling Engine, but it could be interesting in future parts from ATI.

The Xbox 360 GPU: ATI's Xenos PlayStation 3’s GPU: The NVIDIA RSX
Comments Locked

93 Comments

View All Comments

  • LanceVance - Friday, June 24, 2005 - link

    Excellent article. Definitely the most thorough, informative, well researched article on the PS3/Xbox360.

    And most importantly, unlike every other article on the subject, it's not strongly biased toward one camp while making comments of substance.
  • yacoub - Friday, June 24, 2005 - link

    I bet the PS3 debuts at a higher price.

    Also regarding statements made on the Conclusionary page:

    --"That being said, it won’t be impossible to get the same level of performance out of the PS3, it will just take more work. In fact, specialized hardware can be significantly faster than general purpose hardware at certain tasks, giving the PS3 the potential to outperform the Xbox 360 in CPU tasks. It has yet to be seen how much work is required to truly exploit that potential however, and it will definitely be a while before we can truly answer that question."--

    I find it funny that once again the PlayStation will be the harder system to code games for that take full advantage of its abilities. If trends mimic the past (as they often do) this will lead to a large amount of mediocre games by companies too small to afford the dev time necessary to take real advantage of the PS3's advantages or on deadlines too tight to spend the time doing more.
  • Furen - Friday, June 24, 2005 - link

    It does sound pretty low but (I'm guessing) it's more than enough, I dont think they would have separated the dies unless it didnt lead to a big performance penalty. also, I'm guessing that the 256MB/sec bandwidth between the eDRAM and its processing hardware is 256GB/sec? Microsoft was using that number to inflate their "system bandwidth" total.
  • Woodchuck2000 - Friday, June 24, 2005 - link

    And for that matter, 32Mb/s inter-die communications in the Xenos GPU seems low to me
    :p
    Good article though guys!
  • Furen - Friday, June 24, 2005 - link

    Is there any word on the media center extender capabilities on the xbox 360? I think Microsoft mentioned something about that but I'm not sure if that was oficial or not. Just hope they allow us to plug in some video capture device and use it as a dvr eventually.

    As much as I like sony's playstation, I find it quite boring on the technical side. It seems like they're just throwing everything they can into it but nothing is really that exciting, or useful. Come on, dual-HDMI. I dont see myself having two HDTVs in such close proximity to each other. Gigabit router? Seems like they're desperate to use the extra cpu muscle. I wonder how heavy ethernet traffic will affect cpu usage.
  • Woodchuck2000 - Friday, June 24, 2005 - link

    Surely porting between multi-core PC software and Xenon should be fairly trivial, not fairly Non-trivial as stated in the article...?
  • jotch - Friday, June 24, 2005 - link

    I stands for interlaced whilst the P stands for progressive scan. Check out the difference at http://en.wikipedia.org/wiki/720p

    or

    http://en.wikipedia.org/wiki/1080i

    This should resolve this issue.
  • AnnihilatorX - Friday, June 24, 2005 - link

    1080i = 720p doesn't it? 1080p is the one Xbox 360 doesn't support.

    These "i"s and "p"s are confusing me
  • sprockkets - Friday, June 24, 2005 - link

    How is 1080i on your tv's? On my 1 year old Mitsubishi native 1080i tv using dvi from the computer at 1080i is basically useless since the text is too small and the image looks like the refresh rate is below 60hz, whereas HDTV broadcasts look fine. Using the other mode of 720x480 looked great.

    Will HD output from a console be any better than a video card in a computer? Is it just my tv?

    Cmon, did you really think nVidia would release something far more advanced for a console than for a video card, or perhaps, more specifically, having it way outperform 6800 ultras in sli?

    If you need around a 400w power supply for even non sli setup, what kind of heat and power will these new consoles need anyhow???

    Of course I am more interested in how the PS3 will work with Linux more than games hahahahaha, since Sony officially mentioned it.
  • emmap - Sunday, December 4, 2005 - link

    And that's this article, Sony and M$ have missed:

    it's not the number of megapixels, shader pipelines, CPU / GPU bandwidth, multithreaded or single threaded code which do a great game. It's imagination put in the game, gameplay, artistic art quality, human feeling we get looking at the characters, fun and so on. It's not only mathematics and physics: we don't love a game because it has X millions polygons or run at Y fps, no it's totally different. Just see all the mame fans out there, you'll see that they don't care about the obsolete hardware the game they are playing on, they care about the most important thing about game: ENTERTAINMENT!

Log in

Don't have an account? Sign up now