Not Just A New Architecture, But New Features Too

So far we’ve talked about Graphics Core Next as a new architecture, how that new architecture works, and what that new architecture does that Cayman and other VLIW architectures could not. But along with the new architecture GCN will bring with it a number of new compute features to further flesh out AMD’s GPU computing capabilities and to cement the GPU’s position as the CPU’s partner rather than a subservient peripheral.

In terms of base features the biggest change will be that GCN will implement the underlying features necessary to support C++ and other advanced languages. As a result GCN will be adding support for pointers, virtual functions, exception support, and even recursion. These underlying features mean that developers will not need to “step down” from higher languages to C to write code for the GPU, allowing them to more easily program for the GPU and CPU within the same application. For end-users the benefit won’t be immediate, but eventually it will allow for more complex and useful programs to be GPU accelerated.

Because the underlying feature set is evolving, the memory subsystem is also evolving to be able to service those features. The chief change here is that the hardware is being adapted to support an ISA that uses unified memory. This goes hand-in-hand with the earlier language features to allow programmers to write code to target both the CPU and the GPU, as programs (or rather compilers) can reference memory anywhere, without the need to explicitly copy memory from one device to the other before working on it. Now there’s still a significant performance impact when accessing off-GPU memory – particularly in the case of dGPUs where on-board memory is many times faster than system memory – so developers and compilers will still be copying data around to keep it close to the processor that’s going to use it, but this essentially becomes abstracted from developers.

Now what’s interesting is that the unified address space that will be used is the x86-64 address space. All instructions sent to a GCN GPU will be relative to the x86-64 address space, at which point the GPU will be responsible for doing address translation to local memory addresses. In fact GCN will even be incorporating an I/O Memory Mapping Unit (IOMMU) to provide this functionality; previously we’ve only seen IOMMUs used for sharing peripherals in a virtual machine environment. GCN will even be able to page fault half-way gracefully by properly stalling until the memory fetch completes. How this will work with the OS remains to be seen though, as the OS needs to be able to address the IOMMU. GCN may not be fully exploitable under Windows 7.

Finally on the memory side, AMD is adding proper ECC support to supplement their existing EDC (Error Detection & Correction) functionality, which is used to ensure the integrity of memory transmissions across the GDDR5 memory bus. Both the SRAM and VRAM memory can be ECC protected. For the SRAM this is a free operation, while for the VRAM there will be a performance overhead. We’re assuming that AMD will be using a virtual ECC scheme like NVIDIA, where ECC data is distributed across VRAM rather than using extra memory chips/controllers.

Elsewhere we’ve already mentioned FP64 support. All GCN GPUs will support FP64 in some form, making FP64 support a standard feature across the entire lineup. The actual FP64 performance is configurable – the architecture supports ½ rate FP64, but ¼ rate and 1/16 rate are also options. We expect AMD to take a page from NVIDIA here and configure lower-end consumer parts to use the slower rates since FP64 is not currently important for consumer uses.

Finally, for programmers some additional hardware changes have been made to improve debug support by allowing debugging tools to tap the GPU at additional points. The new ISA for GCN will already make debugging easier, but this will further that goal. As with other developer features this won’t directly impact end-users, but it will ultimately lead to better software sooner as the features and tools available for debugging GPU programs have been well behind the well-established tools used for debugging CPU programs.

And Many Compute Units Make A GPU Final Words
Comments Locked

83 Comments

View All Comments

  • hammer256 - Friday, June 17, 2011 - link

    It's good to see AMD more committed to the GPGPU. I use GPGPU for neural network simulations, and currently the default choice has been Nvidia with CUDA. It would be nice to see some competition in this space.
    From the article it sounds like AMD knows to put a lot of emphasis on the software side of things for the developers. Hopefully they'll have a capable programming system that's as good as CUDA, maybe even better.
    Finally, Given AMD's strategies in the past with medium sized GPU chips and multi-GPU for high-end, hopefully they'll put sufficient emphasis into support for easier multi-GPU programming.

    Exciting times indeed.
  • krumme - Friday, June 17, 2011 - link

    What a pleasure to read articles like this. I would gladly pay for it, more directly, so to speak.

    Some animations or video, especially for us less tech savvy, would be highly appriciated too.

    Competition for x86 is comming ! :)
  • mczak - Friday, June 17, 2011 - link

    I wouldn't really call it radical, Cayman already had the same theoretic 1/2 performance for FP64 adds compared to FP32. Muls/FMAs though are now 1/2 too it seems (though it might not extend to all products) whereas it was 1/4 on Cayman. Still, a factor two is not what I'd call a "radical" improvement.
  • ahmedz_1991 - Friday, June 17, 2011 - link

    I really appreciated the letters A M D. Since Athlon, one could feel that AMD is lagging behind Intel more and more, but now with them beingh the first successful CPU\GPU combination (Llano out there now ) now AMD can make their own way and API's even into OS's just like what Intel and NVidia always do. This way I'm more than sure that we'll see titles (apps and games ) with the unified AMD brand instead of those ( meant to be played ) or ( smart solution ) with some stupid stars for Core i3,5 or 7
  • frozentundra123456 - Wednesday, December 21, 2011 - link

    Well, technically Sandy Bridge is also a CPU/GPU combination, and I think I would call it successful. Granted, the graphics are not up to AMD levels, but their CPU performance is much better. And considering the debacle of Bulldozer and the architecture that was not optimized for current software, AMD will have to do a much better job of integrating their hardware with software than they have done so far.
  • haukionkannel - Friday, June 17, 2011 - link

    So maybe not big upgrades in graphic power, but improvement in computing power. Its really good for CPGPU usage. It allso makes it easier to run physic calculations in AMD GPUs.

    Hmm... It allso means that more silicon space is neede for same graphic power...

    Interesting to see how it all sums up.
  • Targon - Saturday, June 18, 2011 - link

    Right now, there has been a shortage of software that really pushes the graphics limits, mostly because you have the substandard Intel graphics out there that still has a significant market share. How many games out there really make you feel that a Radeon 6970 just isn't enough? The polygon count for objects(characters) in games have not been going up as much as more world detail has been going in.

    Now, when developers want to try aiming for 5 million polygon figures in games, THAT is where there will be a bigger demand for more graphics power, and with that level of detail, the CPU power needed to properly animate the objects needs to be higher. This is where all of this work with GPU compute comes in, to handle all the complexities of properly animating these super-high detailed objects.

    I will note that The Witcher 2 is one of the first games I have seen in a long time where CPU power needs to be higher than a Phenom 2 945, and I am waiting for the AMD Bulldozer core CPUs(not APUs) to come out to see how big of an improvement it will make.
  • IlllI - Friday, June 17, 2011 - link

    can someone explain all this to me? lol this is all beyond my understanding
  • tipoo - Saturday, June 18, 2011 - link

    They are making GPU compute much more capable and possible, in a nutshell. This will greatly increase the processing speed of many tasks on computers.
  • khimera2000 - Sunday, June 19, 2011 - link

    AMD has CPU and GPU, but there seperate. They want this to change.

    There combining the CPU and GPU so that they are more able to talk to each other, and do the tasks there best at. this is done by remaking the way they build video cards.

    C++... great for CPU not so great for gpu... they want to change this.

    Out of order operations suck on the GPU. they want to change this, so it can hammer through more work faster.

    There also throwing in a bunch of tools to help tell developers where there messing up in this regard.

    fusion APUs will have a nice trick... they will be able to talk to each other without needing to send information back to memory. Imagion passing letters but having to use fedex, this would be like a move to passing letters in class (no fedex) its quicker :) and your mail isint delayed.

    APU will talk over PCI-E... Im wondering how that will work to 0.o

Log in

Don't have an account? Sign up now