AMD's Graphics Core Next Preview: AMD's New GPU, Architected For Compute

Name: AMD's Graphics Core Next Preview: AMD's New GPU, Architected For Compute
Item: AMD's Graphics Core Next Preview: AMD's New GPU, Architected For Compute
Author: Ryan Smith

by Ryan Smith on December 21, 2011 9:38 PM EST

Posted in
GPUs
AMD
Radeon
GCN

83 Comments | Add A Comment

83 Comments

Prelude: The History of VLIW & Graphics

Before we get into the nuts & bolts of Graphics Core Next, perhaps it’s best to start at the bottom, and then work our way up.

The fundamental unit of AMD’s previous designs has been the Streaming Processor, previously known as the SPU. In every modern AMD design other than Cayman (6900), this is a Very Long Instruction Word 5 (VLIW5) design; Cayman reduced this to VLIW4. As implied by the architectural name, each SP would in turn have 5 or 4 fundamental math units – what AMD now calls Radeon cores – which executed the individual instructions in parallel over as many clocks as necessary. Radeon cores were coupled with registers, a branch unit, and a special function (transcendental) unit as necessary to complete the SP.

VLIW designs are designed to excel at executing many operations from the same task in parallel by breaking it up into smaller groupings called wavefronts. In AMD’s case a wavefront is a group of 64 pixels/values and the list of instructions to be executed against them. Ideally, in a wavefront a group of 4 or 5 instructions will come down the pipe and be completely non-interdependent, allowing every Radeon core to be fed. When dependent instructions would come down however, fewer instructions could be scheduled at once, and in the worst case only a single instruction could be scheduled. VLIW designs will never achieve perfect efficiency in this regard, but the farther real world utilization is from ideal efficiency, the weaker the benefits of VLIW.

The use of VLIW can be traced back to the first AMD DX9 GPU, R300 (Radeon 9700 series). If you recall our Cayman launch article, we mentioned that AMD initially used a VLIW design in those early parts because it allowed them to process a 4 component dot product (e.g. w, x, y, z) and a scalar component (e.g. lighting) at the same time, which was by far the most common graphics operation. Even when moving to unified shaders in DX10 with R600 (Radeon HD 2900), AMD still kept the VLIW5 design because the gaming market was still DX9 and using those kinds of operations. But as new games and GPGPU programs have come out efficiency has dropped over time, and based on AMD’s own internal research at the time of the Cayman launch the average shader program was utilizing only 3.4 out of 5 Radeon cores. Shrinking from VLIW5 to VLIW4 fights this some, but utilization will always be a concern.

Finally, it’s worth noting what’s in charge of doing all of the scheduling. In the CPU world we throw things at the CPU and let it schedule actions as necessary – it can even go out-of-order (OoO) within a thread if it will be worth it. With VLIW, scheduling is the domain of the compiler. The compiler gets the advantage of knowing about the full program ahead of time and can intelligently schedule some things well in advance, but at the same time it’s blind to other conditions where the outcome is unknown until the program is run and data is provided. Because of this the schedule is said to be static – it’s set at the time of compilation and cannot be changed in-flight.

So why in an article about AMD Graphics Core Next are we going over the quick history of AMD’s previous designs? Without understanding the previous designs, we can’t understand what is new about what AMD is doing, or more importantly why they’re doing it.

AMD's Graphics Core Next Preview AMD Graphics Core Next: Out With VLIW, In With SIMD

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

83 Comments

View All Comments

Targon - Sunday, June 19, 2011 - link
AMD wants to put an end to the GPU in the chipset, but no one expects dedicated CPU and GPU to go away. Now, the code that would take advantage of the APU would probably work with a full AMD CPU/AMD GPU combination, so the software side of things would not need a lot of change to support both configurations.
khimera2000 - Sunday, June 19, 2011 - link
Agree, dedicated cards will not go away, however intergrated cards like the past will.

I think we see Eye to Eye on this. AMD wants to take full advantage of all its hardware, It looks like the way there trying to do it is by combining the CPU and Intergrated GPU into one package, after which they want to set it up so infromation that goes into that package dosent have to leave to be processed, like sending it out to ram from the CPU only to be read by the GPU.

Still want to see how this will work across PCI-E. I can already see future reviews and comparisons on how effetive GPU acceleration is on there intergrated aproach VS discreet cards. AND Buying those discreet cards :D

By the time these parts comes out my desktop will be right in the middle of its upgrade cycle :D
Targon - Monday, June 20, 2011 - link
AMD needs to push for the HTX slot again for discrete video, where there is a direct HyperTransport link between the CPU and whatever is plugged into that slot. PCI-Express is decent, but HTX would and should blow the doors off PCI-Express.
rnssr71 - Friday, June 17, 2011 - link
i wish this coming next year especially in Trinity but at lest they are heading in the right direction:) also, to those wondering about improvements in gaming ability, look what amd did with cayman vs cypress- the improved efficiency and noticeably improved performance on the same manufacturing. http://www.anandtech.com/bench/Product/294?vs=331
GCN this is going to improve efficiency even farther and they are cutting the transistor size roughly in half.
nlr_2000 - Saturday, June 18, 2011 - link
"Unfortunately, those of you expecting any additional graphics information will have to sight tight for the time being." sight = sit
EnerJi - Saturday, June 18, 2011 - link
I wonder if this architecture would be a particularly good fit for a next-generation Xbox (due around 2013)? Any thoughts on this?
GaMEChld - Saturday, June 18, 2011 - link
2013? I heard 2015, unless they recently changed dates to counter Nintendo. Anyways, I'm not so sure what benefits a console will realize from this, since full blown PC's barely get to utilize much of the technology we currently have access to. Multi-threading, 64-bit support, advanced cpu instructions are all available yet barely utilized features.

Also, consoles are designed to be cost effective and relatively cheap, so usually modified older generation architecture is used. For example, the new Wii uses Radeon 4700 class graphics, which sounds old but is roughly twice as powerful as the X360 (Radeon X1900) or PS3 (GF7000) graphics.
DanNeely - Saturday, June 18, 2011 - link
That's true of the Wii because Nintendo doesn't subsidize the console, but MS and Sony have gone after higher end GPUs for their last launches. The XBox 360 launched using a GPU similar to that of the ATI 1900, a bare month and a half after the card hit the market.The PS3 used a GF7800 derivative and launched roughly 1 year after the GF7800 did. The GF7900 was nVidias top of the line card at the time, but it was only a marginal improvement over the 7800.
swaaye - Saturday, June 18, 2011 - link
PS3 actually launched about when G80 came out, which obviously made RSX look awfully retro when you saw 7900GTX SLI being beaten in reviews by a single board. ;) But G80 surely was never an option for a console due to size and power.

Xenos has less than half of the pixel fillrate of X1900. X1900 also has 48 pixel shader units + 8 vertex shaders so it might have an advantage over Xenos 48 unified units, especially when clock speed and the access to a large RAM pool over a 256-bit bus are taken into account.
GaMEChld - Sunday, June 19, 2011 - link
But we must also bear in mind that X360 and PS3 may have chosen high on the scale because of the concurrent shift to 720p/1080p resolution instead of the old 480p standard. At this point in time, the 1080p resolution is standardized, so greatly escalating GPU horsepower will show diminishing gains, since people aren't really going to be gaming on higher resolutions than the new standard tv resolution.

What I mean is, if a Radeon 5000 Series could maximize all graphics quality at 1080p, why would a console manufacturer bother with more power?

For example, you wouldn't buy a GTX590 or Radeon 6990 just to game on a 1080p monitor, would you?

The only exception I can think of for this TV resolution argument is 3DTV gaming, in which case I am not well versed in the added GPU overhead required to render a 3D game.

AMD's Graphics Core Next Preview: AMD's New GPU, Architected For Compute

Post Your Comment

83 Comments

View All Comments

Targon - Sunday, June 19, 2011 - link

khimera2000 - Sunday, June 19, 2011 - link

Targon - Monday, June 20, 2011 - link

rnssr71 - Friday, June 17, 2011 - link

nlr_2000 - Saturday, June 18, 2011 - link

EnerJi - Saturday, June 18, 2011 - link

GaMEChld - Saturday, June 18, 2011 - link

DanNeely - Saturday, June 18, 2011 - link

swaaye - Saturday, June 18, 2011 - link

GaMEChld - Sunday, June 19, 2011 - link

Log in

Don't have an account? Sign up now