The Graphics "Pipeline"

The term "pipeline" is thrown around more in the graphics world than in any other business, yet there is quite a bit of misunderstanding (and misuse) involving the word. Graphics manufacturers used to refer to how robust their architecture is by counting the number of "rendering pipelines." The advent of the GPU brought about the use of the word "engine" in referring to anything that did any processing in the GPU - for example, the vertex engines would feed the pixel rendering pipelines. But to properly understand exactly what's going on in a GPU we have to take a step back and begin characterizing GPUs and their architecture not in marketing terms, but in actual microarchitectural terms. With that in mind, let's have a quick look at what a pipeline actually is.

Regardless of whether we're talking about a GPU or a CPU, the word pipeline still means the same thing. The analogy that's often used is one of an assembly line in a car manufacturing plant; instead of having everyone in the plant work on putting a single car together at the same time, the process is split into multiple stages. The frame is welded together at the welding station and then sent down the assembly line to the next stage where the doors may be put on then onto another stage where the engine would be dropped in and so on and so forth. The benefit of the assembly line is that you don't have to wait for one car to be finished in order for the next to begin assembly, as soon as the first car leaves that first stage, the next car begins the building process. This concept of an assembly line is identical to the concept of a pipeline; instead of waiting until one operation is complete before beginning on the next one, a pipelined processor splits its work up into multiple stages in order to have multiple instructions "in flight" at the same time.

The most basic pipeline in the CPU world is the classic five-stage integer pipeline, which consists of the following stages:

Instruction Fetch - Grab instructions from the program to be executed
Instruction Decode - Figure out what the instruction wants the processor to do
Fetch Data/Operands - Grab any data needed for the instruction out of memory (e.g. A = B + C, find the values of B and C so we can add them)
Execution - Carry out the actual operation (add B and C together)
Write Back - Write the result back to a register or memory (store the final value of A somewhere)

Obviously today's microprocessors are significantly more complicated than the basic five stage pipe we just described, but the foundation for all pipelined microprocessors is the same - including GPUs.

The graphics pipeline isn't much different than the simple five-stage integer pipe we just described, however it contains an obscene number of stages. If you thought the Pentium 4's 20-stage integer pipeline was long, hearing that a graphics pipeline can consist of multiple hundreds or even thousands of stages may seem a bit strange. Luckily for GPUs, they have enough memory bandwidth and a pipeline-friendly set of data that prevents this excruciatingly long pipeline from being a performance-limiting characteristic.

The Problem with Understanding Graphics The Graphics Pipeline (continued)
Comments Locked

19 Comments

View All Comments

Log in

Don't have an account? Sign up now