Limitations to the GeForce
It has been about four months since the last major revolution came to the video card market, a revolution brought upon by the release of NVIDIA's new video card processor: the GeForce. Referred to by NVIDIA as a GPU, standing for Graphics Processing Unit, the GeForce marked a significant improvement in 3D gaming and raw processing power. At a fill rate of 480 Million Pixels per Second and around 23 million transistors, the GeForce definitely had more than its predecessor, the TNT2. In addition, advanced features never seen before on video cards, such as T&L, provided for an even more optimistic view of 3D gaming future. However, for all the bang that came with NVIDIA's new powerhouse, there remain speed limitations to the GeForce product line.
In essence, there are two factors which affect the speed of a video card: the fill rate and the memory bandwidth. The fill rate refers to how many pixels the video processor can compute over a given time interval (measured in seconds). The theoretical fill rate is directly proportional to the core speed of a given card. For example, given that the standard core speed of the GeForce is 120 MHz and that the GeForce GPU can process 4 pixels per clock cycle, we can calculate the fill rate to be 4 pixels times the 120 MHz clock cycle resulting in 480 Million Pixels per Second. Any increase in core clock speed would result in an additional increase in fill rate, showing that at a 160 MHz core clock speed, the GeForce would be able to process 640 Million Pixels per Second.
The second factor that affects how fast a video card can function is the memory bandwidth. While the GeForce GPU may be able to process 480 Million Pixels per Second, the video card needs a place to collect and store this data before it gets rendered on the screen. This temporary storage area is provided via RAM. The problem is not keeping the data in the RAM but rather getting the data there. Data from the GeForce GPU can only get to the monitor by passing through the RAM first. Ideally, data transfer between the processor and the memory system would be instantaneous, leaving no potential bottleneck in the system. This, however, is not the case. The data from the processor must pass to the memory via the memory bus, a factor controlled by memory clock speed. More often than not, this bandwidth amount is too little to keep up with the fast rate that the core is sending out data. This is especially the case when running in 32 bit color, because, at this mode, twice as much data has to be passed from processor to RAM. This results in a process that does not quite meet the theoretical fill rate described above, thus this speed represents the effective fill rate. This effective fill rate is limited by the peak available bandwidth of the memory bus.
In the case of the GeForce, which speed factor plays a larger role in overall speed? To answer this question, we independently raised both the memory and core speeds of a typical DDR GeForce card. The Leadtek WinFast GeForce 256 DDR Rev B was used as the test card and results were recorded on an AMD Athlon system running at 750 MHz (see The Test section for more details). Quake III Arena was then run to determine which system proved to be the bottleneck. The next section begins with the graphs and continues into an explanation of the trends. The results may surprise you.