The Kyro II still boasts what STMicroelectronics calls Internal True Color which promises to make 16-bit color gameplay look better. As a result of texturing and z-buffering being performed on-chip, they can be done in full 32-bit color without the large performance penalty that traditional architectures must incur. Further, the internal 32-bit rendering occurs regardless of the frame buffer's color depth. The penalty that most architectures incur for 32-bit rendering is a result of memory bandwidth constraints that are in turn a result of the constant z-buffer accesses and unnecessary overdraw. In an ideal world, with infinite memory bandwidth, traditional 3D architectures would not slow down at all when rendering in 32-bit color.
So why not render in 32-bit mode all the time? If it were that simple, the KYRO probably would always operate in 32-bit mode. The fact remains that a 32-bit frame buffer and textures still take up twice as much memory as 16-bit ones. While the KYRO is able to render each tile on-chip, it is still necessary to put the completed tile in the frame buffer and also to read textures from memory, so the memory footprint (not bandwidth) requirements for 32-bit color are still double what they are for 16-bit, regardless of the rendering architecture in use.
The obvious question is then, why not use 16-bit frame buffers with 32-bit internal rendering all the time? As the screen shots below show, full 32-bit still looks better since the 16-bit image is dithered down from the internal 32-bit. This is especially apparent where fine gradients in color appear on screen. Note that there is a significant reduction in dithering for the 16-bit image of the Kyro compared to most cards 16-bit rendering.
16-bit shown above, 32-bit is below
The images above are JPEG compressed and thus have some quality loss compared to the originals.
Click here to download a zip file (300KB)with the original images in BMP format.
For the full effect, the images should be viewed full screen
The raw specs of the Kyro II processor are rather unimpressive. With a 175 MHz clock that is capable of processing two textures per clock, the Kyro II's raw fillrate is only 350 megapixels per second. The fillrate number is actually much closer to the original NVIDIA RIVA TNT2 Ultra than any current generation graphics processor. As we have seen in the past, numbers can be very misleading.
Rather than use the raw fillrate number of 350 megapixels per second, one has to take into account the overdraw that we discussed before. There are two ways to arrive at an effective fill rate number that takes into account overdraw. Either look at the number of pixels actually rendered on screen in a given amount of time or look at the number of pixels that would be rendered in the case of an immediate mode rendering engine.
With the first way of looking at the situation, a tile rendering architecture, such as that used in the Kyro II, the number of pixels rendered on screen will match the total number of pixels rendered. Thus, the effective fill rate here is the same as the theoretical fill rate. Since an immediate mode renderer often calculates up to 4 times the information necessary to render a scene, one can essentially divide the theoretical fill rate number of these cards by the amount of overdraw to arrive at the effective fill rate. Through out the rest of this article, this will be the "effective fill rate" we are referring to.
For marketing reasons, it's much easier for STMicroelectronics to push the second method of arriving at an effective fill rate number. This means taking the amount of overdraw (which is eliminated on the Kyro II's tile based rendering system) and multiplying by the theoretical fill rate of the tile based rendering card in order to get an "effective" fill rate. The latter is what STMicroelectronics choose to do, giving the Kyro II an effective fill rate number. The idea is to arrive at a fill rate number that can be directly compared to that of an immediate mode renderer. Assuming an overdraw of 4, which is considered a bit high by many in the industry, the Kyro II earns an effective fill rate of 1400 megapixels per second, far above the GeForce2 GTS's 800 megapixel per second rating. Assuming a more conservative overdraw estimate of around 3, the Kyro II boasts a 1050 megapixel per second; a number that is still very impressive.
The Kyro II also boasts 8-layer multitexturing in a single pass. Since texturing is performed on-hip, multitexturing becomes much more efficient in certain circumstances. Consider the GeForce 2 GTS, which can apply 2 texels to 4 pixels in a single pass. If the number of textures for a single pixel exceeds 2, then the GeForce 2 GTS will have to render the pixel in two passes. Those two passes mean that geometry data be sent again for the second pass. On the other hand, the Kyro II is capable of applying up to 8 textures to a pixel in a single pass. Another way in which tile rendering reduces memory bandwidth requirements. Note that this does not mean that the Kyro II can apply 8 textures in a single clock - in fact it can only do one texture per pixel in a single clock.
The Kyro II still contains the original's 300 MHz RAMDAC. Added this time around was S3TC texture compression support. This feature was left out of the initial Kyro release because licensing of the technology from S3 was not completed in time for the Kyro's introduction.