Virtual Memory - Bringing L2 Cache to the VPU

Another fairly major feature that is brought to the table with the P10 is what 3DLabs calls the P10's Virtual Memory System (VMS). The way VMS works is by storing all textures in main memory and treating the memory on the graphics card itself as a very large cache. When a texture is requested, the entire texture isn't downloaded instead a 256x256 block of 32-bit pixels can be pulled in locally and accessed. The perfect example of where this would be useful is when you're walking around in a 3D environment and only a small part of a texture is visible on the screen. In traditional architectures the entire texture must be downloaded but with the P10's VMS the entire texture will remain in system memory and only the part being seen will be transferred to video memory. This may sound a lot like AGP texturing which it is, with the exception being that it is more like AGP textured but cached. The partition of system memory that is used as the P10's VMS does not need to be contiguous which is also useful.

From the standpoint of game developers, VMS is quite attractive as it enables the use of far more textures than is currently possible. Currently game developers are very cautious to use more textures that there is available video memory because swapping out to main memory results in a huge performance hit. But with VMS you get the benefits of an extremely high bandwidth caching effect and can deal with much larger textures than ever before.

The best way to understand the benefits of VMS is by looking at the CPU world. Let's say that the L2 cache on your CPU wasn't really a cache but rather a small amount of high-speed memory that didn't cache system memory at all. As long as the data the CPU needed was in its on-board memory, the performance was extremely fast. However as soon as the application you were running exceeded the local memory size, the performance hit was incredible. Would it make the most sense for application developers to write all of their software so that it only fits within the 512KB of memory on your CPU itself? Or would it make more sense for CPU manufacturers to treat that small amount of memory as a cache and enable developers to use a much larger amount of memory? Obviously you'd want the latter, while things would be much faster running entirely out of high speed local memory it is not only expensive but also limits the software developers in a tremendous way.

In order to get the viewpoint of someone faced with the limitations of current memory architectures we asked Tim Sweeney of Epic Games what his thoughts were on the P10's VMS:

"This is something Carmack and I have been pushing 3D card makers to implement for a very long time. Basically it enables us to use far more textures than we currently can. You won't see immediate improvements with current games, because games always avoid using more textures than fit in video memory, otherwise you get into texture swapping and performance becomes totally unacceptable. Virtual texturing makes swapping performance acceptable, because only the blocks texels that are actually rendered are transferred to video memory, on demand.

Then video memory starts to look like a cache, and you can get away with less of it - typically you only need enough to hold the frame buffer, back buffer, and the blocks of texels that are rendered in the current scene, as opposed to all the textures in memory. So this should let IHV's include less video RAM without losing performance, and therefore faster RAM at less cost.

This does for rendering what virtual memory did for operating systems: it eliminates the hardcoded limitation on RAM (from the application's point of view.)"

The P10's "Pixel Shaders" Final Words
Comments Locked

0 Comments

View All Comments

Log in

Don't have an account? Sign up now