Partially Resident Textures: Not Your Father’s Megatexture

John Carmack’s id Software may not be the engine licensing powerhouse it was back in the Quake 3 days, but that hasn’t changed the revolutionary nature of his engine designs. The reason we bring this up is because there’s a great deal of GPU technology that can be directly mapped to concepts Carmack first implemented. For id Tech 4 Carmack implemented shadow volume technology, which was then first implemented in hardware by NVIDIA as their UltraShadow technology, and has since then been implemented in a number of GPUs. For id Tech 5 the trend has continued, now with AMD doing a hardware implementation of a Carmack inspired technology.

Among the features added to Graphics Core Next that were explicitly for gaming, the final feature was Partially Resident Textures, which many of you are probably more familiar with in concept as Carmack’s MegaTexture technology. The concept behind PRT/Megatexture is that rather than being treated as singular entities, due to their size textures should be broken down into smaller tiles, and then the tiles can be used as necessary. If a complete texture isn’t needed, then rather than loading the entire texture only the relevant tiles can be loaded while the irrelevant tiles can be skipped or loaded at a low quality. Ultimately this technology is designed to improve texture streaming by streaming tiles instead of whole textures, reducing the amount of unnecessary texture data that is streamed.

Currently MegaTexture does this entirely in software using existing OpenGL 3.2 APIs, but AMD believes that more next-generation game engines will use this type of texturing technology. Which makes it something worth targeting, as if they can implement it faster in hardware and get developers to use it, then it will improve game performance on their cards. Again this is similar to volume shadows, where hardware implementations sped up the process.

In order to implement this in hardware AMD has to handle two things: texture conversion, and cache management. With texture conversion, textures need to be read and broken up into tiles; AMD is going with a texture format agnostic method here that can simply chunk textures as they stand, keeping the resulting tiles in the same format. For AMD’s technology each tile will be 64KB, which for an uncompressed 32bit texture would be enough room for a 128 x 128 chunk.

The second aspect of PRT is managing the tiles. In essence PRT reduces local video memory to a very large cache, where tiles are mapped/pinned as necessary and then evicted as per the cache rules, and elsewhere the hardware handles page/tile translation should a tile not already be in the cache. Large tomes have been written on caching methods, and this aspect is of particular interest to AMD because what they learn about caching here they can apply to graphical workloads (i.e. professional) and not just gaming.

To that end AMD put together a technology demo for PRT based on Per-Face Texture Mapping (PTEX), a Disney-developed texture mapping technique that maps textures to polygons in a 1:1 ratio. Disney uses this technique for production rendering, as by constraining textures to a single polygon they don’t have to deal with any complexities that arise as a result of mapping a texture over multiple polygons. In the case of AMD’s demo it not only benefits for the reasons that Disney uses it, but also because when combined with tessellation it trivializes vector displacement, making art generation for tessellated games much easier to create. Finally, PRT fits into all of this by improving the efficiency of accessing and storing the Ptex texture chunks.

Wrapping things up, for the time being while Southern Islands will bring hardware support for PRT software support will remain limited. As D3D is not normally extensible it’s really only possible to easily access the feature from other APIs (e.g. OpenGL), which when it comes to games is going to greatly limit the adoption of the technology. AMD of course is working on the issue, but there are few ways around D3D’s tight restrictions on non-standard features.

Tahiti: The First Direct3D 11.1 GPU Display Tech: HD3D Eyefinity, MST Hubs, & DDM Audio
Comments Locked

292 Comments

View All Comments

  • B3an - Thursday, December 22, 2011 - link

    Anyone with half a brain should have worked out that being as this was going to be AMD's Fermi that it would not of had a massive increase for gaming, simply because many of those extra transistors are there for computing purposes. NOT for gaming. Just as with Fermi.

    The performance of this card is pretty much exactly as i expected.
  • Peichen - Friday, December 23, 2011 - link

    AMD has been saying for ages that GPU computing is useless and CPU is the only way to go. I guess they just have a better PR department than Nvidia.

    BTW, before suggesting I have suffered brain trauma, remember that Nvidia delivered on Fermi 2 and GK100 will be twice as powerful as GF110
  • CeriseCogburn - Thursday, March 8, 2012 - link

    Well it was nice to see the amd fans with half a heart admit amd has accomplished something huge by abandoned gaming, as they couldn't get enough of screaming it against nvidia... even as the 580 smoked up the top line stretch so many times...
    It's so entertaining...
  • CeriseCogburn - Thursday, March 8, 2012 - link

    AMD is the dumb company. Their dumb gpu shaders. Their x86 copying of intel. Now after a few years they've done enough stealing and corporate espionage to "clone" Nvidia architecture and come out with this 7k compute.
    If they're lucky Nvidia will continue doing all software groundbreaking and carry the massive load by a factor of ten or forty to one working with game developers, porting open gl and open cl to workable programs and as amd fans have demanded giving them PhysX ported out to open source "for free", at which point it will suddenly be something no gamer should live without.
    "Years behind" is the real story that should be told about amd and it's graphics - and it's cpu's as well.
    Instead we are fed worthless half truths and lies... a "tesselator" in the HD2900 (while pathetic dx11 perf is still the amd norm)... the ddr5 "groundbreaker" ( never mentioned was the sorry bit width that made cheap 128 and 256 the reason for ddr5 needs)...
    Etc.
    When you don't see the promised improvement, the radeonites see a red rocket shooting to the outer depths of the galaxy and beyond...
    Just get ready to pay some more taxes for the amd bailout coming.
  • durinbug - Thursday, December 22, 2011 - link

    I was intrigued by the comment about driver command lists, somehow I missed all of that when it happened. I went searching and finally found this forum post from Ryan:
    http://forums.anandtech.com/showpost.php?p=3152067...

    It would be nice to link to that from the mention of DCL for those of us not familiar with it...
  • digitalzombie - Thursday, December 22, 2011 - link

    I know I'm a minority, but I use Linux to crunch data and GPU would help a lot...

    I was wondering if you guys can try to use these cards on Debian/Ubuntu or Fedora? And maybe report if 3d acceleration actually works? My current amd card have bad driver for Linux, shearing and glitches, which sucks when I try to number crunch and map stuff out graphically in 3d. Hell I try compiling the driver's source code and it doesn't work.

    Thank you!
  • WaltC - Thursday, December 22, 2011 - link

    Somebody pinch me and tell me I didn't just read a review of a brand-new, high-end ATi card that apparently *forgot* Eyefinity is a feature the stock nVidia 580--the card the author singles out for direct comparison with the 7970--doesn't offer in any form. Please tell me it's my eyesight that is failing, because I missed the benchmark bar charts detailing the performance of the Eyefinity 6-monitor support in the 7970 (but I do recall seeing esoteric bar-chart benchmarks for *PCIe 3.0* performance comparisons, however. I tend to think that multi-monitor support, or the lack of it, is far more an important distinction than PCIe 3.0 support benchmarks at present.)

    Oh, wait--nVidia's stock 580 doesn't do nVidia's "NV Surround triple display" and so there was no point in mentioning that "trivial fact" anywhere in the article? Why compare two cards so closely but fail to mention a major feature one of them supports that the other doesn't? Eh? Is it the author's opinion that multi-monitor gaming is not worth having on either gpu platform? If so, it would be nice to know that by way of the author's admission. Personally, I think that knowing whether a product will support multi monitors and *playable* resolutions up to 5760x1200 ROOB is *somewhat* important in a product review. (sarcasm/massive understatement)

    Aside from that glaring oversight, I thought this review was just fair, honestly--and if the author had been less interested in apologizing for nVidia--we might even have seen a better one. Reading his hastily written apologies was kind of funny and amusing, though. But leaving out Eyefinity performance comparisons by pretending the feature isn't relative to the 7970, or that it isn't a feature worth commenting on relative to nVidia's stock 580? Very odd. The author also states: "The purpose of MST hubs was so that users could use several monitors with a regular Radeon card, rather than needing an exotic all-DisplayPort “Eyefinity edition” card as they need now," as if this is an industry-standard component that only ATi customers are "asking for," when it sure seems like nVidia customers could benefit from MST even more at present.

    I seem to recall reading the following statement more than once in this review but please pardon me if it was only stated once: "... but it’s NVIDIA that makes all the money." Sorry but even a dunce can see that nVidia doesn't now and never has "made all the money." Heh...;) If nVidia "made all the money," and AMD hadn't made any money at all (which would have to be the case if nVidia "made all the money") then we wouldn't see a 7970 at all, would we? It's possible, and likely, that the author meant "nVidia made more money," which is an independent declaration I'm not inclined to check, either way. But it's for certain that in saying "nVidia made all the money" the author was--obviously--wrong.

    The 7970 is all the more impressive considering how much longer nVidia's had to shape up and polish its 580-ish driver sets. But I gather that simple observation was also too far fetched for the author to have seriously considered as pertinent. The 7970 is impressive, AFAIC, but this review is somewhat disappointing. Looks like it was thrown together in a big hurry.
  • Finally - Friday, December 23, 2011 - link

    On AT you have to compensate for their over-steering while reading.
  • Death666Angel - Thursday, December 22, 2011 - link

    "Intel implemented Quick Sync as a CPU company, but does that mean hardware H.264 encoders are a CPU feature?" << Why is that even a question. I cannot use the feature unless I am using the iGPU or use the dGPU with Lucid Virtu. As such, it is not a feature of the CPU in my book.
  • Roald - Thursday, December 22, 2011 - link

    I don't agree with the conclusion. I think it's much more of a perspective thing. Comming from the 6970 to the 7970 it's not a great win in the gaming deparment. However the same can be said from the change from 4870 to 5870 to 6970. The only real benefit the 5870 had over the 4870 was DX11 support, which didn't mean so much for the games at the time.

    Now there is a new architechture that not only manages to increase FPS in current games, it also has growing potential and manages to excell in the compute field aswell at the same time.

    The conclusion made in the Crysis warhead part of this review should therefore also have been highlighted as finals words.

    Meanwhile it’s interesting to note just how much progress we’ve made since the DX10 generation though; at 1920 the 7970 is 130% faster than the GTX 285 and 170% faster than the Radeon HD 4870. Existing users who skip a generation are a huge market for AMD and NVIDIA, and with this kind of performance they’re in a good position to finally convince those users to make the jump to DX11.

Log in

Don't have an account? Sign up now