Tahiti: The First Direct3D 11.1 GPU

One of the many changes coming in Windows 8 next year will be the next iteration of Direct3D, which will be Direct3D 11.1. More so than any other version of Direct3D so far, D3D11.1 is best summed up as a housekeeping release. There will be some new features, but compared to even past point releases such as 10.1 and 9c it’s a small release that’s going to be focusing more on improving the API itself – particularly interoperability with SoC GPUs for Windows 8 – than it will be about introducing new features. This is largely a consequence of the growing length of time for all matters of development hardware and software. By the time Windows 8 ships Direct3D 11 will be 3 years old, but these days that’s shorter than the development period for some AAA games. Direct3D 11/11.1 will continue to be the current Windows 3D API for quite some time to come.

With regards to backward compatibility in D3D11.1, there’s one new feature in particular that requires new hardware to support it: Target Independent Rasterization. As a result AMD’s existing D3D11 GPUs cannot fully support D3D11.1, thereby making Tahiti the first D3D 11.1 GPU to be released. In practice this means that the hardware is once again ahead of the API, even more so than what we saw with G80 + D3D10 or Cypress (5870) + D3D11 since D3D11.1 isn’t due to arrive for roughly another year. For the time being Tahiti’s hardware supports it but AMD won’t enable this functionality until the future – the first driver with D3D11.1 support will be a beta driver for Windows 8, which we expect we’ll see for the Windows 8 beta next year.

So what does D3D11.1 bring to the table? The biggest end user feature is going to be the formalization of Stereo 3D support into the D3D API. Currently S3D is achieved by either partially going around D3D to present a quad buffer to games and applications that directly support S3D, or in the case of driver/middleware enhancement manipulating the rendering process itself to get the desired results. Formalizing S3D won’t remove the need for middleware to enable S3D on games that choose not to implement it, but for games that do choose to directly implement it such as Deus Ex, it will now be possible to do this through Direct3D.

S3D related sales have never been particularly spectacular, and no doubt the fragmentation of the market is partially to blame, so this may be the push in the right direction that the S3D market needs, if the wider consumer base is ready to accept it. At a minimum this should remove the need for any fragmentation/customization when it comes to games that directly support S3D.

With S3D out of the way, the rest of the D3D11.1 feature set isn’t going to be nearly as visible. Interoperability between graphics, video, and compute is going to be greatly improved, allowing video via Media Foundation to be sent through pixel and compute shaders, among other things. Meanwhile target independent rasterization and some new buffer commands should give developers a few more tricks to work with, while double precision (FP64) support will be coming to pixel shaders on hardware that has FP64 support.

Finally, looking at things at a lower level D3D11.1 will be released alongside DXGI 1.2 and WDDM 1.2, the full combination of which will continue Microsoft’s long-term goal of making the GPU more CPU-like. One of Microsoft’s goals has to been to push GPU manufacturers to improve the granularity of GPU preemption, both for performance and reliability purposes. Since XP things have gotten better as Vista introduced GPU Timeout Detection and Recovery (TDR) to reset hung GPUs, and a finer level of granularity has been introduced to allow multiple games/applications to share a GPU without stomping all over each other, but preemption and context switches are still expensive on a GPU compared to a CPU (there are a lot of registers to deal with) which impacts performance and reliability.

To that end preemption is being given a bit more attention, as WDDM 1.2 will be introducing some new API commands to help manage it while encouraging hardware developers to support finer grained preemption. Meanwhile to improve reliability TDR is getting a major addition by being able to do a finer grained reset of the GPU. Currently with Windows 7 a TDR triggers a complete GPU reset, but with Windows 8 and WDDM 1.2 the GPU will be compartmentalized into “engines” that can be individually reset. Only the games/applications using a reset engine will be impacted while everything else is left untouched, and while most games and applications can already gracefully handle a reset, this will further reduce the problems a reset creates by resetting fewer programs.

 

Building Tahiti & the Southern Islands Partially Resident Textures: Not Your Father’s Megatexture
Comments Locked

292 Comments

View All Comments

  • B3an - Thursday, December 22, 2011 - link

    Anyone with half a brain should have worked out that being as this was going to be AMD's Fermi that it would not of had a massive increase for gaming, simply because many of those extra transistors are there for computing purposes. NOT for gaming. Just as with Fermi.

    The performance of this card is pretty much exactly as i expected.
  • Peichen - Friday, December 23, 2011 - link

    AMD has been saying for ages that GPU computing is useless and CPU is the only way to go. I guess they just have a better PR department than Nvidia.

    BTW, before suggesting I have suffered brain trauma, remember that Nvidia delivered on Fermi 2 and GK100 will be twice as powerful as GF110
  • CeriseCogburn - Thursday, March 8, 2012 - link

    Well it was nice to see the amd fans with half a heart admit amd has accomplished something huge by abandoned gaming, as they couldn't get enough of screaming it against nvidia... even as the 580 smoked up the top line stretch so many times...
    It's so entertaining...
  • CeriseCogburn - Thursday, March 8, 2012 - link

    AMD is the dumb company. Their dumb gpu shaders. Their x86 copying of intel. Now after a few years they've done enough stealing and corporate espionage to "clone" Nvidia architecture and come out with this 7k compute.
    If they're lucky Nvidia will continue doing all software groundbreaking and carry the massive load by a factor of ten or forty to one working with game developers, porting open gl and open cl to workable programs and as amd fans have demanded giving them PhysX ported out to open source "for free", at which point it will suddenly be something no gamer should live without.
    "Years behind" is the real story that should be told about amd and it's graphics - and it's cpu's as well.
    Instead we are fed worthless half truths and lies... a "tesselator" in the HD2900 (while pathetic dx11 perf is still the amd norm)... the ddr5 "groundbreaker" ( never mentioned was the sorry bit width that made cheap 128 and 256 the reason for ddr5 needs)...
    Etc.
    When you don't see the promised improvement, the radeonites see a red rocket shooting to the outer depths of the galaxy and beyond...
    Just get ready to pay some more taxes for the amd bailout coming.
  • durinbug - Thursday, December 22, 2011 - link

    I was intrigued by the comment about driver command lists, somehow I missed all of that when it happened. I went searching and finally found this forum post from Ryan:
    http://forums.anandtech.com/showpost.php?p=3152067...

    It would be nice to link to that from the mention of DCL for those of us not familiar with it...
  • digitalzombie - Thursday, December 22, 2011 - link

    I know I'm a minority, but I use Linux to crunch data and GPU would help a lot...

    I was wondering if you guys can try to use these cards on Debian/Ubuntu or Fedora? And maybe report if 3d acceleration actually works? My current amd card have bad driver for Linux, shearing and glitches, which sucks when I try to number crunch and map stuff out graphically in 3d. Hell I try compiling the driver's source code and it doesn't work.

    Thank you!
  • WaltC - Thursday, December 22, 2011 - link

    Somebody pinch me and tell me I didn't just read a review of a brand-new, high-end ATi card that apparently *forgot* Eyefinity is a feature the stock nVidia 580--the card the author singles out for direct comparison with the 7970--doesn't offer in any form. Please tell me it's my eyesight that is failing, because I missed the benchmark bar charts detailing the performance of the Eyefinity 6-monitor support in the 7970 (but I do recall seeing esoteric bar-chart benchmarks for *PCIe 3.0* performance comparisons, however. I tend to think that multi-monitor support, or the lack of it, is far more an important distinction than PCIe 3.0 support benchmarks at present.)

    Oh, wait--nVidia's stock 580 doesn't do nVidia's "NV Surround triple display" and so there was no point in mentioning that "trivial fact" anywhere in the article? Why compare two cards so closely but fail to mention a major feature one of them supports that the other doesn't? Eh? Is it the author's opinion that multi-monitor gaming is not worth having on either gpu platform? If so, it would be nice to know that by way of the author's admission. Personally, I think that knowing whether a product will support multi monitors and *playable* resolutions up to 5760x1200 ROOB is *somewhat* important in a product review. (sarcasm/massive understatement)

    Aside from that glaring oversight, I thought this review was just fair, honestly--and if the author had been less interested in apologizing for nVidia--we might even have seen a better one. Reading his hastily written apologies was kind of funny and amusing, though. But leaving out Eyefinity performance comparisons by pretending the feature isn't relative to the 7970, or that it isn't a feature worth commenting on relative to nVidia's stock 580? Very odd. The author also states: "The purpose of MST hubs was so that users could use several monitors with a regular Radeon card, rather than needing an exotic all-DisplayPort “Eyefinity edition” card as they need now," as if this is an industry-standard component that only ATi customers are "asking for," when it sure seems like nVidia customers could benefit from MST even more at present.

    I seem to recall reading the following statement more than once in this review but please pardon me if it was only stated once: "... but it’s NVIDIA that makes all the money." Sorry but even a dunce can see that nVidia doesn't now and never has "made all the money." Heh...;) If nVidia "made all the money," and AMD hadn't made any money at all (which would have to be the case if nVidia "made all the money") then we wouldn't see a 7970 at all, would we? It's possible, and likely, that the author meant "nVidia made more money," which is an independent declaration I'm not inclined to check, either way. But it's for certain that in saying "nVidia made all the money" the author was--obviously--wrong.

    The 7970 is all the more impressive considering how much longer nVidia's had to shape up and polish its 580-ish driver sets. But I gather that simple observation was also too far fetched for the author to have seriously considered as pertinent. The 7970 is impressive, AFAIC, but this review is somewhat disappointing. Looks like it was thrown together in a big hurry.
  • Finally - Friday, December 23, 2011 - link

    On AT you have to compensate for their over-steering while reading.
  • Death666Angel - Thursday, December 22, 2011 - link

    "Intel implemented Quick Sync as a CPU company, but does that mean hardware H.264 encoders are a CPU feature?" << Why is that even a question. I cannot use the feature unless I am using the iGPU or use the dGPU with Lucid Virtu. As such, it is not a feature of the CPU in my book.
  • Roald - Thursday, December 22, 2011 - link

    I don't agree with the conclusion. I think it's much more of a perspective thing. Comming from the 6970 to the 7970 it's not a great win in the gaming deparment. However the same can be said from the change from 4870 to 5870 to 6970. The only real benefit the 5870 had over the 4870 was DX11 support, which didn't mean so much for the games at the time.

    Now there is a new architechture that not only manages to increase FPS in current games, it also has growing potential and manages to excell in the compute field aswell at the same time.

    The conclusion made in the Crysis warhead part of this review should therefore also have been highlighted as finals words.

    Meanwhile it’s interesting to note just how much progress we’ve made since the DX10 generation though; at 1920 the 7970 is 130% faster than the GTX 285 and 170% faster than the Radeon HD 4870. Existing users who skip a generation are a huge market for AMD and NVIDIA, and with this kind of performance they’re in a good position to finally convince those users to make the jump to DX11.

Log in

Don't have an account? Sign up now