Video & Movies: The Video Codec Engine, UVD3, & Steady Video 2.0

When Intel introduced the Sandy Bridge architecture one of their big additions was Quick Sync, their name for their hardware H.264 encoder. By combining a specialized fixed function encoder with some GPU-based processing Intel was able to create a small, highly efficient H.264 encoder that had quality that was as good as or better than AMD and NVIDIA’s GPU based encoders that at the same time was 2x to 4x faster and consumed a fraction of the power. Quick Sync made real-time H.264 encoding practical on even low-power devices, and made GPU encoding redundant at the time. AMD of course isn’t one to sit idle, and they have been hard at work at their own implementation of that technology: the Video Codec Engine (VCE).

The introduction of VCE brings up a very interesting point for discussing the organization of AMD. As both a CPU and a GPU company the line between the two divisions and their technologies often blurs, and Fusion has practically made this mandatory. When AMD wants to implement a feature, is it a GPU feature, a CPU feature, or perhaps it’s both? Intel implemented Quick Sync as a CPU company, but does that mean hardware H.264 encoders are a CPU feature? AMD says no. Hardware H.264 encoders are a GPU feature.

As such VCE is being added to the mix from the GPU side, meaning it shows up first here on the Southern Islands series. Fundamentally VCE is very similar to Quick Sync – it’s based on what you can accomplish with the addition of a fixed function encoder – but AMD takes the concept much further to take full advantage of what the compute side of GCN can do. In “Full Mode” VCE behaves exactly like Quick Sync, in which virtually every step of the H.264 encoding process is handled by fixed function hardware. Just like Quick Sync Full Mode is fast and energy efficient. But it doesn’t make significant use of the rest of the GPU.

Hybrid Mode is where AMD takes things a step further, by throwing the compute resources of the GPU back into the mix. In Hybrid Mode only Entropy Encode is handled by fixed function hardware (this being a highly serial process that was ill suited to a GPU) with all the other steps being handled by the flexible hardware of the GPU. The end goal of Hybrid Mode is that as these other steps are well suited to being done on a GPU, Hybrid Mode will be much faster than even the highly optimized fixed function hardware of Full Mode. Full Mode is already faster than real time – Hybrid Mode should be faster yet.

With VCE AMD is also targeting Quick Sync’s weaknesses regardless of the mode used. Quick Sync has limited tuning capabilities which impacts the quality of the resulting encode. AMD is going to offer more tuning capabilities to allow for a wider range of compression quality.  We don’t expect that it will be up to the quality standards of X264 and other pure-software encoders that can generate archival quality encodes, but if AMD is right it should be closer to archival quality than Quick Sync was.

The catch right now is that VCE is so new that we can’t test it. The hardware is there and we’re told it works, but the software support for it is lacking as none of AMD’s partners have added support for it yet. On the positive side this means we’ll be able to test it in-depth once the software is ready as opposed to quickly testing it in time for this review, however the downside is that we cannot comment on the speed or quality at this time. Though with the 7970 not launching until next year, there’s time for software support to be worked out before the first Southern Islands card ever goes on sale.

Moving on, while encoding has been significantly overhauled decoding will remain largely the same. AMD doesn’t refer to the Universal Video Decoder on Tahiti as UVD3, but the specifications match UVD3 as we’ve seen on Cayman so we believe it to be the same decoder. The quality may have been slightly improved as AMD is telling us they’ve scored 200 on HQV 2.0 – the last time we scored them they were at 197 – but HQV is a partially subjective benchmark.

Finally, with Southern Islands AMD is introducing Steady Video 2.0, thesuccessor to Steady Video that was introduced with the Llano APU last year. Steady Video 2.0 adds support for interlaced and letter/pillar boxed content, along with a general increase in the effectiveness of the steadying effect. What makes this particularly interesting is that Steady Video implements a new GCN architecture instruction, Quad Sum of Absolute Differences (QSAD), which combines regular SAD operations with alignment operations into a single instruction. As a result AMD can now execute SADs at a much higher rate so long as they can be organized into QSADs, which is one of the principle reasons that AMD was able to improve Steady Video as it’s a SAD-heavy operation. QSAD extends to more than just Steady Video (AMD noted that it’s also good for other image analysis operations), but Steady Video is going to be the premiere use for it.

Display Tech, Cont: Fast HDMI PCI Express 3.0: More Bandwidth For Compute


View All Comments

  • B3an - Thursday, December 22, 2011 - link

    Anyone with half a brain should have worked out that being as this was going to be AMD's Fermi that it would not of had a massive increase for gaming, simply because many of those extra transistors are there for computing purposes. NOT for gaming. Just as with Fermi.

    The performance of this card is pretty much exactly as i expected.
  • Peichen - Friday, December 23, 2011 - link

    AMD has been saying for ages that GPU computing is useless and CPU is the only way to go. I guess they just have a better PR department than Nvidia.

    BTW, before suggesting I have suffered brain trauma, remember that Nvidia delivered on Fermi 2 and GK100 will be twice as powerful as GF110
  • CeriseCogburn - Thursday, March 08, 2012 - link

    Well it was nice to see the amd fans with half a heart admit amd has accomplished something huge by abandoned gaming, as they couldn't get enough of screaming it against nvidia... even as the 580 smoked up the top line stretch so many times...
    It's so entertaining...
  • CeriseCogburn - Thursday, March 08, 2012 - link

    AMD is the dumb company. Their dumb gpu shaders. Their x86 copying of intel. Now after a few years they've done enough stealing and corporate espionage to "clone" Nvidia architecture and come out with this 7k compute.
    If they're lucky Nvidia will continue doing all software groundbreaking and carry the massive load by a factor of ten or forty to one working with game developers, porting open gl and open cl to workable programs and as amd fans have demanded giving them PhysX ported out to open source "for free", at which point it will suddenly be something no gamer should live without.
    "Years behind" is the real story that should be told about amd and it's graphics - and it's cpu's as well.
    Instead we are fed worthless half truths and lies... a "tesselator" in the HD2900 (while pathetic dx11 perf is still the amd norm)... the ddr5 "groundbreaker" ( never mentioned was the sorry bit width that made cheap 128 and 256 the reason for ddr5 needs)...
    When you don't see the promised improvement, the radeonites see a red rocket shooting to the outer depths of the galaxy and beyond...
    Just get ready to pay some more taxes for the amd bailout coming.
  • durinbug - Thursday, December 22, 2011 - link

    I was intrigued by the comment about driver command lists, somehow I missed all of that when it happened. I went searching and finally found this forum post from Ryan:

    It would be nice to link to that from the mention of DCL for those of us not familiar with it...
  • digitalzombie - Thursday, December 22, 2011 - link

    I know I'm a minority, but I use Linux to crunch data and GPU would help a lot...

    I was wondering if you guys can try to use these cards on Debian/Ubuntu or Fedora? And maybe report if 3d acceleration actually works? My current amd card have bad driver for Linux, shearing and glitches, which sucks when I try to number crunch and map stuff out graphically in 3d. Hell I try compiling the driver's source code and it doesn't work.

    Thank you!
  • WaltC - Thursday, December 22, 2011 - link

    Somebody pinch me and tell me I didn't just read a review of a brand-new, high-end ATi card that apparently *forgot* Eyefinity is a feature the stock nVidia 580--the card the author singles out for direct comparison with the 7970--doesn't offer in any form. Please tell me it's my eyesight that is failing, because I missed the benchmark bar charts detailing the performance of the Eyefinity 6-monitor support in the 7970 (but I do recall seeing esoteric bar-chart benchmarks for *PCIe 3.0* performance comparisons, however. I tend to think that multi-monitor support, or the lack of it, is far more an important distinction than PCIe 3.0 support benchmarks at present.)

    Oh, wait--nVidia's stock 580 doesn't do nVidia's "NV Surround triple display" and so there was no point in mentioning that "trivial fact" anywhere in the article? Why compare two cards so closely but fail to mention a major feature one of them supports that the other doesn't? Eh? Is it the author's opinion that multi-monitor gaming is not worth having on either gpu platform? If so, it would be nice to know that by way of the author's admission. Personally, I think that knowing whether a product will support multi monitors and *playable* resolutions up to 5760x1200 ROOB is *somewhat* important in a product review. (sarcasm/massive understatement)

    Aside from that glaring oversight, I thought this review was just fair, honestly--and if the author had been less interested in apologizing for nVidia--we might even have seen a better one. Reading his hastily written apologies was kind of funny and amusing, though. But leaving out Eyefinity performance comparisons by pretending the feature isn't relative to the 7970, or that it isn't a feature worth commenting on relative to nVidia's stock 580? Very odd. The author also states: "The purpose of MST hubs was so that users could use several monitors with a regular Radeon card, rather than needing an exotic all-DisplayPort “Eyefinity edition” card as they need now," as if this is an industry-standard component that only ATi customers are "asking for," when it sure seems like nVidia customers could benefit from MST even more at present.

    I seem to recall reading the following statement more than once in this review but please pardon me if it was only stated once: "... but it’s NVIDIA that makes all the money." Sorry but even a dunce can see that nVidia doesn't now and never has "made all the money." Heh...;) If nVidia "made all the money," and AMD hadn't made any money at all (which would have to be the case if nVidia "made all the money") then we wouldn't see a 7970 at all, would we? It's possible, and likely, that the author meant "nVidia made more money," which is an independent declaration I'm not inclined to check, either way. But it's for certain that in saying "nVidia made all the money" the author was--obviously--wrong.

    The 7970 is all the more impressive considering how much longer nVidia's had to shape up and polish its 580-ish driver sets. But I gather that simple observation was also too far fetched for the author to have seriously considered as pertinent. The 7970 is impressive, AFAIC, but this review is somewhat disappointing. Looks like it was thrown together in a big hurry.
  • Finally - Friday, December 23, 2011 - link

    On AT you have to compensate for their over-steering while reading. Reply
  • Death666Angel - Thursday, December 22, 2011 - link

    "Intel implemented Quick Sync as a CPU company, but does that mean hardware H.264 encoders are a CPU feature?" << Why is that even a question. I cannot use the feature unless I am using the iGPU or use the dGPU with Lucid Virtu. As such, it is not a feature of the CPU in my book. Reply
  • Roald - Thursday, December 22, 2011 - link

    I don't agree with the conclusion. I think it's much more of a perspective thing. Comming from the 6970 to the 7970 it's not a great win in the gaming deparment. However the same can be said from the change from 4870 to 5870 to 6970. The only real benefit the 5870 had over the 4870 was DX11 support, which didn't mean so much for the games at the time.

    Now there is a new architechture that not only manages to increase FPS in current games, it also has growing potential and manages to excell in the compute field aswell at the same time.

    The conclusion made in the Crysis warhead part of this review should therefore also have been highlighted as finals words.

    Meanwhile it’s interesting to note just how much progress we’ve made since the DX10 generation though; at 1920 the 7970 is 130% faster than the GTX 285 and 170% faster than the Radeon HD 4870. Existing users who skip a generation are a huge market for AMD and NVIDIA, and with this kind of performance they’re in a good position to finally convince those users to make the jump to DX11.

Log in

Don't have an account? Sign up now