AMD's Radeon HD 6970 & Radeon HD 6950: Paving The Future For AMD

Name: AMD's Radeon HD 6970 & Radeon HD 6950: Paving The Future For AMD
Item: AMD's Radeon HD 6970 & Radeon HD 6950: Paving The Future For AMD
Author: Ryan Smith

by Ryan Smith on December 15, 2010 12:01 AM EST

Posted in
GPUs
AMD
Radeon

168 Comments | Add A Comment

168 Comments

Refresher: The 6800 Series’ New Features

Back in October AMD launched the first 6000 series cards, the Barts-based Radeon HD 6800 series. At their core they are a refreshed version of the Cypress GPU that we saw on the 5800 series, but AMD used the opportunity to make some enhancements over the standard Cypress. All of these enhancements apply throughout the 6000 series, so this includes the 6900 series. As such for those of you who didn’t pay much attention to the 6800 series, we’re going to quickly recap what’s new in order to lay the groundwork for further comparisons of the 6900 series to the 5800 series.

We’ll start with the core architecture. Compared to Cypress, Barts is nearly identical save 1 difference: the tessellator. For Barts AMD implemented what they call their 7^th generation tessellator, which focused on delivering improved tessellation performance at lower tessellation factors that AMD felt were more important. Cayman takes this one step further and implements AMD’s 8^th generation tessellator, which as the naming conventions implies is the 7^th generation tessellator with even further enhancements (particularly those necessary for load balancing).

The second change we saw with Barts and the 6800 series was AMD’s refined texture filtering engine. AMD’s texture filtering engine from the 5800 set new standards by offering angle independent filtering, but it had an annoying quirk with highly regular/noisy textures where it didn’t do a good enough job blending together various mipmaps, resulting in visible transitions between them. For the 6800 series AMD fixed this, and it can now properly blend together noisy textures. At the same time in a controversial move AMD tweaked its default filtering optimizations for the 5800 series and entire 6000 series, leading to these cards producing imagines subtly different (and depending on who you ask, subtly worse) than they were on the 5800 series prior to the Catalyst 10.10 drivers.

Radeon HD 5870

Radeon HD 6870

GeForce GTX 480

The third change we saw was the introduction of a new anti-aliasing mode, initially launched on the 6800 series and backported to the 5800 series shortly thereafter. Morphological Anti-Aliasing (MLAA) is a post-processing filter that works on any (and all) images, looking for high contrast edges (jaggies) and blending them to reduce the contrast. Implemented as a compute shader, it works with all games. As it’s a post-processing filter the results can vary – the filter has no knowledge of depth, polygons, or other attributes of the rendered world beyond the final image – so it’s prone to blending everything that looks like aliasing. On the plus side it’s cheap to use as it was originally designed for consoles with their limited resources, so by not consuming large amounts of memory & memory bandwidth like SSAA/MSAA it usually has a low performance hit.

Last but not least, AMD made a number of changes to their display hardware. The Universal Video Decoder (UVD) was upgraded to version 3, bringing full decode support for MPEG-2, MPEG-4 ASP, and H.264 MVC (packed frame video for 3D movies). For the 6900 series this is not of great importance as MPEG-2 and MPEG-4 ASP are low complexity codecs, but it does play an important role for AMD’s future APU products and low-end GPUs, where offloading these low complexity codecs is still going to be a big relief for the slower CPUs they’re paired with. And on that note the first public version of the DivX codec with support for UVD3 will be shipping today, letting 6800/6900 series owners finally take advantage of this functionality.

Click to enlarge

The second of the major display changes was the addition of support for the DisplayPort 1.2 standard. DP1.2 doubles DisplayPort’s bandwidth to 21.6Gbps, finally giving DisplayPort a significant bandwidth lead over dual-link DVI. With double the bandwidth it’s now possible to drive multiple monitors off of a single root DisplayPort, a technology called Multi Stream Transport (MST). AMD is heavily banking on this technology, as the additional bandwidth coupled with the fact that DisplayPort doesn’t require a clock source for each monitor/stream means AMD can drive up to 6 monitors off of a single card using only a pair of mini-DP ports. AMD is so cutting edge here that like the 6800 series the 6900 series is technically only DP1.2 ready – there won’t be any other devices available for compliance testing until 2011.

Finally, the 6800 series also introduced support for HDMI 1.4a and support for color correction in linear space. HDMI 1.4a support is fairly straightforward: the 6000 series can drive 3D televisions in either the 1080p24 or 720p60 3D modes. Meanwhile support for color correction in linear space allows AMD to offer accurate color correction for wide gamut monitors; previously there was a loss of accuracy as color correction had to be applied in the gamma color space, which is only meant for use for display purposes. This is particularly important for integrating wide gamut monitors in to traditional gamut workflows, as sRGB is misinterpreted on a wide gamut monitor without color correction.

While all of these features were introduced on the 6800 series, they’re fundamental parts of the entire 6000 series, meaning they’re part of the 6900 series too. This provides us with a baseline set of improvements over AMD’s 5800 series, on top of the additional improvements Cayman and AMD’s VLIW4 architecture brings.

Index Cayman: The Last 32nm Castaway

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

168 Comments

View All Comments

B3an - Thursday, December 16, 2010 - link
Very stupid uninformed and narrow-minded comment. People like you never look to the future which anyone should do when buying a graphics card, and you completely lack any imagination. Theres already tons of uses for GPU computing, many of which the average computer user can make use of, even if it's simply encoding a video faster. And it will be use a LOT more in the future.

Most people, especially ones that game, dont even have 17" monitors these days. The average size monitor for any new computer is at least 21" with 1680 res these days. Your whole comment is as if everyone has the exact same needs as YOU. You might be happy with your ridiculously small monitor, and playing games at low res on lower settings, and it might get the job done, but lots of people dont want this, they have standards and large monitors and needs to make use of these new GPU's. I cant exactly see many people buying these cards with a 17" monitor!
CeepieGeepie - Thursday, December 16, 2010 - link
Hi Ryan,

First, thanks for the review. I really appreciate the detail and depth on the architecture and compute capabilities.

I wondered if you had considered using some of the GPU benchmarking suites from the academic community to give even more depth for compute capability comparisons. Both SHOC (http://ft.ornl.gov/doku/shoc/start) and Rodinia (https://www.cs.virginia.edu/~skadron/wiki/rodinia/... look like they might provide a very interesting set of benchmarks.
Ryan Smith - Thursday, December 16, 2010 - link
Hi Ceepie;

I've looked in to SHOC before. Unfortunately it's *nix-only, which means we can't integrate it in to our Windows-based testing environment. NVIDIA and AMD both work first and foremost on Windows drivers for their gaming card launches, so we rarely (if ever) have Linux drivers available for the launch.

As for Rodinia, this is the first time I've seen it. But it looks like their OpenCL codepath isn't done, which means it isn't suitable for cross-vendor comparisons right now.
IdBuRnS - Thursday, December 16, 2010 - link
"So with that in mind a $370 launch price is neither aggressive nor overpriced. Launching at $20 over the GTX 570 isn’t going to start a price war, but it’s also not so expensive to rule the card out. "

At NewEgg right now:

Cheapest GTX 570 - $509
Cheapest 6970 - $369

$30 difference? What are you smoking? Try $140 difference.
IdBuRnS - Thursday, December 16, 2010 - link
Oops, $20 difference. Even worse.
IdBuRnS - Thursday, December 16, 2010 - link
570...not 580...

/hangsheadinshame
epyon96 - Thursday, December 16, 2010 - link
This was a very interesting discussion to me in the article.

I'm curious if Anandtech might expand on this further in a future dedicated article comparing what NVIDIA is using to AMD.

Are they also more similar to VLIW4 or VLIW5?

Can someone else shed some light on it?
Ryan Smith - Thursday, December 16, 2010 - link
We wrote something almost exactly like you're asking for for our Radeon HD 4870 review.

http://www.anandtech.com/show/2556

AMD and NVIDIA's compute architectures are still fundamentally the same, so just about everything in that article still holds true. The biggest break is VLIW4 for the 6900 series, which we covered in our article this week.

But to quickly answer your question, GF100/GF110 do not immediately compare to VLIW4 or VLIW5. NVIDIA is using a pure scalar architecture, which has a number of fundamental differences from any VLIW architecture.
dustcrusher - Thursday, December 16, 2010 - link
The cheap insults are nothing but a detriment to what is otherwise an interesting argument, even if I don't agree with you.

As far as the intellect of Anandtech readers goes, this is one of the few sites where almost all of the comments are worth reading; most sites are the opposite- one or two tiny bits of gold in a big pan of mud.

I'm not going to "vastly overestimate" OR underestimate your intellect though- instead I'm going to assume that you got caught up in the moment. This isn't Tom's or Dailytech, a little snark is plenty.
Arnulf - Thursday, December 16, 2010 - link
When you launch an application (say a game), it is likely to be the only active thread running on the system, or perhaps one of very few active threads. CPU with Turbo function will clock up as high as possible to run this main thread. When further threads are launched by the application, CPU will inevitably increase its power consumption and consequently clock down.

While CPU manufacturers don't advertise this functionality in this manner, it is really no different from PowerTune.

Would PowerTune technology make you feel any better if it was marketed the other way around, the way CPUs are ? (mentioning lowest frequencies and clock boost provided that thermal cap isn't met yet)

AMD's Radeon HD 6970 & Radeon HD 6950: Paving The Future For AMD

Post Your Comment

168 Comments

View All Comments

B3an - Thursday, December 16, 2010 - link

CeepieGeepie - Thursday, December 16, 2010 - link

Ryan Smith - Thursday, December 16, 2010 - link

IdBuRnS - Thursday, December 16, 2010 - link

IdBuRnS - Thursday, December 16, 2010 - link

IdBuRnS - Thursday, December 16, 2010 - link

epyon96 - Thursday, December 16, 2010 - link

Ryan Smith - Thursday, December 16, 2010 - link

dustcrusher - Thursday, December 16, 2010 - link

Arnulf - Thursday, December 16, 2010 - link

Log in

Don't have an account? Sign up now