Original Link: http://www.anandtech.com/show/4209/amds-radeon-hd-6990-the-new-single-card-king
AMD's Radeon HD 6990: The New Single Card Kingby Ryan Smith on March 8, 2011 12:01 AM EST
The AMD Radeon HD 6990, otherwise known as Antilles, is a card we have been expecting for some time now. In what’s become a normal AMD fashion, when they first introduced the Radeon HD 6800 series back in October, they also provided a rough timeline for the rest of the high-end members of the family. Barts would be followed by Cayman (6950/6970), which would be followed by the dual-GPU Antilles (6990).
AMD’s original launch schedule at the time was to have the whole stack out the door by the end of 2010 – Antilles would be the last product, likely to catch Christmas before it was too late. What ended up happening however is that Cayman didn’t make it out until the middle of December, which put those original plans on ice. So we ended up closing the year with the 6800 series and the single-GPU members of the 6900 series, but AMD did not launch a replacement for their flagship dual-GPU card, leaving AMD’s product stack in an odd place where their top card was a 5000 series card compared to the 6000 series occupying everything else.
So while we’ve had to wait longer than we anticipated for Antilles/6990, the wait has finally come to an end. Today AMD is launching their new flagship card, retiring the now venerable 5970 and replacing it with a new dual-GPU monster powered by AMD’s recently introduced VLIW4 design. Manufactured on the same 40nm process as the GPUs in the 5970, AMD has had to go to some interesting lengths to improve performance here. And as we’ll see, it’s going to be a doozy in more ways than one.
|AMD Radeon HD 6990||AMD Radeon HD 6970||AMD Radeon HD 6950||AMD Radeon HD 5970|
|Memory Clock||1.25GHz (5.0GHz data rate) GDDR5||1.375GHz (5.5GHz data rate) GDDR5||1.25GHz (5.0GHz data rate) GDDR5||1.GHz (4GHz data rate) GDDR5|
|Memory Bus Width||2x 256-bit||256-bit||256-bit||2x256-bit|
|Transistor Count||2x 2.64B||2.64B||2.64B||2x2.15B|
|Manufacturing Process||TSMC 40nm||TSMC 40nm||TSMC 40nm||TSMC 40nm|
For the Radeon HD 5970, AMD found themselves in an interesting position: with the 5000 series launching roughly 6 months ahead of NVIDIA’s 400 series of GPUs, they already had a lead in getting products out the door. But furthermore NVIDIA never completely responded to the 5970, foregoing dual-GPU entirely with the 400 series. The 5970 was undisputed king of video cards – no single card was more powerful. Thus given a lack of direct competition, how AMD can follow up on the 5970 is a matter of great interest.
But before we get too far ahead of ourselves, let’s start with the basics. The Radeon HD 6990 is AMD’s new flagship card, based on a pair of Cayman (VLIW4) GPUs mounted on a single PCB. AMD has clocked the GPU at 830MHz and the GDDR5 memory at 1250MHz (5GHz data rate). The card comes with 4GB of RAM, which due to the internal CrossFire setup of the card reduces the effective RAM capacity to 2GB, the same as AMD’s existing 6900 cards.
Starting with the 5970, TDP limits and the laws of physics began limiting what AMD could do with a dual-GPU card; unlike the 4870X2, the 5970 wasn’t clocked quite high enough to match a pair of 5870s. The delta between the 5970 and the 5870 came down to the 5970 being 125MHz slower on the core and 200MHz (800Mhz data rate) slower for its RAM. In practice this reduced 5970 performance to near-5850CF levels. For the 6990 this gap still exists, but it’s much smaller this time. At 830MHz the 6990 is only 50MHz (5.5%) slower than the 6970, while the 5GHz memory takes a bigger hit as it’s 500MHz (9%) slower than the 6970. As a result at stock settings the 6990 is closer to being a dual-GPU 6970 than the 5970 was a dual-GPU 5870; there is one exception we will see however. Meanwhile the 6990’s GPUs are fully enabled, so all 1536 SPs and 32 ROPs per GPU are available, making the only difference between the 6990 and 6970 the clockspeeds.
Compared to the 5970, the official idle TDP is down some thanks to Cayman’s better power management, leading to an idle TDP of 37W. Meanwhile under load we find our first doozy: the card’s TDP at default clocks is 375W (this is not a typo), and like the 5970 AMD has built it to take even more. Whereas the 5970 stayed within PCI-Express specifications at default clocks, the 6990 makes no attempt to do so, and as such at 375W is the most power hungry card to date.
AMD will be launching the 6990 at $699. Officially this is $100 more expensive than the 5970 at its launch, however the 5970 was virtually never available at this price until very late in the card’s lifetime. $700 does end up being much closer to both the 5970’s historical price and its price relative to AMD’s top single-GPU part (5870), which was $700 and approximately twice the cost respectively. With a more stable supply of GPUs and stronger pressure from NVIDIA we’d expect prices to stick closer to their MSRP this time around, but at the top there’s not a lot of pressure to keep prices from rising. Meanwhile AMD has not provided any hard numbers for availability, but $700 cards are not high volume products. We’d expect availability to be a non-issue.
With the launch of the 6990 AMD’s high-end product stack is fully fleshed out. At the top will be the 6990, followed by the 6970, the 6950 2GB, and the 6950 1GB. The astute among you will notice that the average price of the 6970 is less than half that of the 6990, and as a result a 6970 CrossFire setup is cheaper than the 6990. At the lowest price we’ve seen for the 6970, we could pick up 2 of them for $640, which will put the 6990 in an interesting predicament of being a bit more expensive and a bit slower than the 6970 in CrossFire.
|March 2011 Video Card MSRPs|
|$700||Radeon HD 6990|
|$320-$340||Radeon HD 6970|
|$249-269||Radeon HD 6950 2GB|
||$230-$250||Radeon HD 6950 1GB|
GeForce GTX 560 Ti
|$219||Radeon HD 6870|
|$160-170||Radeon HD 6850|
Meet The 6990
If you recall our coverage on the 5970, we found a few areas where AMD was lacking. The cooling on the 5970 was sufficient to run the GPUs even at 5870 clocks and voltages, however the cooling on the VRMs was lacking, leading to real world programs trigging the VRM thermal protection mechanism; and while this was within safety guidelines, it’s not a comfortable place to be for long term operation. This ultimately led to us writing off the 5970OC as a 100% reliable product, sticking to recommending the 5970 solely at stock speeds.
The design of the 6990 in turn reads very much like a response to our findings in true engineering fashion. Furthermore for the 6990 AMD not only had to take a look at the 5970’s weaknesses, but also how to handle an even greater power load. The result is that the 6990 is distinctly different from the 5970 before it.
Compared to the 5970, the 6990 is ever so slightly shorter, thanks in large part to the fact than the 6000 series casing is more squared off compared to the 5000 series’ tapered design. As a result it comes in at 11.5” for the PCB (the same as the 5970), and with casing a full 12” long compared to the 5970’s 12.16”. This means that the 6990 has effectively the same space requirements as the 5970, cooling notwithstanding.
Meanwhile it’s the fan however that is going to catch the most attention and this is where we’re going to dovetail in to cooling. The 5970’s traditional blower had its strengths and weaknesses, the strengths being that blowers are relatively forgiving about a case’s ability to exhaust hot air, and the weaknesses being that the GPU (and VRMs) closest to the fan received better cooling than the farther GPU. The VRMs proved to be particularly problematic, as they could overheat well before the GPUs did and AMD does not spin up their fans based on VRM temperatures.
Correcting for this and at the same time allowing for even greater heat dissipation, the rear blower design is out. Its replacement is a design that we’ve seen in 3rd party cards before such as the Asus ARES 5870X2, but not in a reference design: a center-mounted fan/blower with a GPU to each side. The difference is critical and indeed cannot be understated: a rear blower channels most hot air outside of the case, while a center-mounted blower effectively splits the card in two, with one GPU + supporting chips being exhausted outside of the case, and the 2nd GPU + supporting chips being exhausted inside the case. The design is still enclosed, so everything goes out either the front or back of the card while fresh air is pulled in the center.
With the replacement of the blower, so has gone the heatsink. The 5970’s single large vapor chamber + heatsink design has been replaced in favor of a segmented heatsink, further driving home the concept that the 6990 is closer to 2 video cards sharing 1PCB than it is 2 GPUs on one card. Each heatsink in turn is connected to the GPU via its own vapor chamber, resulting in the GPUs being fully isolated from each other as far as cooling is concerned.
Even the thermal paste connecting the GPUs to the vapor chambers has been changed for the 6990 – AMD has replaced traditional paste with a phase change material. Phase change materials – pastes/pads of material that melt and solidify based on temperature are nothing new, however they’re still exotic; material similar to what AMD is using is not readily available as paste is. AMD even went so far as to suggest that reviewers not directly disassemble their 6990s as it would require a new application of phase change paste in order to achieve the same efficiency as the original material. The net result of all of this by AMD’s numbers is that the phase change material is 8% better than the regular paste they’ve been using.
Rounding out our focus on cooling is the VRMs, which have been relocated in order to correct for the 5970’s limited VRM cooling capabilities. The VRMs and controllers are now at the center of the board – now they’re cooled before the GPUs or RAM modules are. The profoundness of this is twofold: not only is it an improvement on the 5970, but with the 6990’s higher power consumption VRM cooling is even more important. As with the 6970, voltage regulation is supplied by Volterra MOSFETs and controllers.
All told, while the 5970 was designed to handle and dissipate 400W of heat, the 6990 is officially designed for 450W. In practice, at its limits in our test rig this is closer to 500W. To handle and dissipate that much heat in roughly 72in3 of space is nothing short of amazing.
Meet The 6990, Cont
Moving on from cooling, let’s discuss the rest of the card. From a power perspective, the 6990 is fed by 2 PCIe 8pin sockets on top of the PCIe bus power. At 150W + 150W + 75W this adds up to the 375W limit of the card. As was the case on the 5970, any increase in power consumption will result in exceeding the specifications for PCIe external power, requiring a strong power supply to drive the card. 375W is and of itself outside of the PCIe specification, and we’ll get to the importance of that in a bit. Meanwhile as was the case on the 5970, at default clocks the GPUs on the 6990 are undervolted to help meet the TDP. AMD is running the 6990 GPUs at 1.12v here, binning Cayman chips to get the best GPUs needed to run at 830MHz at this lower voltage.
At the end of the day power has a great deal of impact on GPU performance, so in increasing the performance of the 6990 over the 5970 AMD has played with both the amount of power the card can draw at default settings (which is why we’re at 375W now) and they have been playing with power management. By playing with power management we’re of course referring to PowerTune, which was first introduced on the 6900 series back in December. By capping the power consumption of a card at a set value and throttling the card back if it exceeds it, AMD can increase GPU clocks without having to base their final clocks around the power consumption of outliers like FurMark. The hardest part of course is finding balance – set your clocks too high for a specific wattage and everything throttles which is counterproductive and leads to inconsistent performance, but if clocks are too low you’re losing out on potential performance.
|AMD Radeon HD 6990 PowerTune Throttling|
|Bad Company 2||No|
|Mass Effect 2||No|
|Distributed.net Client||Yes (770MHz)|
It’s the increase in power consumption and the simultaneous addition of PowerTune that has allowed AMD to increase GPU clocks by as much as they have over the 5970. Cayman as an architecture is faster than Cypress in the first place, but having a 105MHz core clock advantage really seals the deal. At default settings PowerTune appears to be configured nearly identically on the 6990 as it is the 6970: FurMark heavily throttles, while Metro and the newly updated Distributed.net client experience slight throttling. The usual PowerTune configuration range of +/- 20% is available, allowing a card in its default configuration to be set between 300W and 450W for its PowerTune limit.
While we’re on the subject of PowerTune, there is one thing we were hoping to see that we did not get: dynamic limits based on CrossFire usage. This isn’t a complaint per-se as much as it is a pie-in-the-sky idea. Perhaps the biggest downside to a dual-GPU card for performance purposes is that they can’t match a single high-end card in terms of clocks when only a single GPU is in use, as clocks are kept low to keep total dual-GPU power consumption down. One thing we’d like to see in the future is for GPU1 to be allowed to hit standard GPU clocks (e.g. 880MHz) when GPU2 is not in use, with PowerTune arbitrating over matters to keep total power consumption in check. This would allow cards like the 6990 to be as fast as high-end single-GPU cards in tasks that don’t benefit from CrossFire, such as windowed mode games, emulators, GPGPU applications, and games that don’t have a CF profile. We’re just thinking out-loud here, but the potential is obvious.
Moving on, as with the 5970 and 2GB 5870 the 6990 is outfitted with 16 RAM chips, 8 per GPU. Half are on the front of the PCB and the other half are on the rear. The card’s backplate provides protection and heat dissipation for the rear-mounted RAM. In one of the few differences from the 6970, the 6990 is using 5GHz GDDR5 instead of 6GHz GDDR5 – our specific sample is using 2Gb Hynix T2C modules. This means the 5GHz stock speed of the card already has the RAM running for as much as it’s spec’d for. Hynix’s datasheets note that 6GHz RAM is spec’d for 1.6v at 6GHz, versus 1.5v at 5GHz for 5GHz RAM. So the difference likely comes down to a few factors: keeping RAM power consumption down, keeping costs down, and any difficulties in running RAM above 5GHz on such a cramped design. In any case we don’t expect there to be much RAM overclocking headroom in this design.
Finally, display connectivity has once again changed compared to both the 5970 and 6970. As Cayman GPUs can only drive 1 dual-link DVI monitor, AMD has dropped the 2nd SL-DVI port and HDMI port in favor of additional mini-DisplayPorts. While all Cayman GPUs (and Cypress/5800 before it) can drive up to 6 monitors, the only way to do so with 1 slot’s worth of display connectors is either through 6 mini-DP ports (ala Eyefinity-6), or through using still-unavailable MST hubs to split DP ports. The 6990 is a compromise in this design – an E6 design requires an expensive DP to DL-DVI adaptor to drive even 1 DL-DVI monitor, while a 5970-like design of 2x DVI + 1 mini-DP doesn’t allow 6 monitors in all cases even with MST hubs. The end result is 1 DL-DVI port for 2560x1600/2560x1440 legacy monitors, and 4 more mini-DP ports for newer monitors. This allows the 6990 to drive 5 monitors today, and all 6 monitors in the future when MST hubs do hit the market.
As with the 5870E6 cards, AMD is going to be stipulating that partners include adapters in order to bridge the DisplayPort adoption. All 6990s will come with 1 passive SL-DVI adapter (taking advantage of the 3rd TDMS transmitter on Cayman), 1 active SL-DVI adapter, and 1 passive HDMI adapter. Between the card’s on-board connectivity options and adapters it’s possible to drive just about any combination short of multiple DL-DVI monitors, including the popular 3 monitor 1080P Eyefinity configuration.
Active SL-DVI Adapter
With all of this said, the change in cooling design and power consumption/heat dissipation does require an additional level of attention towards making a system work, beyond even card length and power supply considerations. We’ve dealt with a number of high-end cards before that don’t fully exhaust their hot air, but nothing we’ve reviewed is quite like the 6990. Specifically nothing we’ve reviewed was a 12” card that explicitly shot out 185W+ of heat directly out of the rear of the card; most of the designs we see are much more open and basically drive air out at all angles.
The critical point is that the 6990 is dumping a lot of hot air in to your case, and that it’s doing so a foot in front of the rear of the case. Whereas the 5970 was moderately forgiving about cooling if you had the space for it, the 6990 will not be. You will need a case with a lot of airflow, and particularly if you overclock the 6990 a case that doesn’t put anything of value directly behind the 6990.
To make a point, we quickly took the temperatures of a 500GB Seagate hard drive in our GPU test rig when placed in the drive cage directly behind the 6990 in PEG slot 1. As a result the 6990 is directly blowing on the hard drive. Note here that normally we have a low-speed 120mm fan directly behind PEG 1, which we have to remove to make room for the 5970 and 6990. All of these tests were run with Crysis in a loop – so they aren’t the highest possible values we could achieve.
|Seagate 500GB Hard Drive Temperatures|
|Radeon HD 6990||37C|
|Radeon HD 6990OC||40C|
|Radeon HD 5970||31C|
At default clocks for the 6990 our hard drive temperature is 37C, while overclocked this reaches 40C. Meanwhile if we replace the 6990 with the 5970, this drops to 31C. Replace that with a pair of 6970s in CrossFire and our 120mm fan, and that drops even more to 27C. So the penalty for having a dual-exhaust card like the 6990 as far as our setup is concerned is 6C compared to a long directed card like the 5970, and 10C compared to a pair of shorter 6970s and an additional case fan. The ultimate purpose of this exercise is to illustrate how placing a hard drive (or any other component) behind the 6990 is a poor choice. As many cases do have hard drive bays around this location, you’d be best served putting your drives as far away from a 6990 as possible.
And while we haven’t been able to test this, as far as air overclocking is concerned the best step may to take this one step further and turn the closest air intake in to an exhaust. A number of cases keep an intake at the front of the case roughly in-line with PEG slot 1; turning this in to an exhaust would much more effectively dissipate the heat that the 6990 is throwing in to the case, and this may be what AMD was going for all along. Video cards that vent air out of the front and the rear of the case, anyone?
Ultimately the 6990 is a doozy, the likes of which we haven’t seen before. Between its greater power consumption and its dual-exhaust cooler, it requires a greater attention to cooling than any other dual-GPU card. Or to put this another way, it’s much more of a specialized card than the 5970 was.
Once Again The Card They Beg You To Overclock
One of the 5970’s unique attributes was that while at default clocks and voltages it was designed to meet a 300W TDP, it was designed for much more. AMD’s design called for it to be able to handle 400W, the amount of power needed to operate the card as if it were a true dual-GPU 5870. In practice this fell a bit short due to VRM temperatures, but for most games this was a workable solution.
In AMD’s case it has paid off well enough that with the 6990 they are returning with the same philosophy, differing only in implementation details. AMD’s engineers have gone and built a card that can run its GPU at 6970-like GPU clocks (880MHz), you just have to do some overclocking to get there. And while AMD’s legal department will tell you that no overclock is guaranteed and that doing so voids any warranty, the design and the binning of GPUs virtually ensures every card can hit 6970 core clocks.
AMD refers to the 6990 as a 450W card. At default clocks it has a rated TDP of 375W but the cooler itself is designed to take 450W, which is why AMD went with so many design changes such as the dual-exhaust system and the exotic thermal compound. The result is that the card can generally keep itself cool at 6970 speeds, and in fact does a better job of this than the 5970 did at 5870 speeds. The catch here is that you will need sufficient cooling to deal with the heat the card dumps in to the case, 225W+ to be precise. Thus while the 6990 is already a card with specialized cooling requirements, the 6990 when overclocked is even more so. With FurMark our numbers point to our card drawing more than 500W, so 6990 overclocking is not for the faint of heart.
With the 5970 AMD enabled overclocking by producing a quick & dirty utility to bump the card’s voltage up to 5870 voltages, which then could be used with Overdrive to achieve the desired clocks. This certainly worked but it wasn’t smooth and it wasn’t consistent - not every vendor used AMD’s utility (particularly if they had their own in-house overclocking utility), and if you did use AMD’s utility then you had to set the voltage and do overclocking on every boot. AMD is not about to include voltage controls in the Catalyst Overdrive controls, so they’ve gone for a better way.
The 5970's ATI Overvolt Tool
Do you recall the BIOS selection switch on the 6900 series cards? On those cards, it was to allow users to safely flash new BIOSes to their cards while having a fallback BIOS to work from. The 6990 takes this concept and repurposes it to fit the 6990’s unique overclocking needs. The switch is still there, but instead of identical BIOSes the switch controls which performance BIOS is used. Position 2, the default position, is a write-protected BIOS that runs the 6990 at its default core clock of 830MHz and default core voltage of 1.12v. Position 1 is a write-enabled BIOS that runs the 6990 at the same core speeds and voltages as the 6970: 880MHz core clock and 1.175v core voltage; meanwhile memory clocks remain unchanged at sub-6970 speeds of 5GHz. AMD calls it the AUSUM switch (Antilles Unlocking Switch for Uber Mode); ignore the name, focus on the fact that the switch is what controls the core voltage on the 6990.
6950/6970 BIOS Switch
From a usability standpoint, the benefit of using the BIOS switch for this is that it’s much more consistent across vendors and it doesn’t require any software interaction. Just flip the switch and you’re done. However we would still count on seeing some vendors taking things a step further and offering fine-tuned voltage control for the card.
Along with the increase in the core clock and the voltage, AMD’s documentation also lists the PowerTune limit as being increased for uber mode. AMD tells us that the limit here is 450W (540W with +20% PT), however in our testing we were unable to hit that limit. Every test up to and including FurMark ran unthrottled, and we peg power consumption there at over 500W. If indeed there isn’t a PowerTune limit this is good news for extreme overclockers, but it means if you use uber mode PowerTune won’t be there to save your bacon if you push too hard.
|Radeon HD 6990 BIOS Switch|
|Position||Core Clock/Voltage||PowerTune Limit||Write-Protected|
As far as additional overclocking is concerned we did not push our sample beyond uber clockspeeds. In uber mode we were already hitting GPU temperatures of 94C in Furmark, which is as high as we’re willing to go. Better cooling of course would allow easier overclocking, and with a an overdrive limit of 1.2GHz in uber mode, the card should vanish in a puff of smoke well before Overdrive becomes a limit.
Radeon HD 6990 Overdrive Limits
Of course all of this talk of overclocking cannot be held without saying something about power consumption. With 2 8pin PCIe power sockets the 6990 is already drawing the full 150W per 8pin line the PCIe specification calls for; uber mode exceeds this, potentially by quite a bit. AMD has engineered the 6990 to pull most excess power from the PCIe power sockets and not the slot itself (since the slot is the weakest link), so a notably overbuilt power supply would be necessary. AMD hasn’t provided any official guidance here, but a well-built power supply offering 20A (240W) per 8pin line with an independent rail for each line would seem to be the minimum to get away with uber mode.
Ultimately however, as we’ll see the 6990OC doesn’t have nearly as large a performance bump to it as the 5970OC did. Thanks to the much higher default clocks, the 6990OC’s core clock is only 6% faster and the memory clock is the same, versus 17% faster on the core clock and 20% faster on the memory clock for the 5970. As a result you get much better performance out of the box, but unlike the 5970 flipping the magic switch doesn’t significantly increase the card’s performance this time around. So unlike the 5970 if you want to significantly improve performance over stock, you’ll have to do some equally significant custom overclocking on the 6990.
Finally, in a close examination of a minor detail, unlike on the 6950/6970 it’s clear that AMD doesn’t intend for this switch to be easily accessible. The switch on the 6990 is slightly recessed, not by enough to make it hard to hit but enough that you’ll never accidentally hit it. Flipping the switch would need to be a conscientious action, which makes sense given the fact that doing so would void the card’s warranty.
Update: After publication of this article there's been some slight confusion on the matter of the AWSUM switch and the warranty. AMD's official guidance is that overclocking the card voids the warranty, which means that AWSUM/uber mode is warranty breaking. Technically speaking just flipping the switch doesn't break the warranty - it's operating the card that does - but retail cards will come with a sticker over the switch warning users of the potential danger of overclocking and that it violates the warranty. So breaking the sticker to flip the switch will for all practical purposes violate the warranty. Specific policies may differ by partner, however.
PCI-Express Compliance: Does It Even Matter?
For a while now we’ve been under the impression that video card size and power consumption was ultimately capped by the PCI-Express specification. At present time the specification and its addendums specify normal (75W), 150W, 225W, and 300W PCIe card operation. In the case of 300W cards in particular this is achieved through 75W from the PCIe slot, 75W from a 6pin PCIe power connector, and 150W from an 8pin PCIe power connector. As the name implies, the PCIe specification also defines what the 6pin and 8pin power connectors are supposed to be capable of, which is where 75W and 150W come from respectively.
Altogether the biggest, most powerful card configuration in the PCIe specification allows for a 12.283” long, triple-wide card that consumes 300W. To date we’ve never seen a card exceed the physical specifications, but we’ve seen several cards exceed the electrical specifications. This includes cards such as the 5970 and some overclocking-oriented 5870s that were designed to handle more than 300W when overclocked, and even more exotic cards such as the Asus ARES 5870X2 that simply drew more than 300W from the get-go. We have yet to see a reference design from AMD/NVIDIA however that exceeds any part of the PCIe specification by default.
So it has been clear for some time now that cards can exceed the PCIe specifications without incurring the immediately wrath of an army of lawyers, but at the same time this doesn’t establish what the benefits or losses are of being or not being PCIe compliant. To have a reference design exceed the PCIe specifications is certainly a new mark for the GPU industry, so we decided to get right to the bottom of the matter and get an answer to the following question: does PCI-Express compliance matter?
To answer this question we went to two parties. The first of which was of course AMD, whose product is in question. AMD’s answer basically amounts to a polite deflection: it’s an ultra-enthusiast card that at default settings does not exceed the power available by the combination of the PCIe slot and PCIe power connectors. Furthermore, as they correctly note, the 6990 is not the first card to ship at over 300W, as the ARES and other cards were drawing more than 300W a year ago. It’s a polite answer that glosses over the fact that no, the 6990 isn’t technically PCIe compliant.
To get a second opinion on the matter we went straight to the source: The Peripheral Component Interconnect Special Interest Group (PCI-SIG), which is the industry group that defines the PCIe standard and runs the workshops that test for product compliance. The PCI-SIG’s member list is virtually everyone in the computing industry, including AMD, NVIDIA, and Intel, so everyone has some level of representation with the group.
So what does the PCI-SIG think about cards such as the 6990 which exceed the PCIe specification? In a nutshell, they don’t directly care. The group’s working philosophy is closer to approving cards that work than it is about strictly enforcing standards, so their direct interest in the matter is limited. The holy grail of the PCI-SIG is the PCI Express Integrators List, which lists all the motherboards and add-on cards that have passed compliance testing. The principal purpose of the list is to help OEMs and system integrators choose hardware, relying on the list and by extension PCI-SIG testing to confirm that the product meets the PCIe standards, so that they can be sure it will work in their systems.
The Integrators List is more or less exclusively OEM focused, which means it has little significance for niche products such as the 6990 which is split between end-user installation and highly customized OEM builds. The 6990 does not need to be on the list to be sold to its target market. Similarly the 5970 was never submitted/approved for listing, and we wouldn’t expect the 6990 to be submitted either.
It is worth noting however that while the PCI-SIG does have power specifications, they’re not a principal concern of the group and they want to avoid doing anything that would limit product innovation. While the 300W specification was laid out under the belief that a further specification would not be necessary, the PCI-SIG does not even test for power specification compliance under their current compliance testing procedures. Conceivably the 6990 could be submitted and could pass the test, leading to it being labeled PCIe compliant. Of course it’s equally conceivable that the PCI-SIG could start doing power compliance testing if it became an issue…
At the end of the day as the PCI-SIG is a pro-compliance organization as opposed to being a standard-enforcement organization, there’s little to lose for AMD or their partners by not being compliant with the PCIe power specifications. By not having passed compliance testing the only “penalty” for AMD is that they cannot claim the 6990 is PCIe compliant; funny enough they can even use the PCIe logo (we’ve already seen a Sapphire 6990 box with it). So does PCIe compliance matter? For mainstream products PCIe compliance matters for the purposes of getting OEM sales; for everything else including niche products like the 6990, PCIe compliance does not matter.
The launch drivers for the 6990 will be a preview version of Catalyst 11.4, which have been made available today and the final version launching sometime in April. Compared to the earlier drivers we’ve been using performance in most of our games is up by at least a few percent, particularly in CrossFire. For launching a dual-GPU card like the 6990, the timing couldn’t be better.
Along with these performance improvements AMD is also throwing a few new features in to the Catalyst Control Center, making it the first time they’ve touched it since the introduction of the new design in January. Chief among these features – and also timed to launch with the 6990 today - is 5x1 portrait Eyefinity mode. Previously AMD has supported 3x1 and 3x2, but never anything wider than 3 monitors (even on the Eyefinity 6 series).
The 6990 is of course perfectly suited for the task as it's able to drive 4 + 1 monitors without any miniDP MST hubs, and indeed the rendering capabilities of this card are wasted a good deal of the time only driving one monitor. Other cards will also support 5x1P, but only E6 cards can work without a MST hub at the moment. Notably, in spite of requiring one fewer monitor than 3x2 Eyefinity this is easily the most expensive option for Eyefinty yet, as portrait modes require monitors with wide vertical viewing angles to avoid color washout – you’d be hard pressed to build a suitable setup with cheap TN monitors like you can the landscape modes.
The other big change for power users is that AMD is adding a software update feature to the Catalyst Control Center, which will allow users to check for driver updates from within the CCC. It will also have an automatic update feature, which will check for driver updates every 2 weeks. At this point there seems to be some confusion over at AMD over whether this will be enabled by default or not – our drivers have it enabled by default, while we were initially told it would be disabled. From AMD’s perspective having the auto update feature enabled improves the user experience by helping to get users on newer drivers that resolve bugs in similarly new games, but at the same time I could easily see this backfiring with users by being one more piece of software nagging for an update every month.
Finally, AMD is undergoing a rebranding (again), this time for the Catalyst Control Center. If you use an AMD CPU + AMD consumer GPU, the Catalyst Control Center is now the AMD VISION Engine Control Center. If you use an Intel CPU + AMD consumer GPU it’s still the Catalyst Control Center. If you use a professional GPU (regardless of CPU), it’s the Catalyst Pro Control Center.
Due to the timing of this launch we haven’t had an opportunity to do in-depth testing of Eyefinity configurations. We will be updating this article with Eyefinity performance data in the next day. In the meantime we have our usual collection of single monitor tests.
|CPU:||Intel Core i7-920 @ 3.33GHz|
|Motherboard:||Asus Rampage II Extreme|
|Chipset Drivers:||Intel 188.8.131.525 (Intel)|
|Hard Disk:||OCZ Summit (120GB)|
|Memory:||Patriot Viper DDR3-1333 3 x 2GB (7-7-7-20)|
AMD Radeon HD 6990
AMD Radeon HD 6970
AMD Radeon HD 6950 2GB
AMD Radeon HD 6870
AMD Radeon HD 6850
AMD Radeon HD 5970
AMD Radeon HD 5870
AMD Radeon HD 5850
AMD Radeon HD 5770
AMD Radeon HD 4870X2
AMD Radeon HD 4870
NVIDIA GeForce GTX 580
NVIDIA GeForce GTX 570
NVIDIA GeForce GTX 560 Ti
NVIDIA GeForce GTX 480
NVIDIA GeForce GTX 470
NVIDIA GeForce GTX 460 1GB
NVIDIA GeForce GTX 460 768MB
NVIDIA GeForce GTS 450
NVIDIA GeForce GTX 295
NVIDIA GeForce GTX 285
NVIDIA GeForce GTX 260 Core 216
NVIDIA ForceWare 262.99
NVIDIA ForceWare 266.56 Beta
NVIDIA ForceWare 266.58
AMD Catalyst 10.10e
AMD Catalyst 11.1a Hotfix
AMD Catalyst 11.4 Preview
|OS:||Windows 7 Ultimate 64-bit|
Kicking things off as always is Crysis: Warhead, still one of the toughest game in our benchmark suite. Even 3 years since the release of the original Crysis, “but can it run Crysis?” is still an important question, and for 3 years the answer was “no.” However as we’ll see the 6990 changes that: full Enthusiast settings at a playable framerate is finally in the grasp of a single card.
It should come as no surprise that with the 6990, AMD has hit a few different important marks on Crysis for a single card thanks to the card’s near-6970CF performance. As far as our traditional 2560 benchmark goes, the 6990 cracks 60fps, meaning we can finally play Crysis at a perfectly smooth framerate at 2560 with our tweaked settings on what is more or less a single video card. Perhaps more importantly however, performance is to the point where Crysis in full enthusiast mode is now a practical benchmark. Thanks in big part to the extra VRAM here, the tops the 5970 by nearly 30%, coming in at 42.8fps. This is still a bit low for a completely smooth framerate, but it is in fact playable, which is more than we can say for the 5970.
Overall Crysis does a good job setting the stage here for most of our benchmark suite: the performance of the card is consistently between the 6950CF and 6970CF, hovering much closer to the former. Compared to NVIDIA’s offerings the 6990 is solidly between the GTX 580 and GTX 580SLI, owing to the fact that NVIDIA doesn’t have a comparable card. The GTX 580SLI is faster, but the 580 is also still the fastest single-GPU card on the market, meaning it commands a significant price premium.
Overclocked to uber mode however only shows minimal gains, as the theoretical maximum gain is only 6% while the real world benefit is less; uber mode alone will never have a big payoff.
As far as minimum framerates are concerned the story is similar. For some reason the 6990 underperforms the 6950CF here by a frame or two per second, which given the 6990’s mostly superior specs leads us to believe that it’s a limitation of PCIe bus bandwidth. Meanwhile we can clearly see the benefits of more than 1GB of VRAM per GPU here: the 6990 walks all over the 5970.
Up next is BattleForge, Electronic Arts’ free to play online RTS. As far as RTSes go this game can be quite demanding, and this is without the game’s DX11 features.
With BattleForge we see the 6990 fall in to a similar hole as the rest of the 6900 series when it comes to performance relative to NVIDIA’s 500 series: sometimes they do well, and sometimes NVIDIA has an advantage in the game; this is the latter.
In the meantime this is one of our better examples of why memory bandwidth matters, as not only does the 6990OC gain little on the 6990, but even the 6990OC trails the 6970CF by 8%. Clearly performance relative to the 6970CF is going to depend on how limited a game is by memory bandwidth. In the worst case, we’re looking at 6950CF-like memory bandwidth, and as a result 6950CF-like performance. The 6990’s advantage over the 5970 also shrinks here as a result, dropping to 17%.
The next game on our list is 4A Games’ Metro 2033, their tunnel shooter released last year. In September the game finally received a major patch resolving some outstanding image quality issues with the game, finally making it suitable for use in our benchmark suite. At the same time a dedicated benchmark mode was added to the game, giving us the ability to reliably benchmark much more stressful situations than we could with FRAPS. If Crysis is a tropical GPU killer, then Metro would be its underground counterpart.
Being a particularly shader heavy game, Metro is one of the better games for both the 6990 and the Radeon 6900 series in general. At 2560 it’s within 5% of the GTX 580 SLI, and compared to the 5970 the 6990 has a 24% performance advantage.
Ubisoft’s 2008 aerial action game is one of the less demanding games in our benchmark suite, particularly for the latest generation of cards. However it’s fairly unique in that it’s one of the few flying games of any kind that comes with a proper benchmark.
In opposition to Metro 2033, HAWX is not shader-heavy by any stretch of the imagination. Here it’s texturing and ROPs that leads the day, leading to architectural differences as the deciding factor. Compared to NVIDIA’s lineup this means the 6990 is behind even the GTX 570 SLI, while an apparent lack of memory bandwidth once again has it hugging the 6950CF. Surprisingly the lead over the 5970 is still strong at 21%.
The other new game in our benchmark suite is Civilization 5, the latest incarnation in Firaxis Games’ series of turn-based strategy games. Civ 5 gives us an interesting look at things that not even RTSes can match, with a much weaker focus on shading in the game world, and a much greater focus on creating the geometry needed to bring such a world to life. In doing so it uses a slew of DirectX 11 technologies, including tessellation for said geometry and compute shaders for on-the-fly texture decompression.
In January we saw NVIDIA’s performance significantly improve in Civilization V. Since then AMD seems to have found their footing, albeit not as well as NVIDIA had. AMD’s primary gain here seems to be in CrossFire versus a boost in base performance, which works out well enough for the 6990’s launch. Interestingly Civ 5 is still so shader bound here on AMD’s cards that the 6990OC’s performance boost almost perfectly matches the increase in the core clockspeed. Still, at the end of the day the 6990 and the rest of the Radeons are still well outgunned by NVIDIA.
Battlefield: Bad Company 2
The latest game in the Battlefield series - Bad Company 2 – remains as one of the cornerstone DX11 games in our benchmark suite. As BC2 doesn’t have a built-in benchmark or recording mode, here we take a FRAPS run of the jeep chase in the first act, which as an on-rails portion of the game provides very consistent results and a spectacle of explosions, trees, and more.
Bad Company 2 ends up being another game heavily reliant on memory bandwidth, so once again the 6970CF takes a solid lead over the 6990, effectively matching the 10% memory bandwidth difference with 10% more performance. Interestingly the 6990 still has an edge on the 6950CF, showing that there’s more than 1 bottleneck with Bad Company 2. Meanwhile compared to the 5970 the 6990 once again takes a respectable lead of 18%. AMD is fortunate here though that the Radeons have an advantage in this game, as giving up 10% in performance almost anywhere else would be enough to fall below the GTX 570 SLI.
Our waterfall minimum framerate benchmark ends up being quite similar in order to our general chase benchmark. The 6990 is noticeably behind the 6970CF, and in fact it’s just a bit worse this time as the 6950CF edges out even the 6990OC. Memory bandwidth and PCIe bandwidth, take your pick, both play a part here.
STALKER: Call of Pripyat
The third game in the STALKER series continues to build on GSC Game World’s X-Ray Engine by adding DX11 support, tessellation, and more. This also makes it another one of the highly demanding games in our benchmark suite.
STALKER is another game that AMD does well in, leading to some impressive results for the 6990. Memory bandwidth limitations are enough to keep the 6990 below the 6970CF, but it’s not so severe as to allow the 6950CF to catch up. Meanwhile at 2560 the 5970 is absolutely obliterated here, thanks in large part to the combination of shader performance and VRAM size; the 6990 is no less than 57% faster than the 5970! Compared to NVIDIA’s lineup the 6990 is also quite impressive, beating even the GTX 580 SLI by 16%.
Codemasters’ 2009 off-road racing game continues its reign as the token racer in our benchmark suite. As the first DX11 racer, DiRT 2 makes pretty thorough use of the DX11’s tessellation abilities, not to mention still being the best looking racer we have ever seen.
DIRT 2 is another one of those games where the 6990 just can’t quite hang on. Without overclocking it can’t keep parity with the 6950CF, let alone the 6970CF. It’s also one of the smaller returns versus the 5970, coming in at only 13% faster. And the less said about the performance relative to NVIDIA’s cards, the better.
Mass Effect 2
Electronic Arts’ space-faring RPG is our Unreal Engine 3 game. While it doesn’t have a built in benchmark, it does let us force anti-aliasing through driver control panels, giving us a better idea of UE3’s performance at higher quality settings. Since we can’t use a recording/benchmark in ME2, we use FRAPS to record a short run.
Mass Effect 2’s results end up mirroring DIRT 2 here, which isn’t a good thing for AMD. Once again the 6990 has more in common with the 6950CF than it does the 6970CF, and overclocking will not solve the problem. Meanwhile NVIDIA easily takes the game with even the GTX 560 SLI. The one bright spot here is that the 6990’s advantage over the 5970 has recovered, pushing out to 21%.
Finally among our benchmark suite we have Wolfenstein, the most recent game to be released using the id Software Tech 4 engine. All things considered it’s not a very graphically intensive game, but at this point it’s the most recent OpenGL title available. It’s more than likely the entire OpenGL landscape will be thrown upside-down once id releases Rage later this year.
Even at 2560 Wolfenstein is very close to being CPU limited when we’re working with SLI/CF. There’s just enough room for the 6990 to once again fall behind the 6950CF, however even the all-powerful 6970CF only eeks out a few more frames per second. In these conditions the test is less about the hardware and more about the software.
Moving on from our look at gaming performance, we have our customary look at compute performance. With AMD’s architectural changes from the 5000 series to the 6000 series, focusing particularly on compute performance, this can help define the 6990 compared to the 5970. However at the same time, neither benchmark here benefits from the dual-GPU design of the 6990 very much.
Our first compute benchmark comes from Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes.
New as of Catalyst 11.4, AMD’s performance in our Civilization V DirectCompute benchmark now scales with CrossFire at least marginally. This leads to the 6990 leaping ahead of the 6970, however the Cayman architecture/compiler still looks to be sub-optimal for this test. The 5970 has a 10% lead even with its core clock disadvantage. This also lets NVIDIA and their Fermi architecture establish a solid lead over the 6990, even without the benefit of SLI scaling.
Our second GPU compute benchmark is SmallLuxGPU, the GPU ray tracing branch of the open source LuxRender renderer. While it’s still in beta, SmallLuxGPU recently hit a milestone by implementing a complete ray tracing engine in OpenCL, allowing them to fully offload the process to the GPU. It’s this ray tracing engine we’re testing.
There’s no CrossFire scaling to speak of in SmallLuxGPU, so this test is all about the performance of GPU1, and its shader/compute performance at that. At default clocks this leads to the 6990 slightly trailing the 6970, while overclocked this leads to perfect parity with it. Unfortunately for AMD this is a test where NVIDIA’s focus on compute performance has really paid off; coupled with the lack of CF scaling and even a $240 GTX 560 Ti can edge out the $700 6990.
Ultimately the take-away from this is that for most desktop GPU computing workloads, the benefit of multiple GPU cores is still unrealized. As a result the 6990 shines as a gaming card, but is out of its element as a GPU computing card unless you have an embarrassingly parallel task to feed it.
Power, Temperature, and Noise: How Loud Can One Card Get?
Last but not least as always is our look at the power consumption, temperatures, and acoustics of the Radeon HD 6990 series. This is an area where AMD has traditionally had an advantage, as their small die strategy leads to less power hungry and cooler products compared to their direct NVIDIA counterparts. Dual-GPU cards like the 6990 tend to increase the benefits of lower power consumption, but heat and noise are always a wildcard.
AMD continues to use a single reference voltage for their cards, so the voltages we see here represent what we’ll see for all reference 6900 series cards. In this case voltage also plays a big part, as PowerTune’s TDP profile is calibrated around a specific voltage.
|Radeon HD 6900 Series Voltage|
|6900 Series Idle||6970 Load||6990 Load|
The 6990 idles at the same 0.9v as the rest of the 6900 series. At load under default clocks it runs at 1.12v thanks to AMD’s chip binning, and is a big part of why the card uses as little power as it does for its performance. Overclocked to 880MHz however and we see the core voltage go to 1.175v, the same as the 6970. Power consumption and heat generation will shoot up accordingly, exacerbated by the fact that PowerTune is not in use here.
The 6990’s idle power is consistent with the rest of the 6900 series. At 171W it’s at parity with the 6970CF, while we see the advantage of the 6990’s lower idle TDP versus the 5970 in the form of a 9W advantage over the 5970.
With the 6990, load power under Crysis gives us our first indication that TDP alone can’t be used to predict total power consumption. With a 375W TDP the 6990 should consume less power than 2x200W 6950CF, but in practice the 6950CF setup consumes 21W less. Part of this comes down to the greater CPU load the 6990 can create by allowing for higher framerates, but this doesn’t completely explain the disparity. Compared to the 5970 the 6990 is also much higher than the TDP alone would indicate; the gap of 113W exceeds the 75W TDP difference. Clearly the 6990 truly is a more power hungry card than the 5970.
Meanwhile overclocking does send the power consumption further up, this time to 544W. This is better than the 6970CF at the cost of some performance. Do keep in mind though that at this point we’re dissipating 400W+ off of a single card, which will have repercussions.
Under FurMark PowerTune limits become the defining factor for the 6900 series. Even with PT triggering on all three 6900 cards, the numbers have the 375W 6990 drawing more than the 2x200W 6950CF, this time by 41W, with the 6970CF in turn drawing 51W more. All things considered the 6990’s power consumption is in line with its performance relative to the other 6900 series cards.
As for our 6990OC, overclocked and without PowerTune we see what the 6990 is really capable of in terms of power consumption and heat. 684W is well above the 6970CF (which has PT intact), and is approaching the 570/580 in SLI. We don’t have the ability to measure the power consumption of solely the video card, but based on our data we’re confident the 6990 is pulling at least 500W – and this is one card with one fan dissipating all of that heat. Front and rear case ventilation starts looking really good at this point.
Along with the 6900 series’ improved idle TDP, AMD’s dual-exhaust cooler makes its mark on idle temperatures versus the 5970. At 46C the 6990 is warmer than our average card but not excessively so, and in the meantime it’s 7C cooler than the 5970 which has to contend with GPU2 being cooled with already heated air. A pair of 6900 cards in CF though is still going to beat the dual-exhaust cooler.
When the 5970 came out it was warmer than the 5870CF; the 6990 reverses this trend. At stock clocks the 6990 is a small but measurable 2C cooler than the 6970CF, which as a reminder we run in a “bad” CF configuration by having the cards directly next to each other. There is a noise tradeoff to discuss, but as far as temperatures are concerned these are perfectly reasonable. Even the 6990OC is only 2C warmer.
At stock clocks FurMark does not significantly change the picture. If anything it slightly improves things as PowerTune helps to keep the 6990 in the middle of the pack. Overclock however and the story changes. Without PowerTune to keep power consumption in check that 681W power consumption catches up to us in the form of 94C core temperatures. It’s only a 5C difference, but it’s as hot as we’re willing to let the 6990 get. Further overclocking on our test bed is out of the question.
Finally there’s the matter of noise to contend with. At idle nothing is particularly surprising; the 6990 is an iota louder than the average card, presumably due to the dual-exhaust cooler.
And here’s where it all catches up to us. The Radeon HD 5970 was a loud card, the GTX 580 SLI was even louder, but nothing tops the 6990. The laws of physics are a cruel master, and at some point all the smart engineering in the world won’t completely compensate for the fact that you need a lot of airflow to dissipate 375W of heat. There’s no way around the fact that the 6990 is an extremely loud card; and while games aren’t as bad as FurMark here, it’s still noticeably louder than everything else on a relative basis. Ideally the 6990 requires good airflow and good noise isolation, but the former makes the latter difficult to achieve. Water cooled 6990s will be worth their weight in gold.
Wrapping things up, there’s little we need to say that wasn’t already evident in our graphs. The 6990 is a halo card and succeeds at such – by packing two Cayman GPUs on a single card, it is without question the fastest video card on the market today. At the same time there is and always will be a distinction between single-GPU cards and dual-GPU cards; the former is a threat to the latter, but the latter is rarely a threat to the former.
When we reviewed the Radeon HD 5970 back in 2009, the principle question we ran in to was whether it would be better to have a 5970 or two 5850s in CrossFire, given that the two were nearly identical in performance. The answer was that CrossFire was superior so long as you had a power supply with four readily available PCIe power plugs. With the Radeon HD 6990, we find ourselves asking the same question and an even more direct answer. With but a trio of exceptions, the 6990 doesn’t make sense compared to a pair of cards in CrossFire.
The reasons for this are numerous. The 6990 is so close to the 6950CF in performance that on average at 2560 the two are identical. It’s only in Bad Company 2 and Stalker that we see the 6990 take an advantage, which is then negated by anything from Civilization V to DIRT 2. Meanwhile the 6950CF is cooler, significantly quieter, and less power hungry than the 6990. And finally the 6950CF is cheaper: we can snag a pair of cards for $520, versus $700 for the 6990. Likewise, for $640 you can have a pair of 6970s and enjoy performance at 2560 roughly 8% ahead of the 6990, and that setup is still quieter than the 6990.
This leads us to our exceptions, and why we believe the 6990 is truly a niche product.
- Quad-CrossFire; this is going to be the highest performing AMD solution at this time, power and noise be damned. This requires a motherboard with PEG slots three slots apart (lest you choke the first 6990), but it’s achievable.
- 5x1P Eyefinity. At five-panel resolutions you’re going need a pair of powerful GPUs, but given AMD’s CrossFire Eyefinity limitations at the time only 2 cards can directly drive five monitors: the 5870 Eyefinity 6, and the 6990. Ultimately MST hubs will allow the 6970CF to do this, but for the time being the 6970CF is limited by the number of displays a single card can drive without a hub.
- If you absolutely cannot fit two cards in your computer. This is often the traditional domain of the dual-GPU card, but the 6990’s cooling and power requirements put this in jeopardy. Most micro-ATX cases would simply not be suitable due to cooling needs, meanwhile motherboards with two or more PEG slots are increasingly common. There are very few computers with a single PEG slot that could power and cool the 6990 without a complete overhaul in the first place.
Dual-GPU cards have always been a niche product, but the 6990 really takes this and runs with it. There’s no significant power/noise savings to be found by consolidating two GPUs on to a single card, and as we said earlier with the dual-exhaust cooler the 6990 is effectively two video cards on one PCB. This isn’t a bad thing – the 6990 is the world’s fastest video card after all – but it drives the card in to some very specific niches. If you fall in to these niches, then the 6990 is certainly the card for you. At 22% faster than the 5970 it isn’t a massive performance boost, but it certainly has earned its place.
But if you don't fall into these niches, then there’s nothing the Radeon HD 6990 offers you today that the 6950/6970 didn’t offer in CrossFire mode yesterday. In this case while AMD’s king card is an engineering marvel for its ability to handle so much power in a confined space, as a product on the market it won’t be quite as significant as the title implies.