Motherboards Memory Storage Cases/Cooling/PSUs IT Computing Displays Mobile Mac CPUs & Chipsets Video Digital Cameras Linux Gadgets Systems Trade Shows Guides Home Increase Font Size Decrease Font Size Change Page Size
Radeon 5970 Overclocking: The VRM Temperature Bottleneck
Radeon 5970 Overclocking: The VRM Temperature Bottleneck
Date: November 25th, 2009
Author: Ryan Smith
 
 

In our Radeon HD 5970 review, we ran in to some issues when trying to overclock the card to 5870 speeds of 850MHz/1200MHz. At the time this is something we attributed to the VRMs, meanwhile AMD suggested that it was cooling related, and that we should manually increase the fan speed.

As it turns out, we were both right, we just didn’t have the tools at the time to properly identify and isolate the issue. Late last week we got our hands on a beta version of Everest Ultimate, which added preliminary support for the 5970. With that, we could read and log the voltages and temperatures of the various components of the 5970, and properly isolate the issue.

From that, we’ve discovered a few interesting things about the 5970. Let’s start things off with the cooler removed from the 5970.

We’ve gone ahead and circled the VRMs in red. There are 9 altogether; 6 on the right side, and 3 near the left side of the card. We aren’t able to track down what each specific VRM is connected to, but we believe that each GPU is attached to 3, each GPU’s RAM is attached to 1, and finally the PLX PCIe bridge is attached to 1. Regardless, pay attention to the location of these VRMs for later discussion.

As we previously noted in our 5970 review, when overclocked the card was throttling down in two cases. One was when running OCCT/FurMark, members of AMD’s “power virus” list by virtue of the fact that they put a card under a greater load than AMD believes to be realistically possible. Our 5800 series cards never throttled under these applications, so to see the 5970 throttle here was a bit surprising but not wholly unexpected.

The second case was using Distributed.net’s pre-release GPU client for use with AMD’s GPUs. Since this is a real program, this was absolutely unexpected, and is what instigated our look in to the matter.

In both cases, the key was the overall load on the GPU cores, and consequently the amount of power required to drive the GPUs. When a bank of VRMs reached roughly 120C (this being averaged among all the VRMs in that bank), overcurrent protection kicked in and throttling began. In the case of FurMark this was very quick and even at 100% fan speed the cooler could still not keep the VRMs cool enough to allow full-time 850MHz operation. The Dnet client on the other hand was much slower to ramp up, and we ultimately found that 70% fan speed was enough to keep our hottest bank of VRMs below the threshold, stabilizing at 116C.

Notably, during this whole period the GPU cores themselves stayed at or under 94C, which is still a few degrees below their own throttle point. AMD’s fan quickly ramped up, and in our testing it only needed to go to 59%. So if the cores did get hotter there was still plenty of room to go with the fan.

This brings us to our first point of concern for the 5970, which is the fan speed. Clearly it’s adequate for the GPU cores themselves, but we cannot find any proof that the fan speed is adjusted based on the temperature of the RAM or the VRMs. If the fan speed were to ramp up in the case of near-critical temperatures in the VRMs, then the Dnet client likely would have ran without an issue the first time, as this would have pushed the fan to 70%.

We asked AMD about whether the fan speed is affected by VRM temperatures at all, but we didn’t receive a response. This isn’t particularly surprising since post-launch periods are a good time to take a vacation and there’s a holiday this week for their American employees, but it means we couldn’t get a confirmation of our assumption. So for the time being, we’re working on the assumption that only GPU core temperatures drive fan speed.

It also bears mentioning that the 5970 gets quite a bit louder when the fan goes up to 70%. We went ahead and captured the noise data for it at 70% and 100%, which is in the chart below. At the 70% fan speeds needed to run the Dnet client at 5870 speeds, you’re looking at 70dB, which is quite a bit louder than the fan noise at stock speeds. It is in fact uncomfortably loud by this point.

Our second point of concern goes beyond just the fan, and is the overall cooling of the VRMs. When we looked at our Everest logs after running the Dnet client, we noticed something interesting with respect to which VRMs were overheating. The VRM bank attached to GPU 1 was some 25C hotter under load, but it wasn’t GPU 1 that was the hottest. GPU 2 was consistently a couple of C warmer. We don’t believe this to be in error, so to understand why this is, we refer back to our disassembled 5970.

As the fan is on the right, the right side of the heatsink the vapor chamber dumps its heat in to is going to be cooler than the left side by the virtue of the fact that the left side is effectively using the already hot-air of the right side to cool. The heatsink and vapor chamber mitigate this some, but the right side of the card – and consequently the right GPU– should be cooler than the left side. This leads us to believe that GPU 1 is the right GPU, and GPU 2 is the left GPU.

This is important since if we look at the VRMs, the VRMs feeding GPU 2 sit under the vapor chamber, while the VRMs feeding GPU 1 (along with the RAM and PCIe bridge) are not. We haven’t been able to fully dissect the cooler, but the VRMs on the right side sit right underneath the fan, and we don’t believe there to be a significant heatsink in the metal bar that sits above them. So while the VRMs feeding GPU 2 are being cooled by the vapor chamber, the VRMs feeding GPU 1 are only being cooled by the heat dissipation properties of a metal bar.

From this, we can conclude that the VRM banks are receiving wildly different amounts of cooling. The VRMs on the right side are not cooled nearly as well as those on the left and as a result the card is being held back by the VRMs on that right side. To that extent, we believe that if all the VRMs received the same level of cooling as the VRMs on the left side, then the card would have no problem maintaining 5870 speeds while running the Dnet client, and likely even FurMark. It’s also worth noting that all the 5800 series cards share the design of placing the VRMs under a metal bar under the fan, but the 5970 seems to suffer more for it compared to the 5800 series.

Finally, there’s the matter of whether this is even going to matter for most users. After catching the VRMs hitting 120C under the Dnet client, we went looking at other applications and games to see where else the card was throttling. The result of that inquiry was that we couldn’t find anything else that could match the Dnet client in total load. The Dnet client is a bit of a special case here, since crunching encryption keys makes exceedingly good use of the 5-wide SIMD design in the 2000-5000 series cards. When we took a look at something similar to the Dnet client, in this case the Folding@Home GPU client, we couldn’t break 100C. The significance of that result remains to be seen though, since the Folding@Home GPU client hasn’t been optimized for the 5800/5900 series yet like the Dnet client has. Our ultimate concern is that this card is going to repeatedly fall flat on its face at 5870 speeds with more GPGPU applications as OpenCL and DirectCompute take off, and the number of such applications bloom.

Radeon HD 5970 Temperatures
  GPU 1 Temp GPU 1 VRM Temp GPU 2 Temp GPU 2 VRM Temp
FurMark 89C 110C 91C 83C
Dnet Client 87C 101C 88C 77C
FurMark OC 91C 120C 94C 100C
Dnet Client OC 93C 120C 94C 94C
Cryis Warhead OC 87C 96C 89C 74C
STALKER OC 85C 96C 88C 72C

Meanwhile in games it was a similar story. Crysis and the STALKER benchmark are two of the most demanding games we’ve tested on the 5970, and in both cases the VRMs again peaked at near 100C. As games aren’t going to hammer the SIMDs like GPGPU applications do, the power load from games should be lower than for GPGPU applications.

As far as our opinion on the 5970 is concerned though, this doesn’t change anything. While we’ll buy AMD’s “power virus” rationale for FurMark and OCCT, the Dnet client is not a power virus. It’s a real application, one that AMD even used in their 5800 presentation back in September. Thus as far as we’re concerned, our 5970 is only good for 775MHz, the lowest clock speed where the VRMs stayed under 120C. Granted, AMD will never officially promise that the 5970 can reach 5870 speeds, but based on how the 5970 was promoted and presented the fact of the matter is that the card can’t meet its advertised capabilities – this card is clearly meant for 5870 clockspeeds.

With that in mind, we’ll end on two thoughts. The first of which is that in spite of our experience, for pure gaming scenarios we don’t have any data to bring in to doubt the idea that the card can run at 5870 speeds without throttling. So long as you only intend to play games, those speeds should be fine.

Our second thought is that cards from vendors with custom overclocking utilities will be better able to maintain 5870 speeds at all times. These are cherry-picked chips, so there’s no reason why they absolutely need 1.1625v core voltage to run at 850MHz; we suspect that they could do with less. Since voltage is our main enemy here, even a small drop in voltage should have a noticeable impact on VRM temperatures. But you’re going to need a utility with a full suite of voltage options to take advantage of that.


44 Comments
Username:
Password:
Noticed one small typo by cgramer, 76 days ago
From the article:

"It’s also worth noting that all the 5800 series cards share the design of placing the VRMs under a metal bar under the fan, but the 5970 seems to suffer more for it compared to the 5970."

I assume you meant to compare it to another card, rather than to itself?

Good article otherwise... Interesting to keep an eye on the latest generation of cards as I get ready to build my next system. Thanks!

Reply
RE: Noticed one small typo by cgramer, 76 days ago
Never mind -- corrected before I even managed to post my comment. :-)

Reply
Could they have designed the card better? by Pneumothorax, 76 days ago
Looks like the design of placing the GPU1 VRM's on the right side of the board would require the fan to be placed further back (and elongating a card that barely fits) to adequately cool them.

Reply
Attempts to cool and retest by The0ne, 76 days ago
Are there plans to attempt to cool the VRMs to actually see what the outcome will be?

Reply
RE: Attempts to cool and retest by Ryan Smith, 76 days ago
Not at this time I'm afraid. I don't have a practical alternative method to cool the VRMs on the right side.

Reply
RE: Attempts to cool and retest by mindless1, 75 days ago
Assuming they used vias under the chips (surely they would or else they might overheat... oh, wait...), you might try pointing a fan at the back of the card though these days fancy mobo NB 'sinks often get in the way.

Reply
RE: Attempts to cool and retest by titanmiller, 74 days ago
How about adding a twist to the fan's spokes so that it directs air downward like a traditional fan?

Reply
so 2 5850's? by Gutcheck2009, 76 days ago
So your better off getting 2 5870's or 2 5850's. I have 3 5850's and they all do 900/1200 on air. I have EK blocks on them and they never get above 55C for the cores. The VRM's are also cooled by the block. EK says they do get hot, so I am glad I am on water.

Reply
VRAM temps and Borderlands by Pjmcnally, 76 days ago
In your article you suggest that VRAM temps wont effect most normal users of this card. That may be true but there are several threads over on the Borderlands PC users forum complaining about overheating based shut downs on ATI 4XXX and 5XXX series cards. At least one has specifically linked the problem to the VRAM. If you have a copy of the game lying around maybe you can check it out.

Reply
RE: VRAM temps and Borderlands by Ryan Smith, 76 days ago
Just to be clear, I'm talking about the Voltage Regulator Module (VRM), not the Video RAM (VRAM).

Reply
RE: VRAM temps and Borderlands by Pjmcnally, 76 days ago
Yeah...Sorry about that. Reading comprehension fail on my part. Next time I will pay more attention.

Reply
Please label the parts being discussed by sciwizam, 76 days ago
I think it would be easier to understand, if you can label the parts being discussed (GPU1 or 2, and which set of VRMs are feeding which GPU) in the 2nd set of pics.

Reply
RE: Please label the parts being discussed by sciwizam, 76 days ago
strikeout "feeding", "associated" with

Reply
RE: Please label the parts being discussed by sciwizam, 76 days ago
strikeout "feeding", "associated" with

Reply
Bad contact? by Mr Perfect, 76 days ago
It looks like one of the VRMs on the right didn't even touch the heatsink, it has no thermal goop on it and there is n oindentation on the TIM either. Maybe better contact would help? Is Anand's card similar or wose for contact? Could be a quality control thing.

Reply
RE: Bad contact? by greywood, 76 days ago
Second that - - from the photo, it looks like at least two VRM's at the bottom right side and all three at the top center are making little if any contact with the TIM. Might be interesting to clean off the generic goop, re-apply some AS5 (or such) then really "cinch-down" the HSF and try re-testing?

Reply
RE: Bad contact? by Rajinder Gill, 76 days ago
I think Ryan scraped that off to read off the FET part numbers. I asked him the FET model numbers because I wanted to find out if ATI had used 45amp slaves.



Reply
RE: Bad contact? by Ryan Smith, 76 days ago
Bingo. The clean VRM is the one I scraped clean to get the model number.

Reply
RE: Bad contact? by Rajinder Gill, 75 days ago
Each VGPU FET is spec'd at 40 amps. So 120 amps tops per GPU.

regards
Raja




Reply
RE: Bad contact? by mindless1, 75 days ago
Spec'd at 40 amps if they had adequate copper under them, ++ heatsinking on top

Reply
Arnother VRM chip? by Zok, 76 days ago
Maybe my eyes are deceiving me, but it looks like there is a 10th VRM chip at the top of the board, where it makes contact with similar white thermal goop used on the other 9. Could you comment on this Ryan?

Reply
RE: Arnother VRM chip? by Ryan Smith, 76 days ago
Your eyes deceive you. There are only 9 VRMs. I think I see what you mean, that's just another small chip.

Reply
RE: Arnother VRM chip? by largon, 75 days ago
Fact is, there are not just 9, and not even 10 but 13 pieces of VRMs of various models of Volterra slave chipsets and other non-Volterra parts on the card.

Here's a complete list:
- 3× Volterra VT1157SFs silkscreened as "U71, U72 and U73" located above the PLX, coupled with the lone horizontal CPLA-3-50 choke "L22/L23". These feed the GPU silkscreened as "U1". Controlled by the VT1165MF marked as "U70".
- 3× Volterra VT1157SFs silkscreened as "U87, U88 and U89" coupled with the vertical CPLA-3-50 choke "L26/L27". located to the right of GPU on the right. These feed the GPU silkscreened as "U2". Controlled by the VT1165MF marked as "U86".
- 2× Volterra VT1157SFs silkscreened as "U76 and U77" coupled with the vertical CPLA-2-50 choke "L21". These feed RV870 GPU "uncore" I/O (GDDR5 ctrl?), 2 phases shared for both GPUs. Controlled by the VT1165MF marked as "U75".
- 1× Volterra VT232WF silkscreened as "U60" coupled with a 1005R1 choke "L14", located above GPU "U2". This feeds the GDDR5 chips their VDD or VDDQ (no way to tell which just by looking).
- 1× Volterra VT232WF silkscreened as "U60" coupled with a 1005R1 choke "L15", located below the vertical CPLA-3-50. This feeds the GDDR5 chips their VDD or VDDQ (no way to tell which just by looking).
- 2× Infineon n-channel MOSFETs (042N03LS + 119N03S) silkscreened as "Q1" and "Q2", respectively, coupled with a 1R5 choke "L1" located on the bottom edge of the PCB, below and left of the vertical CPLA-3-50. These are for the PLX bridge chip.
- 1× AOSMD (AO)Z1024DI low-power integrated buck regulator coupled with a 4R7 choke "L33" located above the vertical CPLA-2-50 choke "L21". I don't know what purpose it serves.

Reply
RE: Arnother VRM chip? by largon, 75 days ago
Correcting a few typing errors:

- 1× Volterra VT232WF silkscreened as "U61" coupled with a 1005R1 choke "L15", located below the vertical CPLA-3-50. This feeds the GDDR5 chips their VDD or VDDQ (no way to tell which just by looking).

- 2× Infineon n-channel MOSFETs (042N03LS + 119N03S) silkscreened as "Q2" and "Q1", respectively, coupled with a 1R5 choke "L1" located on the bottom edge of the PCB, below and left of the vertical CPLA-3-50. These are for the PLX bridge chip.

Reply
RE: Arnother VRM chip? by mindless1, 75 days ago
Thank you!

Reply
Bad Part? by Sahrin, 76 days ago
So far as I have read, AT is the only website that has had any issues getting the 5970 to 850/1200 on both GPUs. Has anyone considered the possibility that AT just got a bad sample? I know AMD 'built these to overclock' - but stock is stock is stock. I'd be interested to see what your testing environment is, and if there's any impact from that on temperatures. From the articles I have read, the clock increases have been pretty painless.

Is AT's part a review sample or a retail card?

Reply
RE: Bad Part? by Ryan Smith, 76 days ago
It's a review sample, but it's identical to a real card.

And yes, it's always possible we got a bad sample. But bear in mind that throttling probably isn't going to show up in a game. So unless the other guys ran FurMark/OCCT/Dnet and were specifically looking for it, they would have never noticed. I'd be surprised if their cards' VRMs didn't get similarly hot.

Reply
Yep by PorscheRacer, 75 days ago
I had this problem on my R600 reference design, though instead of the throttling I would get lockups, hangs or bluescreens when severely overclocked. After removing the backplate and front cover/heatsink, I removed the old thermal pads and applied liquid metal and AS5 for the VRMs and AS5 for the die. Everest showed a significant drop in temperatures and less ramping up of the fan. VRMs are unknown as Everest doesn't report this... I think ATITool did, but I don't recall that anymore.

Very interesting though, and thanks for the investigative journalism. It's one thing to say, oh well in this benchamrk for some reason the 5970 did poorly, and another to explain why it did.

Reply
Video benchmarks by xpclient, 74 days ago
Please I want video encoding, decoding/playback benchmarks. I read in their forums that ATI doesn't use DXVA HD (introduced in Windows 7) on their GPUs but their own API. Intel and Nvidia use DXVA HD.

Reply
bad feeling by wh3resmycar, 74 days ago
i have a feeling ati-fanboys all over will perfectly find this article "offending".



Reply
Copper bar connected to.. by Hauk, 74 days ago
The copper bar for those VRMs needs to be connected to something. The low grade metal of the cooler body isn't wicking/dissipating heat away from the copper bar fast enough. Slap the best TIM available under the copper bar, you still have a traffic jam ahead..

Reply
re by overclocking101, 74 days ago
have you tried a preassure mod at all? i mean just some small plastic washers? this brought my 4870 vrms down by almost 10c!! and they were under a bar just like that im thinking amd is not putting enough preassure for the cooler to make the best contact to be safe not to "cruch or ruin" any of the component but just some small plastic washers help a lot at least in my case they did. just a thought

Reply
hmm by jigglywiggly, 73 days ago
I am going to get one(probably), I am wondering, how big is the gap between the vrm and the heatsink? Is it enough to put a goop of thermal paste? I was thinking of using mx-2, it would probably help, I don't want to use their thermal tape.

Reply
RE: hmm by Ryan Smith, 73 days ago
I didn't bother measuring it, but it's big enough that you can't use paste. Something sizable needs to be there to connect the metal bar with the VRMs. The VRMs and the memory chips sit much lower than the GPUs themselves do.

Reply
interesting discovery by fausto412, 73 days ago
i think ATI will have to address their cooling design and the location of VRM's in future cards. this is a major discovery and i wouldn't buy this card based on those super high VRM temps. i know i probably would not run into issue but components that run that much hotten than usual ones can't last as long. i look forward to an update on this issue and what if anything ATI will do to mitigae the problem.

Reply
RE: interesting discovery by Proxicon, 73 days ago
VRM temps were really only an issue when overclocking, otherwise they were running within there temp boundarys. I think thats what the article said maybe I misunderstood.

Reply
RE: interesting discovery by Targon, 72 days ago
This is something that most people don't seem to understand about ATI/AMD and NVIDIA. Both companies release reference designs, but NVIDIA does NOT make retail graphics cards, and it is hit and miss if ATI/AMD will release a retail card or not for a given GPU these days.

So, you have those companies that will just copy the reference design, and may reproduce a design flaw in the reference design, and you have those that come up with their own cooling solution which DOES tend to be better than the reference design. Those, such as Sapphire Tech that have their own cooling solutions will hopefully solve this problem and provide a much better experience.


Reply
Watercool it by Proxicon, 73 days ago
I think I would rather get two 5870's than this, except currently this is a cheaper alternative by close to $150.00 dollars with the pricing inflation ala demand.

Also, I don't think it's a good idea to be using AS-5 or anything with silver that is conductive on this 600 dollar + card or its VRM's or anywhere on that PCB if I was you, unless your sure that you could keep it in the right spots putting that massive heatsink back on.

Maybe AS-5 on GPU and Ceramique for all the lil VRM's and stuff that might overflow off and onto the board.

They will release a waterblock for this thing that will cool the GPU's and VRM's if you want to overclock. I think you could expect more dramatic results with that.

I think this card is a solid deal.

GO ATI!!

Reply
Hilarious That ATI Still Hasn't Figured This Out by LedHed, 70 days ago
ATI has been designing their X2 cards cooler like this for what? 3 years? From the beginning people have been telling them that the design results in hot air of one GPU being blown into the other; resulting in one GPU being hotter.

I find it hilarious that ATI has done this AGAIN with their "top dog" card. I own a GTX 295 and it idles @ 41C and Full loads @75C, if NVIDIA figured out how to cool dual GPU cards so well over a year ago why is ATI still scratching their heads?

Once again ATI has produced an overly HOT card that can't be used to it's full potential with the stock cooler.

Reply
This is a old problem with Ati by TurboMecca, 68 days ago
I've got a Powercolor 4870PCS+ (1GB, slight factory-overclock) and it displayed the same problem when running Furmark and
other demanding graphic-intense programs.
Powercolor later decreased the factory overclocking of this card to just a mild/small overclock (without telling it's customers, same old O/C figures at their site). They also increased min fan speed to 63% which made the card sound like a old GeForce 5800:
http://www.youtube.com/watch?v=PFZ39nQ_k90

Even later Ati made their graphicsdriver recognize when Furmark was being run and underclocked the card to cope with Furmark (at a lower FPS of course).
This is ridiculous and embarassing both for Ati/AMD and all the hardware-testing sites (Anandtech hereby excluded, after this article :-))
Stupid manufacturers that cannot construct a graphics-card that manage to run stable at even base clocks.

Reply
RE: This is a old problem with Ati by TurboMecca, 68 days ago
I might as well add that the "cooling" solution (heatsink) on my Powercolor 4870 PCS+ doesn't even seem to make contact with the VRM:s.
So instead of help cooling the VRM:s the stupid designers at Powercolor manage to cave in/embed the VRM:s, thus increasing the temperature even further.


Reply
RE: This is a old problem with Ati by TurboMecca, 68 days ago
I had some links to articles at EXPreview that verified my statement above, but they doesn't seem to be accessible at this time (maybe later):
http://en.expreview.com/?p=680
http://en.expreview.com/?p=700

Reply
Perpetuating misnomers by nightstar, 68 days ago
I enjoyed reading this artical. I found it to be well written and researched, with one exception. You quote AMD as calling Furmark and OCCT "power virus".

I expect the authors and editors at such a prestigious website as Anandtech would understand what a virus is in the context of computer software, however some clarification seems to be required. A virus is malware that self replicates after infecting a system, spreading to other systems.

To the best of my knowlege neither of the aforementioned stress testing software tools fit the criteria of a "power virus" or any sort of computer virus for that matter. While I'm not surprised that a hardware manufacturer would try to spin a design deficiency by redefining worlds I hold journalists to a higher standard than Corporate PR reps.

How very Orwellian of you.

War is peace, Freedom is slavery. Certain third party stress-testing programs are power viruses...?

Reply
RE: Perpetuating misnomers by AuDioFreaK39, 23 days ago
I find that explanation to be a little extreme.

Reply
Comments Page 1 of 1





AnandTech.com Blog Categories
All categories
Anand's Macdates
Anand's Theater Construction
Anand's Updates
Cases and Power Supplies
CeBIT 2008
CES 2008
Computex 2009
Derek Decanted
Eddie's Got Game
Gary's First Looks
IT Computing general
Jarred's Musings
Kris's Corner
Raja's Ramblings
Rob's Experiences...
Ryan's Ramblings
Virtualization
What's New with Wes
Blank
Blank

Blank

Latest news by
DailyTech

 February 9, 2010

Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank

 February 8, 2010

Blank


more Blogs Discussions



pipeboost
Copyright © 1997-2010 AnandTech, Inc. All rights reserved. Terms, Conditions and Privacy Information.
Click Here for Advertising Information