Lower Idle Power & Better Overcurrent Protection

One aspect AMD was specifically looking to improve in Cypress over RV770 was idle power usage. The load power usage for RV770 was fine at 160W for the HD4870, but that power usage wasn’t dropping by a great deal when idle – it fell by less than half to 90W. Later BIOS revisions managed to knock a few more watts off of this, but it wasn’t a significant change, and even later designs like RV790 still had limits to their idling abilities by only being able to go down to 60W at idle.

As a consequence, AMD went about designing the Cypress with a much, much lower target in mind. Their goal was to get idle power down to 30W, 1/3rd that of RV770. What they got was even better: they came in past that target by 10%, hitting a final idle power of 27W. As a result the Cypress can idle at 30% of the power as RV770, or as compared to Cypress’s load power of 188W, some 14% of its load power.

Accomplishing this kind of dramatic reduction in idle power usage required several changes. Key among them has been the installation of additional power regulating circuitry on the board, and additional die space on Cypress assigned to power regulation. Notably, all of these changes were accomplished without the use of power-gating to shut down unused portions of the chip, something that’s common on CPUs. Instead all of these changes have been achieved through more exhaustive clock-gating (that is, reducing power consumption by reducing clock speeds), something GPUs have been doing for some time now.

The use of clock-gating is quickly evident when we discuss the idle/2D clock speeds of the 5870, which is 150mhz for the core, and 300mhz for the memory . The idle clock speeds here are significantly lower than the 4870 (550/900), which in the case of the core is the source of its power savings as compared to the 4870. As tweakers who have attempted to manually reduce the idle clocks on RV770 based cards for further power savings have noticed, RV770 actually loses stability in most situations if its core clock drops too low. With the Cypress this has been rectified, enabling it to hit these lower core speeds.

Even bigger however are the enhancements to Cypress’s memory controller, which allow it to utilize a number of power-saving tricks with GDDR5 RAM, along with other features that we’ll get to in a bit. With RV770’s memory controller, it was not capable of taking advantage of very many of GDDR5’s advanced features besides the higher bandwidth abilities. Lacking this full bag of tricks, RV770 and its derivatives were unable to reduce the memory clock speed, which is why the 4870 and other products had such high memory clock speeds even at idle. In turn this limited the reduction in power consumption attained by idling GDDR5 modules.

With Cypress AMD has implemented nearly the entire suite of GDDR5’s power saving features, allowing them to reduce the power usage of the memory controller and the GDDR5 modules themselves. As with the improvements to the core clock, key among the improvement in memory power usage is the ability to go to much lower memory clock speeds, using fast GDDR5 link re-training to quickly switch the memory clock speed and voltage without inducing glitches. AMD is also now using GDDR5’s low power strobe mode, which in turn allows the memory controller to save power by turning off the clock data recovery mechanism. When discussing the matter with AMD, they compared these changes to putting the memory modules and memory controller into a GDDR3-like mode, which is a fair description of how GDDR5 behaves when its high-speed features are not enabled.

Finally, AMD was able to find yet more power savings for Crossfire configurations, and as a result the slave card(s) in a Crossfire configuration can use even less power. The value given to us for an idling slave card is 20W, which is a product of the fact that the slave cards go completely unused when the system is idling. In this state slave cards are still capable of instantaneously ramping up for full-load use, although conceivably AMD could go even lower still by powering down the slave cards entirely at a cost of this ability.

On the opposite side of the ability to achieve such low idle power usage is the need to manage load power usage, which was also overhauled for the Cypress. As a reminder, TDP is not an absolute maximum, rather it’s a maximum based on what’s believed to be the highest reasonable load the card will ever experience. As a result it’s possible in extreme circumstances for the card to need power beyond what its TDP is rated for, which is a problem.

That problem reared its head a lot for the RV770 in particular, with the rise in popularity of stress testing programs like FurMark and OCCT. Although stress testers on the CPU side are nothing new, FurMark and OCCT heralded a new generation of GPU stress testers that were extremely effective in generating a maximum load. Unfortunately for RV770, the maximum possible load and the TDP are pretty far apart, which becomes a problem since the VRMs used in a card only need to be spec’d to meet the TDP of a card plus some safety room. They don’t need to be able to meet whatever the true maximum load of a card can be, as it should never happen.

Why is this? AMD believes that the instruction streams generated by OCCT and FurMark are entirely unrealistic. They try to hit everything at once, and this is something that they don’t believe a game or even a GPGPU application would ever do. For this reason these programs are held in low regard by AMD, and in our discussions with them they referred to them as “power viruses”, a term that’s normally associated with malware. We don’t agree with the terminology, but in our testing we can’t disagree with AMD about the realism of their load – we can’t find anything that generates the same kind of loads as OCCT and FurMark.

Regardless of what AMD wants to call these stress testers, there was a real problem when they were run on RV770. The overcurrent situation they created was too much for the VRMs on many cards, and as a failsafe these cards would shut down to protect the VRMs. At a user level shutting down like this isn’t a very helpful failsafe mode. At a hardware level shutting down like this isn’t enough to protect the VRMs in all situations. Ultimately these programs were capable of permanently damaging RV770 cards, and AMD needed to do something about it. For RV770 they could use the drivers to throttle these programs; until Catalyst 9.8 they detected the program by name, and since 9.8 they detect the ratio of texture to ALU instructions (Ed: We’re told NVIDIA throttles similarly, but we don’t have a good control for testing this statement). This keeps RV770 safe, but it wasn’t good enough. It’s a hardware problem, the solution needs to be in hardware, particularly if anyone really did write a power virus in the future that the drivers couldn’t stop, in an attempt to break cards on a wide scale.

This brings us to Cypress. For Cypress, AMD has implemented a hardware solution to the VRM problem, by dedicating a very small portion of Cypress’s die to a monitoring chip. In this case the job of the monitor is to continually monitor the VRMs for dangerous conditions. Should the VRMs end up in a critical state, the monitor will immediately throttle back the card by one PowerPlay level. The card will continue operating at this level until the VRMs are back to safe levels, at which point the monitor will allow the card to go back to the requested performance level. In the case of a stressful program, this can continue to go back and forth as the VRMs permit.

By implementing this at the hardware level, Cypress cards are fully protected against all possible overcurrent situations, so that it’s not possible for any program (OCCT, FurMark, or otherwise) to damage the hardware by generating too high of a load. This also means that the protections at the driver level are not needed, and we’ve confirmed with AMD that the 5870 is allowed to run to the point where it maxes out or where overcurrent protection kicks in.

On that note, because card manufacturers can use different VRMs, it’s very likely that we’re going to see some separation in performance on FurMark and OCCT based on the quality of the VRMs. The cheapest cards with the cheapest VRMs will need to throttle the most, while luxury cards with better VRMs would need to throttle little, if at all. This should make little difference in stock performance on real games and applications (since as we covered earlier, we can’t find anything that pushes a card to excess) but it will likely make itself apparent in overclocking. Overclocked cards - particularly those with voltage modifications – may hit throttle situations in normal applications, which means the VRMs will make a difference here. It also means that overclockers need to keep an eye on clock speeds, as the card shutting down is no longer a tell-tale sign that you’re pushing it too hard.

Finally, while we’re discussing the monitoring chip, we may as well talk about the rest of its features. Along with monitoring the GPU, it also is a PWM controller. This means that the PWM controller is no longer a separate part that card builders add themselves, and as such we won’t be seeing any cards using a 2pin fixed speed fan to save money on the PWM controller. All Cypress cards (and presumably, all derivatives) will have the ability to use a 4pin fan built-in.

The Race is Over: 8-channel LPCM, TrueHD & DTS-HD MA Bitstreaming More GDDR5 Technologies: Memory Error Detection & Temperature Compensation
Comments Locked

327 Comments

View All Comments

  • erple2 - Tuesday, September 29, 2009 - link

    What the heck are you talking about? Are you saying that electricity consumed by a device divided by the "volume" of the device is the only way to measure the heat output of the device? Every single Engineering class I took tells me that's wrong, and I'm right. I think you need to take some basic courses in Electrical Engineering and/or Thermodynamics.

    (simplified)
    power consumed = work + waste

    You're looking for the waste heat generated by the device. If something can completely covert every watt of electricity that passes through it to do some type of work (light a light bulb, turn a motor, make some calculation on a GPU etc), then it's not going to heat up. As a result, you HAVE to take into consideration how inefficient the particular device is before you can make any claim about how much the device heats up.

    I'll bet that if you put a Liquid Nitrogen cooler on every ATI card, and used the standard air coolers on every NVidia card, that the ATI cards are going to run crazy cooler than the NVidia cards.

    Ultimately the temperature of the GPU's depends a significant amount on the efficiency of the cooler, and how much heat the GPU is generating as waste. My point is that we don't have enough data to determine whether the ATI die runs hot because the coolers are less than ideal, Nvidia ones are closer to ideal, the die is smaller, or whatever you have. You have to look at a combination of the efficiency of the die (how well it converts input power to "work done"), the efficiency of the cooler (how well it removes heat from it's heat source), and the combination of the two.

    I'd posit that the ATI card is more efficient than the NVidia card (at least in WoW, the only thing we have actual numbers of the "work done" and "input power consumed").

    Now, if you look at the measured temperature of the core as a means of comparing the worthiness of one GPU over another, I think you're making just as meaningful a comparison as comparing the worthiness of the GPU based on the size of the retail box that it comes in.
  • SiliconDoc - Friday, September 25, 2009 - link

    You simply repeated my claim about watts, and replaced core size, with fps, and created a framerate per watt chart, that has near nothing to do with actual heat inside the die, since the SIZE of the die, vs the power traversing through it is the determining factor, affected by fan quality (ram size as well).
    Your argument is "framerate power efficiency", as in watts per framerate, and has nothing to do with core temperature (modified by fan cooling of course to some degree), that the article indeed posts except for the two failed ati cards.
    The problem with your flawwed "science" that turns it into hokum, is that no matter what outputs on the screen, the HEAT generated by the power consumption of the card itself, remains in the card, and is not "pumped through the videoport to the screen".
    If you'd like to claim "wattage vs framerate" efficency for 5870, fine I've got no problem, but claiming that proves core temps are not dependent on power consumption vs die size ( modified by the rest of the card *mem size/power useage/ and the fan heatsink* ) is RIDICULOUS.
    ---
    The cards are generally equivalent manufacturing and component additions, so you take the wattage consumed (by the core) and divide by core size, for heat density.
    Hence, ATI cards, smaller cores and similar power consumption, wind up hotter.
    That's what the charts show, that's what should be stated, that is the rule, and that's the way it plays in the real world, too.
    ---
    The only modification to that is heatsink fan efficiency, and I don't find you fellas claiming stock NVIDIA fans and heatsinks are way better than the ATI versions, hence 66C for NVIDIA, 75C, 85C, etc, and only higher for ATI, in all their cards listed.
    Would you like to try that one on for size ? Should I just make it known that NVIDIA fans and heatsinks are superior to ATI ?
    What is true is a lager surface area (die side squared) dissipates the same amount of heat easier, and that of course is what is going on.
    ATI dies are smaller ( by a marked surface area as has so foten been pointed out), and have similar power consumption, and a higher DENSITY of heat generation, and therefore run hotter.
  • erple2 - Friday, September 25, 2009 - link

    Oops, "milliwatt" should be "kilowatt". I got the decimal place mixed up - I used kilowatt since I thought it was easier to see than 0.247, 0.140, 0.137, 0.181...
  • SiliconDoc - Wednesday, September 23, 2009 - link

    Let's take that LOAD TEMP chart and the article's comments. Right above it, it is stated a good cooler includes the 4850 that ILDE TEMPs in at around 40C (it's actually 42C the highest of those mentioned).
    "The floor for a good cooler looks to be about 40C, with the GTS 250(39C), 3870(41C), and 4850 all turning in temperatures around here"
    OK, so the 4850 has a good cooler, as well as the 3870... then right below is the LOAD TEMP.. and the 4850 is @ 90C -OBVIOUSLY that good cooler isn't up to keeping that tiny hammered core cool...

    3870 is at 89C, 4870 is at 88C, 5870 is at 89C ALL ati....
    but then, nvidia...
    250, 216, 285, 275 all come in much lower at 66C to 85C.... but "temps are all over the place".
    NOT only that crap, BUT the 4890 and 4870x2 are LISTED but with no temps - and take the "coolest position" on the chart!
    Well we KNOW they are in the 90C range or higher...
    So, you NEVER MENTION why 4870x2 and 4980 are "no load temp shown in the chart" - you give them the WINNING SPOTS anyway, you fail to mention the 260's 65C lowest LOAD WIN and instead mention GTX275 at 75C...LOL

    The bias is SO THICK it's difficult to imagine how anyone came up with that CRAP, frankly.
    So the superhot 4980 and 4870x2 are given #1 and #2 spots repsectively, a free ride, the other Nvidia cards KICK BUTT in lower load temps EXCEPT the 295, but it makes sure to mention the 8800GT while leaving the 4980 and 4870x2 LOAD TEMP spots blank ?
    roflmao
    ---
    What were you saying about "why" ? If why the 8800GT was included is TRUE, then comment on the gigantic LOAD TEMP bias... tell me WHY.
  • SiliconDoc - Wednesday, September 23, 2009 - link

    AND, you don't take temps from WOW to use for those two, which no doubt even though it is NOT gpu stressing much, will yeild the 90C for those two cards 4870x2 and 4980, anyway.
    So they FAIL the OCCT, but you have NOTHING on them, which would if listed put EVERY SINGLE ATI CARD @ near 90C LOAD, PERIOD...
    ---
    And we just CANNOT have that stark FACT revelaed, can we ? I mean I've seen this for well over a year here now.
    LET's FINALLY SAY IT.
    ---
    LOAD TEMPS ON THE ATI CARDS ARE ALL, EVERY SINGLE ONE NEAR 90c, much higher than almost ALL of the Nvidia cards.
  • pksta - Thursday, September 24, 2009 - link

    I just want to know...With this much zeal about videocards and more specifically the bias that you see, doesn't it make you sound biased too? Can you say that you have owned the cards you are bashing and seen the differences firsthand? I can say I did. I had an 8800 GT and it was running in the upper 80s under load. I switched to my 4850 with the worst cooler I think I've ever seen mind you, and it stays in the mid to upper 60s under load. The cooler on the 8800 gt was the dual-slot design that was the original reference design. The 4850 had the most pathetic fan I've ever seen. It was similar to the fan and heatsink Intel used on the first Core2 stuff. It was the really cheap aluminum with a tiny copper circle that made contact with the die itself. Now, don't get me wrong I love ATI...But I also love nVidia...Anything that keeps games getting better and prices getting better. I honestly don't think, though, that the article is too biased. I think maybe a little for ATI but nothing to rage on and on about. Besides...Calm down. You know nVidia will have a response for this.
  • SiliconDoc - Sunday, September 27, 2009 - link

    1. Who cares what you think about how you percieve me ? Unless you have a fact to refute, who cares ? What is biased ? There has been quite a DISSSS on PhysX for quite some time here, but the haters have no equal alternative - NOTHING that even comes close. Just ASK THEM. DEAD SILENCE. So, strangely, the very best there is, is BAD.

    Now ask yourself again who is biased, won't you? Ask yourself who is spewing out the endless strings... Do yourself a favor and figure it out. Most of them have NEVER tried PhysX ! They slip up and let it be known, when they are slamming away. Then out comes their PC hate the greedy green rage, and more, because they have to, to fit in the web PC code, instead of thinking for themselves.

    2. Yes, I own more cards currently than you will in your entire life. I started retail computer well over a decade ago.

    3. And now, the standard red rooster tale. It sounds like you were running in 2d clocks 100% of the time, probably on a brand board like a DELL. Happens a lot with red cards. Users have no idea.
    4850 with The worst fan in the World ! ( quick call Keith Olbermann) and it was ice cold, a degree colder than anything else in the review. ROFLMAO
    Once again, the red shorts pinnocchio tale. Forgive me while I laugh, again !
    ROFLMAO
    Did you ever put your finger on the HS under load ? You should have. Did you check your 3D mhz..
    http://forums.anandtech.com/messageview.aspx?catid...">http://forums.anandtech.com/messageview.aspx?catid...
    Not like 90C is offbase, not like I made up that forum thread.

    4. I could care less if nvidia has a response or not. Point is, STOP LYING. Or don't. I certainly have noticed many of the lies I've complained about over a year or so have gone dead silent, they won't pull it anymore, and in at least one case, used in reverse for more red bias, unfortunately, before it became the accepted idea.

    So, I do a service, at the very least people are going to think, and be helped, even if they hate me.
  • SiliconDoc - Wednesday, September 23, 2009 - link

    Well of course that's the excuse, but I'll keep my conclusion considering how the last 15 reviews on the top videocards were done, along with the TEXT that is pathetically biased for ati, that I pointed out. (Even though Derek was often the author).
    --
    You want ot tell me how it is that ONLY the GTX295 is near or at 90C, but ALL the ati cards ARE, and we're told "temperatures are all over the place" ?
    Can you really explain that, sir ?
  • 529th - Wednesday, September 23, 2009 - link

    holy shit, a full review is up already!
  • bill3 - Wednesday, September 23, 2009 - link

    Does the article keep referring to Cypress as "too big"? If Cypress is too big, what the hell is GT200 at 480mm^2 or whatever it was? Are you guys serious with that crap?

    I've heard that the "sweet spot" talk from AMD was a bit of a misdirection from the start anyway. IMO if AMD is going to compete for the performance crown or come reasonably close (and frankly, performance is all video card buyers really care about, as we see with all the forum posts only mentioning that GT300 will supposedly be faster than 58XX and not anything else about it) then they're going to need slightly bigger dies. So Cypress being bigger is a great thing. If anything it's too small. Imagine the performance a 480mm^2 Cypress would have! Yes, Cypress is far too small, period.

    Personally it's wonderful to see AMD engineer two chips this time, a bigger performance one and smaller lower end one. This works out far better all around.

    The price is also great. People expecting 299 are on crack.

Log in

Don't have an account? Sign up now