Redefining TDP With PowerTune

One of our fundamental benchmarks is FurMark, oZone3D’s handy GPU load testing tool. The furry donut can generate a workload in excess of anything any game or GPGPU application can do, giving us an excellent way to establish a worst case scenario for power usage, GPU temperatures, and cooler noise. The fact that it was worse than any game/application has ruffled both AMD and NVIDIA’s feathers however, as it’s been known to kill older cards and otherwise make their lives more difficult, leading to the two companies labeling the program a “power virus”.

FurMark is just one symptom of a larger issue however, and that’s TDP. Compared to their CPU counterparts at only 140W, video cards are power monsters. The ATX specification allows for PCIe cards to draw up to 300W, and we quite regularly surpass that when FurMark is in use. Things get even dicier on laptops and all-in-one computers, where compact spaces and small batteries limit how much power a GPU can draw and how much heat can effectively be dissipated. For these reasons products need to be designed to meet a certain TDP; in the case of desktop cards we saw products such as the Radeon HD 5970 where it had sub-5870 clocks to meet the 300W TDP (with easy overvolting controls to make up for it), and in laptop parts we routinely see products with many disabled functional units and low clocks to meet those particularly low TDP requirements.

Although we see both AMD and NVIDIA surpass their official TDP on FurMark, it’s never by very much. After all TDP defines the thermal limits of a system, so if you regularly surpass those limits it can lead to overwhelming the cooling and ultimately risking system damage. It’s because of FurMark and other scenarios that AMD claims that they have to set their products’ performance lower than they’d like. Call of Duty, Crysis, The Sims 3, and other games aren’t necessarily causing video cards to draw power in excess of their TDP, but the need to cover the edge cases like FurMark does. As a result AMD has to plan around applications and games that cause a high level of power draw, setting their performance levels low enough that these edge cases don’t lead to the GPU regularly surpassing its TDP.

This ultimately leads to a concept similar to dynamic range, defined by Wikipedia as: “the ratio between the largest and smallest possible values of a changeable quantity.” We typically use dynamic range when talking about audio and video, referring to the range between quiet and loud sounds, and dark and light imagery respectively. However power draw is quite similar in concept, with a variety of games and applications leading to a variety of loads on the GPU. Furthermore while dynamic range is generally a good thing for audio and video, it’s generally a bad thing for desktop GPU usage – low power utilization on a GPU-bound game means that there’s plenty of headroom for bumping up clocks and voltages to improve the performance of that game. Going back to our earlier example however, a GPU can’t be set this high under normal conditions, otherwise FurMark and similar applications will push the GPU well past TDP.

The answer to the dynamic power range problem is to have variable clockspeeds; set the clocks low to keep power usage down on power-demanding games, and set the clocks high on power-light games. In fact we already have this in the CPU world, where Intel and AMD use their turbo modes to achieve this. If there’s enough thermal and power headroom, these processors can increase their clockspeeds by upwards of several steps. This allows AMD and Intel to not only offer processors that are overall faster on average, but it lets them specifically focus on improving single-threaded performance by pushing 1 core well above its normal clockspeeds when it’s the only core in use.

It was only a matter of time until this kind of scheme came to the GPU world, and that time is here. Earlier this year we saw NVIDIA lay the groundwork with the GTX 500 series, where they implemented external power monitoring hardware for the purpose of identifying and slowing down FurMark and OCCT; however that’s as far as they went, capping only FurMark and OCCT. With Cayman and the 6900 series AMD is going to take this to the next step with a technology called PowerTune.

PowerTune is a power containment technology, designed to allow AMD to contain the power consumption of their GPUs to a pre-determined value. In essence it’s Turbo in reverse: instead of having a low base clockspeed and higher turbo multipliers, AMD is setting a high base clockspeed and letting PowerTune cap GPU performance when it exceeds AMD’s TDP. The net result is that AMD can reduce the dynamic power range of their GPUs by setting high clockspeeds at high voltages to maximize performance, and then letting PowerTune cap GPU performance for the edge cases that cause GPU power consumption to exceed AMD’s preset value.

Advancing Primitives: Dual Graphics Engines & New ROPs PowerTune, Cont
POST A COMMENT

167 Comments

View All Comments

  • mac2j - Wednesday, December 15, 2010 - link

    Um - if you have the money for a 580 ... pick up another $80-100 and get 2 x 6950 - you'll get nearly the best possible performance on the market at a similar cost.

    Also I agree that Nvidia will push the 580 price down as much as possible... the problem is that if you believe all of the admittedly "unofficial" breakdowns ... it costs Nvidia 1.5-2x as much to make a 580 as it costs AMD to make a 6970.

    So its hard to be sure how far Nvidia can push down the price on the 580 before it ceases to become profitable - my guess is they'll focus on making a 565 type card which has almost 570 performance but for a manufacturing cost closer to what a 460 runs them.
    Reply
  • fausto412 - Wednesday, December 15, 2010 - link

    yeah. AMD let us down on this here product. We see what gtx580 is and what 6970 is...i would say if you planning to spend 500...the gtx580 is worth it. Reply
  • truepurple - Wednesday, December 15, 2010 - link

    "support for color correction in linear space"

    What does that mean?
    Reply
  • Ryan Smith - Wednesday, December 15, 2010 - link

    There are two common ways to represent color, linear and gamma.

    Linear: Used for rendering an image. More generally linear has a simple, fixed relationship between X and Y, such that if you drew the relationship it would be a straight line. A linear system is easy to work with because of the simple relationship.

    Gamma: Used for final display purposes. It's a non-linear colorspace that was originally used because CRTs are inherently non-linear devices. If you drew out the relationship, it would be a curved line. The 5000 series is unable to apply color correction in linear space and has to apply it in gamma space, which for the purposes of color correction is not as accurate.
    Reply
  • IceDread - Wednesday, December 15, 2010 - link

    Yet again we do not get to see hd 5970 in crossfire despite it being a single card! Is this an nvidia site?

    Anyway, for those of you who do want to see those results, here is a link to a professional Swedish site!

    http://www.sweclockers.com/recension/13175-amd-rad...

    Maybe there is some google translation available or so if you want to understand more than the charts shows.
    Reply
  • medi01 - Wednesday, December 15, 2010 - link

    Wow, 5970 in crossfire consumes less than 580 in SLI.
    http://www.sweclockers.com/recension/13175-amd-rad...
    Reply
  • ggathagan - Wednesday, December 15, 2010 - link

    Absolutely!!!
    There's no way on God's green earth that Anandtech doesn't currently have a pair of 5970's on hand, so that MUST be the reason.
    I'll go talk to Anand and Ryan right now!!!!
    Oh, wait, they're on a conference call with Huang Jen-Hsun.....

    I'd like to note that I do not believe Anadtech ever did a test of two 5970's, so it's somewhat difficult to supply non-existent into any review.
    Ryan did a single card test in November 2009.That is the only review I've found of any 5970's on the site.
    Reply
  • vectorm12 - Wednesday, December 15, 2010 - link

    I was not aware of the fact that the 32nm process had been canned completely and was still expecting the 6970 to blow the 580 out of the water.

    Although we can't possibly know and are unlikely to ever find out what cayman at 32nm would have performed like I suspect AMD had to give up a good chunk of performance to fit it on the 389mm^2 40nm die.

    This really makes my choice easy as I'll pickup another cheap 5870 and run my system in CF.
    I think I'll be able to live with the performance until the refreshed cayman/next gen GPUs are ready for prime time.

    Ryan: I'd really like to see what ighashgpu can do with the new 6970 cards though. Although you produce a few GPGPU charts I feel like none of them really represent the real "number-crunching" performance of the 6970/6950.

    Ivan has already posted his analysis in his blog and it seems like the change from LWIV5 to LWIV4 made a negligible impact at the most. However I'd really love to see ighashgpu included in future GPU tests to test new GPUs and architectures.

    Thanks for the site and keep up the work guys!
    Reply
  • slagar - Wednesday, December 15, 2010 - link

    Gaming seems to be in the process of bursting its own bubble. Graphics of games isn't keeping up with the hardware (unless you cound gaming on 6 monitors) because most developers are still targeting consoles with much older technology.
    Consoles won't upgrade for a few more years, and even then, I'm wondering how far we are from "the final console generation". Visual improvements in graphics are becoming quite incremental, so it's harder to "wow" consumers into buying your product, and the costs for developers is increasing, so it's becoming harder for developers to meet these standards. Tools will always improve and make things easier and more streamlined over time I suppose, but still... it's going to be an interesting decade ahead of us :)
    Reply
  • darckhart - Wednesday, December 15, 2010 - link

    that's not entirely true. the hardware now allows not only insanely high resolutions, but it also lets those of us with more stringent IQ requirements (large custom texture mods, SSAA modes, etc) to run at acceptable framerates at high res in intense action spots. Reply

Log in

Don't have an account? Sign up now