PowerTune, Cont

PowerTune’s functionality is accomplished in a two-step process. The first step is defining the desired TDP of a product. Notably (and unlike NVIDIA) AMD is not using power monitoring hardware here, citing the costs of such chips and the additional design complexities they create. Instead AMD is profiling the performance of their GPUs to determine what the power consumption behavior is for each functional block. This behavior is used to assign a weighted score to each functional block, which in turn is used to establish a rough equation to find the power consumption of the GPU based on each block’s usage.

AMD doesn’t provide the precise equations used, but you can envision it looking something like this:

Power Consumption =( (shaderUsage * shaderWeight) + (ropUsage * ropWeight) + (memoryUsage * memoryWeight) ) * clockspeed

In the case of the Radeon HD 6970, the TDP is 250W, while the default clockspeed is 880MHz.

With a power equation established, AMD can then adjust GPU performance on the fly to keep power consumption under the TDP. This is accomplished by dynamically adjusting just the core clock based on GPU usage a few times a second. So long as power consumption stays under 250W the 6970 stays at 880MHz, and if power consumption exceeds 250W then the core clock will be brought down to keep power usage in check.

It’s worth noting that in practice the core clock and power usage do not have a linear relationship, so PowerTune may have to drop the core clock by quite a bit in order to maintain its power target. The memory clock and even the core voltage remain unchanged (these are only set with PowerPlay states), so PowerTune only has the core clock to work with.

Ultimately PowerTune is going to fundamentally change how we measure and classify AMD’s GPUs. With PowerTune the TDP really is the TDP; as a completely game/application agonistic way of measuring and containing power consumption, it’s simply not possible to exceed the TDP. The power consumption of the average game is still below the TDP – sometimes well below – so there’s still an average case and a worst case scenario to discuss, but the range between them just got much smaller.

Furthermore as a result, real world performance is going to differ from theoretical performance that much more. Just as is the case with CPUs where the performance you get is the performance you get; teraFLOPs, cache bandwidth, and clocks alone won’t tell you everything about the performance of a product. The TDP and whether the card regularly crosses it will factor in to performance, just as how cooling factors in to CPU performance by allowing/prohibiting higher turbo modes. At least for AMD’s GPUs, we’re now going to be talking about how much performance you can get for any given TDP instead of specific clockspeeds, bringing performance per watt to the forefront of importance.

So by now you’re no doubt wondering what the impact of PowerTune is, and the short answer is that there’s virtually no impact. We’ve gone ahead and compiled a list of all the games and applications in our test suite, and whether they triggered PowerTune throttling. Of the dozen tests, only two triggered PowerTune: FurMark as expected, and Metro 2033. Furthermore as you can see there was a significant difference between the average clockspeed of our 6970 in these two situations.

AMD Radeon HD 6970 PowerTune Throttling
Game/Application Throttled?
Crysis: Warhead No
BattleForge No
Metro Yes (850Mhz)
HAWX No
Civilization V No
Bad Company 2 No
STALKER No
DiRT 2 No
Mass Effect 2 No
Wolfenstein No
3DMark Vantage Yes
MediaEspresso 6 No
Unigine Heaven No
FurMark Yes (600MHz)
Distributed.net Client No

In the case of Metro the average clockspeed was 850MHz; Metro spent 95% of the time running at 880MHz, and only at a couple of points did the core clock drop to around 700MHz. Conversely FurMark, a known outlier, drove the average core clock down to 600MHz for a 30% reduction in the core clock. So while PowerTune definitely had an impact on FurMark performance it did almost nothing to Metro, never mind any other game/application. To illustrate the point, here are our Metro numbers with and without PowerTune.

Radeon HD 6970: Metro 2033 Performance
PowerTune 250W PowerTune 300W
2560x1600 25.5 26
1920x1200 39 39.5
1680x1050 64.5 65

The difference is no more than .5fps on average, which may as well be within our experimental error range for this benchmark. For everything we’ve tested on the 6970 and the 6950, the default PowerTune settings do not have a meaningful performance impact on any game or application we test. Thus at this point we’re confident that there are no immediate drawbacks to PowerTune for desktop use.

Ultimately this is a negative feedback mechanism, unlike Turbo which is a positive feedback mechanism. Without overclocking the best a 6970 will run at is 880MHz, whereas Turbo would increase clockspeeds when conditions allow. Neither one is absolutely the right way to do things, but there’s a very different perception when performance is taken away, versus when performance is “added” for free. I absolutely like where this is going – both as a hardware reviewer and as a gamer – but I’d be surprised if this didn’t generate at least some level of controversy.

Finally, while we’ve looked at PowerTune in the scope of desktop usage, we’ve largely ignored other cases so far. AMD will be the first to tell you that PowerTune is more important for mobile use than it is desktop use, and mobile use is all the more important as the balance between desktops and laptops sold continues to slide towards laptops. In the mobile space not only does PowerTune mean that AMD will absolutely hit their TDPs, but it should allow them to produce mobile GPUs that come with higher stock core clocks, comfortable in the knowledge that PowerTune will keep power usage in check for the heaviest games and applications. The real story for PowerTune doesn’t even begin until 2011 – as far as the 6900 series is concerned, this may as well be a sneak peak.

Even then there’s one possible exception we’re waiting to see: 6990 (Antilles). The Radeon HD 5970 put us in an interesting spot: it was and still is the fastest card around, but unless you can take advantage of CrossFire it’s slower than a single 5870, a byproduct of the fact that AMD had to use lower core and memory clocks to make their 300W TDP. This is in stark comparison to the 4870X2, which really was 2 4870s glued together with the same single GPU performance. With PowerTune AMD doesn’t necessarily need to repeat the 5970’s castrated clocks; they could make a 6970X2, and let PowerTune clip performance as necessary to keep it under 300W. If something is being used without CrossFire for example, then there’s no reason not to run the 1 GPU at full speed. It would be the best of both worlds.

In the meantime we’re not done with PowerTune quite yet. PowerTune isn’t just something AMD can set – it’s adjustable in the Overdrive control panel too.

Redefining TDP With PowerTune Tweaking PowerTune
POST A COMMENT

167 Comments

View All Comments

  • mac2j - Wednesday, December 15, 2010 - link

    Um - if you have the money for a 580 ... pick up another $80-100 and get 2 x 6950 - you'll get nearly the best possible performance on the market at a similar cost.

    Also I agree that Nvidia will push the 580 price down as much as possible... the problem is that if you believe all of the admittedly "unofficial" breakdowns ... it costs Nvidia 1.5-2x as much to make a 580 as it costs AMD to make a 6970.

    So its hard to be sure how far Nvidia can push down the price on the 580 before it ceases to become profitable - my guess is they'll focus on making a 565 type card which has almost 570 performance but for a manufacturing cost closer to what a 460 runs them.
    Reply
  • fausto412 - Wednesday, December 15, 2010 - link

    yeah. AMD let us down on this here product. We see what gtx580 is and what 6970 is...i would say if you planning to spend 500...the gtx580 is worth it. Reply
  • truepurple - Wednesday, December 15, 2010 - link

    "support for color correction in linear space"

    What does that mean?
    Reply
  • Ryan Smith - Wednesday, December 15, 2010 - link

    There are two common ways to represent color, linear and gamma.

    Linear: Used for rendering an image. More generally linear has a simple, fixed relationship between X and Y, such that if you drew the relationship it would be a straight line. A linear system is easy to work with because of the simple relationship.

    Gamma: Used for final display purposes. It's a non-linear colorspace that was originally used because CRTs are inherently non-linear devices. If you drew out the relationship, it would be a curved line. The 5000 series is unable to apply color correction in linear space and has to apply it in gamma space, which for the purposes of color correction is not as accurate.
    Reply
  • IceDread - Wednesday, December 15, 2010 - link

    Yet again we do not get to see hd 5970 in crossfire despite it being a single card! Is this an nvidia site?

    Anyway, for those of you who do want to see those results, here is a link to a professional Swedish site!

    http://www.sweclockers.com/recension/13175-amd-rad...

    Maybe there is some google translation available or so if you want to understand more than the charts shows.
    Reply
  • medi01 - Wednesday, December 15, 2010 - link

    Wow, 5970 in crossfire consumes less than 580 in SLI.
    http://www.sweclockers.com/recension/13175-amd-rad...
    Reply
  • ggathagan - Wednesday, December 15, 2010 - link

    Absolutely!!!
    There's no way on God's green earth that Anandtech doesn't currently have a pair of 5970's on hand, so that MUST be the reason.
    I'll go talk to Anand and Ryan right now!!!!
    Oh, wait, they're on a conference call with Huang Jen-Hsun.....

    I'd like to note that I do not believe Anadtech ever did a test of two 5970's, so it's somewhat difficult to supply non-existent into any review.
    Ryan did a single card test in November 2009.That is the only review I've found of any 5970's on the site.
    Reply
  • vectorm12 - Wednesday, December 15, 2010 - link

    I was not aware of the fact that the 32nm process had been canned completely and was still expecting the 6970 to blow the 580 out of the water.

    Although we can't possibly know and are unlikely to ever find out what cayman at 32nm would have performed like I suspect AMD had to give up a good chunk of performance to fit it on the 389mm^2 40nm die.

    This really makes my choice easy as I'll pickup another cheap 5870 and run my system in CF.
    I think I'll be able to live with the performance until the refreshed cayman/next gen GPUs are ready for prime time.

    Ryan: I'd really like to see what ighashgpu can do with the new 6970 cards though. Although you produce a few GPGPU charts I feel like none of them really represent the real "number-crunching" performance of the 6970/6950.

    Ivan has already posted his analysis in his blog and it seems like the change from LWIV5 to LWIV4 made a negligible impact at the most. However I'd really love to see ighashgpu included in future GPU tests to test new GPUs and architectures.

    Thanks for the site and keep up the work guys!
    Reply
  • slagar - Wednesday, December 15, 2010 - link

    Gaming seems to be in the process of bursting its own bubble. Graphics of games isn't keeping up with the hardware (unless you cound gaming on 6 monitors) because most developers are still targeting consoles with much older technology.
    Consoles won't upgrade for a few more years, and even then, I'm wondering how far we are from "the final console generation". Visual improvements in graphics are becoming quite incremental, so it's harder to "wow" consumers into buying your product, and the costs for developers is increasing, so it's becoming harder for developers to meet these standards. Tools will always improve and make things easier and more streamlined over time I suppose, but still... it's going to be an interesting decade ahead of us :)
    Reply
  • darckhart - Wednesday, December 15, 2010 - link

    that's not entirely true. the hardware now allows not only insanely high resolutions, but it also lets those of us with more stringent IQ requirements (large custom texture mods, SSAA modes, etc) to run at acceptable framerates at high res in intense action spots. Reply

Log in

Don't have an account? Sign up now