Bulldozer's Power Management

AMD confirmed that the power management of the Bulldozer core is an improved version of the power management improvements that are part of the “Llano” CPU. Just like Llano, Bulldozer has a Digital APM Module. The APM modules samples a number of performance counter signals and these samples are used to estimate dynamic power with 98% accuracy. Now combine this power estimate with Bulldozer's power gating at the module level and vastly improved clock gating and you can start to understand what is possible. 


Bulldozer reduces the number of active and power consuming circuits by vastly improved clock gating

If your application runs only one or a few threads on your 8-module, 16-core Interlagos CPU, several of those modules might be power gated. Or if you run integer-only threads, the fact that quite a few unused parts (i.e. the FPU) of the module will be clockgated might be enough to stay under the configured TDP. So in those cases, it won't be necessary to limit the clock speed. And that is really great, especially in the real world.

In the real world, only a few HPC application behave like the SPEC CPU rate benchmarks, which spawns threads accross all cores.  Most server applications do not fully utilize all available cores all the time. Sometimes, only one thread will be really critical and the perceived application performance will depend on it. A little bit later several threads might demand CPU power (but not all cores will be busy). Only a certain percentage of the time are all the cores used. That is exactly the reason why the cheaper Magny-Cours make so much sense for HPC applications, yet it struggles to keep up with the higher clocked, higher IPC Xeon Westmere cores when running OLTP and ERP applications. Putting a power cap on a Magny-Cours means even lower frequencies, and as a result even higher response times (as we have measured here). 

By adding power consumption measurements to the CPU, Bulldozer will run most server applications at full speed unless you lower the TDP too far. (Obviously, if the TDP is lowered enough, the CPU will not be able to operate at higher frequencies, thus degrading the response time performance too.) The maximum throughput will be a little bit lower, but most server applications almost never run at maximum throughput. In fact, maximum throughput only matters for HPC applications and benchmarks. For real human users, response times are the only thing that matter.

The beauty of this new power cap system is that in normal circumstances (e.g. the server is running at 40-70% load), the response times will hardly be any longer. At the same time, the adminstrator can make sure that the server cluster does not exceed the capacity of the cooling equipment and the power lines.

This TDP Power Cap technology could be very interesting to small and medium businesses too, and not only to owners of large server clusters. TDP Power Cap could be a way to make sure that your collocated servers never exceeds the maximum amount of amps allocated to you, and as result you will not have to pay unexpected high electricity bills. However, whether or not this ideal world of low response times and low electricity bills will become a reality for the Bulldozer server owners will also depend on the availability of a good and decently priced management software tool that allows the administrator to configure the TDP on all servers simultaneously.

On a standard server, you will get a section in BIOS that allows you to tweak the TDP in 1W increments (or a maximum of 64 power settings), a good step forward compared to the current p-state setting. But to control a server cluster in an efficient way, good management software is needed. Currently, you either have to buy all your servers from the same vendor (HP for example) and then pay for management software such as HP's Insight Control software. To really unlock this technology, AMD or one of their partners needs to make sure this kind of software is widely available--some open source code perhaps?

TDP Power Cap Final Thoughts and AMD's Future Plans
Comments Locked

59 Comments

View All Comments

  • SanX - Friday, July 15, 2011 - link

    Nice job with the tests. They show exactly what i say AMD FP is twice faster then Intel.

    I used Lahey 32bit code, and as you can see our results are completely consistent - mine with E8400 at 3.8GHz and yours QX6700 at 3.2GHz

    And they are consistent in 64bits: with gfortran_64 i have a bit faster execution on my Intel then on 32bit Lahey and the result is around the same as yours on your i7 3.6GHz
    1 4.01s
    2 2.04s

    Will add here AMD 64bit result as soon as kick kids from the games but as we see we can not expect much different conclusion: on stock clocks AMD FP is twice faster then Intel.

    45nm AMD Phenom is by the way is easily overclockable by 25-45% or 3.5-4.1 GHz. When overclocked to the same clocks as Intel, AMD is even more then twice faster.
  • BSMonitor - Friday, July 15, 2011 - link

    Completely talking out of one's arse.........
  • BSMonitor - Friday, July 15, 2011 - link

    Idk, after years of AMD cpu domination, Intel was more than happy to let everyone talk about Conroe, benchmark it to the public. So much so that it drove up prices of the things when they finally were released.

    The reverse is true now, and I just don't see the same enthusiasm from AMD on Bulldozer. Maybe these 8-cores will be on par with 2600K ?? But Intel is still holding onto 6-core Sandy Bridge.

    Me thinks AMD has another Phenom on its hands. Big, low clock speeds, weaker than expected performance. Eventually, AMD is going to have to improve the performance of it's cores, not just keep adding more crappy ones.
  • saneblane - Friday, July 15, 2011 - link

    i think it's quite clear to everyone that amd went on a whole new level with this design, soo much so that it is even hard to understand how much core the processor actually has. like JF amd said people buy processors not "cores". so if bulldozer die size is smaller than a sandy bridge and use less or equal transistors then amd made the better processor. what we have to look at now is not cores anymore amd could have split 1 large core into 3 instead of 2 and we would be hearing the same arguments, about 3 core vs a single core. what we need to watch is how both companies used the real estate of the die, and who used less and accomplish more made the better cpu.
    and i fully expect an intel processor to copy bullldozer in the near future. cross licensing sucks
  • erikejw - Friday, July 15, 2011 - link

    Cliff notes:
    1. Don't run the Intel 320 SSD in any machines that needs perfect reliability or any kind of mission critical software.
    2. Back up all data on current drives immediately.

    I post it here so maybe some Anandtech guy can address the issue since they seem to be unaware of this for some months reported issue.

    Concearning reliability of the the Intel SSD 320 (and perhaps the 510 too).

    Huge number of complete data losses for users.
    Intel finally admits the problem exists.

    Power failure, instant shut downs causes the issue.

    No reliable information about if it is a firmware issue, design problem(bad design), hardware problem(controller etc, at least running this spec).
    A simple firmware update is most likely to solve the issue eventually

    Erik

    -------------------
    -------------------

    "“Be wary of the new Intel SSD 320 series. Currently, there's a bug in the controller that can cause the device to revert to 8MB during a power failure. AFAIK they have not yet publicly announced it, and won't have a firmware fix ready for release until the end of July.”"

    ---------------------

    http://forums.macrumors.com/showthread.php?t=11858...

    --------------------

    http://www.fudzilla.com/memory/item/23447-intel-co...
    etc
  • BSMonitor - Friday, July 15, 2011 - link

    What does this have to do with AMD Bulldozer?
  • Toadster - Friday, July 15, 2011 - link

    search for "Intel Intelligent Power Node Manager"
  • MilwaukeeMike - Friday, July 15, 2011 - link

    It's what TDP stands for, and I don't think it's in the article. (it's the amount of heat, measured in watts, that must be dissipated by the heatsink to keep the CPU operating safely). I had to stop reading on page two and leave AT.com to go find the answer. Please explain your acronyms... it's really annoying to read about something and feel too dumb for the article. , and it's never a good idea to give readers a reason to leave your website. :)
  • GaMEChld - Saturday, July 16, 2011 - link

    The joy of tabbed browsing.
  • ajlueke - Friday, July 15, 2011 - link

    The sad part about the reality we currently face is there really hasn't been a large increase in CPU performance since the Nethalem launch nearly three years ago.

    AMDs release of the Phenom II line kept them in it, as they were able to offer lesser performance, but at far less cost. SandyBridge changed all that. While again, it doesn't really perform that much better than the high end Nethalem's launched three years ago, or that much worse than the 990x, it is far cheaper than those $999 price tags. SandyBridge by performing as good as the old high end chips and being priced much lower really eroded any reason at all to buy/build an AMD based system at the enthusiast level.

    Bulldozer, with a street price reported to be around $300 needs to be faster than SandyBridge and needs to launch sooner in Q3, rather than in Q4 (October). If it is only on parity, then the reality would be that AMD was finally able to develop a chip that matches the performance Intel had three years ago. With Ivy Bridge, the successor to the high end throne, set to ascend in Q1 2012 would it then take AMD another three years to match that performance? Seems as though they are falling further and further behind. But, this is all speculation. I suppose we'll see what tomorrow brings.

Log in

Don't have an account? Sign up now