Bulldozer's Power Management

AMD confirmed that the power management of the Bulldozer core is an improved version of the power management improvements that are part of the “Llano” CPU. Just like Llano, Bulldozer has a Digital APM Module. The APM modules samples a number of performance counter signals and these samples are used to estimate dynamic power with 98% accuracy. Now combine this power estimate with Bulldozer's power gating at the module level and vastly improved clock gating and you can start to understand what is possible. 


Bulldozer reduces the number of active and power consuming circuits by vastly improved clock gating

If your application runs only one or a few threads on your 8-module, 16-core Interlagos CPU, several of those modules might be power gated. Or if you run integer-only threads, the fact that quite a few unused parts (i.e. the FPU) of the module will be clockgated might be enough to stay under the configured TDP. So in those cases, it won't be necessary to limit the clock speed. And that is really great, especially in the real world.

In the real world, only a few HPC application behave like the SPEC CPU rate benchmarks, which spawns threads accross all cores.  Most server applications do not fully utilize all available cores all the time. Sometimes, only one thread will be really critical and the perceived application performance will depend on it. A little bit later several threads might demand CPU power (but not all cores will be busy). Only a certain percentage of the time are all the cores used. That is exactly the reason why the cheaper Magny-Cours make so much sense for HPC applications, yet it struggles to keep up with the higher clocked, higher IPC Xeon Westmere cores when running OLTP and ERP applications. Putting a power cap on a Magny-Cours means even lower frequencies, and as a result even higher response times (as we have measured here). 

By adding power consumption measurements to the CPU, Bulldozer will run most server applications at full speed unless you lower the TDP too far. (Obviously, if the TDP is lowered enough, the CPU will not be able to operate at higher frequencies, thus degrading the response time performance too.) The maximum throughput will be a little bit lower, but most server applications almost never run at maximum throughput. In fact, maximum throughput only matters for HPC applications and benchmarks. For real human users, response times are the only thing that matter.

The beauty of this new power cap system is that in normal circumstances (e.g. the server is running at 40-70% load), the response times will hardly be any longer. At the same time, the adminstrator can make sure that the server cluster does not exceed the capacity of the cooling equipment and the power lines.

This TDP Power Cap technology could be very interesting to small and medium businesses too, and not only to owners of large server clusters. TDP Power Cap could be a way to make sure that your collocated servers never exceeds the maximum amount of amps allocated to you, and as result you will not have to pay unexpected high electricity bills. However, whether or not this ideal world of low response times and low electricity bills will become a reality for the Bulldozer server owners will also depend on the availability of a good and decently priced management software tool that allows the administrator to configure the TDP on all servers simultaneously.

On a standard server, you will get a section in BIOS that allows you to tweak the TDP in 1W increments (or a maximum of 64 power settings), a good step forward compared to the current p-state setting. But to control a server cluster in an efficient way, good management software is needed. Currently, you either have to buy all your servers from the same vendor (HP for example) and then pay for management software such as HP's Insight Control software. To really unlock this technology, AMD or one of their partners needs to make sure this kind of software is widely available--some open source code perhaps?

TDP Power Cap Final Thoughts and AMD's Future Plans
POST A COMMENT

59 Comments

View All Comments

  • stmok - Friday, July 15, 2011 - link

    Komodo is a CPU that replaces Zambezi. It does not have DX11 IGP. So that should be a "No" in that category. It is not an APU.

    In the "Socket" category, both Trinity and Komodo will use some form of Socket FM infrastructure. AMD currently refers them as Socket FMx (where x = 1, 2, 3, etc). It doesn't mean that both will use the same socket.

    See a thread I've created at Overclockers.com.au forums regarding AMD's 2012 lines.
    => http://forums.overclockers.com.au/showthread.php?t...
    (I've collected a good number of official and leaked presentation slides.)
    Reply
  • jjj - Friday, July 15, 2011 - link

    Also about Komodo, it has 10 cores not 8. Reply
  • Kristian Vättö - Friday, July 15, 2011 - link

    All leaked slides suggest 8 cores. If you have something to proof the 10 cores, then please share it with us.. Reply
  • TimCh - Friday, July 15, 2011 - link

    Here you go

    http://blogs.amd.com/fusion/2010/11/09/simply-put-...

    No need for leaks.
    Reply
  • stmok - Friday, July 15, 2011 - link

    The slide you point to is November 9th 2010 for AMD Financial Analyst Day.

    The slide in the thread I've created is dated January 2011 for CES 2011.

    Your info regarding 10-cores is out of date. Its 8-cores for Komodo.
    Reply
  • inf64 - Friday, July 15, 2011 - link

    No,look closely in the slide. There is a Correction at the bottom.
    It says:
    Correction,MArch 8,2011

    So they corrected the IGP error in Komodo and corrected the core count number.Now it is 6-10 enhanced/NG Bulldozer cores.
    So yes,Komodo will feature up to 10 Bulldozer+ cores.
    Reply
  • mino - Friday, July 15, 2011 - link

    In other words, Komodo is the C2012 part for desktops.

    However there is one issue - Komodo will go for AM3+ OR FM1, it is VERY unlikely AMD would go for another socket in 2012.

    And since there is no PCIe in AM3+ while also no IGP on FM1 chip ... it is more likely they go for FM1 with Komodo actually having Display controller but not having a GPU - the same as some embedded Brazos parts today.

    Last (sensible) option is for AMD to go FM1 with the same setup as Lynnfield.
    Reply
  • Topweasel - Thursday, July 21, 2011 - link

    Nothing you said in this makes sense. AM3+ has PCIe, There isn't an IGP on the chipset for FM1 because no CPU in that socket would be missing it, and Brazos has a barely capable IGP, (40SP unit?) but it isn't just some 2d display controller. Reply
  • Kristian Vättö - Saturday, July 16, 2011 - link

    I have updated the article to be up-to-date with the slide you provided. Reply
  • Kristian Vättö - Friday, July 15, 2011 - link

    Even the link you provided suggests that Komodo will feature a DX11 capable IGP. Note that it says GPU for Komodo but nothing for Zambezi. Reply

Log in

Don't have an account? Sign up now