Power Management in Windows Server 2008 SP2

Enabling the C-states in ESX 5i might bring the Opteron 6276 an improved performance/watt ratio. The question is whether the low power consumption at light loads will negate the performance impact. Although power consumption is lowered by using the "C-state enable" tweak, it is not spectacular: 10% lower energy consumption in idle will probably not give the Opteron 6276 an amazing performance/watt ration in ESXi. The impact of this tweak will make a difference in our EWL testing, not in the "full speed ahead" benchmarks. Also, our vApus FOS EWL testing showed that the Xeon consumed 25% less energy, so it will remain ahead.

As the virtualization benchmarks require more time to run, we will have to delay investigating them for a later article. But what about Windows 2008 R2? The idle power of the Opteron 6276 was excellent there. So which power policy should be chosen in Windows 2008? We compared Opteron performance in "High performance" to the Opteron 6276 performance when the power management policy was set to "Balanced.

  Opteron 6276
"High Performance"
Opteron 6276
"High Performance"
+ C6 enable.
Xeon X5670
"High performance" vs.
Xeon X5670 "Balanced"
Cinebench Single-threaded +16% +18% +1%
Cinebench Multi-threaded +5% +5% +1%
Blender +4% +13% +1%
Encryption/Decryption AES +43% / +42% +43% / +44% +28% / +28%
Encryption/Decryption Twofish/Serpent +8% / +8% +8 / +8% +0 / +0%
Compression/decompression +9% / +4% +9 / +4% +0 / +2%

If we combine the our idle power consumption measurements with these numbers, things get a lot clearer. The "balanced" power policy disables turbo. Therefore, the maximum performance boost from enabling "high performance" should be 13%. The TrueCrypt benchmarks show much larger increases (see (*)), which we honestly don't understand. The performance boost (40%) is only possible if the CPU boosts to 3.2GHz, but that is not supposed to happen. First, the TrueCrypt software is well threaded and uses all clusters (32 threads). Second, we disabled C6, so normally the CPU is not able to boost to 3.2GHz. Third, our monitoring clearly indicated a 2.6GHz clock as expected.

We also did a quick x264 4.0 benchmark (1st pass) which is lightly threaded and showed the same performance (46%!) increase by simply switching from "Balanced" to "High performance" (turbo limited to 2.6GHz, no C6). The Xeon only got a 13% increase in performance..

Closer monitoring reveals that "Balanced" frequently reduces the cores to 1.4GHz. So we have a similar situation as the one where we found power management problems on the AMD "Istanbul" Opteron when the power policy was set to "Balanced".

Basically "Balanced" brings the clock speed down to a low P-state even when a thread is demanding the maximum processing power. Or in other words, the power manager is too eager to bring the clock speed down instead of looking ahead: the polling is blind for the very near future. The result is that quite often the workload gets processed at 1.4GHz (for a short time).

In contrast, the high performance setting does not make use of frequency scaling besides Turbo. So the CPU runs at 2.3GHz at the very minimum and frequently reaches 2.6GHz. So if you buy an Opteron 6200 server, it is strongly advised to chose the "High Performance" setting. Under light load, the balanced power manager saves a few percentage of power running idle, but in our opinion, it is not worth the large performance degradation. Notice also that the Xeon hardly suffers from the same problem with the exception of the AES-NI enabled TrueCrypt bench, and even then the performance impact is significantly lower.

In a nutshell: the power policy "Balanced" strongly favors the Xeon as the performance impact is non-existent or much lower. Let us see some raw performance numbers.

Measuring Real-World Power Consumption, Part Two Rendering Performance: Cinebench
Comments Locked

106 Comments

View All Comments

  • Kevin G - Tuesday, November 15, 2011 - link

    I'm curious if CPU-Z polls the hardware for this information or if it queries a database to fetch this information. If it is getting the core and thread count from hardware, it maybe configurable. So while the chip itself does not use Hyperthreading, it maybe reporting to the OS that does it by default. This would have an impact in performance scaling as well as power consumption as load increases.
  • MrSpadge - Tuesday, November 15, 2011 - link

    They are integer cores, which share few ressources besides the FPU. On the Intel side there are two threads running concurrently (always, @Stuka87) which share a few less ressources.

    Arguing which one deserves the name "core" and which one doesn't is almost a moot point. However, both designs are nto that different regarding integer workloads. They're just using a different amount of shared ressources.

    People should also keep in mind that a core does not neccessaril equal a core. Each Bulldozer core (or half module) is actually weaker than in Athlon 64 designs. It got some improvements but lost in some other areas. On the other hand Intels current integer cores are quite strong and fat - and it's much easier to share ressources (between 2 hyperthreaded treads) if you've got a lot of them.

    MrS
  • leexgx - Wednesday, November 16, 2011 - link

    but on Intel side there are only 4 real cores with HT off or on (on an i7 920 seems to give an benefit, but on results for the second gen 2600k HT seems less important)

    where as on amd there are 4 cores with each core having 2 FP in them (desktop cpu) issue is the FPs are 10-30% slower then an Phenom cpu clocked at the same speed
  • anglesmith - Tuesday, November 15, 2011 - link

    which version of windows 2008 R2 SP1 x64 was used enterprise/datacenter/standard?
  • Lord 666 - Tuesday, November 15, 2011 - link

    People who are purchasing SB-E will be doing similar stuff on workstations. Where are those numbers?
  • Kevin G - Tuesday, November 15, 2011 - link

    Probably waiting in the pipeline for SB-E base Xeons. Socket LGA-2011 based Xeon's are still several months away.
  • Sabresiberian - Tuesday, November 15, 2011 - link

    I'm not so sure I'd fault AMD too much because 95% of the people that their product users, in this case, won't go through the effort of upgrading their software to get a significant performance increase, at least at first. Sometimes, you have to "force" people to get out of their rut and use something that's actually better for them.

    I freely admit that I don't know much about running business apps; I build gaming computers for personal use. I can't help but think of my Father though, complaining about Vista and Win 7 and how they won't run his old, freeware apps properly. Hey, Dad, get the people that wrote those apps to upgrade them, won't you? It's not Microsoft's fault that they won't bring them up to date.

    Backwards compatibility can be a stone around the neck of progress.

    I've tended to be disappointed in AMD's recent CPU releases as well, but maybe they really do have an eye focused on the future that will bring better things for us all. If that's the case, though, they need to prove it now, and stop releasing biased press reports that don't hold up when these things are benched outside of their labs.

    ;)
  • JohanAnandtech - Tuesday, November 15, 2011 - link

    The problem is that a lot of server folks buy new servers to run the current or older software faster. It is a matter of TCO: they have invested a lot of work into getting webapplication x.xx to work optimally with interface y.yy and database zz.z. The vendor wants to offer a service, not a the latest technology. Only if the service gets added value from the newest technology they might consider upgrading.

    And you should tell your dad to run his old software in virtual box :-).
  • Sabresiberian - Wednesday, November 16, 2011 - link

    Ah I hadn't thought of it in terms of services, which is obvious now that you say it. Thanks for educating me!

    ;)
  • IlllI - Tuesday, November 15, 2011 - link

    amd was shooting to capture 25% of the market? (this was like when the first amd64 chips came out)

Log in

Don't have an account? Sign up now