Real World Power

In the real world you do not run your virtualized servers at their maximum just to measure the potential performance. Neither do they run idle. The user base will create a certain workload and expect this workload to be performed with the lowest response times. The service provider (that is you!) wants the server to finish the job with the least amount of energy consumed. So the general idea behind this new benchmark scenario is that each server runs exactly the same workload and that we then measure the amount of energy consumed. It is similar to our previous article about server power consumption, but the methodology has been enhanced.

We made a new benchmark scenario. In this scenario, we changed three things compared to the vApus Mark II scenario:

  1. The number of users or concurrency per VM was lowered significantly to throttle the load
  2. The OLTP VMs are omitted
  3. We ran with two tiles

vApus Mark II loads the server with up to 800 users per second on the OLAP test, up to 50 users per second on the website, and the OLTP test is performing transactions as fast as it can. The idea is to give the server so much work that it is constantly running at 95-99% CPU load, allowing us to measure throughput performance quite well. vApus Mark II is designed as a CPU/memory benchmark.

To create a real world “equal load” scenario, we throttle the number of users to a point where you typically get somewhere between 30% and 60% CPU load on modern servers. As we cannot throttle our OLTP VM (Swingbench) as far we as know, we discarded the OLTP VM in this test. If we let the OLTP test run at maximum speed, the OLTP VM would completely dominate the measurements.

We run two tiles with 14 vCPUs (eight vCPUs for OLAP, three webservers with two vCPUs per tile), so in total 28 virtual CPUs are active. There are some minor tasks in the background: a very lightly loaded Oracle databases that feeds the three websites (one per tile), the VMware console (which idles most of the time), and of course the ESX hypervisor kernel. So all in all, you have a load on about 30-31 vCPUs. That means that some of the cores of the server system will be idleing, just like in the real world. On the HP DL380 G7, this “equal workload” benchmark gives the following CPU load graph:

On the Y-axis is CPU load, and on the X-axis is the periodic CPU usage. ESXtop was set up to measure CPU load every five seconds. Each test was performed three times: two times to measure performance and energy consumption, and the third time we did the same thing but with extensive ESXtop monitoring. To avoid having the CPU load in the third run much higher than the first two, we measured every five seconds. We measure the energy consumption over 15 minutes.

  vApus Mark   

Again, the dual Opteron numbers are somewhat high as we are running them in a quad socket machine. A Dell R715 is probably going to consume about 5% less. If we get the chance, we'll verify this. But even if the dual Opterons are not ideal measurements in comparison to the dual Xeon, they do give us interesting info.

Two Opterons CPUs are consuming 26.5 Wh (96.7 - 70.2). So if we extrapolate, this means roughly 55% (53 Wh out of 97Wh) of the total energy in our quad Opteron server is consumed by the four processors. Notice also that despite the small power handicap of the Opteron (a dual socket server will consume less), it was able to stay close to the Xeon X5670 based server when comparing maximum power (360W vs 330W). But once we introduce a 30-50% load, the gap between the dual Opteron setup and dual Xeon setup widens. In other words, the Opteron and Xeon are comparable at high loads, but the Xeon is able to save more power at lower loads. So there is still quite a bit of room for improvement: power gating will help the “Bulldozer” Opteron drive power consumption down at lower load.

Ok, enough interesting tidbits, who has the best performance per watt ratio?

Power Extremes: Idle and Full Load Response Times
Comments Locked

51 Comments

View All Comments

  • JohanAnandtech - Friday, September 10, 2010 - link

    Thanks, appreciate you took the time to let us know. We went through 5 weeks of intensive testing and my eyes still hurt from looking at the countless excel sheets, with endless power and response time readings. ;-)
  • FourthLiver - Thursday, September 9, 2010 - link

    at the end of page 12, you allude to a performance per watt analysis. looks like you forgot to put it up. i'm chomping at the bit to see those numbers!

    please disregard me if i failed to rtfa correctly. Anandtech is the best; your (all of you collectively) articles are brilliant and correct down to the smallest details. This is another article that was an absolute joy to read. :]
  • JohanAnandtech - Thursday, September 9, 2010 - link

    Well you can't really calculate it, as it depends on the situation. On low load loads, the system that consumes the less, is the winner, on the condition that the response times stay low. But of course, if your systems are running at low load all time, there might be something wrong: you should have bought more RAM and consolidated more VMs per system.

    At higher loads, the power consumption at high load divided by the throughput (vApusmark) is close to the truth. But it is definitely not the performance/watt number for everyone

    It depends on your workloads. The more critical processing power (think response time SLA) is, the more the last mentioned calculation makes sense. The more we are talking about lots of lightly loaded VMs (like authentification servers, fileservers etc.), the more simply looking at the energy consumed at page 12 make sense.
  • mino - Thursday, September 9, 2010 - link

    First, congratulations to a great article !

    Now to the small ammount of mess in there:
    "the CPUs consume more than the ACP ratings that AMD mentions everywhere"

    1) Avegare CPU Power (ACP) is NOT supposed/marketed to represent 100% load power use
    Wikipedia: "The average CPU power (ACP), is a scheme to characterize power consumption of new central processing units under "average" daily usage..."

    2) 122W at the wall and 110W at the CPU ??? Are you telling us the PSU's are 95% along with VRM/power/fans at 95% efficiency ? (0.95*0.95*1.22=1.10)
    . Sorry to spoil the party but that is NOT the case. 122W at wall means 100W at CPU at the most realistically 95W.

    Otherwise a great work. Keep is up!
  • JohanAnandtech - Friday, September 10, 2010 - link

    "1) Avegare CPU Power (ACP) is NOT supposed/marketed to represent 100% load power use
    Wikipedia: "The average CPU power (ACP), is a scheme to characterize power consumption of new central processing units under "average" daily usage...""

    You are right. But what value does it have? As an admin I want to know what the maximum could be realistically (TDP is the absolute maximum for non-micro periods) and if you read between the lines that is more or less what AMD communicated (see their white paper). if it is purely "average", it has no meaning, because average power can be a quite a bit lower as some servers will run at 30% on average, others at 60%.

    These PSU are supposed to be 92-94% efficient and AFAIK the VRMs are at least 90% efficient. So 122 x 0.92 x 0.90 = 101 W.
  • mino - Saturday, September 11, 2010 - link

    Well, I was bit unslept when writing it but anyway. So got a bit harser than should have.

    In my experience the ACP values pretty well represent your average loaded server (<= 80% load). But that is not the point.

    AMD created ACP in a response to the fact that their TDP numbers are conservative while Intel's are optimistic. That was the main cause wery well known to you as well.

    Call me an ass but I certainly do not remember AT bitching about Intel TDPs no bein representative (during last 6 years at least).
    And we all know too well that those NEVER represented the real power use of their boxen nor did they EVER represented what the "TDP" moniker stands for.

    Currently the situation is as such that identical 2P AMD box with 80W ACP has ~ the same power requirements as 2P Intel box with 80W TDP. You have just proven that.

    Therefore I believe it would be fair to stop bitching about AMD (or Intel) cheating in marketing (both do) and just say whether the numbers are comparable or not.
    Arguing about spin wattage is not really needed.
  • JohanAnandtech - Monday, September 13, 2010 - link

    "Arguing about spin wattage is not really needed. "

    I have to disagree. The usual slogan is "don't look at TDP, look at measurements". What measurments? The totally unrealistic SPECpower numbers?

    It is impossible for review sites to test all CPUs. So it is up to vendors to gives us a number that does not have to be accurate on a few percent, but that let us select CPUs quickly.

    Customers should have one number that allows them to calculate worst case numbers which are realistic (heavily load webserver for example, not a thermal virus). So all CPU vendors should agree on a standard. That is not bitching, but is a real need of the sysadmins out there.
  • mino - Thursday, September 9, 2010 - link

    One thing I would love to see is having the lowest end HP server put to its paces.
    So far it seems to us a the best option for vCenter hosting in small environments (with FT Vm's hosting vCenter).

    Maybe even run 1-tile vAPUS (v1? perhaps) on it ?
  • m3rdpwr - Thursday, September 9, 2010 - link

    I would have prepared to have had the DL385 G7 compared.
    They can be had with 8 and 12 core CPU's.

    We have close to 200 HP servers of all models, rack and blades.
    Many running vm in our Data Center.

    -Mario
  • duploxxx - Friday, September 10, 2010 - link

    same here, we moved also to 385g7 with the new 8-12core cpu's, Nice servers with huge core count since we never run more vCPU then pCPU in a system. Dell 815 looks like a good solution also, it was mentioned in the review the BL685 and DL585 are way more expensive.

Log in

Don't have an account? Sign up now