Will the Real QX9650 Power Consumption Please Stand Up?

The recent attention highlighting the conflicting manners in which Intel and AMD quote processor power consumption figures has generated a fair amount of discussion regarding which methodology more closely represents the truth. In reality, we can see the merits of both systems, and which one is more valuable to the consumer is largely a matter of personal perception.

To this day Intel still provides processor TDP (thermal design power) specifications with a single purpose in mind - ensuring thermal engineers have the information they need to do their jobs. Intel knows companies providing platforms based on their products must be able to design and manufacture cooling solutions capable of dissipating the installed processor's maximum heat production - all without violating temperature limits. Designing around these worst-case values ensures the system remains stable under all computing conditions.

AMD's focus is very different, as they have decided to use this data as a marketing tool. By drawing attention to the processor APC (average power consumption), they hope to appeal to businesses looking to lower costs associated with their data infrastructure's operation. Although most end users find little reason to concern themselves with the relatively small monthly savings that can come from a single system's lowered power consumption, this can be a significant sum of money for corporations supporting hundreds or even thousands of workstations. Then again, any business supporting thousands of computers usually has other concerns. Do they worry more about the $5,000-$10,000 per month (or more) spent on the employee using a workstation, or the $10-$30 spent on the power for the workstation? The greater concern is often whether or not a given location has the capacity to power the workstations, not how much the power will cost.

With that said, we would argue that AMD is pandering to the public while Intel has chosen to support the industry developers. We can all certainly agree though: no matter which method you believe is more correct, quoted numbers are useless without a consistent means for calculation.

Surprisingly, Intel has chosen to leave the QX9650's TDP unchanged from the 65nm QX6850's given value of 130W. This is a little confusing as reports from Intel on 45nm process technology improvements provide us with claims of switching energy efficiency increases of up to 30% and a reduction in transistor gate leakages of 10x or more. Could basic core performance improvements such as the incorporation of the new SSE4 instruction set have made such a difference in overall CPU efficiency? We think not. Obviously, there is something more going on here; either the QX6850 TDP is vastly under-rated or the QX9650 is much better than suggested. Perhaps over-rating the QX9650's TDP gives Intel the option in claiming further thermal improvements later. We will share the method we used in calculating a more representative TDP rating for our 45nm quad-core in a moment, but we found the value to be closer to 100W - some 30W less than the figure provided by the blue team, and right in line with the 30% energy efficiency increase.

Determining just how much power a particular processor consumes at a given operating frequency is more difficult than one might originally think. A few different methods exist for estimating these values - one requires nothing more that the processor's specified TDP and then scales this value based on a given overclocked core frequency and voltage. This isn't going to cut it for us, especially considering this technique insists we place full faith in Intel's quoted QX9650 TDP, a number we know to be sometimes unrepresentative of actual operating conditions. Instead, we decided to get creative and develop our own method. The concept is rather simple: we will measure the power draw from the wall, subtract off power consumption for "static" components, correct for PSU efficiency, and then devise a means for determining what portion of the remaining power the CPU consumes - by somehow attributing the remainder to "everything else." The biggest challenge will be devising a means for accurately determining where to place this final split.

Choosing the right PSU is vital in the success of our mission, and employing the services of a Corsair unit means we can take advantage of the published efficiency curves. Monitoring and recording total system power draw is easy thanks to our Kill-A-Watt from P3 International. Finally, calculating just how much power is used by static loads (fans, pumps, HDDs, etc.) won't be too hard at all - simply noting the power draw before and after disconnecting the load is quick and effective. Indeed, developing a method for subtracting off power consumed by the motherboard, memory, and graphics card is going to require the most thought.

Our approach is rather unique: we vary CPU frequency and voltage and then plot the base system load delta versus frequency. System load delta is the difference between wall power minus static loads at CPU full load and wall power minus static loads at idle - both corrected for PSU efficiency. Graphing the delta means that the plotted power varies solely as a function of CPU speed. Our immediate results generate the following graph.



Base System Power Delta (W) versus Processor Frequency (GHz)

Extending the curve downwards allows us to extrapolate our plot's predicted y-intercept. Mathematically, this point represents the frequency in which CPU full load power minus idle power is zero (CPU idle power equals CPU full power). Logic suggests that as the CPU full load power approaches zero so will idle power, especially considering idle power should always be less than the full load power. Without much argument, we find ourselves accepting this as our CPU minimum energy frequency (about 240MHz).

We then plot the base system idle power (wall power minus static loads at idle, again corrected for PSU efficiency) versus the same set of frequencies. Creating a best-fit equation and solving for power using the minimum energy frequency found in the previous graph above gives us the base system power consumption with the CPU at its minimum energy (near zero). This is as close we are going to come to quantitatively determining the idle power consumption of the motherboard, memory, and graphics card combined. We calculated a value of 110W using an ASUS P5E3, 4GB of OCZ DDR3, and a single 8800 Ultra resting comfortably at 500Mhz on a 2D desktop. More likely than not the power-hungry GPU and overclocked X38 chipset make up the lion's share of this value.

With this number our goal is in sight - subtracting this value from the corrected base system idle values used in the creation of our second graph (not shown) gives us the CPU idle power points. Adding to these the base system load deltas plotted in the figure above provides us with the final unknowns - actual CPU full load power values. Joining these two sets of data points gives us the data below.


Core
2 Extreme QX9650 Power Consumption

This graph rewards our hard work by providing us with a wealth of knowledge. Although the values above are highly precise only when referring to our particular QX9650 (every CPU is a little different), the accuracy of our method should be very good. The first thing we notice is how full load power consumption quickly increases to almost uncontrollable levels when pushing above ~4.25GHz. In fact, we found achieving true stability any higher nearly impossible with water-cooling alone. Users who are familiar with overclocking will recognize this trend: a CPU usually scales extremely well up to a certain point, beyond which any further overclocking requires an excessive amount of additional voltage. This is normal and exactly where this point lies is a function of cooling and silicon limits (i.e. process technology) and in reality has very little to do with luck.



An example of the findings that can be developed by measuring system response

We made this last graph using our CPU full load power consumption/frequency pair data points from the figure above. Based on our examinations, just about every Core 2 Extreme QX9650 owner should have little trouble accomplishing at least a 3.9GHz overclock. Upgrading the cooling used will at best provide modest improvements as these quad-cores begin to generate a lot of heat when pushed much further.

Index Exploring the Limits of 45nm Silicon
POST A COMMENT

56 Comments

View All Comments

  • Lifted - Wednesday, December 19, 2007 - link

    Very impressive. Seems more like a thesis paper than a typical tech site article. While the content on AT is of a higher quality than the rest of the sites out there, I think the other authors, founder included, could learn a thing or two from an article like this. Less commentary/controversy and more quality is the way to go. Reply
  • AssBall - Wednesday, December 19, 2007 - link

    Shouldn't page 3's title be "Exlporing the limits of 45nm Halfnium"? :D

    http://www.webelements.com/webelements/elements/te...">http://www.webelements.com/webelements/elements/te...
    Reply
  • lifeguard1999 - Wednesday, December 19, 2007 - link

    "Do they worry more about the $5000-$10000 per month (or more) spent on the employee using a workstation, or the $10-$30 spent on the power for the workstation? The greater concern is often whether or not a given location has the capacity to power the workstations, not how much the power will cost."

    For High Performance Computers (HPC a.k.a. supercomputers) every little bit helps. We are not only concerned about the power from the CPU, but also the power from the little 5 Watt Ethernet port that goes unused, but consumes power. When you are talking about HPC systems, they now scale into the tens-of-thousands of CPUs. That 5 Watt Ethernet port is now a 50 KWatt problem just from the additional power required. That Problem now has to be cooled as well. More cooling requires more power. Now can your infrastructure handle the power and cooling load, or does it need to be upgraded?

    This is somewhat of a straw-man argument since most (but not all) HPC vendors know about the problem. Most HPC vendors do not include items on their systems that are not used. They know that if they want to stay in the race with their competitors that they have to meet or exceed performance benchmarks. Those performance benchmarks not only include how fast it can execute software, but also how much power and cooling and (can you guess it?) noise.

    In 2005, we started looking at what it would take to house our 2009 HPC system. In 2007, we started upgrades to be able to handle the power and cooling needed. The local power company loves us, even though they have to increase their power substation.

    Thought for the day:
    How many car batteries does it take to make a UPS for a HPC system with tens-of-thousands of CPUs?
    Reply
  • CobraT1 - Wednesday, December 19, 2007 - link

    "Thought for the day:
    How many car batteries does it take to make a UPS for a HPC system with tens-of-thousands of CPUs?"

    0.

    Car batteries are not used in neither static nor rotary UPS's.
    Reply
  • tronicson - Wednesday, December 19, 2007 - link

    this is a great article - very technical, will have to read it step by step to get it all ;-)

    but i have one question that remains for me.. how is it about electromigration with the very filigran 45nm structures? we have here new materials like the hafnium based high-k dielectricum, guess this may improove the resistance agains em... but how far may we really push this cpu until we risk very short life and destruction? intel gives a headroom until max 1.3625V .. well what can i risk to give with a good waterchill? how far can i go?

    i mean feeding a 45nm core p.ex. 1,5V is the same as giving a 65nm 1,6375! would you do that to your Q6600?
    Reply
  • eilersr - Wednesday, December 19, 2007 - link

    Electromigration is an effect usually seen in the interconnect, not in the gate stack. It occurs when a wire (or material) has a high enough current density that the atoms actually move, leading to an open circuit, or in some cases, a short.

    To address your questions:
    1. The high-k dielectric in the gate stack has no effect on the resistance of the interconnect
    2. The finer features of wires on a 45nm process do have a lower threshold to electromigration effects, ie smaller wires have a lower current density they can tolerate before breaking.
    3. The effects of electromigration are fairly well understood at this point, there are all kinds of automated checks built in to the design tools before tapeout as well as very robust reliability tests performed on the chips prior to volume production to catch these types of reliability issues.
    4. The voltage a chip can tolerate is limited by a number of factors. Ignoring breakdown voltages and other effects limited by the physics of transistor operation, heat is where most OC'ers are concerned. As power dissipation is most crudely though of in terms of CVf^2 (capacitance times voltage times frequency-squared), the reduced capacitance in the gate due to the high-k dielectric does dramatically lower power power dissipation, and is well cited. The other main component in modern CPU's is the leakage, which again is helped by the high-k dielectric. So you should expect to be able to hit a bit higher voltage before hitting a thermal envelope limitation. However, the actual voltage it can tolerate is going to depend on the CPU and what corner of the process it came from. In all, there's no general guideline for what is "safe". Of course, anything over the recommended isn't "safe", but the only way you'll find out, unfortunately, is trial and error.
    Reply
  • eilersr - Wednesday, December 19, 2007 - link

    Doh! Just noticed my own mistake:
    high-k dielectric does not reduce capacitance! Quite the contrary, a high-k dielectric will have higher capacitance if the thickness is kept constant. Don't know what I was thinking.

    Regardless, the capacitance of the gate stack is a factor, as the article mentioned. I don't know how the cap of Intel's 45nm gate compares with that of their 65nm gate, but I would venture it is lower:

    1. The area of the FET's is smaller, so less W*L parallel plate cap.
    2. The thickness of the dielectric was increased. Usually this decreases cap, but the addition of high-k counter acts that. Hard to say what balance was actually achieved.

    This is just a guess, only the process engineers no for sure :)
    Reply
  • kjboughton - Wednesday, December 19, 2007 - link

    Asking how much voltage can be safetly applied to a (45nm) CPU is a lot like asking which story of a building can you jump from without the risk of breaking both legs on the landing. There's inherent risk in exceeding the manufacturer's specification at all and if you asked Intel what they thought I know exactly what they would say -- 1.3625V (or whatever the maximum rated VID value is). The fact of the matter is that choices like these can only be made by you. Personally, I feel exceeding about 1.4V with a quad 45nm CPU is a lot like beating your head against a wall, especially if your main concern is stability. My recommendation is that you stay below this value, assuming you have adequate cooling and can keep your core temperatures in check. Reply
  • renard01 - Wednesday, December 19, 2007 - link

    I just wanted to tell you that I am impressed by your article! Deep and practical at the same time.

    Go on like this.

    This is an impressive CPU!!

    regards,
    Alexander
    Reply
  • defter - Wednesday, December 19, 2007 - link

    People stop posting silly comments like: "Intel's TDP is below real power consumption, it isn't comparable to AMD's TDP".

    Here we have a 130W TDP CPU consuming 54W under load.
    Reply

Log in

Don't have an account? Sign up now