More Performance Please!

Now let's push the processor cores to their best performance.

Fritzmark integer processing: max performance

When the going gets tough, the tough get going. The L3426 has a very tight TDP limit with no headroom for any extra Turbo Boost action. The X3470 has a TDP limit with a large margin, and as a result it's still capable of boosting the clock speed a bit higher (3.066GHz) when running eight threads.

At the same time, this graph shows how superior the integer engine of the "Nehalem" based cores is over AMD. A 1.9GHz quad-core offers about the same performance as a 2.9GHz quad-core. It also shows that for - well multi-threaded - integer applications, AMD's six-core was a decent countermeasure.

integer processing: full load power

Turbo Boost delivers better performance, but comes with a power price. The influence is not really shocking at the system level - 9% higher power - but it is significant if you only look at the processor power level. We are still working on our methodology to measure power at the component level, but looking at the idle power and the spec sheets we can estimate CPU power rather well. Looking at the CPU level, Turbo Boost probably needs from 15% to 17% more power to the CPU VRMs.

How Much Power? Overview
Comments Locked

35 Comments

View All Comments

  • n0nsense - Monday, January 18, 2010 - link

    Here is what system sees ...
    only one is 2.5, other three are 2.0 :)

    nons ~ # cat /proc/cpuinfo
    processor : 0
    vendor_id : GenuineIntel
    cpu family : 6
    model : 23
    model name : Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz
    stepping : 7
    cpu MHz : 2497.000
    cache size : 3072 KB
    physical id : 0
    siblings : 4
    core id : 0
    cpu cores : 4
    apicid : 0
    initial apicid : 0
    fpu : yes
    fpu_exception : yes
    cpuid level : 10
    wp : yes
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 lahf_lm tpr_shadow vnmi flexpriority
    bogomips : 5009.38
    clflush size : 64
    cache_alignment : 64
    address sizes : 36 bits physical, 48 bits virtual
    power management:

    processor : 1
    vendor_id : GenuineIntel
    cpu family : 6
    model : 23
    model name : Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz
    stepping : 7
    cpu MHz : 1998.000
    cache size : 3072 KB
    physical id : 0
    siblings : 4
    core id : 1
    cpu cores : 4
    apicid : 1
    initial apicid : 1
    fpu : yes
    fpu_exception : yes
    cpuid level : 10
    wp : yes
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 lahf_lm tpr_shadow vnmi flexpriority
    bogomips : 7012.69
    clflush size : 64
    cache_alignment : 64
    address sizes : 36 bits physical, 48 bits virtual
    power management:

    processor : 2
    vendor_id : GenuineIntel
    cpu family : 6
    model : 23
    model name : Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz
    stepping : 7
    cpu MHz : 1998.000
    cache size : 3072 KB
    physical id : 0
    siblings : 4
    core id : 2
    cpu cores : 4
    apicid : 2
    initial apicid : 2
    fpu : yes
    fpu_exception : yes
    cpuid level : 10
    wp : yes
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 lahf_lm tpr_shadow vnmi flexpriority
    bogomips : 5009.08
    clflush size : 64
    cache_alignment : 64
    address sizes : 36 bits physical, 48 bits virtual
    power management:

    processor : 3
    vendor_id : GenuineIntel
    cpu family : 6
    model : 23
    model name : Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz
    stepping : 7
    cpu MHz : 1998.000
    cache size : 3072 KB
    physical id : 0
    siblings : 4
    core id : 3
    cpu cores : 4
    apicid : 3
    initial apicid : 3
    fpu : yes
    fpu_exception : yes
    cpuid level : 10
    wp : yes
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 lahf_lm tpr_shadow vnmi flexpriority
    bogomips : 5009.09
    clflush size : 64
    cache_alignment : 64
    address sizes : 36 bits physical, 48 bits virtual
    power management:
  • VJ - Tuesday, January 19, 2010 - link

    These are mobile CPUs, however:

    With Linux on a Latitude (Intel T7200 or T7500), CPU Frequency Scaling Monitor allows one to scale the frequency of one core to its max while leaving the other core at its minimum.

    With an AMD TL62, this is not possible. The induced scaling of one core causes the frequency of the other core to follow.

    With an AMD ZM84 this is possible. Just like with the Latitude, one can have one core at its max with the other core at its minimum.

    Maybe what's shown is not what's taking place.

    Additionally;

    http://www.intel.com/technology/itj/2006/volume10i...">http://www.intel.com/technology/itj/200...al_Manag...

    "For example, in a Dual-Processor system, when the OS decides to reduce the frequency of a single core, the other core can still run at full speed. In the Intel Core Duo system, however, lowering the frequency to one core slows down the other core as well."


  • VJ - Tuesday, January 19, 2010 - link

    Additionally; AMD's ZM84 allows each core to operate at different frequencies. The lowest frequency is 575Mhz while the highest is 2300Mhz.

    I can set one core to 1150Mhz with the other set at 2300Mhz. This is different from the Intel (Mobile) CPUs I've come across where a difference in frequency between cores is only possible when one core is (seemingly) operating at its lowest frequency (in a dual core system).

    What is also interesting from aforementioned cpuinfo output is that only core is running at its max frequency while all (3) other cores are (seemingly) at their minimum frequency. Considering my previous conjecture on C2 and C0 states, it would be surprising if one can show cpuinfo output where 2 cores are running at max frequency while the other 2 cores are running at any frequency other than max frequency. That shouldn't be possible at all.

  • valnar - Thursday, May 6, 2010 - link

    Does anyone know if this kind of power management for Lynnfield processors is available in Windows 2003?
  • hshen1 - Sunday, June 23, 2013 - link

    This is really a good article for power management researchers like me!!

Log in

Don't have an account? Sign up now