Homework: How Turbo Mode Works

AMD and Intel both figured out the practical maximum power consumption of a desktop CPU. Intel actually discovered it first, through trial and error, in the Prescott days. At the high end that's around 130W, for the upper mainstream market that's 95W. That's why all high end CPUs ship with 120 - 140W TDPs.

Regardless of whether you have one, two, four, six or eight cores - the entire chip has to fit within that power envelope. A single core 95W chip gets to have a one core eating up all of that power budget. This is where we get very high clock speed single core CPUs from. A 95W dual core processor means that individually the cores have to use less than the single 95W processor, so tradeoffs are made: each core runs at a lower clock speed. A 95W quad core processor requires that each core uses less power than both a single or dual core 95W processor, resulting in more tradeoffs. Each core runs at a lower clock speed than the 95W dual core processor.

The diagram below helps illustrate this:

  Single Core Dual Core Quad Core Hex Core
TDP
Tradeoff

 

The TDP is constant, you can't ramp power indefinitely - you eventually run into cooling and thermal density issues. The variables are core count and clock speed (at least today), if you increase one, you have to decrease the other.

Here's the problem: what happens if you're not using all four cores of the 95W quad core processor? You're only consuming a fraction of the 95W TDP because parts of the chip are idle, but your chip ends up being slower than a 95W dual core processor since its clocked lower. The consumer has to thus choose if they should buy a faster dual core or a slower quad core processor.

A smart processor would realize that its cores aren't frequency limited, just TDP limited. Furthermore, if half the chip is idle then the active cores could theoretically run faster.

That smart processor is Lynnfield.

Intel made a very important announcement when Nehalem launched last year. Everyone focused on cache sizes, performance or memory latency, but the most important part of Nehalem was far more subtle: the Power Gate Transistor.

Transistors are supposed to act as light switches - allowing current to flow when they're on, and stopping the flow when they're off. One side effect of constantly reducing transistor feature size and increasing performance is that current continues to flow even when the transistor is switched off. It's called leakage current, and when you've got a few hundred million transistors that are supposed to be off but are still using current, power efficiency suffers. You can reduce leakage current, but you also impact performance when doing so; the processes with the lowest leakage, can't scale as high in clock speed.

Using some clever materials engineering Intel developed a very low resistance, low leakage, transistor that can effectively drop any circuits behind it to near-zero power consumption; a true off switch. This is the Power Gate Transistor.

On a quad-core Phenom II, if two cores are idle, blocks of transistors are placed in the off-state but they still consume power thanks to leakage current. On any Nehalem processor, if two cores are idle, the Power Gate transistors that feed the cores their supply current are turned off and thus the two cores are almost completely turned off - with extremely low leakage current. This is why nothing can touch Nehalem's idle power:

Since Nehalem can effectively turn off idle cores, it can free up some of that precious TDP we were talking about above. The next step then makes perfect sense. After turning off idle cores, let's boost the speed of active cores until we hit our TDP limit.

On every single Nehalem (Lynnfield included) lies around 1 million transistors (about the complexity of a 486) whose sole task is managing power. It turns cores off, underclocks them and is generally charged with the task of making sure that power usage is kept to a minimum. Lynnfield's PCU (Power Control Unit) is largely the same as what was in Bloomfield. The architecture remains the same, although it has a higher sampling rate for monitoring the state of all of the cores and demands on them.

The PCU is responsible for turbo mode.

New Heatsinks and Motherboards Lynnfield's Turbo Mode: Up to 17% More Performance
Comments Locked

343 Comments

View All Comments

  • Genx87 - Tuesday, September 8, 2009 - link

    But after looking at the gaming benchmarks. I am wondering if the i5 is worth the cost to upgrade from an E8400? The best I could come up with from the graphs was the Q9560@3Gz or the E8600. In most of the games they were within a few % points. Ill have to see how the i5 does with the new round of cards from AMD\Nvidia before making a decision if I am going to build a new machine or just upgrade the GPU this winter.
  • Kaleid - Tuesday, September 8, 2009 - link

    Do like I do. Buy a better GPU. I'll stick to my e8400 at least until the 32nm CPU's arrive.

    And according to the guru3d review overclocking makes dramatically increases power consumption during load:
    "Once we overclock to 4.1 GHz... the power consumption all of a sudden is 295 Watts (!), so an additional 1200 MHz of power is costing us an additional 133 Watts."
    http://www.guru3d.com/article/core-i5-750-core-i7-...">http://www.guru3d.com/article/core-i5-7...re-i7-86...
  • papapapapapapapababy - Tuesday, September 8, 2009 - link

    "the lowest Lynnfield is a faster gaming CPU than Intel's fastest dual-core: the E8600"

    bullshit. the E8600 has higher minimum frame rates umm know "when it matters the most"


    http://images.anandtech.com/reviews/cpu/intel/lynn...">http://images.anandtech.com/reviews/cpu/intel/lynn...
  • scooterlibby - Tuesday, September 8, 2009 - link

    Nice review. Lynnfield seems like a great deal too for people building a new system, but from a gaming standpoint, I don't see enough performance difference to upgrade my overclocked e8400 setup. Guess it'll be Sandybridge for me!
  • rbbot - Tuesday, September 8, 2009 - link

    What is the maximum memmory you can fit onto a P55 chipset? I notice the Gigabtye board has 6 dimms but their website still says Max 16Gb?

    Is there a 16Gb chipset limit? Would it increase once those new high-capacity dimms from samsung make an appearance?
  • the machinist - Tuesday, September 8, 2009 - link

    I really don't know what to make of all this. I am about to buy i7 920 and over clock it to 3.6GHZ and then sometime next year upgrade the CPU to i9 6 core on LGA 1366. SLI does not interest me... cores/threads and clock speed are my main concern for 3d rendering.

    Is there any reason for someone like me to get this new platform instead? Please advice me.
  • rsher - Tuesday, September 8, 2009 - link

    I wish I had an answer for you. I am in the same situation. If you do get a good reply please post it so I could figure out what to buy..BTW what is the i9 CPU?
    I have some time before I need to upgrade. HAve you considered using the Xenon processors... I use MAX 2010..
    rSher

    .

  • the machinist - Tuesday, September 8, 2009 - link

    rSher Xeon are overkill these days considering the price premium. Single socket CPUs are so powerful these days that I just don't see the bang for the buck when it comes to Xeons. i7 920 over clocked matches some of the mid level Xeons anyway. If I was minting it and rendering only then I would get pair of high end Xeons

    Regarding your other question....
    i9 will be 6 core version that will come out next year and you can use them on LGA1366 Mobos. I think a 8 core version will come out too. They will be expensive but by the time I decide to upgrade they should be less expensive.
  • PassingBy - Tuesday, September 8, 2009 - link

    Can get single socket Xeon machines as well. The reason that professional users often prefer them is for ECC support. Up to you whether that matters for your applications. Naturally, for servers, ECC is the norm and that is also the situation for most professional workstations. Xeons can overclock as well, perhaps sometimes even better than the desktop equivalents, but professional users rarely overclock.
  • Ann3x - Tuesday, September 8, 2009 - link

    In some respects a great article. However the assertation that anything below the top end 1336 cpus are pointless is pretty obsurd.

    As others have stated the headroom and potential overclock of ANY d0 920 easily beats these new processors.

    As it is, i7s are aimed at enthusiasts. FOR AN ENTHUSIAST *ie someone willing to tweak and OC* the 920 is still by fast the best bang for buck choice.

    The new platform is only better if no tweeking is carried out (ie if youre not a technical user).

    Therefore were left with a column aimed at technical users saying something that is only relevant to non technical users. At best its a gross simplification. As worst its missleading.

    Yes, the new platform is good for the mass market, yes its exciting. However keep some perspective with your audience, the i7 920 is still BY FAR the best performance value for money CPU if you have the knowledge required to get the most out of it (as the majority of people buying X58 do).

Log in

Don't have an account? Sign up now