Today Intel started talking about its ISSCC plans and included in the conference call were some details on Westmere that I previously didn't know. Most of it has to do with power savings, but also some talk about 32nm quad-core Westmere derivatives!

Westmere is Intel’s 32nm Nehalem derivative. Take Nehalem with all of its inherent goodness, add AES instructions, build it using 32nm transistors and you’ve got Westmere.

Westmere's Secret: Power Gated Un-Core

We just recently met the first incarnation of Westmere - Clarkdale, the dual-core processor that’s been branded the Core i3 and Core i5. Later this quarter we’ll meet Gulftown, a six-core Westmere that’ll be sold under the Core i7 label. All of that is old news, now for the new stuff.

With Nehalem Intel started power gating parts of the chip. Stick a power gate transistor in front of the supply voltage to each core and you can effectively shut off power (including leakage power) to the core when it’s not in use. This was a huge step in increasing power efficiency, something that’s evident when you look at Nehalem idle power numbers.

When you shut off a core you need to save the core’s state so that when it wakes back up it knows what to do next. Remember that power down these cores can happen dozens of times in the course of a second. The cores can’t wake up in a reboot state, they need to simply shut off when they’re not needed and wake back up to continue work when they are needed.

In Nehalem the core’s state (what instruction it’s going to work on next, data in its registers, etc...) is saved in the last level cache - L3. Unfortunately this means that the L3 cache can’t be powered down when the cores are idle, because that’s where they store their state information. Take this one step further and it also means that Nehalem’s L3 cache wasn’t power-gated.

In Westmere, Intel has added a dedicated SRAM to store core state data. Each core dumps its state information into the dedicated SRAM and then shuts off. With the state data kept out of the L3 cache, Westmere takes the next logical step and power gates the L3.

Intel lists this dedicated SRAM as a Westmere-mobile feature, there’s a chance it’s not present on the desktop chips. But it makes sense. Without a way of powering down the L3 cache, Westmere would be a very power hungry mobile CPU. Westmere appears to make it mobile-friendly.

Hex and Quad Core Westmere in 2010?

The last bits of information Intel revealed have to do with its high end desktop/workstation/server intentions with Westmere. The 6-core Westmere is a 240mm^2 chip made up of 1.17B transistors:

That’s six cores on a single die, but with 12MB of L3 cache. Remember that Nehalem/Lynnfield have 8MB and Clarkdale has 4MB. Nehalem’s chief architect, Ronak Singhal told me that he wanted to maintain at least 2MB of L3 per core on the die. A 6-core Westmere adheres to that policy.

The chip works in existing LGA-1366 sockets, so you still have three DDR3 memory channels. 6C Westmere does support both regular DDR3 (1.5V) as well as low voltage DDR3 (1.35V). This is particularly useful in servers where you’ve got a lot of memory present, power consumption should be noticeably lower.

The other big news is that Intel will be releasing 4-core variants of Westmere as well. While I originally assumed this would mean desktop and server, Intel hasn't committed to anything other than a quad-core Westmere. These parts could end up as server only or server and desktop.

The table below shows you the beauty of 32nm. Smaller die, more transistors:

CPU Codename Manufacturing Process Cores Transistor Count Die Size
Westmere 6C Gulftown 32nm 6 1.17B 240mm2
Nehalem 4C Bloomfield 45nm 4 731M 263mm2
Nehalem 4C Lynnfield 45nm 4 774M 296mm2
Westmere 2C Clarkdale 32nm 2 384M 81mm2


It also shows that there's a definite need for Intel to build a quad-core 32nm chip. Die sizes nearing 300mm2 aren't very desirable. The question is whether we'll see quad-core 32nm in 2010 desktops or if we'll have to wait for Sandy Bridge in 2011 for that.

We’ll find out soon enough.

Comments Locked


View All Comments

  • ClagMaster - Tuesday, March 23, 2010 - link

    I would like to see a 32nm Quad Core Westmere for Socket 1156 while retaining the on-die memory controller and more capable PCIe controller. I use a discrete graphics card and do not mind the P55 solution. These Quad Cores would be mainstream products but consume less power at the same clock, or operate at greater clock for the same power, and/or be more overclockable.

    A socket 1156 Westmere Quad Core priced between $200 to $300 is useful to this mainstream user while a socket 1366 Hex Core or socket 1156 Dual Core is not useful to me.
  • glenster - Saturday, February 6, 2010 - link

    World's Fastest Graphene Transistor:
    100 billion cycles/second (100 GigaHertz)">
  • tygrus - Tuesday, February 9, 2010 - link

    It takes several transistors to make logic gates. It takes several logic gates to make a pipeline stage. Each pipeline stage must be in sync with the core clock. The core clock is buffered and propagated across the core. If the longest path in a pipeline stage takes 15 transistors then you need transistors to be at least 45GHz for a 3GHz clock target (15x3=45). Some stages such as L2 cache access or complex division take several cycles to complete but are still in sync.
  • TheGreek - Thursday, February 4, 2010 - link

    But folders tend to run their machine 100% loaded 24/7/365, so the power savings isn't much of an issue.

    But the sooner Intel lets us know that the chipset is also 32nm the better.
  • georgekn3mp - Thursday, February 4, 2010 - link

    Gulftown (Non-Extreme Edition) I7-970 6-core...should be released by 3Qtr 2010 for $564 bulk. That's 6 cores, clocked at around 3.4GHz at 130 Watts, by fall 2010.

    As opposed to i7-980Xe "Extreme" Gultown at $999...
  • Robear - Friday, February 5, 2010 - link

    Wasn't it said somewhere that the 6-core extreme would be 2.66GHz? I mean unless they really pulled off something spectacular with westmere it's hard to see 6 cores fitting in a 130W TDP over 3GHz.
  • Mike1111 - Thursday, February 4, 2010 - link

    I think we'll have to wait for 32nm quad core desktop CPUs (Lynnfield successor) until Sandy Bridge in Q1/2011. Everything else doesn't make much sense because Intel just has or just will release updates in the other segments (32nm mobile CPUs just released, 32nm high-end desktop and server coming in march). What else could be the first Sandy Bridge chip?
  • Voldenuit - Thursday, February 4, 2010 - link

    Whoa. Most of your systems are idling at/above 100W? @_@ Something is seriously rotten in the state of Denmark here.

    Even the Athlon II X4 955 BE system at SPCR drew only 56W (DC)">

    And their intel Core i5 661 system sips a miserly 18W (!) at idle.">

    Even factoring in PSU conversion losses, the high idle power draw of your test systems is worrying. Is that due to component choices or turning off power saving features? If the former, it seems rather pointless to be comparing CPU power efficiency when the rest of the system is drawing inordinate amounts of power, and if the latter, why?
  • kmmatney - Thursday, February 4, 2010 - link

    They aren't measuring the same thing, though. the SPC review is measuring the load on the ATX 12V connector, which is the 4-pin connector just used to provide (or supplement) cpu power. The Anandtech measurements are measured at the wall, and include the PSU, motherboard, and other system devices.
  • Voldenuit - Thursday, February 4, 2010 - link

    SPCR is measuring total system power (DC output) as well as just the CPU+VRM (the ATX12V figure). They report both figures. I fail to see where the confusion lies?

Log in

Don't have an account? Sign up now