Multiple Clock Domains

Functionally there are some basic differences between Nehalem and previous Intel architectures. The Front Side Bus is gone and replaced with Intel's Quick Path Interconnect, similar to AMD's Hyper Transport. The QPI implementation on the first Nehalem is a 25.6GB/s interface which matches up perfectly to the 25.6GB/s of memory bandwidth Nehalem has.

The CPU operates on a multiplier of the QPI source clock, which in this case is 133MHz. The top bin Nehalem runs at 3.2GHz or 133MHz x 24. The L3 cache and memory controller operate on a separate clock frequency called the un-core clock. This frequency is currently 20x the BCLK, or 2.66GHz.

This is all very similar to AMD's Phenom, but where the two differ is in how they handle power management. While AMD will allow individual cores to request different clock speeds, Nehalem attempts to run all of its cores at the same frequency; if one core is idle then it's simply power gated and the core is effectively turned off. I explain this in greater detail here but the end result is that we don't have the strange performance issues that sometimes appear with AMD's Cool'n'Quiet enabled. While we have to turn off CnQ to get repeatable results in some of our benchmarks (in some cases we'll see a 50% performance hit with CnQ enabled), Intel's EIST seems to be fine when turned on and does not concern us.

My Concern

Looking at Nehalem's microarchitecture one thing becomes very clear: this is a CPU designed to address Intel's shortcomings in the server space. There's nothing inherently wrong about that, but it's a different approach than what Intel did with Conroe. With Conroe Intel took a mobile architecture and using the philosophy that what was good for mobile, in terms of power efficiency and performance per watt, would also be good for the desktop, it created its current microarchitecture.

This was in stark contrast to how microprocessor development used to work; chips would be designed for the server/workstation/high end desktop market and trickle down to mainstream users and the mobile space. But Conroe changed all of that, it's a good part of why Intel's Core 2 architecture makes such a great desktop and mobile processor.

Power obviously also matters in servers but not to the same extent as notebooks, needless to say Conroe did well in the server market but it lacked some key features that allowed AMD to hang onto market share.

Nehalem started out as an architecture that addressed these enterprise shortcomings head on. The on-die memory controller, Hyper Threading, larger TLBs, improved virtualization performance, restructured cache hierarchy, the new 2nd level branch predictor, all of these features will be very important to making Intel more competitive in the enterprise space, but at what cost to desktop power consumption and performance?


Intel promises better energy efficiency for the desktop, we'll be the judge of that...

I'm stating the concern up front because when I approached today's Nehalem review that's what I had in mind. Everyone has high expectations for Nehalem, but it hasn't been that long since Intel dropped Prescott on us - what I want to find out is whether Intel has stayed true to its mission on keeping power in check or if we've simply regressed with Nehalem.

The only hope I had for Nehalem was that it was the first high performance desktop core that implemented Intel's new 2:1 performance:power ratio rule. Also used by the Atom's design team, every feature that made its way into Nehalem had to increase performance by 2% for every 1% increase in power consumption otherwise it wasn't allowed in the design. In the past Intel used a general 1:1 ratio between power and performance, but with Nehalem the standards were much higher. We'll find out if Intel was all talk in a moment, but let's take a look at Nehalem's biggest weakness first.

Index The Chips
Comments Locked

73 Comments

View All Comments

  • Clauzii - Thursday, November 6, 2008 - link

    I still use PS/2. None of the USB keyboards I've borrowed or tried out would work in 'boot'. Also I think a PS/2 keyboard/mouse don't lag so much, maybe because it has it's own non-shared interrupt line.

    But I can see a problem with PS/2 in the future, with keyboards like the Art Lebedev ones. When that technology gets more pocket friendly I'd gladly like to see upgraded but still dedicated keyboard/mouse connectors.
  • The0ne - Monday, November 3, 2008 - link

    Yes. I have the PS2 keyboard on-hand in case my USB keyboard can't get in :)
  • Strid - Monday, November 3, 2008 - link

    Ahh, makes sense. Thanks for clarifying!
  • Genx87 - Monday, November 3, 2008 - link

    After living through the hell that were ATI drivers back in 2003-2004 on a 9600 Pro AIW. I didnt learn and I plopped money down on a 4850 and have had terrible driver quality since. More BSOD from the ati driver than I have had in windows in the past 5 years combined from anything. Back to Nvidia for me when I get a chance.

    That said this review is pretty much what I expected after reading the preview article in August. They are really trying to recapture market in the 4 socket space. A place where AMD has been able to do well. This chip is designed for server work. Ill pick one up after my E8400 runs out of steam.
  • Griswold - Tuesday, November 4, 2008 - link

    You're just not clever enough to setup your system properly. I have two indentical systems sitting here side by side with the only difference being the video card (HD3870 in one and a 8800GT in the other) and the box with the nvidia cards gives me order of magnitude more headaches due to crashing driver. While that also happens on the 3870 machine now and then, its nowehere nearly as often. But the best part: none of the produces a BSOD. That is why I know you're most likely the culprit (the alternative is faulty hardware or a pathetic overclock).
  • Lord 666 - Monday, November 3, 2008 - link

    The stock speed of a Q9550 is 2.83ghz, not 2.66qhz.

    Why the handicap?
  • Anand Lal Shimpi - Monday, November 3, 2008 - link

    My mistake, it was a Q9450 that was used. The Q9550 label was from an earlier version of the spreadsheet that got canned due to time constraints. I wanted a clock-for-clock comparison with the i7-920 which runs at 2.66GHz.

    Take care,
    Anand
  • faxon - Monday, November 3, 2008 - link

    toms hardware published an article detailing that there would be a cap on how high you are allowed to clock your part before it would downclock it back to stock. since this is an integrated par of the core, you can only turn it off/up/down if they unlock it. the limit was supposedly a 130watt thermal dissipation mark. what effect did this have in your tests on overclocking the 920?
  • Gary Key - Monday, November 3, 2008 - link

    We have not had any problems clocking our 920 to the 3.6GHz~3.8GHz level with proper cooling. The 920, 940, and 965 will all clock down as core temps increase above the 80C level. We noticed half step decreases above 80C or so and watched our core multipliers throttle down to as low as 5.5 when core temps exceeded 90C and then increase back to normal as temperatures were lowered.

    This occurred with stock voltages or with the VCore set to 1.5V, it was dependent on thermals, not voltages or clock speeds in our tests. That said, I am still running a battery of tests on the 920 right now, but I have not seen an artificial cap yet. That does not mean it might not exist, just that we have not triggered it yet.

    I will try the 920 on the Intel board that Toms used this morning to see if it operates any differently than the ASUS and MSI boards.
  • Th3Eagle - Monday, November 3, 2008 - link

    I wonder how close you came to those temperatures while overclocking these processors.

    The 920 to 3.6/3.8 is a nice overclock but I wonder what you mean by proper cooling and how close you came to crossing the 80C "boundary"?

Log in

Don't have an account? Sign up now