Final Words

The more I think about it, the more I'm confident that the Core i7 continues to fuel Intel's beacon of performance, although admittedly the biggest gains are in well threaded workloads (I will be working on a Hyper Threading/multi-tasking set of tests next). It's not worth the upgrade for most existing Core 2 Quad owners unless you do a lot of video encoding, video editing or 3D rendering, but going forward it looks very likely to continue Intel's performance lead even as AMD brings up its 45nm Phenom processors.

Take power efficiency into account however and then Nehalem gets more interesting to more people. Right now we're only talking about 130W TDP parts, which means that the power efficiency really only applies to someone looking to replace a QX9770. Going forward, when Intel can deliver a 95W, 65W or even lower TDP based on the Nehalem then there may be a compelling power efficiency story. A 10 - 20% decrease in power consumption, at the same manufacturing process, is nothing to scoff at. Then a year from now we get the same architecture built on 32nm, which will hopefully give us an even further reduction in power consumption. It's weird to say, but Nehalem may end up being an incredibly good architecture for notebooks. Keep that in mind before buying those new MacBooks guys.

The power efficiency story gets even more exciting when you realize that these gains come with no change in manufacturing process. Pardon the pun, but the next tick is going to be a cool one.

The overclocking story with Core i7 isn't as complex as it sounded at first, fundamentally you can still clock this thing the way you did the Core 2s before it. Turbo mode and the TDP/current limitations do add some complexities, but with the flip of a BIOS switch they go away if you don't wish to bother with them. Change can be scary, but in this case there's no reason to be worried.

The Core i7 appears to be just as smooth of an overclocker as the Core 2s before it. Increase the BCLK and off you go, free performance from Intel and its wonderful fabs.

The split between the core and the uncore in terms of clock speed and overclocking potential doesn't appear to be that big of a deal either. The uncore runs slower on the lower end chips, but increasing its clock speed doesn't really do all that much for performance. There's a reason Intel kept the uncore running slower than the core and it doesn't look like there's much real world benefit in pushing it much higher.

With Nehalem Intel implemented a lot of changes simultaneously. We got Hyper Threading, a completely static CMOS design, new power gate transistors, QPI, an integrated memory controller and some other lower level architectural tweaks. It's a lot to digest, but we're getting there. To Intel: deliver us some 95W and 65W TDP Nehalem and you'll win the hearts of the current Q6600/Q9300/Q9450 owners.

And I can't wait to see one of these things in a notebook, mobile Nehalem could be the most exciting Centrino launch since Merom...

Oooh, Shiny - But Why?
Comments Locked

23 Comments

View All Comments

  • lemonadesoda - Wednesday, November 19, 2008 - link

    Anand. Fantastic article, but:

    1./ You didnt mention whether your tests were on 32bit or 64bit. We know that 32bit Core 2 is more efficient due to microcode fusion, whereas that isnt true for 64bit. On i7, opcode fusion is there on 64bit.

    2./ I think you should execute a CPU HALT to observe deep down idle. This figure, say 110W, should then be SUBTRACTED from all other results. Why? Because this is essentially the mainboard/HDD/system power draw excluding the CPU. I see from your figures that the power used (as a delta from idle) on i7 is actually HIGHER than QX9770. So I actually have a very different view than you. I think x58 is much more efficient, and that internal memory controller is less power than older northbridge. But when the i7 is crunching, is is using more power AT THE CPU than the QX9770
  • prodystopian - Monday, November 10, 2008 - link

    While this limit is a non-issue for anyone getting a X58 motherboard, what about those looking for the e2xxx of this generation? When looking for a cheap CPU to heavily OC to get an extreme Price/performance, it would be best to pair with a cheap motherboard such as the next P series (not X). I'm assuming we don't know whether this BIOS switch will be on the P series motherboards, but if it is not, that is where the real problem occurs.
  • Live - Sunday, November 9, 2008 - link

    I don't know if this has been answered yet but what are the advantage of the i7-965 higher QPI? Can you overclock the QPI and if so dose it make a difference?
  • Live - Sunday, November 9, 2008 - link

    Live I think you meant to write:

    I don't know if this has been answered yet, but what is the advantage of the i7-965 higher QPI? Can you overclock the QPI and if so does it make a difference?
  • CEO Ballmer - Saturday, November 8, 2008 - link

    Made for Vista!

    http://fakesteveballmer.blogspot.com">http://fakesteveballmer.blogspot.com
  • Rev1 - Saturday, November 8, 2008 - link

    Maybe im missing something but being that the multiplier was not unlocked how did he get it that high?
  • frazz - Saturday, November 8, 2008 - link

    Surely CPU power at a fixed voltage is proportional to the square of the voltage, not the cube? I thought the formula was this:

    Power dissipation = C.V^2.f where C is the capacitance being switched per clock cycle
  • frazz - Saturday, November 8, 2008 - link

    Sorry I meant CPU power at a fixed FREQUENCY is proportional to the square of the voltage. D'oh.
  • HolyFire - Saturday, November 8, 2008 - link

    I agree. This surely was a misinterpretation of Intel's slide, which actually meant: If the frequency is increased proportionally to the voltage, the power will go like voltage cubed. But for a fixed frequency, power goes like voltage squared.

    In either case, I find that slide a little suspicious, as I have not yet seen any theoretical or experimental result suggesting that frequency should be linearly proportional to voltage.
  • ltcommanderdata - Friday, November 7, 2008 - link

    Great article. It's nice to see someone do a more in depth analysis of Nehalem's characteristics rather than just printing a bunch of benchmarks.

    In regards to you Hyperthreading tests, it might be interesting to isolate the causes of HT performance increases in Nehalem. HT quite often was a hinderance for Netburst and it would be interesting to see whether the cause was primarily HT's implementation in Netburst or just do the the maturity of HT compatible software at the time. It's an odd coincidence that the last processor to carry HT, besides Atom, was the Pentium Extreme Edition 965 while the first desktop processor to reintroduce HT is again numbered 965 as part of the Core i7 family.

    For instance, you could try to compare the speedup that 965EE receives going from 2 to 4 threads against the i7-965 doing the same. It would also be interesting to see if HT's performance delta improves going from Windows XP to Windows Vista, which would imply that Vista's scheduler is smarter about dispatching tasks to logical cores that don't share resources.

    And in regards to mobile Nehalem, I agree that the power consumption improvements could really benefit notebooks, but it's kind of curious that Nehalem won't come to notebooks until Q3 2009. I believe previous Core 2 rollouts for Merom and Penryn were pretty fast, like a quarter spread between the desktop, notebook, and UP/DP server markets, but this looks to be a 3 quarter spread. I wonder what the delay is? With a Q3 2009 mobile Nehalem launch, they might as well just wait a quarter and do a strong roll out of Westmere on mobile first.

Log in

Don't have an account? Sign up now