Motherboards Memory Storage Cases/Cooling/PSUs IT Computing Displays Mobile Mac CPUs & Chipsets Video Digital Cameras Linux Gadgets Systems Trade Shows Guides Home Increase Font Size Decrease Font Size Change Page Size
Nehalem: The Unwritten Chapters
Nehalem: The Unwritten Chapters
Date: November 7th, 2008
Topic: CPU & Chipset
Manufacturer: Intel
Author: Anand Lal Shimpi
Buy the Intel BX80601975 Core Extreme Edition
Blank
 CostCentral $1,041.17
 Newegg $999.99
 CircuitCity $974.99
 
 

Despite being extremely well prepared in having Nehalem, motherboards, coolers and memory well before launch, the run up to the NDA lift of Intel's Core i7 processors was stressful. There was so much to test: multi-GPU compatibility with X58, memory controller performance, general application performance, overclocking, Hyper Threading, etc...

We're all still hard at work on sorting out the details, Gary is working on a X58 motherboard roundup and has been testing 12GB memory configurations for the past several days (as well as working with board vendors to improve performance/compatibility with 12GB but I'll let you tell him about that), Derek is working on multi-GPU performance and Kris has been working on an overclocking guide. What have I been up to? Well, I've been trying to answer a few lingering questions about Nehalem.

What I've got today are the first results of the questions I've been asking, I've spent the past week looking at power efficiency, memory latency and talking to some of Intel's finest on the phone about Nehalem. And I'm back to report, gather 'round for Nehalem: The Unwritten Chapters.

The Uncore

I got a little more detail from Intel on the un-core clock. Just like Phenom, Intel’s Core i7 is divided into an area called the “core” and an area called the “uncore”. The core contains the individual processor cores and their L1/L2 caches, while the uncore houses the memory controller and the shared L3 cache. In our review I mentioned that the uncore runs at 2.66GHz, which is true, but only for the Core i7-965. The Core i7-940 and 920 both run the uncore at 2.13GHz.

The uncore clock is defined by Intel just like the core clock is - Intel sets it based on yield and performance targets. As I mentioned in the launch review, the uncore clock runs at a simple multiplier of the bclk (133MHz): 20x for the i7-965 and 16x for the i7-940/920. The uncore also runs at its own voltage (1.20V) and that voltage doesn't scale up/down.

On Intel’s own X58 board the uncore clock is configured on the memory settings page and is simply called UCLK:

I took the i7-965, ran it at 2.66GHz to simulate an i7-920, and varied the uncore clock to measure the impact in L3 cache and memory latency:

Core Clock Uncore Clock L3 Latency Main Memory Latency x264 HD Benchmark Cinebench XCPU Benchmark
2.66GHz 2.93GHz 34 cycles 143 cycles 72.8 fps 13456
2.66GHz 2.66GHz 36 cycles 148 cycles 73.0 fps 13429
2.66GHz 2.13GHz 41 cycles 159 cycles 72.7 fps 13182

 

At a 2.66GHz uncore clock things seem to hit a sweet spot, although the translation to real-world performance just isn't there. Perhaps in a very memory intensive test we'd see something more pronounced, but even the x264 HD encoding test showed no performance difference between the three uncore clock speeds.

Surprisingly enough, I couldn’t get the i7-965’s uncore to hit 3.2GHz - Vista would bluescreen before I could even get to the desktop (note that the Intel X58 board I was using did not support adjusting the uncore voltage, so it remained at stock). As the table above shows, increases in uncore frequency aren't nearly as useful as increasing the CPU frequency. Intel recognized this performance relationship as well and chose to optimize the uncore for power consumption, not clock speed, which means that the uncore won't be able to clock as high as the core itself. You could always increase the voltage a lot to try and boost uncore speed but right now it's not looking like the tradeoff would be worth it as you'd increase power quite a bit.

The Overclocking Story: Much Ado About Nothing   Next Page

 
  Index

Tools Share
Find lowest prices Find the lowest prices
Digg   del.icio.us   E-mail  
Print This Article Print this article  

23 Comments - Last by lemonadesoda, 366 days ago
Username:
Password:
Top Side Contact pads by Mclendo06, 378 days ago
Does anyone know what the contact pads on the top edges of the processor are for? I've wondered this for a while but a quick google search only yielded questions. Also, Anand, thanks for the great coverage.

Reply
RE: Top Side Contact pads by Clauzii, 377 days ago
Good question. Probably used for final testing/burn in.

Reply
Nice by npp, 378 days ago
"Intel has done nothing to limit overclocking with the Core i7" :)
There was such a huge anti-campaign going on everywhere towards Core i7 overclocking that it seems almost funny to hear that now. I just couldn't imagine how on earth Intel would ditch one of the sweetest things in the geek world just for fun... They weren't so stupid, fortunately.

I would be very interested in some idle/full load temps, particulary for the junior model, at stock speeds and overclocked to some reasonable 24/7 level. It's interesting to see how much they differ from temps we're used to see right now with the good old Core 2 Duos/Quads.

Reply
QPI power hungry by cpugeek, 378 days ago
I think anandtech fail to mention about QPI vs FSB. QPI is super power hungry and offset a lot of power reduction done by Intel. Thats why Lynfield/clarkfield will be much better power efficient since they didn't use QPI physical layer to talk with chipset/tylesburg.

Reply
multi-tasking tests by tynopik, 378 days ago
> (I will be working on a Hyper Threading/multi-tasking set of tests next).

looking forward to it!

(and then the VM tests ;)

Reply
Further questions by Denithor, 378 days ago
Great article. Very impressive results here, congrats to the i7 design team. Of course, we all said the same thing when C2D was launched, with a much bigger differential in performance/watt versus the "Netbust" architecture.

Have you guys tried F@H SMP client on these i7 chips yet? I'm curious how they stack up against the Q9xx0 series in raw performance. Do the multithreading improvements help put CPU folding any closer to GPU folding or will GPU continue to reign supreme?

Does Intel intend to launch dual-core versions of these processors or will this generation be quad only?

Finally, for myself, I have an e8400 and an e3110 which are more than adequate for my current needs. I doubt I will even bother with one of these new setups, I'll just wait until Westmere and the 32nm improvements (higher clocks, lower power, heat and probably price).

Reply
RE: Further questions by Strid, 378 days ago
Yeah, I agree. While the offer a solid quad-core performance, and possibly also with a decent energy efficiency, they're not much use for a guy like me who doesn't use much of that multi-core jazz.
They might not chew up more watts than QX9770, but QX9770 still is a lot more hungry than even the currently quickest 45 nm dual core (E8600). Any news as to a dual-core'd version of Nehalem yet? I'll stick to my Xeon E3110 until then.

Reply
Hyperthreading Speedup by ltcommanderdata, 378 days ago
Great article. It's nice to see someone do a more in depth analysis of Nehalem's characteristics rather than just printing a bunch of benchmarks.

In regards to you Hyperthreading tests, it might be interesting to isolate the causes of HT performance increases in Nehalem. HT quite often was a hinderance for Netburst and it would be interesting to see whether the cause was primarily HT's implementation in Netburst or just do the the maturity of HT compatible software at the time. It's an odd coincidence that the last processor to carry HT, besides Atom, was the Pentium Extreme Edition 965 while the first desktop processor to reintroduce HT is again numbered 965 as part of the Core i7 family.

For instance, you could try to compare the speedup that 965EE receives going from 2 to 4 threads against the i7-965 doing the same. It would also be interesting to see if HT's performance delta improves going from Windows XP to Windows Vista, which would imply that Vista's scheduler is smarter about dispatching tasks to logical cores that don't share resources.

And in regards to mobile Nehalem, I agree that the power consumption improvements could really benefit notebooks, but it's kind of curious that Nehalem won't come to notebooks until Q3 2009. I believe previous Core 2 rollouts for Merom and Penryn were pretty fast, like a quarter spread between the desktop, notebook, and UP/DP server markets, but this looks to be a 3 quarter spread. I wonder what the delay is? With a Q3 2009 mobile Nehalem launch, they might as well just wait a quarter and do a strong roll out of Westmere on mobile first.

Reply
RE: Hyperthreading Speedup by Denithor, 378 days ago
HT works well on i7 because of two things: software is much more multithreaded today and there have been drastic throughput & memory controller improvements in the generations from Netbust to Nehalem.

Multithreaded applications can be accelerated hugely by pulling resources from multiple cores to work on one application (whether physical or virtual cores doesn't matter).

HT on Netbust was like fitting a garden hose onto a fire hydrant. The data just backed up and couldn't feed through the pipe smoothly. On i7 the bandwidth and memory controller have been optimized to improve flow so the cores don't sit idle (HT basically levels the flow of work across the cores so they all stay busy).

Reply
RE: Hyperthreading Speedup by SiXiam, 378 days ago
"The Q9450 can operate at voltages down to 0.85V and as high as 1.3625V, while the Core i7-920 currently appears to be limited to a minimum of around 1.137V."

- I just wanted to let everyone know that benchmarkreviews.com got the i7 920 at stock speeds with 1.125volts.

2.66 GHz @ 1.125v 133mhz x20
http://benchmarkreviews.com/index.php?o...Itemid=63&limit=1&limitstart=5


Reply
Comments Page 1 of 3

Unlicensed Software at Your Last Company
Anonymously Report Unlicensed Software with Our Form Now. Get Up to $1 Million.
We Buy Laptop and PC Memory! Sell to Us!
Min of 25 pieces required. Call us today at 239.354.1230.
Special Offer from The Economist
Get 12 issues of The Economist for $12. US subscribers only.
Free Forrester Risk Management Report
Demystifying Enterprise Risk Management. Download Free With Registration.
Download Microsoft Visual Studio ® Team System
Streamline Dev processes, Reduce time to market. Try Microsoft Visual Studio Team System, FREE!




Latest news by
DailyTech

 November 20, 2009

Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank

 November 19, 2009

Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank


Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
more CPU & Chipset Discussions



pipeboost
Copyright © 1997-2009 AnandTech, Inc. All rights reserved. Terms, Conditions and Privacy Information.
Click Here for Advertising Information