Motherboards Memory Storage Cases/Cooling/PSUs IT Computing Displays Mobile Mac CPUs & Chipsets Video Digital Cameras Linux Gadgets Systems Trade Shows Guides Home Increase Font Size Decrease Font Size Change Page Size
What Nehalem is really about
What Nehalem is really about
Date: August 19th, 2008
Author: Johan De Gelas
 
 

IDF has started and the first benchmarks of Nehalem are going to start popping up. It is without a doubt an impressive architecture with a much better platform to run on, but this CPU is not about giving you better frames per second in your favorite game than the Penryn family. Let me make that more clear: even when the GPU is not the bottleneck, it is likely that most games will not be significantly faster than on Penryn. We, the people behind it.anandtech.com will probably have the most fun with it, more than your favorite review crew at Anandtech.com :-). And no, I have not seen any tests before I type this. Nehalem is about improving HPC, Database, and virtualization performance, and much less about gaming performance. Maybe this will change once games get some heavy physics threads, but not right away.

Why? Most Games are about fast caches and super integer performance. After all, most of the Floating point action is already happening on the GPU. The Core 2 CPUs were a huge step forward in integer performance (not the least because of memory disambiguation) compared to the CPUs of that time (P4 and K8). Nehalem is only a small step forward in integer performance, and the gains due to slightly increased integer performance are mostly negated by the new cache system. In a previous post I told you that most games really like the huge L2 of the Core family. With Nehalem they are getting a 32KB L1 with a 4 cycle latency, next a very small (compared to the older Intel CPUs) 256KB L2 cache with 12 cycle latency, and after that a pretty slow 40 cycle 8MB L3. When running on Penryn, they used to get a 3 cycle L1 and a 14 cycle 6144KB L2. The Penryn L2 is 24 times larger than on Nehalem!

The percentage of L2 caches misses for most games running on a Penryn CPU is extremely low. Now that is going to change. The integrated memory controller of Nehalem will help some, but the fact remains that the L3 is slow and the L2 is small. However, that doesn't mean Intel made a bad choice. Intel made a superbly good choice by improving the performance where Core (Merom/Penryn) was mediocre to good. Penryn was already a magnificent gaming CPU, but it could not beat the AMD competition in HPC benchmarks, and AMD put up a good fight in database performance benchmarks. Now Intel is ready to fix these shortcomings.

Most Database code cannot use the wide architecture of Penryn very well. The number of instructions per cycle can be lower than 0.5 and waiting for the memory is the most probable cause. SMT or Hyper-Threading can do wonders here: while one thread waits for a memory stall, the other thread continues working and vice versa.

Secondly, quad (and eight) socket performance is going to improve a lot as four Nehalems only have to keep four L3 caches in sync, while a similar Tigerton system has to keep eight L2 caches in sync. That is why the cache system is perfect for server performance, but a little less interesting for gaming performance.

The massive bandwidth that the integrated tri-channel memory controller delivers should also do wonders for HPC code, and the new TLB architecture with EPT will make Nehalem shine compared to its older Core brothers.

No, Nehalem wasn't made for the gaming enthusiasts. Rather, it was made to please the IT and HPC people. So we say bring it to it.anandtech.com; it's just not that interesting for you gamers! ;-)


47 Comments
Username:
Password:
Well, there is still one thing for gamers by AlexWade, 539 days ago
The price of the old Core 2 products will come down. So even if Core i7 (or whatever crazy name it is called) doesn't help in games, it will help my wallet when I am upgrading my computer later this year.

Reply
RE: Well, there is still one thing for gamers by Mithan, 539 days ago
I've got a E8400 OC'ed to 3.6ghz, so I am good :)

Worst case, I swap out my 8800GTS512 next year for whatever is "latest and greatest" and boom, double or triple frame rates in most games.

Core i7 can wait.

Reply
RE: Well, there is still one thing for gamers by del, 494 days ago
well, I will just complete my upgrade path, get a good Core 2 Quad with a 1,333-MHz FSB, and I guess, a better video card when the time comes. I still like Core i7 better though.

Reply
RE: Well, there is still one thing for gamers by bijiipalir, 369 days ago
Wii System Bundle and Wii Games Bundle

Reply
RE: Well, there is still one thing for gamers by bijiipalir, 369 days ago
RE: Well, there is still one thing for gamers by Mithan, 539 days ago
I've got a E8400 OC'ed to 3.6ghz, so I am good :)

Worst case, I swap out my 8800GTS512 next year for whatever is "latest and greatest" and boom, double or triple frame rates in most games.

Core i7 can wait.

Reply
RE: Well, there is still one thing for gamers by Mr Roboto, 539 days ago
Go easy with that 600Mhz overclock.

Reply
RE: Well, there is still one thing for gamers by Jedi2155, 539 days ago
Still running with my nearly 2 year old E6600 OC'ed to 3.6 GHz :). Of course not as fast a penryn clock for clock, but I was thinking about moving to a Nehalem. I'm happy my CPU hasn't burnt out yet with nearly 2 years of 1.55 volts being pushed through it along with hot SoCal weather.

Still, I wonder how Nehalem helps with encoding performance.

Reply
Johan by IntelUser2000, 538 days ago
I also love how the latency comparisons between Yorkfield and Nehalem is being skewed in favor of the article.

According to ANANDTECH'S benchmarks, Nehalem's L2 latency was 11 cycles and Yorkfield was 15 cycles, not 12 and 14.

Multiple websites have argued that Core 2's performance was not achievable(4 issue cannot be fully utilized blah blah). While Nehalem won't be a miracle in single thread, I think it'll be better than what this article will imply.

Reply
RE: Johan by JohanAnandtech, 536 days ago
Do you think one cycle of L2-cache latency will matter? Depending on the tool you use, quadcore Intel's report a 14 tot 15 cycle L2. On Nehalem, I am just using the numbers I had at that point. Seems that 10-12 cycles is more or less accurate.

And no I didn't argue that 4-issue can not be fully utilized (that is BS, the 4-way decoding is for peak moments, to get the average higher). I argued that well implemented SMT can help typical low IPC loads such as database workloads to achieve much higher performance.

Reply
Agreed by Nehemoth, 539 days ago
I can't agreed more with u guys, cause I hope this architecture fly, course I Hope a fight between AMD and Intel on servers; is really no so easy as the desktop environment.

I really just hope for FBDIMM improvements, no so Hot please and also of course I really hope Good To Excellent competition between Shangai and Nehalem, but the real improvement here from AMD should be when they're switch to DDR3.

Reply
RE: Agreed by Turas, 539 days ago
Aren't FB-Dimms being dropped? I was under the impressions even the dual socket 1366 (or whatever it is) will be using regular DDR3 now.

Reply
RE: Agreed by MonkeyPaw, 539 days ago
Yeah, FB-Dimms are not the future. They add heat, complexity, and cost. Of course, most of that doesn't affect desktop users, unless you are a MacPro/Skulltrail user. The IMC and Hypertransport have effectively killed complex memory configurations, since each CPU finally gets to manage its own memory.

Really, Nehalem is just taking Opteron's strong points and adding it to the Core2 architecture. Intel should win a lot of server benches now. And to think they spent billions on Itanium for so many years. IA64's market keeps dwindling, and this will only chip away more. Just goes to show that Intel is not indestructible.

Reply
RE: Agreed by Nehemoth, 539 days ago
Really nice to hear that.

That's what we need standard, as DDR3 becomes the new one.

Also the last time I hear an Itanium investment number was 10 Billions.

But for some cases (In our company we have some) Itanium is the right choice.

Reply
actually it's quite interesting for gamers by gaiden2k5, 539 days ago
knowing that i wont be able to afford a Nahalem CPU, i become more interested in seeing the price drop for the Penryn's and hoping for Q9450 to fall below $200 :)

Reply
So AMD was TOO forward looking? by wingless, 539 days ago
This sort of validates AMD's approach with Barcelona in a way. Their actions were deliberate but they got berated because K10's architecture was not 100% focused on gaming. Now Nehalem is going to take an AMD-like approach but throw in that AMAZING Hyper-Threading which will take server applications and multi-threaded gaming to a new level past AMD's K10. On Xtremesystems.org I stated in a post that Penryn owners probably won't get much out of Nehalem that they can't already do with their 45nm Quads as far as gaming is concerned. My ego has just been given a boost now that my thoughts have been confirmed by the pros at Anandtech!

Reply
RE: So AMD was TOO forward looking? by DigitalFreak, 539 days ago
LOL

It's been posted in a number of "previews" of Core i7 that it won't do much for gaming or single socket desktops, period. Perhaps your should get your ego under control and stop taking credit for information you ripped off from somewhere else.

Reply
RE: So AMD was TOO forward looking? by Berger, 539 days ago
Seems all you ever do is go around is nitpicking DF, seeking personal glory or something?

Reply
RE: So AMD was TOO forward looking? by MonkeyPaw, 539 days ago
From what I gather, the third channel for the memory doesn't do much either. I can't imagine that dual channel DDR3 is at all handicapped for bandwidth. Perhaps the third channel is put in place for Larrabee descendants?

Reply
RE: So AMD was TOO forward looking? by MDme, 538 days ago
In a VERY big way yes.

1. "NEW" cache architecture: small L2; big shared L3 = Barcelona
2. Point to point serial links: Hypertranspor er Quickpath = Barcelona
3. On board memory controller: oh that's Barcelona too (and K8)
4. EPT: sounds a lot like NPT or RVI to me

on all these points intel did incorporate AMD tech into the CPU. So i7 is kinda like the P4, K10, Core2 combined.

Reply
RE: So AMD was TOO forward looking? by JohanAnandtech, 538 days ago
P4 is not really in there, except for the the "trace cache alike" Loop Stream Detector. If you are talking about SMT, SMT in the P4 is only a shadow of what SMT should be. SMT is a lot better implemented in Nehalem.

But for the rest you are right, EPT = RVI/NPT. But I wouldn't call p2p serial links and IMC AMD tech. There are a lot of companies who have done this before AMD.

Reply
RE: So AMD was TOO forward looking? by formulav8, 538 days ago
But if AMD didn't have it would Intel have adopted it? I say No for the most part.

In a way AMD did alot of things before Intel, that Intel later followed/copied. IMC, HTT, x64, True/Native Dual and Quad Core, Shared L3 Cache, High Performance Discreet GPU's, High Performance IGP's, 4x4, DDR based memory, and so on.


Jason

Reply
Interesting by IceBreakerG, 539 days ago
This is interesting information. Right now I'm trying to decide if I should just go ahead and build a Q9550 system or try to wait for Nehalem. My current system, Athlon 64 3800+ single core, has been showing it's age for quite a while now, and I need an upgrade. Since gaming performance isn't expected to go up much, but other areas are, I'm still on the fence.

I have my Xbox 360 for games, so gaing isn't too important (as long as I can play Roller Coaster Tycoon 3 lol). However, I do a lot of video encoding and music production. Too many decisions. I think either way, whatever I decide, the new system will be significantly faster than what I have now so either will be a nice upgrade for me.

Reply
RE: Interesting by theplaidfad, 539 days ago
The correct decision is to hold onto that athlon a little longer, and THEN hit the q9550 when the i7 comes out and get in on that oh so sssschuuweeeeet price drop that should happen :)

Reply
RE: Interesting by Calin, 539 days ago
There is always something better in the future: a new architecture, faster (same architecture) processors, and price cuts.
As such, if you can wait until the next generation comes, you could hope for a price cut. And maybe better performance/overclockability due to new microprocessor stepping, and a better mainboard than right now (though this might not be so).

Reply
RE: Interesting by gabo, 539 days ago
I'm waiting for this new generation of processors. I'm currently using a P4 1.6 Mhz with 1.5 GB of RAM. I dont do to much work with 3D or gaming (for that I also use my XBOX360). I mainly work coding in C asp php java and some flash, some database related work like admin backup and such, and sometimes work with excel word powerpoint, etc. and obiously web surfing and email.

Of course any upgrade would be tremendous in my situation, but what would any of you recommend most? a very low cost Penryn, when they drop in price, or a more expensive Nehalem? Which is the most bang for the buck for my particular needs?

Thanks in advance

Reply
RE: Interesting by gochichi, 537 days ago
I think waiting is silly, particularly as DDR2 memory is so cheap and so good.

Get a Quad-core, get 8GB of RAM, and never look back... don't wait another minute.

Reply
Interesting by Genx87, 539 days ago
I didnt know it had such a small L2 cache with a big L3. I will guess this is a process issue? With subsequent process shrinks we will start to see larger L2 per core?

I would agree that small of a cache will cripple it in gaming. Possibly making any bebefits minimal over Core 2 duo.

Should be interesting to see how it works out. Thought we had some benchmarks from last Spring that showed it 20-30% faster per clock in gaming? Either way I am on an E8400 and happy.

Reply
RE: Interesting by JohanAnandtech, 539 days ago
I wouldn't call it a process issue. 4 x 256 KB L2 + 8 MB L3 is a lot of cache. :-)

It is a trade-off as always. A small L2 cache for every core avoids that two cores at full throttle cause extra latencies and get in the way of each other. The L3 makes sure only one cache has to be kept coherent between the different socket.

It is completely inclusive L3, so it has to be a lot bigger than the L2, otherwise it is not effective. And it easier to keep the power cost down if you have a large L3 that is clocked lower and at a different power plane.

Reply
RE: Interesting by chizow, 539 days ago
Great points about the smaller L2 and slower L3, I'll have to keep an eye on that when looking at any i7 gaming performance benchies. My main buying point will be overclockability however. If Nehalem allows for higher clockspeeds, that should provide the benefit in gaming with all else counter-balancing against Penryn. If it clocks about the same as Penryn I'll just wait for the 32nm refresh before upgrading from my P45/Q6600 and let DDR3 prices drop a bit more.

Reply
RE: Interesting by JohanAnandtech, 538 days ago
I have my doubts about higher overclockability, especially if you would compare with one of the youngest steppings of Penryn. After all, a tri channel memory channel and 2 QPI links do not come for free. Nehalem is a bit more power hungry running at full speed (but it can also shutdown it's cores so it will consume less running light tasks). I would expect Nehalem to stay a bit behind Penryn in clock speeds unless Intel artificially keeps Penryn low with newer steppings.

Reply
Gaming by rgallant, 539 days ago
Look's like my E8600 @ 4.2 is good for a while.

Reply
RE: Gaming by ICE1966, 531 days ago
What in the hell do you need a 4.2ghz cpu for? Oh, wait, its just bragging rights, LOL

Reply
Stepping stone to Larrabee? by DXRick, 539 days ago
With Intel fantasizing about doing away with graphics cards and making ray tracing a reality for games, and Nehalem offering a 30% increase in CPU power, I would guess it is a stepping stone to Larrabee for gamers then. I can't see current game developers doing anything differently yet, as they will continue to use shaders (DirectX or OpenGL).

The one exception is Microsoft's FlightSimX, which benefits more from CPU power than GPU.

How will Nehalem impact developers looking at CUDA and PhysX?




Reply
Shanghai by Saen11, 539 days ago
If Core i7 does not improve gaming performance, and AMD's new 45nm chips do, could this then give AMD a chance to steal the gaming performance crown?

Reply
gaming? by melgross, 539 days ago
What I didn't get out of this article is whether gaming performance will be slightly better than Penyrn, about the same as Penyrn, or slightly worse.

I don't imagine that there will be a big difference.

Reply
RE: gaming? by ryedizzel, 539 days ago
lrn2read

Reply
ill wait for the " tock " by cokbun, 538 days ago
i'll wait till next year when the mature nehalem comes out, 32 nm sounds good.

Reply
How About Editing for Grammar? by CanamAldrin, 538 days ago
I don't think I'm nitpicking to say that this article is so full of grammatical errors I find it hard to read it fluidly. If English is a second language for the engineer/journalist, please have someone edit his writing before it gets posted. I don't think that is too much to expect from a site of AnandTech's stature.

Reply
Great for Vista! by steveyballme, 538 days ago
This will greatly increase speed of whirling, fading and spinning things on the desktop!

http://fakesteveballmer.blogspot.com

Reply
Disappointed, just another tick? by silversound, 538 days ago
Should this nahelam be much faster like intel claimed last year? Up to 3 times faster than penryn?
Only up to 30% improvement in app and no improvement in games does not deserve a tock...

core2 still way better leap from the sucky P4

Reply
Wait for what???? by gochichi, 537 days ago
It is ridiculous to wait for desktop parts, particularly for "gamers".

ATI just baffled us with a $175.00 very high end graphics card. Memory prices for DDR2 are basically what I would call "free of charge". Seriously, you can get 4GB of good RAM for $75.00...

8GB of RAM for a mere $150.00... that is going to take FOREVER to get to that price-point with DDR3... which means that 8GB of RAM will not go obsolete for a long time.

You can ALWAYS wait, but you have to balance out time and money because they are interchangeable. Computers are consumables, and you gotta eat. You can get a Lenovo X300 for $3k or wait ten years and spend $300.00... but what of your ten years of wait, and the lost utility?

I have a stock-clocked Q6600 (picked up a Dell refurbished desktop for $350.00), 4GB RAM (2 open slots), and Radeon 4850 and I seriously don't see a game worth playing that doesn't run beyond awesome. I hope something much better comes out that makes my system obsolete, that would be so much fun... but in the meantime I'm waiting in comfort and style and yall should too...

Running a single core Pentium is a waste of electricity. What are you using government money or something? You don't have to wait to spend $3k on a desktop and then wait 5 years until it's completely outdated junk... just spend $600 every two years and you'll have a great machine that runs the best software of its time.

In terms of laptops... I AM being a total hypocrite and waiting for LED displays to enter lower price points(one-two weeks now)... but that's because they are truly different than what's out there. I want to be able to see my laptop screen better... so I wait. But massive desktop performance is available right here and right now... waiting while you use outdated and slow junk is just weird.





Reply
Megahertz by perzy, 537 days ago
This heat wall is really a solid brick wall...
I think I will go out and search the bottom of the bargains bins at the shop for height of devolopment: old P4 Prescott 3.8 GHz.
Let's face it, 99% of the software I use are singlethreaded, and like cubic inches, there is no substitute for pure hertz!

Reply
RE: Megahertz by del, 494 days ago
blah...

A 3.0 GHz Core 2 Duo still runs single-threaded apps faster than a 3.8 GHz Pentium 4.

Reply
Core i7 vs Penryn by Poloasis, 537 days ago
I currently run a E8500 @4.3 W/C and since the Core i7 is not gearing towards gaming, I would not expect intel dropping the Penryn that much unless we all starts running HPC and database @ home.

Reply
Cache by mobilecomputing, 365 days ago
that much cache is going to make a collosal system. I'll stick with my netbooks for now. http://www.mobile-computing-news.co.uk/...rend-in-mobile-computing-netbooks.html

Reply
Comments Page 1 of 1





AnandTech.com Blog Categories
All categories
Anand's Macdates
Anand's Theater Construction
Anand's Updates
Cases and Power Supplies
CeBIT 2008
CES 2008
Computex 2009
Derek Decanted
Eddie's Got Game
Gary's First Looks
IT Computing general
Jarred's Musings
Kris's Corner
Raja's Ramblings
Rob's Experiences...
Ryan's Ramblings
Virtualization
What's New with Wes
Blank
Blank

Blank

Latest news by
DailyTech

 February 9, 2010

Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank

 February 8, 2010

Blank


more Blogs Discussions



pipeboost
Copyright © 1997-2010 AnandTech, Inc. All rights reserved. Terms, Conditions and Privacy Information.
Click Here for Advertising Information