Nehalem's Weakness: Cache

Intel opted for a very Opteron-like cache hierarchy with Nehalem, each core gets a small L2 cache and they all sit behind one large, shared L3 cache. This sort of a setup benefits large codebase applications that are also well threaded, for example the type of things you'd encounter in a database server. The problem is that the CPU launching today, the Core i7, is designed to be used in a desktop.

Let's look at a quick comparison between Nehalem and Penryn's cache setups:

  Intel Nehalem Intel Penryn
L1 Size / L1 Latency 64KB / 4 cycles 64KB / 3 cycles
L2 Size / L2 Latency 256KB / 11 cycles 6MB* / 15 cycles
L3 Size / L3 Latency 8MB / 39 cycles N/A
Main Memory Latency (DDR3-1600 CAS7) 107 cycles (33.4 ns) 160 cycles (50.3 ns)

*Note 6MB per 2 cores

Nehalem's L2 cache does get a bit faster, but the speed doesn't make up for the lack of size. I suspect that Intel will address the L2 size issue with the 32nm shrink, but until then most applications will have to deal with a significantly reduced L2 cache size per core. The performance impact is mitigated by two things: 1) the fast L3 cache, and 2) the very fast on die memory controller. Fortunately for Nehalem, most applications can't fit entirely within cache and thus even the large 6MB and 12MB L2 caches of its predecessors can't completely contain everything, thus giving Nehalem's L3 cache and memory controller time to level the playing field.

The end result, as you'll soon see, is that in some cases Nehalem's architecture manages to take two steps forward, and two steps back, resulting a zero net improvement over Penryn. The perfect example is 3D gaming as you can see below:

  Intel Nehalem (3.2GHz) Intel Penryn (3.2GHz)
Age of Conan 123 fps 107.9 fps
Race Driver GRID 102.9 fps 103 fps
Crysis 40.5 fps 41.7 fps
Farcry 2 115.1 fps 102.6 fps
Fallout 3 83.2 fps 77.2 fps

 

Age of Conan and Fallout 3 show significant improvements in performance when not GPU bound, while Crysis and Race Driver GRID offer absolutely no benefit to Nehalem. It's almost Prescott-like in that Intel put in a lot of architectural innovation into a design that can, at times, offer no performance improvement over its predecessor. Where Nehalem fails to be like Prescott is in that it can offer tremendous performance increases and it's on the very opposite end of the power efficiency spectrum, but we'll get to that in a moment.

The Chips Understanding Nehalem's Memory Architecture
Comments Locked

73 Comments

View All Comments

  • Spectator - Monday, November 3, 2008 - link

    that sht is totally logical.

    And Im proper impressed. I would do that.

    you can re-process your entire stock at whim to satisfy the current market. that sht deserves some praise, even more so when die shrinks happen. Its an apparently seemless transition. Unless world works it out and learns how to mod existing chips?

    Chukkle. but hey im drunk; and I dont care. I just thought that would be a logical step. Im still waiting for cheap SSD's :P

    Spectator.
  • tential - Monday, November 3, 2008 - link

    We already knew nehalem wasn't going to be that much of a game changer. The blog posts you guys had up weeks ago said that because of the cache sizes and stuff not to expect huge gains in performance of games if any. However because of hyperthreading I think there also needs to be some tests to see how multi tasking goes. No doubt those gains will be huge. Virus scanning while playing games and other things should have extremely nice benefits you would think. Those tests would be most interesting although when I buy my PC nehalem will be mainstream.
  • npp - Monday, November 3, 2008 - link

    I'm very curious to see some scientific results from the new CPUs, MATLAB and Mathematica benchmarks, and maybe some more. It's interesting to see if Core i7 can deliver something on these fronts, too.
  • pervisanathema - Monday, November 3, 2008 - link

    I was afraid Nehalem was going to be a game changer. My wallet is grateful that its overall performance gains do not even come close to justifying dumping my entire platform. My x3350 @ 3.6GHz will be just fine for quite some time yet. :)

    Additionally, its relatively high price means that AMD can still be competitive in the budget to low mid range market which is good for my wallet as well. Intel needs competition.
  • iwodo - Monday, November 3, 2008 - link

    Since there are virtually no performance lost when using Dual Channel. Hopefully we will see some high performance DDR3 with low Latency next year?
    And which means apart from having half the core, Desktop version doesn't look so bad.

    And since you state the Socket 1366 will be able to sit a Eight Core inside, i expect the 11xx socket will be able to suit a Quad Core as well?

    So why we dont just have 13xx Socket to fit it all? Is the cost really that high?
  • QChronoD - Monday, November 3, 2008 - link

    How long are they going to utilize this new socket??
    $284 for the i7-920 isn't bad, but will it be worth the extra to buy a top end board that will appreciate a CPU upgrade 1-2 years later? Or is this going to be useless once Intel Ticks in '10?
  • Strid - Monday, November 3, 2008 - link

    Great article. I enjoyed reading it. One thing I stumbled upon though.

    "The PS/2 keyboard port is a nod to the overclocking crowd as is the clear CMOS switch."

    What makes a PS/2 port good for overclockers? I see the use for the clear CMOS switch, but ...
  • 3DoubleD - Monday, November 3, 2008 - link

    In my experience USB keyboards do not consistently allow input during the POST screen. If you are overclocking and want to enter the BIOS or cancel an overclock you need a keyboard that works immediately once the POST screen appears. I've been caught with only a USB keyboard and I got stuck with a bad overclock and had to reset the CMOS to gain control back because I couldn't cancel the overclock.
  • Clauzii - Monday, November 3, 2008 - link

    I thought the "USB Legacy support" mode was for exactly that? So legacy mode is for when the PC are booted in DOS, but not during pre?
  • sprockkets - Monday, November 3, 2008 - link

    No, USB legacy support is for support during boot up and for the time you need input before an OS takes control of the system. However, as already mentioned, sometimes USB keyboards just don't work in a BIOS at startup for one reason or another, and in my opinion, this means they should NEVER get rid of the old PS/2 port.

    I ran into this problem with a Shuttle XPC with the G33 chipset, which had no ps/2 ports on it. There was a 50/50 chance it would not work.

Log in

Don't have an account? Sign up now