Intel's Best x86 Server CPU

The launch of the Nehalem-EX a year ago was pretty spectacular. For the first time in Intel's history, the high-end Xeon did not have any real weakness. Before the Nehalem-EX, the best Xeons trailed behind the best RISC chips in either RAS, memory bandwidh, or raw processing power. The Nehalem-EX chip was well received in the market. In 2010, Intel's datacenter group reportedly brought in $8.57 billion, an increase of 35% over 2009.

The RISC server vendors have lost a lot of ground to the x86 world. According to IDC's Server Tracker (Q4 2010), the RISC/mainframe market share has halved since 2002, while Intel x86 chips now command almost 60% of the market. Interestingly, AMD grew from a negligble 0.7% to a decent 5.5%.

Only one year later, Intel is upgrading the top Xeon by introducing Westmere-EX. Shrinking Intel's largest Xeon to 32nm allows it to be clocked slightly higher, get two extra cores, and add 6MB L3 cache. At the same time the chip is quite a bit smaller, which makes it cheaper to produce. Unfortunately, the customer does not really benefit from that fact, as the top Xeon became more expensive. Anyway, the Nehalem-EX was a popular chip, so it is no surprise that the improved version has persuaded 19 vendors to produce 60 different designs, ranging from two up to 256 sockets.

Of course, this isn't surprising as even mediocre chips like Intel Xeon 7100 series got a lot of system vendor support, a result of Intel's dominant position in the server market. With their latest chip, Intel promises up to 40% better performance at slightly lower power consumption. Considering that the Westmere-EX is the most expensive x86 CPU, it needs to deliver on these promises, on top of providing rich RAS features.

We were able to test Intel's newest QSSC-S4R server, with both "normal" and new "low power" Samsung DIMMs.

Some impressive numbers

The new Xeon can boast some impressive numbers. Thanks to its massive 30MB L3 cache it has even more transistors than the Intel "Tukwilla" Itanium: 2.6 billion versus 2 billion transistors. Not that such items really matter without the performance and architecture to back it up, but the numbers ably demonstrate the complexity of these server CPUs.

Processor Size and Technology Comparison
CPU transistors count (million) Process

Die Size (mm²)

Cores
Intel Westmere-EX 2600 32 nm 513 10
Intel Nehalem-EX 2300 45 nm 684 8
Intel Dunnington 1900 45 nm 503 6
Intel Nehalem 731 45 nm 265 4
IBM Power 7 1200 45 nm 567 8
AMD Magny-cours 1808 (2x 904) 45 nm 692 (2x 346) 12
AMD Shanghai 705 45 nm 263 4

 

Test Servers and Benchmark Setup
POST A COMMENT

62 Comments

View All Comments

  • john@cepros.com - Thursday, May 19, 2011 - link

    I did not see anything in the article about RAS, or at least my understanding of the acronym as its used in IT. Are you using it to mean "Reliability, Availability, and Serviceability"? If so, where was that addressed in the article? If not, what was RAS supposed to mean?

    http://en.wikipedia.org/wiki/Reliability,_Availabi...
    Reply
  • haplo602 - Thursday, May 19, 2011 - link

    I second this comment. You mention that the new Xeons have exceletn RAS features but do not describe a single one.

    How about an article on that topic ? And comparing to Opteron and Itanium while you are at it ? I have no clue about IBM or Sparc chips (Itanium is my daily bread), so I'd be very much interested in such a comparison.

    The last thing I saw from a Nehalem Xeon was that it threw an MCA and rebooted the box. The only benefit was that it enabled some diagnostic. An Itanium system would deconfigure the CPU and boot stable with 1 less socket. The Xeon system just kept rebooting at the same point over and over again.
    Reply
  • Casper42 - Thursday, May 19, 2011 - link

    Go back and read the reviews on the Nehalem EX from 9 months ago.
    There are no major new RAS features in Westmere EX that I am aware of as its a die shrink and not a major feature change.

    One of the things I remember was the ability to identify and disable a bad DIMM or even a bad memory chip within a DIMM in such a way that (if the OS supports it) the machine wouldn't crash and could keep running.
    Also supports memory sparing so you can even load some extra memory in there to take over for the bad DIMM.

    But I'm no expert, go back and read the older articles.
    Reply
  • haplo602 - Friday, May 20, 2011 - link

    I know, that's what I remember. In my world, that's not RAS, and as I witnesed first hand, it does not always work as expected. Reply
  • L. - Thursday, May 19, 2011 - link

    Well .. if that's all the Intel 32nm process has to offer, I believe I can say there's blood in the water.

    The "crappy" old phenom-2 based Opterons are in fact keeping up in perf/watt WITH ONE LESS DIE SHRINK.

    This is just huge ... it means that unless AMD manages to fuck up the bulldozer extremely bad (as in making it worse than the phenom 2), just the die shrink will give them a clear perf/watt advantage.

    Add in the speed gained through the new process and the Xeons will look like power-hungry overpriced pieces of junk ... and that's still not considering that the bulldozer architecture is any better than the ph2.
    Reply
  • L. - Thursday, May 19, 2011 - link

    Also, if there ever was any time to buy amd stock . now it is. (like I said for nVidia back in July 2010, double within 6 months) Reply
  • Casper42 - Thursday, May 19, 2011 - link

    While it looks that way on paper, the reality is the opposite.

    Intel CPUs, especially with Nehalem/Westmere families, just outright sell themselves. For whatever reason, and I cant explain it myself, the AMDs just dont sell as well.

    Personally I love the new AMD line for servers.
    They use the same CPUs for high end 2P and all 4P servers.
    All the CPUs have the same memory speeds and loading rules
    Quad channel memory even on 2P
    They give you Cores-o-plenty (this can be a downside in the world of Oracle)

    Then they have a much cheaper 1P/2P option with half the cores and Dual Channel memory
    Each CPU family only has like 5/6 CPUs as well.
    Its such a simple lineup its so easy for a enterprise customer to standardize a large cross section of the DC.

    Now look at Intel.
    1P is the 3000 family
    2P is the 5000 family
    4P is both the 6000 and 7000 family
    8P is usually the 7000 family.
    1/2 and 4/8 have different memory designs including Tri vs Quad channel
    on 1/2 you get different memory speeds depending on what model CPU you buy.
    Which is really fun because they have like a dozen or more CPU models on each of 1P and 2P.

    So even though AMD seems like the better choice, Intel is still dominating the market.
    Sandy Bridge 2P Servers will be out before the end of the year. Right now it looks like Bulldozer might beat them to market by a matter of a few months. If AMD slips that date, Intel will still have quite a competitive product and BD had better basically be FLAWLESS.

    So for the next gen servers, I think the purchasing habits of most companies will not change unless AMD pulls a major rabbit out of their hat.
    Reply
  • haplo602 - Friday, May 20, 2011 - link

    on top of that, AMD gives you the same CPU virtualisation support in each model (does not matter if 1P, 2P, 4P+) while Intel differs between models. Reply
  • L. - Friday, May 20, 2011 - link

    I have trouble understanding you : sandy bridge 2p servers will be out before the end of the year ?

    Aren't they out yet ?

    And even if they're there, they will NOT compete with the AMD chips, as I said above, a 45nm Ph2-based Opteron is as power efficient as a 32nm sb-based xeon - lolwut ?

    The only thing that will somehow be bad for bulldozer is Ivy Bridge 22nm IF it comes out as Intel planned it - and even then, it's only a repeat of the same core arch.

    If Bulldozer is no more efficient than the phenom, you will have AMD win in perf/watt/dollar until ib is out, and then the only advantage will be the 3d gate, which Intel said would amount to a dozen % improvement over standard 22nm.

    As a summary, if the Bulldozer Architecture is 12% more efficient than the Phenom 2, then the Bulldozer will destroy the Westmere-EX at the same process, and face the ivy bridge as an equal.

    Considering the design options picked by AMD on bulldozer, I'm quite confident it'll be at least 12% more efficient through architecture.

    And even if Intel is good at marketing, AMD has been gaining share and will gain more in the future.

    Intel said this ?" With their latest chip, Intel promises up to 40% better performance at slightly lower power consumption."

    Well that means that shrinking from 45nm to 32nm yields 30% (pinch of salt ;) ) improvement.

    Make no mistake, Bulldozer will totally kill the Sandy Bridge based offerings, by at least a 30% margin on perf/watt/dollar and I would expect this to be in the 40-50% range with the architecture changes.
    Reply
  • alent1234 - Monday, May 23, 2011 - link

    nobody ever got fired for buying IBM. or these days Intel and Microsoft.

    by the time you price out a HP Proliant with AMD CPU's it's the same price or more than an Intel based server. maybe just a little cheaper. and the AMD CPU's do a lot worse on benchmarks that test more real world performance like database OLTP and other more common server tasks.
    Reply

Log in

Don't have an account? Sign up now