Challenging. That is the least you can say about the economic climate for the launch of Intel's newest "Nehalem EP Xeon" platform. However, challenges must be met and they certainly make things more interesting. The server vendors won't convince a lot of people to buy a new Intel Nehalem (or AMD Shanghai) based server just because "performance is higher". That will only work in the processing hungry HPC and render worlds, where less time per task results in time and cost savings. Hence, the challenge for AMD and Intel is to convince the rest of the market - that is 95% or so - that the new platforms provide a compelling ROI (Return On Investment).
 
The most productive or intensively used servers in general get replaced every 3 to 5 years. Based on Intel's own inquiries, Intel estimates that the current installed base consists of 40% dual-core CPU servers and 40% servers with single-core CPUs.
 

That means that Intel's Nehalem platform (and AMD's Shanghai/Opteron 23xx platform) has to convince people to replace their dual-core Opteron, dual-core Xeon 50xx ("Dempsey"), and Xeon "Irwindale" servers. There are two great ways to turn a much more powerful server into a moneymaking and cost saving machine. One is to use fewer servers in a cluster, which is not applicable to all companies. The other more popular approach is to consolidate more servers on the same physical machine by using virtualization. The most important arguments for upgrading your servers are performance/watt and support for virtualization.

Intel's newest platform holds the promise that it supports virtualization better by adding EPT and lower world switch times. However, probably the largest bottleneck in the past was the amount of available bandwidth. Bandwidth is frequently an overrated performance factor, as few applications - excluding the HPC world - get a boost from for example using three instead of two memory channels. That changes dramatically when you are running tens of virtual machines on top of a physical machine: many applications with medium bandwidth demands morph into one big bandwidth-hogging monster. The challenge is thus to provide access to the memory as fast as possible, lower energy consumption, and better support for virtualization. On paper, the Nehalem architecture definitely can play all those trump cards. Anand has provided a detailed description of the Nehalem architecture. The most important improvements for business applications are:

  • The integrated memory controller talks to its own local memory or remote memory (NUMA). Memory access takes between 27 and 54 ns (80 to 161 cycles). Compare this to the Xeon 5450 at the same clock speed where memory access via the MC in the chipset can take up to 123 ns! The closest competitor (Opteron "Shanghai") needs between 32 and 71 ns.
  • A native quad-core design with fast 33 cycle L3 cache make it easy for the L2 caches to exchange cache coherency information
  • Fast CPU interconnects make sure that the rest of the snoops happen very fast and do not interfere with other traffic.
  • The memory controller has up to three channels. A dual CPU configuration has access to 35GB/s of memory bandwidth (measured with stream) if you use DDR3-1333. The latest dual Opteron achieves 19.4GB/s with DDR2-800

Basically, Nehalem is Intel's version of the improvements found in the AMD Barcelona platform, only better (or at least that's the goal). Let's see what it can do in reality.

What Intel is Offering
POST A COMMENT

44 Comments

View All Comments

  • gwolfman - Tuesday, March 31, 2009 - link

    Why was this article pulled yesterday after it first posted? Reply
  • JohanAnandtech - Tuesday, March 31, 2009 - link

    Because the NDA date was noon in the pacific zone and not CET. We were slightly too early... Reply
  • yasbane - Tuesday, March 31, 2009 - link

    Hi Johan,

    Any chance of some more comprehensive Linux benchmarks? Haven't seen any on IT Anandtech for a while.

    cheers
    Reply
  • JohanAnandtech - Tuesday, March 31, 2009 - link

    Yes, we are working on that. Our first Oracle testing is finished on the AMD's platform, but still working on the rest.

    Mind you, all our articles so far have included Linux benchmarking. All mysql testing for example, Stream, Specjbb and Linpack.
    Reply
  • Exar3342 - Monday, March 30, 2009 - link

    Thanks for the extremely informative and interesting review Johan. I am definitely looking forward to more server reviews; are the 4-way CPUs out later this year? That will be interesting as well. Reply
  • Exar3342 - Monday, March 30, 2009 - link

    Forgot to mention that I was suprised HT has such an impact that it did in some of the benches. It made some huge differences in certain applications, and slightly hindered it in others. Overall, I can see why Intel wanted to bring back SMT for the Nehalem architecture. Reply
  • duploxxx - Monday, March 30, 2009 - link

    awesome performance, but would like to see how the intel 5510-20-30 fare against the amd 2378-80-82 after all that is the same price range.

    It was the same with woodcrest and conroe launch, everybody saw huge performance lead but then only bought the very slow versions.... then the question is what is still the best value performance/price/power.

    Istanbul better come faster for amd, how it looks now with decent 45nm power consumption it will be able to bring some battle to high-end 55xx versions.
    Reply
  • eryco - Tuesday, April 14, 2009 - link

    Very informative article... I would also be interested in seeing how any of the midrange 5520/30 Xeons compare to the 2382/84 Opterons. Especially now that some vendors are giving discounts on the AMD-based servers, the premium for a server with X5550/60/70s is even bigger. It would be interesting to see how the performance scales for the Nehalem Xeons, and how it compares to Shanghai Opterons in the same price range. We're looking to acquire some new servers and we can afford 2P systems with 2384s, but on the Intel side we can only go as far as E5530s. Unfortunately there's no performance data for Xeons in the midrange anywhere online so we can make a comparison. Reply
  • haplo602 - Monday, March 30, 2009 - link

    I only skimmed the graphs, but how about some consistency ? some of the graphs feature only dual core opterons, some have a mix of dual and quad core ... pricing chart also features only dual core opterons ...

    looking just at the graphs, I cannot make any conclusion ...
    Reply
  • TA152H - Monday, March 30, 2009 - link

    Part of the problem with the 54xx CPUs is not the CPUs themselves, but the FB-DIMMS. Part of the big improvement for the Nehalem in the server world is because Intel sodomized their 54xx platform, for reasons that escape most people, with the FB-DIMMs. But, it's really not mentioned except with regards to power. If the IMC (which is not an AMD innovation by the way, it's been done many times before they did it, even on the x86 by NexGen, a company they later bought) is so important, then surely the FB-DIMMs are. They both are related to the same issue - memory latency.

    It's not really important though, since that's what you'd get if you bought the Intel 54xx; it's more of an academic complaint. But, I'd like to see the Nehalem tested with dual channel memory, which is a real issue. The reason being, it has lower latency while only using two channels, and for some benchmarks, certainly not all or even the majority, you might see better performance by using two (or maybe it never happens). If you're running a specific application that runs better using dual channel, it would be good to know.

    Overall, though, a very good article. The first thing I mention is a nitpick, the second may not even matter if three channel performance is always better.
    Reply

Log in

Don't have an account? Sign up now