The past several months have seen both Intel and AMD introducing interesting updates to their CPU lines. Intel started with the E-stepping of the Xeon. Even at 3GHz, the four cores of the Xeon 5450 need 80W at the most, and if speed is all you care about a 120W 5470 is available at 3.33GHz. The big news came of course from AMD. The "only native x86 quad-core" is finally shining bright thanks to a very successful transition to 45nm immersion lithography as you can read here. The result is a faster and larger 6MB L3 cache, higher clock speeds, and lower memory latency. AMD's quad-core is finally ready to be a Xeon killer.

So it was time for a new server CPU shoot out as server buyers are confronted with quickly growing server CPU pricelists. Talking about pricelists, is someone at marketing taking revenge on a strict math teacher that made him/her suffer a few years ago? How else can you explain that the Xeon 5470 is faster than the 5472, and that the Xeon 5472 and 5450 are running at the same clock speed? The deranged Intel (and in a lesser degree AMD) numbering system now forces you to read through spec sheets the size of a phone book just to get an idea of what you are getting. Or you could use a full-blown search engine to understand what exactly you can or will buy. The marketing departments are happy though: besides the technical white papers you need to read to build a server, reading white papers to simply buy a CPU is now necessary too. Market segmentation and creative numbering…a slightly insane combination.

Anyway, if you are an investor trying to understand how the different offerings compare, or you are out to buy a new server and are asking yourself what CPU should be in there, this article will help guide you through the newest offerings of Intel and AMD. In addition, as the Xeon 55xx - based on the Nehalem architecture - is not far off, we will also try it to see what this CPU will bring to the table. This article is different from the previous ones, as we have changed the collection of benchmarks we use to evaluate server CPUs. Read on, and find out why we feel this is a better and more realistic approach.

Breaking out of the benchmark prison

When I first started working on this article, I immediately started to run several of our "standard" benchmarks: CINEBENCH, Fritz Chess, etc. As I started to think about our "normal" benchmark suite, I quickly realized that this article would become imprisoned by its own benchmarks. It is nice to have a mix of exotic and easy to run benchmarks, but is it wise to make an article with analysis around such an approach? How well does this reflect the real world? If you are actually buying a server or are you are trying to understand how competitive AMD products are with Intel's, such a benchmark mix is probably only confusing the people who like to understand what decisions to make. For example, it is very tempting to run a lot of rendering and rarely used benchmarks as they are either easy to run or easy to find, but it gives a completely distorted view on how the different products compare. Of course, running more benchmarks is always better, but if we want to give you a good insight in how these server CPUs compare, there are two ways to do it: the "micro architecture" approach and the "buyer's market" approach.

With the micro architecture approach, you try to understand how well a CPU deals with branch/SSE/Integer/Floating Point/Memory intensive code. Once you have analyzed this, you can deduce how a particular piece of software will probably behave. It is the approach we have taken in AMD's 3rd generation Opteron versus Intel's 45nm Xeon: a closer look. It is a lot of fun to write these types of articles, but it only allows those who have profiled their code to understand how well the CPU will do with their own code.

The second approach is the "buyer's market" approach. Before we dive into new Xeons and Opterons, we should ask ourselves "why are people buying these server CPUs"? Luckily, IDC reports[1] answer these questions. Even though you have to take the results below with a grain of salt, they give us a rough idea of what these CPUs are used for.

The reason why people buy a 2 socket server

The reason why people buy a 4 socket server

IT infrastructure servers like firewalls, domain controllers, and e-mail/file/print servers are the most common reasons why servers are bought. However, file and print servers, domain controllers, and firewalls are rarely limited by CPU power. So we have the luxury of ignoring them: the CPU decision is a lot less important in these kind of servers. The same is true for software development servers: most of them are for testing purposes and are underutilized. Mail servers (probably 10% out of the 32-37%) are more interesting, but currently we have no really good benchmark comparisons available, since Microsoft's Exchange benchmark was unfortunately retired. We are currently investigating which e-mail benchmark should be added to our benchmarking suite. However, it seems that most mail server benchmarking boils down to storage testing. This subject is to be continued, and suggestions are welcome.

Collaborative servers really deserve more attention too as they comprise 14 to 18% of the market. We hope to show you some benchmarks on them later. Developing new server benchmarks takes time unfortunately.

ERP and heavy OLTP databases are good for up to 17% of the shipments and this market is even more important if you look at the revenue. That is why we discuss the SAP benchmarks published elsewhere, even though they are not run by us. We'll add Oracle Swingbench in this article to make sure this category of software is well represented. You can also check Jason's and Ross' AMD Shanghai review for detailed MS SQL Server benchmarking. With Oracle, MS SQL Server and SAP, which together dominate this part of the server market, we have covered this part well.

Reporting and OLAP databases, also called decision support databases, will be represented by our MySQL benchmark. Last but not least, we'll add the MCS eFMS web server test -- an ultra real world test -- to our benchmark suite to make sure the "heavy web" applications are covered too. It is not perfect, but this way we cover the actual market a lot better than before.

Secondly, we have to look at virtualization. According to IDC reports, 35% of the servers bought in 2007 were bought to be virtualized. IDC expect this number to climb up to 52% in 2008 [2]. Unfortunately, as soon as we upgraded the BIOS of our quad socket platform to support the latest Opteron, it would not allow us to install ESX nor let us enable power management. That is why we had to postpone our server review for a few weeks and that is why we split it into two parts. For now, we will look at the VMmark submissions to get an idea how the CPUs compare when it comes to virtualization.

In a nutshell, we're moving towards a new way of comparing server CPUs. We combine the more reliable industry standard benchmarks (SAP, VMmark) with our own benchmarks and try to give you a benchmark mix that comes closer to what the servers are actually bought for. That should allow you to get an overview that is as fair as possible. Performance/watt is still missing in this first part, but a first look is already available in the Shanghai review.

Benchmark Configuration
POST A COMMENT

29 Comments

View All Comments

  • zpdixon42 - Wednesday, December 24, 2008 - link

    DDR2-1067: oh, you are right. I was thinking of Deneb.

    Yes performance/dollar depends on the application you are running, so what I am suggesting more precisely is that you compute some perf/$ metric for every benchmark you run. And even if the CPU price is less negligible compared to the rest of the server components, it is always interesting to look both at absolute perf and perf/$ rather than just absolute perf.

    Reply
  • denka - Wednesday, December 24, 2008 - link

    32-bit? 1.5Gb SGA? This is really ridiculous. Your tests should be bottlenecked by IO Reply
  • JohanAnandtech - Wednesday, December 24, 2008 - link

    I forgot to mention that the database created is slightly larger than 1 GB. And we wouldn't be able to get >80% CPU load if we were bottlenecked by I/O Reply
  • denka - Wednesday, December 24, 2008 - link

    You are right, this is a smallish database. By the way, when you report CPU utilization, would you take IOWait separate from CPU used? If taken together (which was not clear) it is possible to get 100% CPU utilization out of which 90% will be IOWait :) Reply
  • denka - Wednesday, December 24, 2008 - link

    Not to be negative: excellent article, by the way Reply
  • mkruer - Tuesday, December 23, 2008 - link

    If/When AMD does release the Istanbul (k10.5 6-core), The Nehalem will again be relegated to second place for most HPC. Reply
  • Exar3342 - Wednesday, December 24, 2008 - link

    Yeah, by that time we will have 8-core Sandy Bridge 32nm chips from Intel... Reply
  • Amiga500 - Tuesday, December 23, 2008 - link

    I guess the key battleground will be Shanghai versus Nehalem in the virtualised server space...

    AMD need their optimisations to shine through.


    Its entirely understandable that you could not conduct virtualisation tests on the Nehalem platform, but unfortunate from the point of view that it may decide whether Shanghai is a success or failure over its life as a whole. As always, time is the great enemy! :-)
    Reply
  • JohanAnandtech - Tuesday, December 23, 2008 - link

    "you could not conduct virtualisation tests on the Nehalem platform"

    Yes. At the moment we have only 3 GB of DDR-3 1066. So that would make pretty poor Virtualization benches indeed.

    "unfortunate from the point of view that it may decide whether Shanghai is a success or failure"

    Personally, I think this might still be one of Shanghai strong points. Virtualization is about memory bandwidth, cache size and TLBs. Shanghai can't beat Nehalem's BW, but when it comes to TLB size it can make up a bit.
    Reply
  • VooDooAddict - Tuesday, December 23, 2008 - link

    With the VMWare benchmark, it is really just a measure of the CPU / Memory. Unless you are running applications with very small datasets where everything fits into RAM, the primary bottlenck I've run into is the storage system. I find it much better to focus your hardware funds on the storage system and use the company standard hardware for server platform.

    This isn't to say the bench isn't useful. Just wanted to let people know not to base your VMWare buildout soley on those numbers.
    Reply

Log in

Don't have an account? Sign up now