The Xeon E5-2600: Dual Sandy Bridge for Servers

Name: The Xeon E5-2600: Dual Sandy Bridge for Servers
Item: The Xeon E5-2600: Dual Sandy Bridge for Servers
Author: Johan De Gelas

by Johan De Gelas on March 6, 2012 9:27 AM EST

81 Comments | Add A Comment

81 Comments

Conclusions

Our conclusion about the Xeon E5-2690 2.9 GHz is short and simple: it is the fastest server CPU you can get in a reasonably priced server and it blows the competition and the previous Xeon generation away. If performance is your first and foremost priority, this is the CPU to get. It consumes a lot of power if you push it to its limits, but make no mistake: this beast sips little energy when running at low and medium loads. The price tag is the only real disadvantage. In many cases this pricetag will be dwarfed by other IT costs. It is simply a top notch processor, no doubt about it.

For those that prioritize performance/watt or performance/dollar, we've summarized our findings in a comparison table. We made 3 columns for easy comparison:

In the first column, we compare Intel's newest generation with the previous one. We compare the CPUs with midrange TDP (95W).
In the second column, we compare Intel's and AMD's midrange offerings.
In the third column we compare CPUs with a similar pricepoint as we believe that a six-core E5-2660 will be very close to the performance of 2.3 GHz Xeon E5-2630.

We also group our benchmarks in different software groups and indicate the importance of this software group in the server market (we motivated this here).

Software: Importance in the market	Xeon E5-2660 vs Xeon X5650	Xeon E5-2660 vs Opteron 6276	Xeon E5-2660 6C vs Opteron 6276
Virtualisation: 20-50%
ESXi + Linux	+40%	+40%	+7%
OLAP Databases: 10-15%
MS SQL Server 2008 R2	+30%	+34%	+8%
HPC: 5-7%
LS Dyna	+77%	+26%	+15%
Rendering software: 2-3%
Cinebench	+50%	+37%	+9%
3DS Max 2012 (iRay)	2%	+12%	+18%
Blender	+9%	+32%	+26%

Other: N/A
Encryption/Decryption AES	+42/41%	+38/32%	+8/4%
Encryption/Decryption Twofish/Serpent	+37/49%	+5/2%	-19%/-19%
Compression/decompression	+35/37%	+105/13%	+66/-11%

It is pretty amazing that with the exception of two rendering applications with relatively mediocre scaling, the new Xeon is able to outperform the previous Xeons by a large margin (from 30% up to 60%) in a wide range of applications. All that performance comes with lower energy consumption and a very fast I/O interface. Whether you want high performance per dollar or performance per watt, the Xeon E5-2660 is simply a home run. End of story.

For those who are more price sensitive, the Xeon E5-2630 costs less than the Opteron 6276 and performs (very likely) better in every real world situation we could test.

And what about the Opteron? Unless the actual Xeon-E5 servers are much more expensive than expected, it looks like it will be hard to recommend the current Opteron 6200. However if Xeon E5 servers end up being quite a bit more expensive than similar Xeon 5600 servers, the Opteron 6200 might still have a chance as a low end virtualization server. After all, quite a few virtualization servers are bottlenecked by memory capacity and not by raw processing power. The Opteron can then leverage the fact that it can offer the same memory capacity at a lower price point.

The Opteron might also have a role in the low end, price sensitive HPC market, where it still performs very well. It won't have much of chance in the high end clustered one as Intel has the faster and more power efficient PCIe interface.

Ultimately, our hope for stiffer competion lies with the newest Opteron "Abu Dhabi" which is based upon the "Piledriver" core. The new Opteron was after all made to operate at 3 GHz and higher clockspeeds as opposed to the meager 2.3/2.6 GHz we have seen so far. Apparantely AMD will not only be able to boost IPC a bit (by 10% or more) but they may also significantly boost the clockspeed as we have learned from this ISSC paper: "a AMD’s 4+ GHz x86-64 core code-named “Piledriver” employs resonant clocking to reduce clock distribution power up to 24% while maintaining a low clock-skew target."

This should allow AMD to get higher clockspeeds within the same power envelope. Until then, it is the Xeon E5-2600 that rules the server world.

Compression and Encryption

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

81 Comments

View All Comments

JohanAnandtech - Wednesday, March 7, 2012 - link
Argh. You are absolutely right. I reversed all divisions. I am fixing this as we type. Luckily this does not alter the conclusion: LS-DYNA does not scale with clockspeed very well.
alpha754293 - Wednesday, March 7, 2012 - link
I think that I might have an answer for you as to why it might not scale well with clock speed.

When you start a multiprocessor LS-DYNA run, it goes through a stage where it decomposes the problem (through a process called recursive coordinate bisection (RCB)).

This decomposition phase is done every time you start the run, and it only runs on a single processor/core. So, suppose that you have a dual-socket server where the processors say...are hitting 4 GHz. That can potentially be faster than say if you had a four-socket server, but each of the processors are only 2.4 GHz.

In the first case, you have a small number of really fast cores (and so it will decompose the domain very quickly), whereas in the latter, you have a large number of much slower cores, so the decomposition will happen slowly, but it MIGHT be able to solve the rest of it slightly faster (to make up for the difference) just because you're throwing more hardware at it.

Here's where you can do a little more experimenting if you like.

Using the pfile (command line option/flag 'p=file'), not only can you control the decomposition method, but you can also tell it to write the decomposition to a file.

So had you had more time, what I would have probably done is written out the decompositions for all of the various permutations you're going to be running. (n-cores, m-number of files.)

When you start the run, instead of it having to decompose the problem over and over again each time it starts, you just use the decomposition that it's already done (once) and then that way, you would only be testing PURELY the solving part of the run, rather than from beginning to end. (That isn't to say that the results you've got is bad - it's good data), but that should help to take more variables out of the equation when it comes to why it doesn't scale well with clock speed. (It should).
IntelUser2000 - Tuesday, March 6, 2012 - link
Please refrain from creating flamebait in your posts. Your post is almost like spam, almost no useful information is there. If you are going to love one side, don't hate the other.
Alexko - Tuesday, March 6, 2012 - link
It's not "like spam", it's just plain spam at this point. A little ban + mass delete combo seems to be in order, just to cleanup this thread—and probably others.
ultimav - Wednesday, March 7, 2012 - link
My troll meter is reading off the charts with this guy. Reading between the lines, he's actually a hardcore AMD fan trying to come across as the Intel version of Sharikou to paint Intel fans in a bad light. Pretty obvious actually.
JohanAnandtech - Wednesday, March 7, 2012 - link
We had to mass delete his posts as they indeed did not contain any useful info and were full of insults. The signal to noise ratio has been good the last years, so we must keep it that way.

Inteluser2000, Alexko, Ultimav, tipoo: thx for helping to keep the tone civil here. Appreciate it.

- Johan.
tipoo - Wednesday, March 7, 2012 - link
And thank you for removing that stuff.
tipoo - Tuesday, March 6, 2012 - link
We get it. Don't spam the whole place with the same post.
tipoo - Tuesday, March 6, 2012 - link
No, he's just a rational persons. I don't care which company you like, if you say the same thing 10 times in one article someones sure to get annoyed and with justification.
MySchizoBuddy - Tuesday, March 6, 2012 - link
I'm again requesting that when you do the benchmarks please do a Performance per watt metric along with stress testing by running folding@home for straight 48hours.

The Xeon E5-2600: Dual Sandy Bridge for Servers

Virtualisation: 20-50%

+40%

+40%

+7%

OLAP Databases: 10-15%

+30%

+34%

+8%

HPC: 5-7%

+77%

+26%

+15%

Rendering software: 2-3%

+50%

+37%

+9%

2%

+12%

+18%

+9%

+32%

+26%

Other: N/A

+42/41%

+38/32%

+8/4%

+37/49%

+5/2%

-19%/-19%

+35/37%

+105/13%

+66/-11%

Post Your Comment

81 Comments

View All Comments

JohanAnandtech - Wednesday, March 7, 2012 - link

alpha754293 - Wednesday, March 7, 2012 - link

IntelUser2000 - Tuesday, March 6, 2012 - link

Alexko - Tuesday, March 6, 2012 - link

ultimav - Wednesday, March 7, 2012 - link

JohanAnandtech - Wednesday, March 7, 2012 - link

tipoo - Wednesday, March 7, 2012 - link

tipoo - Tuesday, March 6, 2012 - link

tipoo - Tuesday, March 6, 2012 - link

MySchizoBuddy - Tuesday, March 6, 2012 - link

Log in

Don't have an account? Sign up now