Motherboards Memory Storage Cases/Cooling/PSUs IT Computing Displays Mobile Mac CPUs & Chipsets Video Digital Cameras Linux Gadgets Systems Trade Shows Guides Home Increase Font Size Decrease Font Size Change Page Size
LINPACK: Nehalem vs Shanghai part 2
LINPACK: Nehalem vs Shanghai part 2
Date: December 1st, 2008
Author: Johan De Gelas
 
 

The last post generated some very interesting comments and questions, which I wanted to address. Unfortunately, some people misinterpreted the post as a "the best scores Nehalem and Shanghai can get in Linpack" review.
 
So let me make this very clear: this and the previous blogpost are not meant to be a "buyer's guide". The Nehalem desktop system and AMD "Shanghai" server are completely different machines, targeted at totally different markets. Normally, we should wait for the Xeon 5500 to run these kind of benchmarks, but consider this a preview out of curiosity.
 
Secondly, we were not trying to get the highest possible LINPACK scores on both architectures. We wanted to use one binary which has good optimizations for both AMD's and Intel CPU's. Fully optimized binaries won't even run on the other CPU. Our only goal is to get an idea how the Nehalem and Shanghai architectures compare when running a "LINPACK" alike binary which is optimized to run on all machines.
 
Thirdly, this is not our review of course. This is a blogpost which talks about some of the tests we are doing for the review.
 
MKL on AMD?
Using the Intel Math Kernel Libraries on an AMD CPU is of course a good way to start some heavy debates. As I pointed out in the last blogpost however, in some cases, the slightly older MKL versions still do a very good job on AMD CPUs when you benchmark with low matrix sizes. You don't have to take my word for it of course.
 
Compare the Intel Linpack 9.0 (available mid 2007) with the binary that AMD produced at the end of 2007. AMD made a K10 only version using the ACML version 4.0.0, and compiling Linpack with the PGI 7.0.7 compiler (with following flags: pgcc -O3 -fast -tp=barcelona-64).
 
All the benchmarks below are done on one CPU with 4 GB (AMD, Intel Xeon) or 3 GB (Intel Core i7). Speedstep, Powernow! and Turbo mode were disabled. 
 
LINPACK version 2007
 
As predicted, the ACML binary which was compiled with 2007 compiler is slower than the MKL "2007" version also compiled in 2007. The MKL version runs on any CPU that has support for (S)SSE-3, so it continues to be a very interesting one for us to test. As you can clearly see from the Xeon 5472 (3 GHz) score, it is not fully optimized for the latest 45 nm Intel CPUs with SSE-4. It is a good "not too optimized" version which can be used on both Intel and AMD CPUs.  You can clearly see this as the 3 GHz Xeon 5472 is behind the AMD Opteron 8384. If this Intel Binary was giving the AMD CPUs a badly optimized code path, this would not be possible.
 
As we move forward to 2008,  we have to create a new binary as both AMD and Intel's fully optimized Linpack versions will not run on the competitor's CPU. Intel released the Linpack benchmark version 10.1, which is not fully optimized for the "Nehalem" architecture, but for 45 nm "Harpertown" family.
 
AMD has created a new Linpack binary using ACML 4.2 and the PGI 7.2-4 compiler.  Below you see how the two CPUs compare.
 
LINPACK version late 2008
 
Bottom line is that these LINPACK benchmarks are moving targets like the SPEC CPU benchmarks, as the compilers and libraries used are just as important as the CPUs.When the Xeon 5500 will materialize, LINPACK performance will probably be higher as the binary is built for the "Penryn/Harpertown" family.
 
While it is useful for the HPC people to see which CPU + compiler can offer the best performance, it is also interesting to understand what kind of performance you get when you compile binaries that have to run on all current CPUs. It is pretty hard to compare CPU architectures if you are using totally different binaries.
 
In the next post we'll delve a bit deeper on what is happening with Hyperthreading, Linpack and the new architectures.

35 Comments
Username:
Password:
testing... by JohanAnandtech, 354 days ago
123

Reply
RE: testing... by Spoelie, 353 days ago
reading through those responses, don't you miss the level of conversation held at aceshardware articles?

ah well different readership ;)

Reply
RE: testing... by JohanAnandtech, 352 days ago
Yes, of course :-).

We 'll do everything in the coming months to make sure that the it.anandtech.com community gets high quality debates with a good signal/noise ratio. In a few months the comment box and board should integrate for example. It is still the goal to bring deeply technical benchmarking to the IT world, where good reviews are still scarce IMHO. It will take time, but it.anandtech will get there. :-)

Again, I welcome all constructive and even sharp comments. But preferably well founded with some good reasoning.

Reply
What I want to know by Zorblack1, 354 days ago
Where are all the hateful people from the last blog entry. Crow tastes good huh? You should at least be man enough to apologize.

Reply
RE: What I want to know by PERUVIANMADAFAKA, 354 days ago
LOL.

DAMMNIT I NEVER GET IN TO FORUMS N ALL, BUT THIS IS GETING HOT.

APOLOGIZE FOR WHAT?

Reply
RE: What I want to know by Zorblack1, 354 days ago
Look at all these hateful peeps...



Reply
RE: What I want to know by Zorblack1, 354 days ago
RE: What I want to know by befair, 348 days ago
apologize for what!? for showing what Anandtech is worth!?? Its is just a bunch of guys who had a good thing going then decided to get in to bed with Intel and present their biased and useless information?

All you mindless guys can continue to enjoy to read worthless articles.

Reply
RE: What I want to know by befair, 348 days ago
And I doubt Johan can even do a decent run of HPL without saying "Intel wins" before he even starts the run

Reply
wow - so Nehalem is rockin the house? by Toadster, 354 days ago
very interesting... when can we buy some? :)

Reply
RE: wow - so Nehalem is rockin the house? by ZootyGray, 354 days ago
AT THE i7 TLB error DISCOUNT STORE

that'll rock your house.

Reply
RE: wow - so Nehalem is rockin the house? by BlueBlazer, 353 days ago
Go back to your UAEZone.

So called bug FUD was debunked.
http://www.techreport.com/discussions.x/15979



Reply
Architecture or software comparison? by erikejw, 354 days ago
If you really wanted to compare architectures you would not use the Intel binary even if it faster for AMD than other binaries.

You would choose a binary that was not heavily optimized at all and would not benefit any architecture.

The scores would be lower but you would compare ARCHITECTURES.

Then it is up to software developers to create the most efficient binary for the platform of choice. That is a completely different matter.

Do you really beleive that the Intel binary would not be better suited for Intel processors?

If they were not better the Intel developers would be worthless and incompetent, I think they are not.

Reply
RE: Architecture or software comparison? by JohanAnandtech, 348 days ago
Remember that AMD always takes in account that a lot of code out there is optimized for Intel architectures. If Intel's engineers go all the way to produce a carefully optimized SSSE-3 binary, it is very possible that it performs very well on the K10. And the evidence shows that the one I have been using is very good, as Shanghai at 2.7 GHz outperforms the Xeon 5472 at 3 GHz.



Reply
You don't get it yet. by ZootyGray, 354 days ago
Nobody believes you anymore.

Imagine a site that goes to great lengths to present unbiased testing. without the cheap trix you present.

2nd the call for apology - then again, it goes deeper than that - and you know it. Never mind - I am done with you.



Reply
RE: You don't get it yet. by MamiyaOtaru, 354 days ago
yay

Reply
RE: You don't get it yet. by strikeback03, 353 days ago
good, the more fanbois that disappear the better

Reply
RE: You don't get it yet. by kmmatney, 353 days ago
Bye - Don't let the door hit your ass.

Reply
brand promotion or analysis by uf, 353 days ago
If you pretend being neutral, AnandTech should publish info about its financial interest and its connection with all brands mentioned in the review: Intel, Amd, etc., otherwise you very look like Intel promotion site.

Reply
RE: brand promotion or analysis by JohanAnandtech, 352 days ago
Understand that Liz and myself are working here in Belgium, Jason and Ross are in Canada. We have no clue whatsoever happens in the financial part of Anandtech, and rightfully so. I have no interest whatsoever to get involved in that.

Read the past articles at it.anandtech.com and judge on that whether or work is "neutral" or not.


Reply
RE: brand promotion or analysis by uf, 352 days ago
If you are really not biased, and if you are really interested to compare two hardware (CPU,chipset,mem) systems, WHY you didn't use neutral software as someone pointed out. Of cause, we would not reach top speed but it was not our goal.

Reply
RE: brand promotion or analysis by befair, 348 days ago
Yes agreed .. he doesnt know about any financial matters. The guys who bring in the money just goes "Just make Intel look good" and he follows like a good sheep.

I cant believe how blatantly over and over they make all these statements " .. it is optimized for Intel but it runs fast on AMD" come one people!

Reply
Sparse Matrix Performance by Mclendo06, 353 days ago
Linpack is great and all, but I was wondering if you had any benchmarks for sparse operations that you could run as part of the review, for instance running Pardiso on a 250k equation system (if RAM permits - 3GB will probably limit you to about ~100k-ish depending on matrix sparsity). I may be wrong here, but I think I've heard somewhere that memory is a significant bottleneck for sparse matrix computation, and so it would be interesting to see what sorts of gains Intel has made here with the new memory controller.

Reply
Anandtech should be called Inteltech by superflex, 353 days ago
Your bias is pathetic. Look at the front page where Intel has been hosting the Intel Resource Center link for over a year. Intel is one of Anand's biggest advertizers and Steve Ballmer enjoys the rim jobs Anand and his douchbag bloogers provide him.
Thanks again for confirming my beleifs that AMD (and ATi) will never get a fair shake on this sorry site.

Reply
RE: Anandtech should be called Inteltech by Zorblack1, 353 days ago
Wow your a moron! Steve Ballmer works for Microsoft not Intel. Microsoft != Intel
And as for your AMD fanboyness take a hike. You blame the hard working folks at anandtech for AMD's failures.

Reply
RE: Anandtech should be called Inteltech by JohanAnandtech, 352 days ago
As most of our regular readers know, let our trackrecord speak for itself:

http://it.anandtech.com/weblog/showpost.aspx?i=443
Comment:
"This is really interesting. It looks like AMD is very competitive in the HPC / Server market. I am glad to hear it. I am currently using a Core 2 Duo system for my desktop machine, and it is very fast. But, I certainly don't want the competition between AMD and Intel to come to an end anytime soon. Any good news for AMD at this point is good news for consumers."

Still, not convinced?
http://it.anandtech.com/weblog/default.aspx

"The very first independent Nested Paging Virtualization tests"
"AMD's K10: a "dead" product or not?"
"AMD back in the quad socket race"
"The revenge of AMD Barcelona's TLB?"

Take the time to research, and judge for yourself.


Reply
RE: Anandtech should be called Inteltech by 7upMan, 352 days ago
Thank you, Johan. The last article is a reminder why so many Opteron machines are in the Top 10 of the Top 500 list of supercomputers. And this, folks, you should always remember when arguing Pro or Con AMD.

Barcelona is and was first and foremost a server ship (and performing exceptionally well in that role), with Deneb hopefully changing that bias toward gaming. At least I hope so, because I'd love to change my Athlon X2 6400+ to a faster CPU with less power hunger.

Besides, I can imagine why Anand is slightly biased toward Intel (and yes, it really is). After all, it was AMD who wanted to have positive reviews on the first Phenoms by inviting the testers to exotic locations (see Anand's review of Phenom 1). If someone tries to cheat you, you are less inclined to believe every bit of hype he tells you.

Reply
How about we get some other benchies by shin0bi272, 353 days ago
Just to shut up the AMD people. I havent bought an AMD since the k6III series and dont plan to go back now. But still Id like to see more benchmarks with games and such in them so that we can see if/how much the intel beats the amd (again).

Reply
A real review by Shmak, 353 days ago
Those looking for a real review of the Shanghai can find it tucked away in the IT section here: http://www.anandtech.com/showdoc.aspx?i=3456&p=1

Reply
SMT by bonesdmz, 353 days ago
Does Nehalem still performs worse with SMT enabled when using newer Linpack binaries?

Reply
AMD vs Intel by octop, 352 days ago
The raw performance of a processor does not just rely on Architecture. The manufacturing process technology play an important part. Intel win in this area. AMD as well as readers here knew that too. In my view, in order for AMD to survive, he has to beat Intel in a more creative way, which is manifested in their CPU design. Look, Nahelem takes a lot of similar concepts as AMD Agena. I think Intel is copying the ideas from AMD K9 processor in term of bus technology, the importance of embedding mem. controller into the processor etc.

I never think that AMD loss, both in design & sometimes performance. Pls note that, software has also making the equation more complex. Some benchmark only calc raw speeds which doesn't really implement in real life, AND the API used for benchmark software also depends on OS. Is OS not biased? If I'm Microsoft, how could I create a kernel that take the best from both world with a fixed resources, AT the time my R&D team is working on the Kernel? I'd tune my compiler base for the most popular CPU with their latest optimised insttruction set AT that time. All in all, it'd be base your benachmark on intended application rather than raw performance or architecture. You'll never find a fair answer.

So, there's really no point to argue on the results. But argument on the details will benefit all of us.

By the way, how much you used to pay AMD CPU for similar/equivalent preformance compare to Intel. And how would Intel CPU cost us without AMD pressure. I think cost should be also factored in the benchmark for general highlevel comparison. Intel selling Penryn in affordable manner is b'cse they have achieve economic of scale in manufacturing and also competition from AMD. All the bucks you pay are mostly for manufacturing technology, the silicon doesn't cost much.

That's just my 2 cents. Hope you guys don dispute each other anymore.

Reply
RE: AMD vs Intel by jmurbank, 348 days ago
I agree AMD still have not lost the processor game. Gamers think AMD did lose because benchmarks shows Phenom and Athlon64 does not equal or over come the performance of Core 2 Duo. Gamers do not understand is that AMD is very strong in the server market. From what I have seen with benchmarks in the past comparing Core 2 Duo and AMD processors, they have not done a a very heavy load test. I am seeing a little glimpse with heavy loads that AMD Shanghai has a higher performance per watt ratio compared to Intel Harpertown processors.

I disagree that a smaller fabrication process improves performance. Sure it can provide higher clocks and more area to include more transistors. There are other ways to make a processor efficient. Turning a RISC processor to act like a VLIW processor which Intel have done with their Core 2 Duo processor. A silicon disk does cost a lot of money. Making pure silicon is not easy and the manufacturing process is very expensive.

From the results that I am seeing with LINPACK for both AMD Shanghai processor and Intel Nehalem processor, they are both equal in performance. I can not tell if the Intel processor is using non-ECC memory and if the AMD processor is using ECC memory. If one is using non-ECC and the other is using ECC memory, no wonder there is a 5 percent performance boost. Also both setups should hold about 6 GiB (6 of 1 GiB memory modules) of RAM to do a good comparison. Intel fans celebrating over a 5 percent of performance gain should look at it closer.

AMD has a better cost advantage. Their on-board motherboard chipsets are better and are cheaper compared to Intel motherboard chipsets. Also AMD systems have some flexibility to use 3rd-party motherboard chipsets with out having any problems. Using 3rd-party motherboard chipsets for Intel systems does have problems and history have stated this.

I call myself an AMD fan, but again this LINPACK review just shows that AMD Shanghai has equal performance to Intel Nehalem because the controls are vague and there are too many variables.

Reply
Couple Questions Johan on how you got the results.. by twilkens, 347 days ago
Johan,
Long time since we talked. I spoke with Joshua in TX and he informed me that he sent you some data on how to run and a binary with the latest ACML. You used that in your update, right? ACML 4.2 is quite a bit faster than 4.1 and uses a new algorithm which will be pushed out to all of ACML in the next year.

Another question I had, did you run HPL? There's a hpl.dat file there which needs to be tuned. Like you said in the review.. the chip's important, but so is the software (aka library but also that hpl.dat file). In the hpl.dat file there's an NB parameter. For ACML 4.2 that parameter is best set to 168. For MKL, ATLAS and GOTO they have their favorites so fyi if you switch libraries you also need to tune that parameter. Don't know if you knew about this but .. there it is.

Stay in touch.. I'm sure you have the email address.. and best regards Johan.

Tim Wilkens

Reply
RE: Couple Questions Johan on how you got the results.. by IdaGno, 346 days ago
sheeshe! picky, picky, picky!

bottom line: OPTIMIZED CODE IS CRITICAL

this should surprize no one

lighten up, people

Reply
Virtualization by piesquared, 342 days ago
Johan, an honest question seeking an honest answer. I know you've said in the past that virualization articles are complicated and time consuming. It's been almost a month now. As you know, virtualization is THE feature the industry is begging for going forward, and is one of, if not the strongest feature of Shanghai. I'm just wondering why you didn't make this a top priority in your testing, and if we shouldn't expect to see any articles or reviews in this regard, until Intel has something that competes. It is really starting to look that way, but I hope i'm wrong as i've had alot of respect for you in the past. I doubt Core i7 will be all that competitive for VM, but even so, i'll be extremely dissapointed if it miraculously makes an appearance. Especially given the time to market between the two server parts.

Anyway, i'll continue to wait.................................

Reply
Comments Page 1 of 1





AnandTech.com Blog Categories
All categories
Anand's Macdates
Anand's Theater Construction
Anand's Updates
Cases and Power Supplies
CeBIT 2008
CES 2008
Computex 2009
Derek Decanted
Eddie's Got Game
Gary's First Looks
IT Computing general
Jarred's Musings
Kris's Corner
Raja's Ramblings
Rob's Experiences...
Ryan's Ramblings
Virtualization
What's New with Wes
Blank
Blank

Blank

Latest news by
DailyTech

 November 20, 2009

Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank

 November 19, 2009

Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank


more Blogs Discussions



pipeboost
Copyright © 1997-2009 AnandTech, Inc. All rights reserved. Terms, Conditions and Privacy Information.
Click Here for Advertising Information