Words of Thanks

A lot of people gave us assistance with this project, and we would of course like to thank them.
Damon Muzny, AMD US
Brett Jacobs, AMD US
(www.amd.com)

Randy Chang, ASUS
(www.asus.com)

Kelly Sasso, Crucial Technology

Matty Bakkeren, Intel Netherlands
(www.intel.com)

Benchmark configuration

Here is the list of the different configurations. All servers have been flashed to the latest BIOS, and unless we add any specific comments to the contrary, the BIOS are set to default settings.

Opteron 2350 Server: ASUS KFSN4-DRE
Dual Opteron 2350 2GHz
Asus KFSN4DRE BIOS version 1001.02 (8/28/2007) - NVIDIA nForce Pro 2200 chipset
8GB (4x2GB) Crucial Registered DDR2-667 CL5 ECC
NIC: Broadcom BCM5721

Opteron Socket F 1207 Server: Tyan Transport TA26 - 2932
Dual Opteron 2222 3GHz / 2224SE 3.2GHz
Tyan Thunder n3600m (S2932) - NVIDIA nForce Pro 3600 chipset
8GB (4x2GB) Crucial Registered DDR2-667 CL5 ECC
NIC: nForce Pro 3600 integrated MAC with Marvell 88E1121 Gigabit Ethernet PHY

Xeon Server: Intel "Bensley platform" server
2x Xeon 5160 3GHz or 2x Xeon E5345 at 2.33GHz
Intel Server Board S5000PSL - Intel 5000P Chipset
8GB (4x2GB) Crucial Registered FB-DIMM DDR2-667 CL5 ECC
NIC: Dual Intel PRO/1000 Server NIC
BIOS note: Hardware prefetching disabled

Client Configuration: Dual Opteron 850
MSI K8T Master1-FAR
4x512MB Infineon PC2700 Registered, ECC
NIC: Broadcom 5705

Software
SUSE Linux SLES 10 SP1 (Linux 2.6.16.46-smp)
MySQL 5.0.26 as shipped with SUSE SLES 10 SP1
SPECjbb2005
Sun Hotspot Java JVM 1.5.0_08
3DSMax 9
Cinebench 9.5
WinRAR 3.62

ASUS KFSN4-DRE: Split Power Enabled

ASUS was the first to send us a board which supports AMD's split power plane. Split power plane or dual dynamic power management means that the memory controller and the core are fed from different power rails. According to AMD, this allows the memory controller to run at 1.8GHz instead of 1.6GHz, and provides a 7% boost in raw memory performance. The new quad-core Opteron processors should work in the older socket-F motherboards with a BIOS Update. These unified plane motherboards will be slightly slower (slightly lower clocked memory controller) and consume a bit more power though.

The ASUS KFSN4-DRE supports no less than 16 DIMMs in total, good for a maximum of 64GB of RAM (but at lower speeds than 667 MHz). To improve performance, ASUS has implemented a "dual link". The two processor sockets are connected by a pair (instead of one) of 16-bit coherent HyperTransport buses. According to ASUS, dual link and split power planes offer not only lower power but up to 14% higher performance. Unfortunately, it has not been disclosed in which application this 14% has been measured.


The SSE EB board is clearly made for a compact 1U server. The one x16 PCI-E slot can be used to plug in a riser card which allows you to use two x8 PCI-E slots. Using the NVIDIA nForce pro 2200, the board is able to provide four SATA ports which can support RAID levels 0,1,10 and 5. Luckily, for those of you thinking of using VMWare ESX on this board, a version with the LSI 1064 SAS controller and four SAS connectors will also be available. ASUS has made an excellent choice by using the Broadcom BCM5721 PCI-E for the Gigabit LAN interfaces.

ASUS left some space for an optional IPMI Module (ASMB3) for Out-of-Band and Remote Server Management. With two 667 MHz DIMMs in each node, the ASUS KFSN4-DRE ran all our benchmarks for many hours. The LINPAC benchmark in particular takes up to eight hours and proves that the board behaved very well in this configuration. We will investigate other DIMM configurations later.

AMD's Newest Quad-Core "Native Quad-Core"
POST A COMMENT

46 Comments

View All Comments

  • tshen83 - Monday, October 01, 2007 - link

    according to mysql site, starting with 5.0.37, the mutex contention bug and the Innodb bug has been improved by a lot, which helps 8 core systems.

    I was wondering that since 5.0.45 is available on mysql's website, why isn't the latest mysql being benchmarked? 5.0.26 still has that bug, and you can see it in the benchmark where a 8 core system is slower than a 4 core which is slower than a 2 core.

    Now that we are benchmarking 8-16 core systems, the newest versions of software should be used to reflect the improved multithreading.
    Reply
  • swindelljd - Wednesday, September 12, 2007 - link

    I currently have a 4 way 2.4ghz opteron as a production db server that I am considering upgrading. I'm trying to use the Anandtech benchmarks to help project how much performance gain we'll see in a new machine.

    We're running Oracle but are considering moving to MySQL. So I am trying to compare the stat's in 2 Anandtech reviews to see how the new Barcelona cores compare to the Intel Woodcrest and Clovertown.


    In looking at this article from June 2006( http://www.anandtech.com/IT/showdoc.aspx?i=2772&am...">http://www.anandtech.com/IT/showdoc.aspx?i=2772&am... ) , 2x3ghz Woodcrests (4 cores, right?) run the MySQL test at about 950 QPS (queries per second) for 25,50 and 100 concurrent sessions.

    However this recent article in September 2007 ( http://www.anandtech.com/IT/showdoc.aspx?i=3091&am...">http://www.anandtech.com/IT/showdoc.aspx?i=3091&am... ) appears to show the same 2x3ghz Woodcrests running 700,750 and 850 QPS for 25,50 and 100 connections respectively. That represents a 20% or so DECREASE in performance of the same chip in the last 12 months.

    What am I missing?

    Ultimately I want to compare the Opteron 2350 vs Xeon 5345 and then the Opteron 8350 vs Xeon E7330 but I'm starting with what exists for benchmarks first so I can make sure I understand what I am reading.

    Can someone please help set me straight.



    thanks,
    John
    Reply
  • JohanAnandtech - Monday, September 17, 2007 - link

    The article in june 2006 uses 5.0.21, and there might also be a small change in tuning. The article in September 2007 uses the standaard 5.0.26 mysql version that you get with SLES 10 SP1.

    The best numbers are here:
    http://www.anandtech.com/cpuchipsets/intel/showdoc...">http://www.anandtech.com/cpuchipsets/intel/showdoc...

    The newest version 5.0.45 will give you performance like the above article: MySQL has incorporated the Patches we talked about (that Peter Z. wrote) in this new version.
    Reply
  • Jjoshua2 - Tuesday, September 11, 2007 - link

    I like this benchmark alot as I am a fan of computer chess. Higher was spelled wrong on the graph on that page in Hiher is better. Reply
  • Schugy - Tuesday, September 11, 2007 - link

    Maybe it's too early for gcc optimizations but how about testing programs like oggenc, ffmpeg, blender, kernel compilation, apache with openssl, doom III and so on? Reply
  • erikejw - Monday, September 10, 2007 - link

    I read another review and they got these scores on the slightly lowerspeed 1.9 GHz Barcelona.

    Barcelona 2347 (1.9Ghz)
    37.5 Gflop/s

    Intel Xeon 5150(2.6Ghz)
    35.3 Gflop/s

    It seems your Barcelona scores are way off for some reason.
    The Xeons score is more or less identical.
    This seems really weird. Normally the higher score is the correct one due to some bad optimizations. The rest of the article is great though.

    Reply
  • kalyanakrishna - Monday, September 10, 2007 - link

    This article seems to be very biased.
    1) they choose faster Intel processors, 2 GHz Opteron. There are 2 GHz processors available across all the processors used in this analysis.
    2) No mention of what compiler was used. Intel compilers earlier had a trick, which was not documented - any code optimized for Intel processors if used on non-intel processors (uhm! AMD), would disable all optimizations. Who knows what else they are doing now. And this gentleman used Intel optimized code on AMD to test performance. Who in the right mind measuring performance would do that?
    3) Intel MKL was used for BLAS. Shouldnt they use ACML for AMD code? Again, who would do that when looking for performance?
    4) Memory Subsystem - knowing that the frequencies are different, why were all the results not normalized?
    5) They managed to comment that Tulsa and Opteron 2000 series are half the performance of core or Barcelona and hence should not be considered in the first page. But in Linpack page, it is mentioned that Intel chips ate AMD ones for breakfast. Of course, they did - peak of Xeon 5100 series is twice that of Opteron 2000 series. You dont need LINPACK to tell you that. Gives a very biased impression.
    6) LinPACK results graph could not be any more wrong. The peak performance of each CPU considered is different ... obviously their sustained performance is going to be different. The author should have at least made the effort to normalize the graph to show the real comparison.
    7) Since when is Linpack "Intel friendly"

    The author says they didnt have time to optimize code for AMD Opteron ... why would you do a performance study in the first place if you didnt have the methodology right.

    I didnt even read beyind LinPACK .. I would be careful reading articles from this author next time and maybe the whole site ... Its sad to see such an immature article. Whats worse is majority of people would just see the "fact" Intel is still faster than AMD.

    Over all, a very immature article with false information cleverly hidden behind numbers. or could it be that this article was intended to be biased .... who knows.
    Reply
  • JohanAnandtech - Monday, September 10, 2007 - link

    quote:

    why would you do a performance study in the first place if you didnt have the methodology right.
    quote:

    Memory Subsystem - knowing that the frequencies are different, why were all the results not normalized?


    What about the bytes/Cycle in each table?

    quote:

    The author should have at least made the effort to normalize the graph to show the real comparison.


    Why is that the "real comparison"? If Intel has a clockspeed advantage, nobody is going to downclock their CPUs to be fair to AMD.

    quote:

    ) Since when is Linpack "Intel friendly"


    First you claim we are biased. As we disclose that the binary that we run was compiled with Intel compilers targetting Core architecture, it is clear that the binary is somewhat Intel friendly.

    quote:

    why would you do a performance study in the first place if you didnt have the methodology right.


    It not wrong. It is incomplete and we admit that more than once. But considering AMD gaves us a few days before the NDA was over, it was impossible to cover all angles.



    Reply
  • erikejw - Tuesday, September 11, 2007 - link

    quote:

    Why is that the "real comparison"? If Intel has a clockspeed advantage, nobody is going to downclock their CPUs to be fair to AMD.


    That is true in the desktop scene but I am sure you know that servers is about performance/price and performance/w. Prices will declinge and we don't know what the price is tomorrow. It is ok to compare against a similarly priced cpu but a comparison against a
    same frequency cpu is very interesting too.

    Your LINPACK score just seems obscure. Somewhat Intel friendly compiler? LOL. If the compiler is so great why is the gcc score I read on another review 30% higher with the Barcelona(with a 1.9 GHz CPU)? That is just ridiculous. I thought this review was about architechture and what it can perform and not about which compiler we use and if it is true that optimizations is turned off in then Intel compiler if it is an AMD cpu then the score is worthless and the comparison is severly biased.



    Reply
  • JohanAnandtech - Tuesday, September 11, 2007 - link

    quote:

    Your LINPACK score just seems obscure. Somewhat Intel friendly compiler? LOL. If the compiler is so great why is the gcc score I read on another review 30% higher with the Barcelona(with a 1.9 GHz CPU)? That is just ridiculous. I thought this review was about architechture and what it can perform and not about which compiler we use and if it is true that optimizations is turned off in then Intel compiler if it is an AMD cpu then the score is worthless and the comparison is severly biased.


    Which review? Did they fully disclose the compiler settings?

    If the Intel compiler did fool us and turned off optimisations, we will update the numbers.
    Reply

Log in

Don't have an account? Sign up now