Words of thanks

A lot of people gave us assistance with this project, and we like to thank them of course:

David Van Dromme, Iwill Benelux Helpdesk (http://www.iwill-benelux.com)
Ilona van Poppel, MSI Netherlands (http://www.msi-computer.nl)

Frank Balzer, IBM DB2/SUSE Linux Expert
Jasmin Ul-Haque, Novell Corporate Communications - SUSE LINUX

Matty Bakkeren, Intel Netherlands
Trevor E. Lawless, Intel US
Larry.D. Gray, Intel US
Markus Weingartner, Intel Germany

Nick Leman, MySQL expert
Bert Van Petegem, DB2 Expert
Ruben Demuynck, Vtune and OS X expert
Yves Van Steen, developer Dbconn

Damon Muzny, AMD US

I would also like to thank Lode De Geyter, Manager of the PIH, for letting us use the infrastructure of the Technical University of Kortrijk in which to test the database servers.

Benchmark configuration

To ensure that our databases were stable and reliable, we followed the guidelines of SUSE and IBM. For example, DB2 is only certified to run on the SLES versions of SUSE Linux - you cannot run it - in theory - on any Linux distribution. We also used the MySQL version (4.0.18) that came with the SUSE SLES9 CD's, which was certified to work on our OS.

Network performance wasn't an issue. We used a direct Gigabit Ethernet link between client and server. On average, the server received 4 Mbit/s and sent 19 Mbit/s of data, with a peak of 140 Mbit/s, way below the limits of Gigabit. The disk system wasn't overly challenged either: up to 600 KB of reads and at most 23 KB/s writes. You can read more about our MySQL and DB2 test methods.

Software:

IBM DB2 Enterprise Server Edition 8.2 (DB2ESE), 32 bit and 64 bit
MySQL 4.0.18, 32 en 64 bit, MyISAM and InnoDB engine SUSE SLES 9 (SUSE Entreprise Edition) , Linux kernel 2.6.5, 64 bit.

Hardware

We'll discuss the different servers that we tested in more detail below. Here is the list of the different configurations:

Intel Server 1:
Dual Intel Nocona 3.6 GHz 1 MB L2-cache, 800 MHz FSB - Lindenhurst Chipset
Dual Intel Irwindale 3.6 GHz 2 MB L2-cache, 800 MHz FSB - Lindenhurst Chipset
Intel® Server Board SE7520AF2
8 GB (8x1024 MB) Micron Registered DDR-II PC2-3200R, 400 MHz CAS 3, ECC enabled
NIC: Dual Intel® PRO/1000 Server NIC (Intel® 82546GB controller)

Intel Server 2:
Dual Xeon DP 3.06 GHz 1 MB L3-cache, Dual Xeon 3.2 GHz 2MB L3-cache
Dual Xeon 3.2 GHz
Intel SE7505VB2 board - Dual DDR266
2 GB (4x512 MB) Crucial PC2100R - 250033R, 266 MHz CAS 2.5  (2.5-3-3-6)
NIC: 1 Gb Intel RC82540EM - Intel E1000 driver.

Intel Server 3:
Pentium-D EE 840
Intel SE7505VB2 board - Dual DDR266
2 GB (4x512 MB) Crucial PC2100R - 250033R, 266 MHz CAS 2.5  (2.5-3-3-6)
NIC: 1 Gb Intel RC82540EM - Intel E1000 driver.

Opteron Server 1: Dual Core Opteron 875 (2.2 GHz), Dual/ Single Opteron 850, Dual/Single Opteron 848
Iwill DK8ES Bios version 1.20
4 GB: 4x1GB MB Transcend (Hynix 503A) DDR400 - (3-3-3-6)
NIC: Broadcom BCM5721 (PCI-E)

Quad Opteron Server 2: Iwill H4103: Quad Opteron 844, 848
Iwill H4103
4-8 GB: 4-8x1GB MB Transcend (Hynix 503A) DDR400 - (3-3-3-6)
NIC: Intel 82546EB (PCI-E)

Opteron Server 3: Dual Core Opteron 875 (2.2 GHz), Dual/ Single Opteron 848
MSI K8N Master2-FAR
4 GB: 4x1GB MB Transcend (Hynix 503A) DDR400 - (3-3-3-6)
NIC: Broadcom BCM5721 (PCI-E)

Opteron Server 4: AMD Quartet: Dual Opteron 848, Quad 848
Quartet motherboard, Zildjian personality board, Tobias backplane board and Rivera power distribution board.
Quad configurations: 4 GB: 8x512 MB infineon PC2700 Registered, ECC enabled
Dual configurations: 2 GB: 4x512 MB infineon PC2700 Registered, ECC enabled
NIC: Broadcom NetExtreme Gigabit

Client Configuration: Dual Opteron 850
MSI K8T Master1-FAR
4x512 MB infineon PC2700 Registered, ECC enabled
NIC: Broadcom 5705

Shared Components
1 Seagate Cheetah 36 GB - 15000 RPM - 320 MB/s
Maxtor 120 GB DiamondMax Plus 9 (7200 RPM, ATA-100/133, 8 MB cache)

Software

Vtune for Windows version 7.2, Vtune for Linux remote agent 3.0
Code Analyst for Linux 3.4.8
Code Analyst for Windows 2.3.4

More about the servers in this test

Although our main focus is the database server performance of the different AMD and Intel platforms, allow me to introduce some of the motherboards and server barebones that we used in this test.

Iwill H4103

The amount of power that the Iwill H4103 can pack in a 1U rack mounted case is nothing short of amazing. As you can see, there are no less than 4 Opterons in this pizza box, which also allows you to put up to 32 GB of DDR RAM in there. Even more impressive is the inclusion of two redundant 700 Watt power supplies.

The preferred habitat of such a beast is, of course, a HPC (High Performance Computing) environment, but it can also be used as a database server. The single problem is that the only disk interface available is the old and (especially for server applications) slow P-ATA interface. It is possible to cram two disks and a slim CDROM in there, but P-ATA disks are not a decent solution for a server. So, why show this quad CPU monster be in a database server review?

First of all, the H4103 is equipped with 4 Gigabit Ethernet ports, courtesy of a Intel 82546EB dual-channel GbE LAN controller, connected to the PCI-X bus. This allows you to connect to a NAS devices at Gigabit speeds. The integration of the Intel Gigabit Ethernet chip is a very good move: Intel's Gigabit chips are capable of reaching up to 900 Mbit/s at CPU's loads of less than 20% (measured with an Opteron 248).

The second option is to use the PCI-X 64bit 133/100/66MHz expansion slot with riser card (what we did) and connect Direct Attached Storage (DAS) externally such as an array of SCSI disks. It is pretty clear that when it comes to saving rack space, the Iwill H4103 is a very interesting option.

What about cooling? Well, this is a server of course and a whole battery of 10,000 RPM fans keep the copper heat sinks cool. In our poorly cooled lab, temperatures rose easily to 30°C and higher, but the Iwill H4103 heat sinks hardly became warm under full load. According to Iwill, you should be able to use dual core Opterons in this pizza box, but we haven't been able to verify this as the necessary BIOS version still had yet to arrive.

The H4103 left a very stable and highly performing impression upon us. The only thing left that would make this ultra compact quad Opteron machine complete is the integration of a SCSI controller or at least a very good SATA controller.

Iwill DK8ES

The Iwill DK8ES is a server board, based on NVIDIA's nForce 4 2200 Professional chipset, which includes two x16 PCI Express slots. The board also integrates an ATI RageXL VGA controller, 2 x PCI-Express x16 expansion slots (one in PCI-Express x2 mode), 3 PCI-X 64bit 133/100/66MHz expansion slots and 4 SATA ports. Two Broadcom BCM5721PCI-E Gigabit Ethernet Controllers are connected to a PCI-E port.

The interesting thing about the Iwill board is the high quality components that have been used on the board, such as tantalum capacitors and high end Digital VRMs.

The digital VRMs allow a very precise voltage regulation, increasing the stability of the server.

The board has proven to be fully stable during 6 weeks of heavy database server testing. The only problem was that Linux didn't like running Dual core Opterons on this board. A BIOS update (to 1.20) made the Opteron 275 and 875 run stable on this board, but Linux (kernel 2.6.12) still didn't use both cores. Both cores were reported, but only one was used. We suspect that Iwill may have to work out a BIOS issue, or that some of the NVIDIA Linux drivers still need some tuning.

MSI's K8Master-FAR2

MSI sent us a completely different, relatively cheap workstation board that should enable very compact quad core (two dual cores) Opteron machines. The MSI K8Master-Far2 is based on the NVIDIA nForce 4 Pro chipset.

In order to get two Opteron sockets, one 32-bit/33 MHz PCI slot, one PCI Express x4 slot and two PCI Express x16 slots (SLI mode supported) on a standard ATX board, MSI didn't give the second CPU local memory. Despite this limitation, the MSI K8Master-FAR2 proved to be an excellent performer in our database server tests.

Six memory slots allow up to 12 GB of RAM, not bad for such a compact board.

We weren't very happy with the Gigabit Marvell 88E1111 PHY interface, which consumes more than 60% CPU power (Opteron 248) easily and delivers only 500 Mbit/s. Luckily for our tests, we could use the second gigabit Ethernet chip, the Broadcom BCM5788 (800 Mbit at 30%). We also would like to see the fan on the NVIDIA chipset replaced by a decent heat sink. Four SATA-II (300 MB/s) connections are available.

But at the end of the tests, the MSI K8Master-FAR2 (BIOS version 1.0) proved to be a very capable board. Supported by our Vantec 470W power supply, it had no trouble at all with two 875 Opterons running heavy database server tests for more than 3 weeks.

The current market situation Benchmarks
POST A COMMENT

45 Comments

View All Comments

  • JohanAnandtech - Saturday, June 18, 2005 - link

    Mino, thanks for pointing that out. Query cache enabling has nothing to do with "stressful". It has to do with accelarting a few queries that are run over and over again. Which is very interesting for reducing the response time of a website serving up the last article, but which is not limited by CPU power at all.



    Reply
  • JohanAnandtech - Saturday, June 18, 2005 - link

    To the people who make a fuss about disabling the query cache: this has nothing to with the Opteron not performing well in that situation. Single Xeon: 980 queries/s. Dual xeon: 985 queries/s Opteron 250: 1020 queries/s . Get it now why I say "other bottlenecks started to kick in"?

    It impossible that a dual xeon can't outperform a single one in these tests. We tried to find the bottleneck and even used a quad opteron 850 as client. The client was not the problem. My bet is on the network latency, but I have no knowledge of tools to profile the complete machine. The disk was not the problem, we tested that. Network bandwidth neither. My bet is on the network latency, or even the OS as the bottleneck kicked in a lot sooner w kernel 2.4
    Reply
  • mino - Friday, June 17, 2005 - link

    #32 try to think for a moment
    "Because the Opteron can't perform that well in stressful situations you won't post the scores?"

    If the CPU is not the bottleneck in the query cache scenario then why test the effect of CPU at all !!!

    You reminded me friend of mine who "tested" effect the "FSB" has on A64 system NOT having an FSB at all !!! ;-)
    Funny guy indeed.

    And about an intel compiler not beeing used.
    Like it or not, It IS a fact that it is not widely adopted especially among the target audience of this site an article.

    BTW given the past experience intel compiler would produce better code even on AMD systems so don't be so sure! Best code for K7 is made by intelcc set to PIII config. Albeit it does not use 3DNow! functionality at all.
    Reply
  • ElMoIsEviL - Friday, June 17, 2005 - link

    I think I have to agree with #20, as much as I am un-biased I feel this test was doctored by AMD... it ressembles the tests we see released by Apple often...

    "We didn't use the Intel compiler version as we have reason to believe that this version is not used a lot in the real world. We might try it out in a future article."

    Translation, "with the intel compiler AMD lost so being a marketing force for AMD we opted not to post those scores".


    and also as was mentioned before...
    ""The " query cache" was off, as we wanted to test worst case performance. In some cases, the query cache was able to push a single Xeon to 1000 queries per second, and the CPU was still capable of doing more, as the CPU load was at 50% - 70%."

    Why not?
    Because the Opteron can't perform that well in stressful situations you won't post the scores?

    Seriously.. this test is the biggest load of BS I have ever read... and I'm a current AMD adopter.
    Reply
  • JohanAnandtech - Friday, June 17, 2005 - link

    Viditor, it is possible that the IOMMU might have to do something with it.

    The IOMMU is a memory mapping unit sitting between the I/O bus and physical memory.

    Memory mapping is AFAIK only necessary if a certain device (PCI devices come to mind) can not do a 64 bit DMA. Now it seems that almost everything inside the newest Intel southbridges can do 64 bit DMA.

    So the IOMMU can only play a role when the driver is a 32 bit only, and the memory mapping has to happen. Now I would think that Intel would have an advantage here with their ultra modern southbridges. There might be a device that I am overlooking of course. Maybe our SCSI controller... But I don't think so.
    Reply
  • Viditor - Friday, June 17, 2005 - link

    Johan, if you're still reading (great article BTW)...
    A question I have had for quite awhile now is what effect the IOMMU has on these tests.
    The reasons I'm asking are
    1. I noticed that there was quite a disparity between the AMD and Intel 64bit performance (which you mentioned).
    2. I know that one difference between the 2 platforms is that AMD has a hardware IOMMU (of sorts) and Intel (at present) does not.
    3. I saw a thread last year with Linus T mentioning this quite a bit. He seemed to think that this would impair the EM64T substantially...

    Your thoughts?
    Reply
  • JohanAnandtech - Friday, June 17, 2005 - link

    If your database is running many "identical databases".... I meant "queries"

    Reply
  • JohanAnandtech - Friday, June 17, 2005 - link

    Juhl: It was 2.6.12rc5.

    Viditor: thanks for the helpful comment. Indeed, if you turn on the query cache, your CPU is doing very little.
    Everybody else: note the "identical" word in viditor's quote. If your database is running many identical databases, than you are not going to spend time reading this kind of article: you simply buy the cheapest decent server. Any CPU today can run 1000s of querries if everything comes out the query cache.

    Running benchmarks with the query cache on is simply not interesting. The query cache is all about accelerating the IDENTICAL queries that are run from time to time. You might reserve a bit of RAM to make sure that the most common queries (getting the latest article of a website for example) are run faster.

    But those numbers don't tell you anything about the load that your server is going to be able to take. You want worst case performance numbers!
    Reply
  • Viditor - Friday, June 17, 2005 - link

    Questar - the reason the query cache was turned off (guessing here) is to more reasonably simulate a real-world test. Obviously in this test, the same queries are repeated quite often. But that is not usually the case in the real world...
    For those who don't know what the heck a "query cache" is:

    "the query cache stores the text of a SELECT query together with the corresponding result that was sent to the client. If the identical query is received later, the server retrieves the results from the query cache rather than parsing and executing the query again"
    Reply
  • Questar - Friday, June 17, 2005 - link

    #23,

    We don't know, it specifically says Xeon. We don't have any idea what happens on an Opteron.
    Reply

Log in

Don't have an account? Sign up now