Introduction

Enterprise versions of Linux based on kernel 2.6, and 64 bit database servers are now very mature. Dual core 64 bit Opteron and 64 bit Xeons with 2 MB L2-caches are available. It was definitely time to update our previous Linux Database Server CPU comparison.

In this article, you will find a comparison of the latest Xeon (Irwindale), the previous Xeon (Nocona), the old Xeon (Galatin), the Dual core Opteron, and the "normal" Opteron, of course. We also included the Pentium-D to get an idea of what a Dual core Xeon could do, although the comparison is not completely fair: the memory subsystem of a Dual core Xeon will have higher latency and slightly lower bandwidth as it will use ECC buffered DIMMs instead of non-buffered DIMMs.

In our previous article, we used SUSE SLES 8 (kernel 2.4.21) and the Xeon 3.6 GHz "Nocona" matched the performance of the Opteron 250 in 32 bit DB2, but failed to impress in MySQL. Intel's Xeon was not recognized as a 64 bit capable CPU by SLES 8 with kernel 2.4 however, and the Opteron gained 12% (DB2) and 30% (MySQL) when running in 64 bit.

On SLES 9, we can unleash the full 64 bit potential of both the Intel Xeon and Opteron. Kernel 2.6 includes better and improved support for NUMA, 64 bit, large memory pages, threading and fully recognizes EM64T CPUs as 64 bit capable. How do the Xeon and Opteron compare when they both run 64 bit applications on a 64 bit enterprise version of Linux? Should you invest in Dual core CPUs, or are these expensive CPUs beaten by two single CPUs? Should you wait for Dempsey, the dual core Xeon?

These are a few of the questions that we will answer. While we still continue to improve the quality of our benchmarks, we decided to report our first impressions.

The scope and focus of this test

Our last Database server comparison generated quite a bit of very useful and interesting feedback. Living up to the excellent AnandTech tradition, we have read them carefully and taken many suggestions to heart.

In a nutshell, the foci of this article are as follows:
  • Only CPU and CPU-chipset-memory database performance tests
  • Mostly Database reads
  • DB2 and MySQL on SUSE SLES 9 - Kernel 2.6.5
  • Database use of small and medium-sized enterprises
  • single and dual processing systems.
Our benchmark Quality assurance methods include:
  • Checking the disk activity with iostat and vmstat
  • Constant monitoring of the Client's CPU load, network load and memory usage
  • Tests were repeated at least 3 times
  • All tests were performed with two different clients: a Dual Opteron 850 2.4 GHz and a Quad Opteron 848 2.2 GHz
  • Improved and optimised Client program
Real world databases are in many cases disk limited. Jason and Ross have been running 8 x 36GB 15,000RPM Ultra320 SCSI drives in RAID-0 to avoid the Enterprise Class Performance tests being limited by disk I/O performance.

However, the Lab of the Technical University of Kortrijk where we performed our tests did not dispose of such an impressive disk array, and we were determined to focus on the database performance of the different CPUs and CPU-chipset-memory combinations. All tests were done (99% of the time) with in-memory queries. Investigating the performance of different disk storage systems is a time-consuming and completely different project.

We still tested with our 1 GB big database imported in MySQL MyISAM, InnoDB and IBM's DB2 8.2 .

Some of you might still be convinced that in-memory tests are not really relevant. Consider that the availability of cheap 64 bit system makes it possible to use much more RAM than before. Flat 64 bit addressing of more than 4 GB of RAM used to be a privilege of very expensive servers (Power4, etc.), but this is no longer the case with the introduction of Intel's EM64T Xeons and AMD's AMD64 Opteron.

With the current prices of 1 GB DDR(-II) sticks, it is very easy and inexpensive to build a database server with 8 GB of RAM. Even 16 GB (16x1 GB) is not that expensive, considering the price of a quad Opteron server. As a seasoned sys-admin told me, "the performance of database servers can be brought back to life with some extra RAM." It is in many cases that a large amount of RAM can do more than very expensive 15,000RPM SCSI disks.

Again, this article is not about the typical huge central databases of banks that need to handle a large number of transactions, with writes operations being very frequent.

We test on SUSE SLES 9 (SUSE Enterprise Edition) SP1, Linux kernel 2.6.5-151smp. Yes, this is not the latest kernel version, which is 2.6.12 at the time of this article. We used 2.6.5 because it is the last kernel available for our enterprise version of SUSE. The very nature of this project also forces us to check our numbers with at least 5 consecutive tests, and a lot of time is spent in checking parameters and so on, so we need to "freeze" the kernel version for a few weeks. We did perform a few tests on Gentoo, however, with kernel version 2.6.12.

The current market situation
POST A COMMENT

45 Comments

View All Comments

  • JohanAnandtech - Saturday, June 18, 2005 - link

    Mino, thanks for pointing that out. Query cache enabling has nothing to do with "stressful". It has to do with accelarting a few queries that are run over and over again. Which is very interesting for reducing the response time of a website serving up the last article, but which is not limited by CPU power at all.



    Reply
  • JohanAnandtech - Saturday, June 18, 2005 - link

    To the people who make a fuss about disabling the query cache: this has nothing to with the Opteron not performing well in that situation. Single Xeon: 980 queries/s. Dual xeon: 985 queries/s Opteron 250: 1020 queries/s . Get it now why I say "other bottlenecks started to kick in"?

    It impossible that a dual xeon can't outperform a single one in these tests. We tried to find the bottleneck and even used a quad opteron 850 as client. The client was not the problem. My bet is on the network latency, but I have no knowledge of tools to profile the complete machine. The disk was not the problem, we tested that. Network bandwidth neither. My bet is on the network latency, or even the OS as the bottleneck kicked in a lot sooner w kernel 2.4
    Reply
  • mino - Friday, June 17, 2005 - link

    #32 try to think for a moment
    "Because the Opteron can't perform that well in stressful situations you won't post the scores?"

    If the CPU is not the bottleneck in the query cache scenario then why test the effect of CPU at all !!!

    You reminded me friend of mine who "tested" effect the "FSB" has on A64 system NOT having an FSB at all !!! ;-)
    Funny guy indeed.

    And about an intel compiler not beeing used.
    Like it or not, It IS a fact that it is not widely adopted especially among the target audience of this site an article.

    BTW given the past experience intel compiler would produce better code even on AMD systems so don't be so sure! Best code for K7 is made by intelcc set to PIII config. Albeit it does not use 3DNow! functionality at all.
    Reply
  • ElMoIsEviL - Friday, June 17, 2005 - link

    I think I have to agree with #20, as much as I am un-biased I feel this test was doctored by AMD... it ressembles the tests we see released by Apple often...

    "We didn't use the Intel compiler version as we have reason to believe that this version is not used a lot in the real world. We might try it out in a future article."

    Translation, "with the intel compiler AMD lost so being a marketing force for AMD we opted not to post those scores".


    and also as was mentioned before...
    ""The " query cache" was off, as we wanted to test worst case performance. In some cases, the query cache was able to push a single Xeon to 1000 queries per second, and the CPU was still capable of doing more, as the CPU load was at 50% - 70%."

    Why not?
    Because the Opteron can't perform that well in stressful situations you won't post the scores?

    Seriously.. this test is the biggest load of BS I have ever read... and I'm a current AMD adopter.
    Reply
  • JohanAnandtech - Friday, June 17, 2005 - link

    Viditor, it is possible that the IOMMU might have to do something with it.

    The IOMMU is a memory mapping unit sitting between the I/O bus and physical memory.

    Memory mapping is AFAIK only necessary if a certain device (PCI devices come to mind) can not do a 64 bit DMA. Now it seems that almost everything inside the newest Intel southbridges can do 64 bit DMA.

    So the IOMMU can only play a role when the driver is a 32 bit only, and the memory mapping has to happen. Now I would think that Intel would have an advantage here with their ultra modern southbridges. There might be a device that I am overlooking of course. Maybe our SCSI controller... But I don't think so.
    Reply
  • Viditor - Friday, June 17, 2005 - link

    Johan, if you're still reading (great article BTW)...
    A question I have had for quite awhile now is what effect the IOMMU has on these tests.
    The reasons I'm asking are
    1. I noticed that there was quite a disparity between the AMD and Intel 64bit performance (which you mentioned).
    2. I know that one difference between the 2 platforms is that AMD has a hardware IOMMU (of sorts) and Intel (at present) does not.
    3. I saw a thread last year with Linus T mentioning this quite a bit. He seemed to think that this would impair the EM64T substantially...

    Your thoughts?
    Reply
  • JohanAnandtech - Friday, June 17, 2005 - link

    If your database is running many "identical databases".... I meant "queries"

    Reply
  • JohanAnandtech - Friday, June 17, 2005 - link

    Juhl: It was 2.6.12rc5.

    Viditor: thanks for the helpful comment. Indeed, if you turn on the query cache, your CPU is doing very little.
    Everybody else: note the "identical" word in viditor's quote. If your database is running many identical databases, than you are not going to spend time reading this kind of article: you simply buy the cheapest decent server. Any CPU today can run 1000s of querries if everything comes out the query cache.

    Running benchmarks with the query cache on is simply not interesting. The query cache is all about accelerating the IDENTICAL queries that are run from time to time. You might reserve a bit of RAM to make sure that the most common queries (getting the latest article of a website for example) are run faster.

    But those numbers don't tell you anything about the load that your server is going to be able to take. You want worst case performance numbers!
    Reply
  • Viditor - Friday, June 17, 2005 - link

    Questar - the reason the query cache was turned off (guessing here) is to more reasonably simulate a real-world test. Obviously in this test, the same queries are repeated quite often. But that is not usually the case in the real world...
    For those who don't know what the heck a "query cache" is:

    "the query cache stores the text of a SELECT query together with the corresponding result that was sent to the client. If the identical query is received later, the server retrieves the results from the query cache rather than parsing and executing the query again"
    Reply
  • Questar - Friday, June 17, 2005 - link

    #23,

    We don't know, it specifically says Xeon. We don't have any idea what happens on an Opteron.
    Reply

Log in

Don't have an account? Sign up now