Benchmarks IBM DB2 8.1.3: Intel versus AMD



The first question that most people will ask is, of course, how the best AMD Opteron compares to the newest Intel Xeon "Nocona" CPU. Below is a quick table to refresh your memory and to enable you to compare price/performance:

Intel Xeon CPUs Core L2 cache L3 cache x86-64 bit In Test Price
3.60 GHz w/ 1M cache 800 MHz FSB (90nm) Nocona = "Prescott server" 1 MB No Yes Yes $851
3.40 GHz w/ 1M cache 800 MHz FSB (90nm) Nocona = "Prescott server" 1 MB No Yes No $690
3.20D GHz w/ 1M cache 800 MHz FSB (90nm) Nocona = "Prescott server" 1 MB No Yes No $455
3 GHz w/ 1M cache 800 MHz FSB (90nm) Nocona = "Prescott server" 1 MB No Yes No $316
3.20C GHz w/ 2M cache 533 MHz FSB (.13) Galatin = "P4 EE Server" 0,5 MB 2 MB No Yes $1,043
3.20 GHz w/ 1M cache 533 MHz FSB (.13) Galatin = "P4 EE Server" 0,5 MB 1 MB No No $690
3.06A GHz w/ 1M cache 533 MHz FSB (.13) Galatin = "P4 EE Server" 0,5 MB 1 MB No Yes $455
3.06 GHz w/ 512k cache 533 MHz FSB (.13) Prestonia = "Northwood Server" 0,5 MB No No Yes $316
AMD Opteron CPU's Core L2 cache L3 cache x86-64 bit In Test Price
Model 250 (2.4 GHz) Sledgehammer 1 MB No Yes Yes $851
Model 248 (2.2 GHz) Sledgehammer 1 MB No Yes Yes $690
Model 246 (2.0 GHz) Sledgehammer 1 MB No Yes No $455
Model 244 (1.8 GHz) Sledgehammer 1 MB No Yes No $316

We were also very curious about the Xeon Nocona, as the it brings higher clock speeds, a bigger L2-cache, no L3-cache and a pipeline 11 stages longer than the previous Xeon "Prestonia" and Xeon "Gallatin", which maxed out at 3.2 GHz. The first two features mentioned should boost the performance quite well, while the two last are disadvantages.

We should emphasize that, as we tested with SUSE SLES 8 (kernel 2.4.21), the Xeon Nocona was disadvantaged, since we could not test it in 64-bit mode. We assure you that we will update this report with 2.6 kernel. For now, we decided to give you a full report on SLES 8 and kernel 2.4. (All numbers are expressed in queries per second.)

Concurrency Xeon 3.6 GHz Dual Xeon 3.2 L3 (2MB) Dual Xeon 3.2 Dual Xeon 3.06 L3 (1MB) Dual Xeon 3.06 Opteron 250 DDR400 32 bit Dual Opteron 250 DDR 400 64 bit Dual Opteron 248 DDR 400 64 bit
1 55 46 44 43 42 57 61 57
2 87 74 61 72 61 105 118 107
5 128 104 100 98 98 123 137 129
10 136 112 107 105 102 129 145 132
20 136 113 106 106 104 131 147 132
35 138 113 106 104 99 133 150 129
50 138 110 106 102 100 130 145 128

All concurrency tests below 5 are not reliable enough to make any firm conclusion, especially for the Xeon. The margin of error is somewhat higher, but that is not all.

As the Dual Xeon with Hyperthreading spawns 4 logical CPUs, with a concurrency of 2, it is possible that only one physical CPU is doing all the work. Looking at the numbers and the linux tool top, we feel pretty sure that this is exactly what happens most of the time. Compare Row "5" with "2", and "2" with "1" to see what I mean. Note that the results of rows 10 to 50 do not vary a lot; so, we look at these numbers for our conclusions. In the table below, you can see an overview of how the different CPUs compare in percentages.

3.6 vs 3.2 2 MB L3-cache vs none 1 MB L3-cache vs none Xeon 3.2 vs 3.06 Xeon 3.2 vs 3.06 (both with L3) Xeon 3.6 vs Opteron 250 Opteron 64 bit vs 32 bit
20% 3% 1% 7% 7% -4% 6%
17% 22% 18% 3% 3% -17% 12%
24% 4% 1% 5% 5% 5% 12%
21% 5% 3% 6% 6% 6% 13%
21% 6% 2% 6% 6% 3% 12%
22% 7% 5% 8% 8% 3% 12%
26% 4% 2% 8% 8% 7% 12%

If we had published a similar report back in August, the Opteron would enjoyed a landslide victory. Luckily for Intel, Nocona is very competitive and is about 5% faster than the Opteron 250.

The gigantic - for x86 - L3-cache can not help the Xeon much. We measured only a 2% to 5% performance boost from the 1 MB L2-cache (at 3.06 GHz), and a 4% to 7% performance boost from the 2 MB L3-cache (at 3.2 GHz). The L3-cache seems to boost performance as much as 5% to 6% clock speed increase - nothing to write home about. So a Xeon "Galatin" 3.2 GHz 2 MB L3-cache performs more or less like a Xeon "Galatin" 3.4 GHz, if such a beast should exist.

A comparison between the 3.2 GHz and 3.06 GHz shows that CPU clockscaling - given equal cache sizes - is almost perfect, a testimony to how CPU intensive this benchmark is. Clearly, the generalisation, "databases are all about I/O" is not accurate for a number of database applications. Read-heavy databases seem to be "all about the CPU".

Using a 64 bit database (DB2 8.1.3) on a 64 bit operating system delivers about 12% to 13% better performance. Since we didn't use more than 2 GB, the most likely explanation is the fact that the software can make use of 16 registers instead of 8. We also tested with a twice as large database and 4 GB of RAM, and the results were very similar.

The performance of the Nocona Xeon compared to the older Xeons is also remarkable. The database doesn't mind the longer pipeline and absence of the L3-cache. On the contrary, it performs better than its clock speed indicates, leaving the older 3.2 GHz Xeon (with 2 MB L3 cache!) behind with 21% to 22%, while the Nocona has only a 13% clock speed advantage over the latter. To be honest, we expected Nocona, with its huge branch misprediction penalty, a result of its extremely long pipeline, to scale much worse.

The reference machines versus HP and SUN Benchmarks IBM DB2: DDR400 vs DDR333
Comments Locked

46 Comments

View All Comments

  • blackbrrd - Friday, December 3, 2004 - link

    I think that it is Quad-channel, as the board is Numa aware..
  • Olaf van der Spek - Friday, December 3, 2004 - link

    > The result is that the Lindenhurst board can offer 4 DIMMs per channel while the other Xeon servers with DDR-I were limited to 4 DIMMs in total, or one per memory channel.

    Is that chipset quad-channel?
  • Olaf van der Spek - Friday, December 3, 2004 - link

    > It is especially impressive if you consider the fact that the load on the address lines of DDR makes it very hard to use more than 4 DIMMs per memory channel. Most Xeon and Opteron systems with DDR-I are limited to 4 DIMMs per memory channel

    Isn't the Opteron limited to 3 or 4 DIMMs per channel too?
    After all, it's 6 to 8 DIMMs per CPU and each CPU is dual-channel.
  • prd00 - Thursday, December 2, 2004 - link

    I am waiting for 64 bit Nocona vs 64 bit Opteron. Also, I think SLES9 would be interesting.
  • mczak - Thursday, December 2, 2004 - link

    #16 ok didn't know 2.4.21 already supported NUMA. SuSE lists it as a new feature in SLES9.
    I agree it probably really makes not much of a difference with a 2-cpu box, but I think there should be quite an advantage with a 4-cpu box. The HT links are speedy, but I would guess you would end up using basically only one ram channel for all ram accesses way too often, bumping into bandwidth limitations.
  • JohanAnandtech - Thursday, December 2, 2004 - link

    Lindy, you are probably right, I probably got carried away a little too much. however, you seem to swing the other way a little too far. For example, a peoplesoft server is essentially a database server (or are you talking about the application server, working in 3 tiers?)

    A webserver is in many cases a databaseserver too. I would even doubt an exchange server is not related, but I never worked with that hard to configure stubborn application. Many of those turnkey and homegrown apps are probably apps on top of database server too...

    And I think it is clear we are not talking about fileservers. I agree fully that fileservers are all about I/O but I don't agree about database servers.

    To sum it up: yes, you are right, it is not the lionshare in quantities. However, it is probably still the biggest part when we look at costs. Because I can probably buy 5 fileservers for one database server. Why even use fileservers when you have NAS?

  • dragonballgtz - Thursday, December 2, 2004 - link

    cliff notes :P
  • lindy - Thursday, December 2, 2004 - link

    This statement……

    Up to $46 billion is spent in the Servers (hardware) market, and while a small portion of those servers is used for other things than running relational databases (about 20% for HPC applications), the lion's share of those servers are bought to ensure that a DB2, Oracle, MS SQL server or MySQL database can perform its SQL duties well.

    ……..Is so far off base, its almost funny.

    I would reverse that statement, as in a small portion of servers are database server in a most companies. I manage an IT department that takes care of about 160 servers for a company. A good mix of mostly 2/3 windows servers and 1/3 UNIX/LINUX. System administration/engineering is my trade.

    When I look at our servers I see DNS, DHCP, WINS, Domain Controllers, Exchange, SMTP, Blackberry, Proxy, File, Print, WEB, Backup, turnkey application, and Database servers. Maybe 20 of the approximately 160 servers are database servers. Of that 2, (8 CPU Sun 1280’s clustered running Sybase) are the busiest, containing our customer database of over 200,000 customers. Even at that, those servers are rarely over 50% CPU utilization.

    The other 18 database servers, run a variety of databases (none DB2) Oracle, MySQL, and Microsoft SQL. The databases server up data for all kinds applications, like Microsoft SMS2003, Crystal Reports, ID badge security application, People Soft, Remedy, all kinds of turn key applications based around our industry, home grown apps and the list goes on. There are times when some of these servers are really busy CPU wise, about 5% of the time, and usually at night doing data uploads or re-index’s.

    My point is most servers waste CPU power. Sure you can find applications and uses for servers that eat CPU all day long…..but that is the minority of the 46 billion spent on servers…..tiny minority. For most servers network I/O and especially disk I/O are way more critical. Database servers setup with the wrong disk configurations have their CPU’s sitting around doing not much. Servers like File, print, DHCP, DNS, SMTP, some in every company…..can get away with single CPU’s. Heck our print servers are running on Dell 1650’s with 1.4ghz P3-CPU’s that are coasting, but the disks are spinning all the time, and the network cards are busy, busy.

    When you realize these things, Xeon CPU’s vs Opteron does not really matter 99% of the time, cost does. When you a company like Dell that has sold its soul to Intel for low prices, that they turn around and offer to people like me……I don’t even consider what CPU is in the box most of the time.
  • JohanAnandtech - Thursday, December 2, 2004 - link

    about MySQL:
    I don't think you can find a way to make the Xeon go faster than the Opteron.

    But I do agree that performance depends on the kind of application, the size of the database etc.

    "A database that fits entirely inside of RAM isn't very interesting"

    Well, I can understand that. But

    1) do realize that for really performance critical (read applications) applications you are doomed if information has to come from your harddisks, no matter how fast RAID 50 is. Caching is the key to a speedy database application

    2) The information that is being requested 99% of the time (in most applications) is relatively small compared to the total amount of data. So a test with a 1 GB database can be representive for a database that is in total 30 GB or something. Just look at Anandtech: how many of you are browsing the forum of 3 months ago? How interesting is it for AT to optimise for those few that do?

    3) I think we made it very clear that our focus was not on the huge OLTP databases but the ones behind other applications


  • Slack3r78 - Thursday, December 2, 2004 - link

    I'd agree that using SuSe 8 was a poor choice. I like the "not using the latest and greatest" theme for servers as that's a reality in the field, but SuSe 8 was realeased essentially alongside the first Opterons. The move to a 2.6 kernel and the time for developers to really play with the new architechture could mean even bigger performance numbers.

    Given that Nocona, or public knowledge of an Intel x86-64 chip at all, didn't exist when SuSe8 was released, I'm not surprised that it wouldn't run in 64 bit mode. EMT64 has proven to be rather quirky and less than perfect, from the reports I've read, anyway. See here:
    http://www.theinquirer.net/?article=16879

    Another test running a distribution that was more recently released would definitely be interesting, if possible.

Log in

Don't have an account? Sign up now