The Xserve Server Platform

The most surprising and even astonishing results of the previous article were, of course, the MySQL and Apache server benchmarks. A powerful Windows XP based client (see above: "Client Configuration: Dual Opteron 250") fires off an enormous amount of Select, grouping and ordering read intensive queries and simulates 1 to 50 concurrent clients. All that query data is sent over a direct Gigabit Ethernet link to the tested server; in this case, a PowerMac Dual G5 2.5 GHz running OS X Server (Tiger). In part I, we discovered that performance of the Apple machine completely collapsed once there were more than 2 concurrent clients.

The solution? Install a Linux distribution to verify our suspicion that the OS is to blame is on the mark. We chose Yellow Dog Linux (YDL). Terra Soft, the company behind Yellow Dog, is an Apple Authorized OEM Value Added Reseller, so you could say that Apple has no objection to installing YDL on your Apple machines. There is more: Terra Soft is specialized in optimizing for the G5 processor. The version that we used, Yellow Dog Linux 4.0.1, is based on the Linux Kernel version 2.6.10-1.ydl.1g5-smp.

Let us see how the Dual 2.5 GHz G5 performed in MySQL when running Yellow Dog Linux. Please note: YDL 4.0 wouldn't run on the 2.7 GHz Apple machine, so we do not have results for that platform.

The difference between the PowerMac running Linux and Mac OS X Server is absolutely striking. Mac OS X server shows better performance going from one to a second connection (and thus thread) because the second CPU steps in and helps carry the load. After that, however, performance completely collapses and stabilizes at around 50 queries per second.

While the G5 is not the best integer processing unit out there, it is not the one to blame for the poor performance that we experienced in our first tests. Running Yellow Dog Linux, the Dual G5 was capable of performing similar to a 3 GHz Xeon. Notice that more concurrent connections gives better performance from 1 to 20. At 5 concurrent simulated users, YDL simply wipes the floor with Mac OS X: 411 versus 113 queries per second. It gets worse at 10 concurrent users: 443 queries per second on Linux versus 62 on Mac Os X. Around 20 connections, performance declines only very slowly just like all the x86/Linux machines.

With the MySQL performance woes now clearly caused by OS X, let us see if Apache tells us the same story. We tested with Apachebench, with "n" being the total of number of connections and "c" the total of concurrent connections:
ab -n 100000 -c 100 http://localhost
Some people suggested that we should test with both Apache 1.3 and 2.0, so we gave Apache 2.0 a test run.

Unit: Requests per second Powermac Dual G5 2.5 GHz OS X Powermac Dual G5 2.5 GHz YDL Dual Xeon 3.6 GHz
Apache 1.3 250 709 1291
Apache 2.0 266 2165 3410

On OS X, we noticed that the activity monitor was telling us that the CPUs were not working very hard and were underutilized. This seems to indicate that the problem with Apache is somewhat different from MySQL, as MySQL showed a CPU load between 165% and 190%. (200% is the maximum, as there are 2 CPUs in the system.)

Apple told us that the problem lies in Apachebench (the client side), which stalls from time to time and thus generates too low of a load on the (Apache) server. The weird thing is that this does not happen with few connections (up to 10,000). When we repeated the test, Apachebench on Mac OS X gets in trouble again. Version 2.0 is slightly faster on OS X, but it still trails by a significant margin. On the other hand, YDL and the Xeon platform are roughly 3X as fast with version 2.0.

According to Apple, this is a bug in Apachebench. Now, we can accept that explanation, as it is clear that the server is not loaded and can still accept a lot more web requests. However, the Apachebench problem is still interesting. Why exactly does the client stall? Is it really a bug or is it running out of some resources? We didn't delve deeper, as we are developing a less synthetic, closer to the real world benchmark to test web servers.

Even if we ignore the Apache results, our MySQL tests - and the queries used in these tests - are based on a real world usage pattern of a real world database. The G5 is partially crippled by a chipset that takes a long time to access the memory, and it's not the fastest integer CPU; still, it performs like a 3 GHz Xeon on Linux. The problem clearly lies in Mac OS X, and is worth further investigation.

Micro CPU Benchmarks: Isolating the FPU Bottleneck Search
Comments Locked

47 Comments

View All Comments

  • JohanAnandtech - Friday, September 2, 2005 - link

    Sorry couldn't resist :-). (for the rest of the world, pannekoek is dutch for Pancake)

    Desktop performance is ok, as desktop apps are similar to the workstation apps we tested in the first article. Those apps spend from 5-20% in the OS, while server apps spend up to 80% of their time in the OS!

    However, I should point out that we tested Mac OS X SERVER, so it is a problem for the Xserves.
  • Pannenkoek - Friday, September 2, 2005 - link

    I stand corrected then. However, my reasoning still applies, it's just that Apple relies even more on its brand than on technology to sell server systems apparently. Who runs Mac OS servers anyway, it's an oxymoron. ;-)

    P.S. Do not mock my nick, it served well in beating godlike UT bots, and should be honoured as much as Loque.
  • Tanclearas - Thursday, September 1, 2005 - link

    "Apple told us that the problem lies in the Apachebench (the client side), which stalls from time to time and thus, generates too low of a load on the (Apache) server."

    How does this explanation make any sense? Linux obviously doesn't have a problem with these "stalls".
  • JohanAnandtech - Friday, September 2, 2005 - link

    What follows is not what Apple said, but my interpretation...

    They are probably pointing out that the version for Mac OS X has a Mac OS X specific bug. Of course, who is to blame? I am sceptical like you.
  • mariush - Thursday, September 1, 2005 - link

    Page 4 :

    We used the following on the Opteron based PCs:

    Gcc -O2 -mcpu=G5 flops.c -o flops

    And, on the G5 machines, we used:

    Gcc -O2 -march=k8 flops.c -o flops

    I think it's the other way around.
  • Houdani - Thursday, September 1, 2005 - link

    Aye, was gonna point that out also.

    In addition, on page 3 should you list the Yellow Dog Linux along with OSX in the Software section for the Apple PowerMac G5?
  • Shinei - Thursday, September 1, 2005 - link

    My question is, would the memory latencies be so high for the 970FX if high-end RAM was used for the Linux tests (like, say, some TCCD or BH-5 at 2-2-2-5), instead of the standard 3-3-3-8 SPD that ships with the G5 system? Or is there some limitation to the G5 motherboard that prevents posting with performance RAM as a way for Apple to ensure that only certain, accepted DIMMs are used with their computers?
    Anyway, these results are very telling about what the OSX86 Macs are going to perform like--that is to say, ~25% slower than the equivalent Windows/Linux boxes running the same hardware...
  • IntelUser2000 - Sunday, September 4, 2005 - link

    quote:

    My question is, would the memory latencies be so high for the 970FX if high-end RAM was used for the Linux tests (like, say, some TCCD or BH-5 at 2-2-2-5), instead of the standard 3-3-3-8 SPD that ships with the G5 system? Or is there some limitation to the G5 motherboard that prevents posting with performance RAM as a way for Apple to ensure that only certain, accepted DIMMs are used with their computers?


    That doesn't matter since they are testing workstations, Irwindale and Opteron is also using CAS3 RAM. No workstations/servers use 2-2-2-5 RAM.


    The poor scores of OS X compared to Linux makes sense. G5 was rumored to be fast in speccpu benchmarks but came out to be slower. Must be that rumor systems were benched with Linux and the production was benched with OSX.

    I am impressed with OS X's features though.
  • Jedi2155 - Thursday, September 1, 2005 - link

    The G5 motherboard has the limitations due to Apple's way to insure you only buy certified ram. The SPD settings must be perfect.
  • ceefka - Thursday, September 1, 2005 - link

    I am humbled by the sheer expertise of Johan. Amazing work, Johan!

    This makes me even more curious about Intel's contribution to the next generation of Macs. How will they compare to the best G5s?

Log in

Don't have an account? Sign up now