The G5 a.k.a. Power 970FX

You might not have noticed it, but there is in fact a lot of good news in this article for owners of current Apple systems. Gcc 4.0 promises a lot better (FP) performance in open source software. The improvement from gcc 4.0 over gcc 3.3.3 and 3.3 is amazing on the PowerFX family: almost a 70% improved FP performance!

Now that the open source community finally has a decent compiler for the Apple platform, Apple management decides to step over to another architecture. Ironically, right now, the Intel architecture needs a super-optimized compiler (Intel's own) to reach the FP performance that the G5 now reaches with a very popular but far less aggressive compiler (gcc).

Combined with the data from our first article, we can safely say that the G5 2.7 GHz FP performance is at least as good as the best x86 CPUs. Integer performance seems to be between 70% and 80% of the fastest x86 CPUs, while FP/SIMD performance can actually surpass x86 in certain situations.

With the dual-core Power 970MP available and IBM's current outstanding track record when it comes to multi-core CPUs, big question marks can be placed on whether or not the switch to Intel CPUs will - from a technical point of view - be such a big step forward as Steve Jobs claims. There is more: each core has 1 MB cache instead of the current 512 KB, which will improve integer performance quite a bit as it lessens the impact of the biggest problem of the G5 - the high latency access to the memory system.


Xserve, silently cooled. Below the G5 with the cover, you can see the heatsink.

It is again ironic that the Power 970MP is far more advanced than the current Intel Dual-cores when it comes to power management. Each core can be placed independently in a power-saving state called doze, while the other core continues operation.

A low power Power 970FX is also available and consumes about 16 Watts at 1.6 GHz; so it seems that IBM, although slightly late, could have provided everything that Apple needs. The G5 with its 58 million transistors and 66 mm² die size is not really a hot CPU. The Xserve (2 x 2.3 GHz G5) was by far the quietest 1U air-cooled server that ever entered our lab in Kortrijk.

The Usual Suspects

The Mac OS X kernel environment includes the Mach kernel, BSD, the I/O Kit, file systems, and networking components. Some of these components slow down MySQL significantly. While our rough profiling has not identified the true culprits, we think that we can narrow the possible suspects to:
  1. Relatively high TCP Latency that we measured
  2. The implementation of the threading system. Does the pthread to Mach threads mapping involve some overhead, or is this the result of the traditional performance problem of the micro kernel, namely the high latency of such a kernel on system calls? While Mac Os X is not a micro-kernel, the problem might still exist as the Mach core is deep inside that kernel. Is there IPC overhead? Lmbench signaling benchmarks seem to suggest that there is.
  3. The finer grained locking in the current version of Tiger does not appear to be working for some reason and we still have the "two lock" system of Panther.
Will this performance problem only be visible in MySQL? At this point, we can only speculate, but we have a strong suspicion that this is not the case. Server workloads spend, contrary to other workloads such as workstation apps, a substantial portion of their execution in the kernel and TCP stack. Porting such applications to Mac OS X is more complicated than just recompiling code. We didn't have to search long before we found examples of companies that increased their number of servers or upgraded in order to run MySQL faster.

We look forward to testing other database and server apps on the Mac OS X platform. Critical reports that point out weaknesses can only help the Apple community move forward and keep the Apple people on their toes.

References

[1] Threading on OS X
http://developer.apple.com/technotes/tn/tn2028.html

[2] Lmbench: Portable Tools for Performance Analysis
Larry McVoy, Silicon Graphics
Carl Staelin, Hewlett-Packard Laboratories

[3] Performance Characterization of a Quad Pentium Pro SMP Using OLTP
Kimberly Keeton*, David A. Patterson*, Yong Qiang He+, Roger C. Raphael+, and Walter E. Baker
Computer Science Division
University of California at Berkeley

Mac OS X Achilles Heel
Comments Locked

47 Comments

View All Comments

  • stmok - Thursday, September 1, 2005 - link

    LOL...As everyday passes, it seems more "interesting things" are revealed from Apple solutions.
  • ViRGE - Thursday, September 1, 2005 - link

    Granted, some of this was over my head(more than I'd like to admit to), but your results are none the less very interesting Johan. Now that we have the Linux/G5 numbers, there's no arguing that there's a weakness in MacOSX somewhere, which is a bit depressing as a Mac user, but still a very useful insight as to how there's obviously something very broken in some design aspect of the OS(it simply shouldn't be getting crushed like it is). My only question now is how Apple and its devs will respond to this - it is pretty damning after all.

    Thanks for finally getting some Linux/G5 numbers out to settle this.
  • sdf - Friday, September 2, 2005 - link

    By changing hardware platforms.

    No, seriously.

    A transition from PowerPC to Intel would be the perfect time to correct ABI flaws like this. It isn't that the G5 causes the slow down, it's that the slow down (maybe) can't really be fixed without breaking binary compatibility. A CPU transition is clearly going to do that anyway, so maybe they'll just wait...
  • toelovell - Thursday, September 1, 2005 - link

    I am kind of curious to see how Darwin would work on an x86 based system for these same tests. There are x86 binaries for Darwin 8. So it should be possible to run these tests and compare Darwin with Linux on an x86 platform. This would help to see if the OS really is the limitation. Just a thought.
  • JohanAnandtech - Thursday, September 1, 2005 - link

    If linux is capable of pushing the G5 8 times higer than with Mac OS X, there is little doubt on my mind that the OS is the problem. Or did I understand you wrong?

    Anyway, I have no experience whatsoever with Darwin. My first impression is that installing Darwin on x86 is probably a very masochistic experience, due to lack of proper drivers. We might get it working but can it really run MySQL and other apps? THere are probably libraries missing... Will the results be representative of anything as it is probably tuned for just getting it running instead of performance? Anyone with Darwin x86 experience?
  • wjcott - Thursday, September 1, 2005 - link

    The only interest I have in a mac OS is if they are going to sell it without a computer. I would love to have OS X, but I must build the machine.
  • Quanticles - Thursday, September 1, 2005 - link

    Every component must be fine tuned to the upmost degree... Every BIOS Setting... Every Hidden Register... *crazy eyes* =)

Log in

Don't have an account? Sign up now