Mac OS X versus Linux

Lmbench 2.04 provides a suite of micro benchmarks that measure the bottlenecks at the Unix operating system and CPU level. This makes it very suitable for testing the theory that Mac OS X might be the culprit for the terrible server performance of the Apple platform.

Signals allow processes (and thus threads) to interrupt other processes. In a database system such as MySQL 4.x where so many processes/threads (60 in our MySQL screenshot) and many accesses to the kernel must be managed, signal handling is a critical performance factor.

Larry McVoy (SGI) and Carl Staelin (HP):
" Lmbench measure both signal installation and signal dispatching in two separate loops, within the context of one process. It measures signal handling by installing a signal handler and then repeatedly sending itself the signal."
Host OS Mhz null null
call
open
I/O
stat slct
clos
sig
TCP
sig
inst
Xeon 3.06 GHz Linux 2.4 3056 0.42 0.63 4.47 5.58 18.2 0.68 2.33
G5 2.7 GHz Darwin 8.1 2700 1.13 1.91 4.64 8.60 21.9 1.67 6.20
Xeon 3.6 GHz Linux 2.6 3585 0.19 0.25 2.30 2.88 9.00 0.28 2.70
Opteron 850 Linux 2.6 2404 0.08 0.17 2.11 2.69 12.4 0.17 1.14

All numbers are expressed in microseconds, lower is thus better. First of all, you can see that kernel 2.6 is in most cases a lot more efficient. Secondly, although this is not the most accurate benchmark, the message is clear: the foundation of Mac OS X server, Darwin handles the signals the slowest. In some cases, Darwin is even several times slower.

As we increase the level of concurrency in our database test, many threads must be created. The Unix process/thread creation is called "forking" as a copy of the calling process is made.

lmbench "fork" measures simple process creation by creating a process and immediately exiting the child process. The parent process waits for the child process to exit. The benchmark is intended to measure the overhead for creating a new thread of control, so it includes the fork and the exit time.

lmbench "exec" measures the time to create a completely new process, while " sh" measures to start a new process and run a little program via /bin/ sh (complicated new process creation).

Host OS Mhz fork
hndl
exec
proc
Sh
proc
Xeon 3.06 GHz Linux 3056 163 544 3021
G5 2.7 GHz Darwin 2700 659 2308 4960
Xeon 3.6 GHz Linux 3585 158 467 2688
Opteron 850 Linux 2404 125 471 2393

Mac OS X is incredibly slow, between 2 and 5(!) times slower, in creating new threads, as it doesn't use kernel threads, and has to go through extra layers (wrappers). No need to continue our search: the G5 might not be the fastest integer CPU on earth - its database performance is completely crippled by an asthmatic operating system that needs up to 5 times more time to handle and create threads.

Mac OS X: beautiful but… Workstation, yes; Server, no.
POST A COMMENT

112 Comments

View All Comments

  • Viditor - Friday, June 03, 2005 - link

    IntelUser2000 - "about the AMD TDP number, they never state that its max power, they say its maximum power achievable under most circumstances, its not absolute max power"

    Not true at all...AMD's datasheet clearly states that it's not only max power, but max theoretical power.
    http://www.amd.com/us-en/assets/content_type/Downl...
    Reply
  • trooper11 - Friday, June 03, 2005 - link

    I think its hard enough comparing a G5 to PC systems. I dont belive there will ever be a 'fair' comparison that satisfies everyone on both sides. There are too few general programs to compare and people will always complain about using or not using optimized apps for either platform. many of the varibles are subjective and the benchmarks to be compared are so heavily debated without a clear answer.

    I think this was a good attempt, but I gave up trying to 'fairly' compare the two a long time ago. Anyhting that sheds a bit of light is a good thing, but i never expect an end to the contreversy, too many questions that cant be answered.

    I would though love to see the addition of dual core amd chips since they are out there and would be serious competition, of course it would fly in server applications. hopefully the numbers for that could be added in a later article.
    Reply
  • psychodad - Friday, June 03, 2005 - link

    Fascinating. You run these tests using a compiler that Apple does not use (unless it is Yellow Dog) against software generally optimized for x86 architectures and you make conclusions. This makes your data tainted (actually biased) and your conclusions faulty. I would suggest that in fairness you make your tests more "real world" by using the software compiled by compilers that the rest of us nontechnical people use on a daily basis. Reply
  • smitty3268 - Friday, June 03, 2005 - link

    Rosyna:
    Oh, I assumed he was using the Apple version of gcc. If not, then I see what you mean.
    Reply
  • crimsonson - Friday, June 03, 2005 - link

    This article may be moot by Monday

    http://tinyurl.com/7ex4v
    Reply
  • Garyclaus16 - Friday, June 03, 2005 - link

    " Oh and the graph on page 5 doesnt display correctly in firefox. "

    AND you are using firefox for what reason?...you deserve to view pages incorrectly
    Reply
  • Rosyna - Friday, June 03, 2005 - link

    smitty3268, that's part of the problem. Almost no one uses GCC 3.3.3 (stock, from the main gcc branch) for Mac OS X development because it really sucks at optimizing for the PPC. On the other hand, OS X was compiled with the Apple shipped GCC 3.3/GCC 4.0. Reply
  • smitty3268 - Friday, June 03, 2005 - link

    I think its fair to use the compilers most people are going to be using. That would be gcc on both platforms. As far as autovectorization in 4.0, don't expect very much from it. Obviously it will be better than 3.3, but the real work is being added now in 4.1.

    I'll join the other 50 posters who would have liked to see at least 1 page showing the G5's performance under linux compared to OSX. That and maybe a few more real world benchmarks. But your article was very informative and answered a lot of questions. It was frustrating that there really wasn't anything done like this before.
    Reply
  • Rosyna - Friday, June 03, 2005 - link

    Actually, for better or worse the GCC Apple includes is being used for most Mac OS X software. OS X itself was compiled with it. Reply
  • elvisizer - Friday, June 03, 2005 - link

    rosyna's right.
    i'm just not sure if there IS anyway to do the kind of comparison you seem to've been shooting for (pure competition between the chips with as little else affecting the outcome as possible). you could use the 'special' compilers on each platform, but those aren't used for compiling most of the binaries you buy at compusa.
    Reply

Log in

Don't have an account? Sign up now