Mac OS X: beautiful but...

The Mac OS X (Server) operating system can't be described easily. Apple:

"Mac OS X Server starts with Darwin, the same open source foundation used in Mac OS X, Apple's operating system for desktop and mobile computers. Darwin is built around the Mach 3.0 microkernel, which provides features critical to server operations, such as fine-grained multi-threading, symmetric multiprocessing (SMP), protected memory, a unified buffer cache (UBC), 64-bit kernel services and system notifications. Darwin also includes the latest innovations from the open source BSD community, particularly the FreeBSD development community."

While there are many very good ideas in Mac OS X, it reminds me a lot of fusion cooking, where you make a hotch-potch of very different ingredients. Let me explain.


Hexley the platypus, the Darwin mascot

Darwin is indeed the open Source project around the Mach kernel 3.0. This operating system is based around the idea of a microkernel, a kernel that only contains the essence of the operating system, such as protected memory, fine-grained multithreading and symmetric multiprocessing support. This in contrast to "monolithic" operating systems, which have all of the code in a single large kernel.

Everything else is located in smaller programs, servers, which communicate with each other via ports and an IPC (Inter Process Communication) system. Explaining this in detail is beyond the scope of this article (read more here). But in a nutshell, a Mach microkernel should be more elegant, easier to debug and better at keeping different processes from writing in eachother's protected memory areas than our typical "monolithic" operating systems such as Linux and Windows NT/XP/2000. The Mach microkernel was believed to be the future of all operating systems.

However, you must know that applications (in the userspace) need, of course, access to the services of the kernel. In Unix, this is done with a Syscall, and it results in two context switches (the CPU has to swap out one process for another): from the application to the kernel and back.

The relatively complicated memory management (especially if the server process runs in user mode instead of kernel) and IPC messaging makes a call to the Mach kernel a lot slower, up to 6 times slower than the monolithic ones!

It also must be remarked that, for example, Linux is not completely a monolithic OS. You can choose whether you like to incorporate a driver in the kernel (faster, but more complex) or in userspace (slower, but the kernel remains slimmer).

Now, while Mac OS X is based on Mach 3, it is still a monolithic OS. The Mach microkernel is fused into a traditional FreeBSD "system call" interface. In fact, Darwin is a complete FreeBSD 4.4 alike UNIX and thus monolithic kernel, derived from the original 4.4BSD-Lite2 Open Source distribution.

The current Mac OS X has evolved a bit and consists of a FreeBSD 5.0 kernel (with a Mach 3 multithreaded microkernel inside) with a proprietary, but superb graphical user interface (GUI) called Aqua.

Performance problems

As the mach kernel is hidden away deep in the FreeBSD kernel, Mach (kernel) threads are only available for kernel level programs, not applications such as MySQL. Applications can make use of a POSIX thread (a " pthread"), a wrapper around a Mach thread.


Mac OS X thread layering hierarchy (Courtesy: Apple)

This means that applications use slower user-level threads like in FreeBSD and not fast kernel threads like in Linux. It seems that FreeBSD 5.x has somewhat solved the performance problems that were typical for user-level threads, but we are not sure if Mac OS X has been able to take advantage of this.

In order to maintain binary compatibility, Apple might not have been able to implement some of the performance improvements found in the newer BSD kernels.

Another problem is the way threads could/can get access to the kernel. In the early versions of Mac OS X, only one thread could lock onto the kernel at once. This doesn't mean only one thread can run, but that only one thread could access the kernel at a given time. So, a rendering calculation (no kernel interaction) together with a network access (kernel access) could run well. But many threads demanding access to the memory or network subsystem would result in one thread getting access, and all others waiting.

This "kernel locked bottleneck" situation has improved in Tiger, but kernel locking is still very coarse. So, while there is a very fine grained multi-threading system (The Mach kernel) inside that monolithic kernel, it is not available to the outside world.

So, is Mac OS X the real reason why MySQL and Apache run so slow on the Mac Platform? Let us find out... with benchmarks, of course!

The G5 as Server CPU Mac OS X versus Linux
Comments Locked

116 Comments

View All Comments

  • michaelok - Saturday, June 4, 2005 - link

    "with one benchmark showing that the PowerMac is just a mediocre PC while another shows it off as a supercomputer, the unchallenged king of the personal computer world."

    Well, things are a little different when you connect, say, 32768 processors together, i.e. you go from running MySQL to Teradata, so yes, the Power architecture seems to dominate, and the Virginia Tech supercomputer is still up there, at 7th.

    http://www.top500.org/lists/plists.php?Y=2004&...

    " The RISC ISA, which is quite complex and can hardly be called "Reduced" (The R of RISC), provides 32 architectural registers"

    'Reduced Instruction Set' is misleading, it actually refers to a design philosophy of using *smaller, simpler* instructions, instead of a single complex instruction. This is to be compared with the Itanium for example, which Intel calls 'EPIC' (Explicit Parallel Instruction Computing), but it is essentially derived from VLIW (Very Long Instruction Word).

    Anyway, nice article, certainly much more to discuss here, such as SMT (Simultaneous Multithreading), (when that is available for the Apple :), vs. Intel's Hyperthreading. We'll still be comparing Apples to Oranges but isn't that why everybody buys the Motor Trend articles, i.e. '68 Mustang vs. '68 GT?


  • psychodad - Saturday, June 4, 2005 - link

    I agree. Recently I read a review which pitted macs against pcs using software blatantly optimized for macs. If you have ever used unoptimized software, you will know it. It is slow, often unstable and not at all usable, especially if you're after productivity.
  • Viditor - Friday, June 3, 2005 - link

    IntelUser2000 - "about the AMD TDP number, they never state that its max power, they say its maximum power achievable under most circumstances, its not absolute max power"

    Not true at all...AMD's datasheet clearly states that it's not only max power, but max theoretical power.
    http://www.amd.com/us-en/assets/content_type/Downl...
  • trooper11 - Friday, June 3, 2005 - link

    I think its hard enough comparing a G5 to PC systems. I dont belive there will ever be a 'fair' comparison that satisfies everyone on both sides. There are too few general programs to compare and people will always complain about using or not using optimized apps for either platform. many of the varibles are subjective and the benchmarks to be compared are so heavily debated without a clear answer.

    I think this was a good attempt, but I gave up trying to 'fairly' compare the two a long time ago. Anyhting that sheds a bit of light is a good thing, but i never expect an end to the contreversy, too many questions that cant be answered.

    I would though love to see the addition of dual core amd chips since they are out there and would be serious competition, of course it would fly in server applications. hopefully the numbers for that could be added in a later article.
  • psychodad - Friday, June 3, 2005 - link

    Fascinating. You run these tests using a compiler that Apple does not use (unless it is Yellow Dog) against software generally optimized for x86 architectures and you make conclusions. This makes your data tainted (actually biased) and your conclusions faulty. I would suggest that in fairness you make your tests more "real world" by using the software compiled by compilers that the rest of us nontechnical people use on a daily basis.
  • smitty3268 - Friday, June 3, 2005 - link

    Rosyna:
    Oh, I assumed he was using the Apple version of gcc. If not, then I see what you mean.
  • crimsonson - Friday, June 3, 2005 - link

    This article may be moot by Monday

    http://tinyurl.com/7ex4v
  • Garyclaus16 - Friday, June 3, 2005 - link

    " Oh and the graph on page 5 doesnt display correctly in firefox. "

    AND you are using firefox for what reason?...you deserve to view pages incorrectly
  • Rosyna - Friday, June 3, 2005 - link

    smitty3268, that's part of the problem. Almost no one uses GCC 3.3.3 (stock, from the main gcc branch) for Mac OS X development because it really sucks at optimizing for the PPC. On the other hand, OS X was compiled with the Apple shipped GCC 3.3/GCC 4.0.
  • smitty3268 - Friday, June 3, 2005 - link

    I think its fair to use the compilers most people are going to be using. That would be gcc on both platforms. As far as autovectorization in 4.0, don't expect very much from it. Obviously it will be better than 3.3, but the real work is being added now in 4.1.

    I'll join the other 50 posters who would have liked to see at least 1 page showing the G5's performance under linux compared to OSX. That and maybe a few more real world benchmarks. But your article was very informative and answered a lot of questions. It was frustrating that there really wasn't anything done like this before.

Log in

Don't have an account? Sign up now