Mac OS X Achilles Heel

It is clear that profiling MySQL on the kernel is the only way that we are going to be able to pin-point why exactly MySQL is so slow on Mac OS X. So, why did I state that I believe the threading engine in Mac OS X to be rather slow? Well, I admit that I should have made it more clear in the article that I didn't have rock-solid evidence. However, my suspicion is based on more than speculation.

First of all, notice that the Mac OS X performance is decent with a concurrency of one, or one simulated user. It still performs well when a second user is simulated, as the second CPU can kick in and push performance higher. Let us check the scaling, by putting the numbers of our MySQL graphic into a table.

Concurrency Dual G5 2,5 GHz Tiger Scaling (Concurrency one=100%) Dual G5 2,5 GHz Linux 2.6 Scaling (Concurrency one=100%) Dual Opteron 2.4Ghz Scaling (Concurrency one=100%)
1 192 100% 228 100% 290 100%
2 274 143% 349 153% 438 151%
5 113 59% 411 180% 543 187%
10 62 32% 443 194% 629 217%
20 50 26% 439 193% 670 231%
35 50 26% 422 185% 650 224%
50 47 25% 414 182% 669 230%

The performance at concurrency 1 and 2 is mediocre, but not really bad. Notice that the scaling of Mac OS X from one to two is not fantastic, but is almost as good as the Linux machines. Once we worked with 5 concurrent users, however, performance collapses on Mac OS X: we get only 60% of the performance at concurrency one. With Linux, both CPUs are not stressed at a concurrency of two, and increasing the load makes the CPUs work harder.

The G5 (Linux) achieves its peak quicker as it is a bit slower in this integer intensive task than the Opteron. However, it is important to remark that while performance begins to decline very slowly as we increase the number of users, there is no collapse! At a concurrency of 50, we still have 80% more performance than at a concurrency of one, showing that Linux handles the extra load of the extra threads very well. On Mac OS X, performance has plummeted to one quarter of our initial performance, showing that the threads are creating an additional overhead somehow.

Secondly, it is a fact that our benchmark is not disk limited. In that case, it is well documented that MySQL performance depends on the threading performance of the OS. A few examples:
MySQL Reference Manual for version 5.0.3-alpha:

"MySQL is very dependent on the thread package used. So when choosing a good platform for MySQL, the thread package is very important"
More:
"The capability of the kernel and the thread library to run many threads that acquire and release a mutex over a short critical region frequently without excessive context switches. If the implementation of pthread_mutex_lock() is too anxious to yield CPU time, this will hurt MySQL tremendously. If this issue is not taken care of, adding extra CPUs will actually make MySQL slower"
Darwin (6.x and older) used to be quite a bit slower when it came to context switches, but our own LMBench testing shows that the latest Darwin 8.0 performs context switches just as/nearly as fast as Linux kernel 2.6. So, a possible explanation might be that more context switches happen, but we still have to find a method to measure this. Suggestions are welcome....

From the MySQL site:
"As a multithreaded server, MySQL is most efficient on an operating system that has a well implemented threading system"
Thirdly, we have the Lmbench benchmarks, which are not conclusive, but point in the same direction. Even the high latency for the TCP measurements (see above) on Mac OS X might indicate relatively poor threading performance. MySQL has a TCP/IP connection thread, which handles all connection requests. This thread creates a new dedicated thread to handle the authentication and SQL query processing for each connection.

The split funnel suspect

The last suspect is the locking system. In Panther, only two threads could lock into the kernel to execute code of the kernel. One thread could lock into the networking part, while the other into the rest of the kernel services.

In Tiger, the locking is finer. Although Apple's documents indicate that it is still rather coarse grained, it is clear that more than two locks into the kernel can exist at the same time. In the case of MySQL, this should be a very important improvement, but we didn't see any improvement at all when performing the tests on both Panther and Tiger. This is speculation, but based on our data, we are tempted to hypothesize that the new locking system isn't really working right now, and that Tiger continues to behave like Panther.

Does it affect you?

What does this all mean? Whether or not you skipped the technical part, you probably want to know how it affects your Apple (server) experience.

It is clear that if you plan to run MySQL on Apple hardware, it is better to install YDL Linux than to use OS X. If you need excellent read performance, the maximum performance of your server will be up to 8 times better. If your server is only going to serve a limited number of users, YDL Linux will allow you to run with a less expensive system.

If the usage pattern of your server is more OLTP, Transaction processing oriented, we give you the same advice. Our quick tests with InnoDB show the same kind of behavior and we have noticed very slow file system performance. At this point, we do not have enough data to be conclusive. We noticed, for example, that importing data in our database (via the ">" command) took up to 8 times longer.

Low level benchmarking on Mac OS X and Linux The G5 a.k.a. Power 970FX
Comments Locked

47 Comments

View All Comments

  • stmok - Thursday, September 1, 2005 - link

    LOL...As everyday passes, it seems more "interesting things" are revealed from Apple solutions.
  • ViRGE - Thursday, September 1, 2005 - link

    Granted, some of this was over my head(more than I'd like to admit to), but your results are none the less very interesting Johan. Now that we have the Linux/G5 numbers, there's no arguing that there's a weakness in MacOSX somewhere, which is a bit depressing as a Mac user, but still a very useful insight as to how there's obviously something very broken in some design aspect of the OS(it simply shouldn't be getting crushed like it is). My only question now is how Apple and its devs will respond to this - it is pretty damning after all.

    Thanks for finally getting some Linux/G5 numbers out to settle this.
  • sdf - Friday, September 2, 2005 - link

    By changing hardware platforms.

    No, seriously.

    A transition from PowerPC to Intel would be the perfect time to correct ABI flaws like this. It isn't that the G5 causes the slow down, it's that the slow down (maybe) can't really be fixed without breaking binary compatibility. A CPU transition is clearly going to do that anyway, so maybe they'll just wait...
  • toelovell - Thursday, September 1, 2005 - link

    I am kind of curious to see how Darwin would work on an x86 based system for these same tests. There are x86 binaries for Darwin 8. So it should be possible to run these tests and compare Darwin with Linux on an x86 platform. This would help to see if the OS really is the limitation. Just a thought.
  • JohanAnandtech - Thursday, September 1, 2005 - link

    If linux is capable of pushing the G5 8 times higer than with Mac OS X, there is little doubt on my mind that the OS is the problem. Or did I understand you wrong?

    Anyway, I have no experience whatsoever with Darwin. My first impression is that installing Darwin on x86 is probably a very masochistic experience, due to lack of proper drivers. We might get it working but can it really run MySQL and other apps? THere are probably libraries missing... Will the results be representative of anything as it is probably tuned for just getting it running instead of performance? Anyone with Darwin x86 experience?
  • wjcott - Thursday, September 1, 2005 - link

    The only interest I have in a mac OS is if they are going to sell it without a computer. I would love to have OS X, but I must build the machine.
  • Quanticles - Thursday, September 1, 2005 - link

    Every component must be fine tuned to the upmost degree... Every BIOS Setting... Every Hidden Register... *crazy eyes* =)

Log in

Don't have an account? Sign up now