64-bit MySQL (Linux 64-bit)

MySQL has released version 5.1.22, which supposedly can scale up to eight CPU cores. That would be a huge improvement considering that all versions earlier than 5.0.37 could only make good use of two CPU cores. In our experience, this new binary scales well up to four cores, but eight cores are easily 20% slower than four core systems. Thus, we tested with a maximum of four cores.


MySQL
5.1.22

Please note that these results cannot be compared with our earlier MySQL results. MySQL v5.1.22 is a completely different binary than v5.0.2x we tested previously. Although it still doesn't scale beyond four cores, it is up to 70% (!!) faster than v5.0.26 that came standard with our SLES 10 SP1. For smaller servers with four cores, MySQL is once again an ultra fast database.

Moreover, the third generation of Opterons absolutely loves this new MySQL version. At 2.5GHz, it is just as fast (the margin of error is up to 4%) as the mighty Xeon 5472 at 3GHz. As we failed to profile MySQL in depth (CodeAnalyst for Linux still has some quirks), we cannot pinpoint the exact reason why the Opteron 23xx is so good at this. The MySQL database is mostly limited by synchronizing the locks, so we suspect that the slightly faster cache coherency syncing on the Opteron 23xx might be one of the reasons AMD's latest performs so well.

WinRAR 3.62 (Windows 32-bit)

WinRAR 3.62 is a completely different kind of workload.

WinRAR 3.62 Profiling
Profile Total
Average IPC (on AMD 2350) 0.36
Instruction mix
Floating Point 0%
SSE 0%
Branches 9%
L1 datacache ratio 1.13
L1 I cache ratio 0.35
Performance indicators (on Opteron 2350)
Branch misprediction 7%
L1 datacache miss 4%
L1 Instruction cache miss 0%
L2 cache miss 3%

Notice that contrary to the other workloads we have profiled so far, WinRAR does not run perfectly in the L1 or L2 cache. Second, notice the huge amount of loads that happen: more than one per retired instruction.


WinRAR
3.62

The massive bandwidth that Barcelona can offer multi-threaded software pays off here. You can also see that the Seaburg chipset improves the score of the 3GHz quad-core Xeon by 7%.

64-bit Linux Java Performance: SPECjbb2005 Fritz Chess and HPC
Comments Locked

43 Comments

View All Comments

  • Regs - Tuesday, November 27, 2007 - link

    I would not expect any from vendors and wholesalers until early next year.

    Matter of fact I wouldn't want one until then anyhow. I would at least wait until B3 stepping.
  • TA152H - Tuesday, November 27, 2007 - link

    Johan,

    From my understanding, x87 is now obsolete and not even supported in x86-64. Can you verify this? I know I had read it, from your article you state that Intel improved it, so I'm not as sure. I had assumed one of AMD's handicaps was the disproportionate, and nearly useless, x87 processing power their processors carried, but now I am not as sure. Is x87 supported in x86-64, and if not, why would Intel increase their x87 capabilities when it's clearly a deprecated technology?
  • JohanAnandtech - Tuesday, November 27, 2007 - link

    The x87 instructions can be used in legacy mode and long mode. But it is true that Scalar SSE instructions are preferred by AMD and Intel.

    x87 performance as many 32 bit programs are still important (look at 3DSMAx 32 bit).

    If Intel's newest Core architecture would not have improved the x87 FP it would probably have looked silly as so many 32 bit programs still use it intensively. Secondly, as you can see, things like the Radix-16 circuitry are used by both the SIMD as the x87 units.
  • Gholam - Tuesday, November 27, 2007 - link

    Do you have any plans to benchmark Opteron vs Xeon in an ESX Server environment?
  • DeepThought86 - Tuesday, November 27, 2007 - link

    This is exactly what I was thinking of too. I want to change my mode of working to run several separate VM's, one for programming, one for Office etc and really want to know how Phenom compares to Q6600 for those uses. Well, this article looks at the server versions of those chips but for VMware the performance might be more comparable than, say, SuperPi 1M benchmarks!
  • DeepThought86 - Tuesday, November 27, 2007 - link

    I forgot to add, since Phenom would presumably also have the nested table support as Barcelona, how much performance improvement would this yield? I'd love to know
  • sht - Tuesday, November 27, 2007 - link

    I was about to ask the same question after reading the concluding

    You may feel for example that using four instances in our SPECjbb test favors AMD too much, but there is no denying that using more virtual machines on fewer physical servers is what is happening in the real world.

    Since the CPUs have features that should accelerate virtualization, it would really be interesting to see how they compete there. My only addition to your request would be to add KVM as host as well (and XEN and what not as well if you care, though I really think only KVM is of interest).
  • JohanAnandtech - Tuesday, November 27, 2007 - link

    Indeed, we are working on that. The software that we described here (http://www.anandtech.com/IT/showdoc.aspx?i=2997&am...">http://www.anandtech.com/IT/showdoc.aspx?i=2997&am... is being adapted to testing virtualized applications. We are also looking into the parameters that can really influence the results of a benchmark on a virtualized server.
  • JohanAnandtech - Tuesday, November 27, 2007 - link

    Indeed, we are working on that. The software that we described here (http://www.anandtech.com/IT/showdoc.aspx?i=2997&am...">http://www.anandtech.com/IT/showdoc.aspx?i=2997&am... is being adapted to testing virtualized applications. We are also looking into the parameters that can really influence the results of a benchmark on a virtualized server.
  • AssBall - Tuesday, November 27, 2007 - link

    Thanks, Johan.

    This has been one of the clearer and better proofread articles I have read here lately. It was interesting, unbiased, and insightful. I am excited to see what you get into for your next project.

Log in

Don't have an account? Sign up now