Virtualization Performance: Linux VMs on ESXi

We introduced our new vApus FOS (For Open Source) server workloads in our review of the Facebook "Open Compute" servers. In a nutshell, it a mix of four VMs with open source workloads: two PhpBB websites (Apache2, MySQL), one OLAP MySQL "Community server 5.1.37" database, and one VM with VMware's open source groupware Zimbra 7.1.0. Zimbra is quite a complex application as it contains the following components:

  • Jetty, the web application server
  • Postfix, an open source mail transfer agent
  • OpenLDAP software, user authentication
  • MySQL is the database
  • Lucene full-featured text and search engine
  • ClamAV, an anti-virus scanner
  • SpamAssassin, a mail filter
  • James/Sieve filtering (mail)

All VMs are based on a minimal CentOS 5.6 setup with VMware Tools installed. All our current virtualization testing is on top of the hypervisor which we know best: ESXi (5.0). CentOS 5.6 is not ideal for the Interlagos Opteron, but we designed the benchmark a few months ago. It took us weeks to get this benchmark working and repeatable (especially the latter is hard). For example it was not easy to get Zimbra fully configured and properly benchmarked due to the complex usage patterns and high I/O usage. Besides, the reality is that VMs often contain older operating systems. We hope to show some benchmarks based on Linux kernel version 3.0 or later in our next article.

We tested with five tiles (one tile = four VMs). Each tile needs seven vCPUs, so the test requires 35 vCPUs.

vApus FOS

The Opteron 6276 stays close to the more expensive Xeons. That makes the Opteron server the one with the best performance per dollar. Still, we feel a bit underwhelmed as the Opteron 6276 fails to outperform the previous Opteron by a tangible margin.

The benchmark above measures throughput. Response times are even more important. Let us take a look at the table below, which gives you the average response time per VM:

vApus FOS Average Response Times (ms), lower is better!
CPU PhpBB1 PHPBB2 MySQL OLAP Zimbra
AMD Opteron 6276 737 587 170 567
AMD Opteron 6174 707 574 118 630
Intel Xeon X5670 645 550 63 593
Intel Xeon X5650 678 566 102 655

The Xeon X5670 wins a landslide victory in MySQL. MySQL has always scaled better with clock speed than with cores, so we expect that clock speed played a major role here. The same is true for our first VM: this VM gets only one CPU and as result runs quicker on the Xeon. In the other applications, the Opteron's higher (integer) core count starts to show. However, AMD cannot really be satisfied with the fact that the old Opteron 6174 delivers much better MySQL performance. We suspect that the high latency L2 cache and higher branch misprediction penalty (20 vs 12) is to blame. MySQL performance is characterized by a relatively high amount of branches and a lot of accesses to the L2. The Bulldozer server does manage to get the best response time on our Zimbra VM, however, so it's not a complete loss.

Performance per watt remains the most important metric for a large part of the server market. So let us check out the power consumption that we measured while we ran vApus FOS.

vApus FOS Power Consumption

The power consumption numbers are surprising to say the least. The Opteron 6174 needs quite a bit less energy than the two other contenders. That is bad news for the newest Opteron. We found out later that some tinkering could improve the situation, as we will see further.

Benchmark Configuration Measuring Real-World Power Consumption, Part One
POST A COMMENT

106 Comments

View All Comments

  • mino - Wednesday, November 16, 2011 - link

    IT had most likely to do with you running it on NetBurst (judging by no VT-X moniker).

    As much to do with VT-X as with a crappy CPU ... wiht bus architecture ah, thank god they are dead.
    Reply
  • JustTheFacts - Wednesday, November 16, 2011 - link

    Please explain why there is no comparison between the latest AMD processors to Intel's flagship two-way server processors: the Intel Westmere-EX e7-28xx processor family?

    Lest you forgot about them, you can find your own benchmarks of this flagship Intel processor here: http://www.anandtech.com/show/4285/westmereex-inte...

    Take the gloves off and compare flagship against flagship please, and then scale the results to reflect the price differece if you have to, but there's no good reason not to compare them that I can see. Thanks.
    Reply
  • duploxxx - Thursday, November 17, 2011 - link

    Westmere EX 2sockets is dead, will be killed by own intel platform called romley which will have 2p and 4p.

    it was a stupid platform from the start and overrated by sales/consultants with there so called huge memory support.
    Reply
  • aka_Warlock - Wednesday, November 16, 2011 - link

    I think you should have done a more thorough VM test than you did. 64GB RAM?
    We all know single threaded performance is weak, but I still feel the server are underutilized in your test.

    These CPU's are screaming heavy multi threading workloads. Many VM's. Many vCPU's.

    What would the performance be if you had, say, at least 192GB of RAM and 50 (maybe more) VM's on it?

    And offcourse, storage should not be a bottleneck.

    I think this is where his 8modules/16threads cpu would shine.
    A dual socket rack/blade. 16modules/32 threads.
    Loads of RAM and a bounch of VM's.
    Reply
  • iwod - Wednesday, November 16, 2011 - link

    It is power hungry, isn't any better then Intel, and it is only slightly cheaper, at the cost of higher electricity bill.

    So unless with some software optimization that magically show AMD is good at something, i think they are pretty much doomed.

    It is like Pentium 4, except Intel can afford making one or two mistakes, but not with AMD.
    Reply
  • mino - Wednesday, November 16, 2011 - link

    Then the article served its purpose well. Reply
  • SunLord - Wednesday, November 16, 2011 - link

    So is the AMD system running 8GB DDR3-1600 DIMMS or 4GB DDR3-1333? Because you list the same DDR3-1333 model for both systems and if the Server supports 16 DIMMs well 16*4 is 64GB Reply
  • JohanAnandtech - Thursday, November 17, 2011 - link

    Copy and paste error, Fixed. We used DDR-3 1600 (Samsung) Reply
  • Johnmcl7 - Wednesday, November 16, 2011 - link

    I have wondered about this, with more cores per socket and virtualisation (organising new set of servers and buying far less hardware for the same functionality) so I'd have thought in total less server hardware is being purchased. Clearly that isn't the case though, is the money made back from more expensive servers?

    John
    Reply
  • bruce24 - Wednesday, November 16, 2011 - link

    While sure which each new generation of server you need much less hardware to do the same amount of work, however worldwide people are looking for servers to do much more work. Each year companies like Google, Facebook, Amazon, Microsoft and Apple add much more computing power than they could get by refreshing their current servers. Reply

Log in

Don't have an account? Sign up now