vApus Mark I: Performance-Critical applications virtualized

Our vApus Mark I benchmark is not a VMmark replacement. It is meant to be complimentary: while VMmark uses runs 60 to 120 light loads, vApus Mark I runs 8 heavy VMs on 24 virtual CPUs (vCPUs). Our current vApus Stressclient is being improved to scale to much higher amount of vCPUs, but currently we limit the benchmark to 24 virtual CPUs.

A vApus Mark I tile consists of one OLTP, one OLAP and two heavy websites are combined in one tile. These are the kind of demanding applications that still got their own dedicated and natively running machine a year ago. vApus Mark I shows what will happen if you virtualize them. If you want to fully understand our benchmark methodology: vApus Mark I has been described in great detail here. We have changed only one thing compared to our original benchmarking: we used large pages as it is generally considered as a best practice (with RVI, EPT).

The current vApus Mark I uses two tiles. Per tile we have thus 4 VMs with 4 server applications:

  • A SQL Server 2008 x64 database running on Windows 2008 64-bit, stress tested by our in-house developed vApus test (4 vCPUs).
  • Two heavy duty MCS eFMS portals running PHP, IIS on Windows 2003 R2, stress tested by our in house developed vApus test (each 2 vCPUs).
  • One OLTP database, based on Oracle 10G Calling Circle benchmark of Dominic Giles (4 vCPUs).

The beauty is that vApus (stress testing software developed by the Sizing Servers Lab) uses actions made by real people (as can be seen in logs) to stress test the VMs, not some benchmarking algorithm.

vAPUS Mark I 2 tile test - 24 vCPUs - ESX 4.0

As always, vApus Mark paints a totally different picture than VMmark. In this case, “only” 8 Opteron cores are needed to keep up with the six Xeons.  While right now the Xeon X5670 is ahead with a significant margin (34%) on the current six-core Opteron, an octal-core Opteron might be competitive, on the condition that AMD prices it right. 

We are proud to present you our first vApus Mark I on Hyper-V. One of the great advantages of our virtualization benchmark is that it runs on all popular hypervisors. Below we tested with Hyper-V R2 6.1.7600.16385 (21st of July 2009).

vAPUS Mark I 2 tile test - 24 vCPUs - Hyper-V

Hyper-V R2 performs well, very well. The scheduler prefers to work with a number of physical CPUs that can be easily divided among the virtual CPUs. Contrary to ESX, where the 16 logical cores of the Xeon X5570 prevail, Hyper-V prefers the twelve cores of the Opteron 2435, much to our surprise. It interesting to see that ESX seems to prefer the Nehalem based architectures much more than Hyper-V. With ESX the gap between the six-core Opteron and six-core Xeon is 34%. With Hyper-V, this shrinks to 15%.

Take our results with a grain of salt though, as this is the very first time we have run vApus Mark I on Hyper-V on different architectures. We need more analyzing time to understand what is going on. My first bet is that ESX is very well optimized for the Nehalem architecture. This includes the excellent Hyper-threading optimizations and probably some optimizations to avoid one of the few Nehalem architecture limitations: the small “prefetch” (16 byte on Nehalem, 32 byte on Istanbul) and especially the relatively small TLB. That is pure speculation though, we will need more time to investigate this.

Virtualization & consolidation Final Words
POST A COMMENT

39 Comments

View All Comments

  • Wireloop - Saturday, March 27, 2010 - link

    After watching vApus' result for both Intel and AMD gear, the natural conclusion drawn is that Hyper-V is more optimized for the Opteron architecture than ESX since the latter achieves a lower Geometric Mean VM rate (on that platform).

    I guess it has something to do with maneuver of data into the L3 cache which is a critical condition for high multithreaded performance on the AMD platform. If so, my kudos to Microsoft.
    Reply
  • mgbell - Friday, March 19, 2010 - link

    Hey Anand,
    I think you should do set up a test pitting the Xeon line against their perspective i7 counterparts and run some workstation type tests. I would be very interested in any testing that had to do with video encoding/rendering. I am a video editor and would love to see a side by side comparison with a xeon sytem of the same speed against a core i7 system. Also just for fun turn off the second processor or turn it on so we can see what kinds of rendering benefits a second processor with 4/6 cores (8/12 threads) would gain.

    Thanks
    MB
    Reply
  • lemonadesoda - Sunday, March 21, 2010 - link

    I very much agree. It would be interesting to run a typical "enthusiast" or "workstation" application/benchmark just to see how it compares.

    I would like to see a Cinebench R10 comparison, a Everest PhotoWorxx, and a Fritz Chess Benchmark. Possibly a video encoding benchmark too.

    A lot of enthusiasts run dual Xeons as workstations... you cant predict what software they will be running, but the above 3 tests are good general comparatives.

    There are also servers providing other services like OCR or PDF generation. These Oracle database benchmarks are useful, but represent only one type of server/workstation use.
    Reply
  • damianrobertjones - Thursday, March 18, 2010 - link

    I'm sitting here at the end of and ADSL line with a fresh WIndows XP machine, all updates, new Kaspersky install.

    While waiting for an app to install I've visited this page....

    Bang. Kaspersky popped up with a warning

    Trojan downloader.java.agent.aw from www.googleadsenstats.ru/useralexey/files/gsb50.jar/Appletx.class

    Do you have something against ie8 as this doesn't happen with Opera?

    PLEASE MAKE YOUR SITE SAFE!
    Reply
  • itsmeagain - Wednesday, March 17, 2010 - link

    Any chance you could throw a couple of these in a mac pro and give us a preview? Reply
  • Shadowmaster625 - Wednesday, March 17, 2010 - link

    The E5503 looks like the most reasonable and appealing server processor for those of us that live in the real world. Yet there are no benchmarks... Reply
  • Lukas - Thursday, March 18, 2010 - link

    The 550x CPUs are crap. They don't have HyperThreading or TurboBoost. The only reason they exist is for a cheap entry price tag. If you don't need a lot of CPU (e.G. unvirtualized LOB software), better go with a 34xx series Xeon. A lot cheaper than the 55xx series. Reply
  • majortom1981 - Tuesday, March 23, 2010 - link

    they also exist for government and public service contracts . We got a z600 with 4 gig ram ,1 5504 xeon, and an 80 gig 10k rpm enterprise sata drive (also nvida gpu) for $700. For just $239 i can add another 5504 . Reply
  • pvdw - Wednesday, March 17, 2010 - link

    How come only Windows servers are being used. What about RHEL with a Tomcat or JBOSS bench (surely such exists). Reply
  • Lukas - Thursday, March 18, 2010 - link

    Probably because the benchmarkers are not familiar with those platforms? Doing benchmarks on a platform about which you don't know enough will not give you any usable results. Reply

Log in

Don't have an account? Sign up now