vApus Mark I: Performance-Critical applications virtualized

Our vApus Mark I benchmark is not a VMmark replacement. It is meant to be complimentary: while VMmark uses runs 60 to 120 light loads, vApus Mark I runs 8 heavy VMs on 24 virtual CPUs (vCPUs). Our current vApus Stressclient is being improved to scale to much higher amount of vCPUs, but currently we limit the benchmark to 24 virtual CPUs.

A vApus Mark I tile consists of one OLTP, one OLAP and two heavy websites are combined in one tile. These are the kind of demanding applications that still got their own dedicated and natively running machine a year ago. vApus Mark I shows what will happen if you virtualize them. If you want to fully understand our benchmark methodology: vApus Mark I has been described in great detail here. We have changed only one thing compared to our original benchmarking: we used large pages as it is generally considered as a best practice (with RVI, EPT).

The current vApus Mark I uses two tiles. Per tile we have thus 4 VMs with 4 server applications:

  • A SQL Server 2008 x64 database running on Windows 2008 64-bit, stress tested by our in-house developed vApus test (4 vCPUs).
  • Two heavy duty MCS eFMS portals running PHP, IIS on Windows 2003 R2, stress tested by our in house developed vApus test (each 2 vCPUs).
  • One OLTP database, based on Oracle 10G Calling Circle benchmark of Dominic Giles (4 vCPUs).

The beauty is that vApus (stress testing software developed by the Sizing Servers Lab) uses actions made by real people (as can be seen in logs) to stress test the VMs, not some benchmarking algorithm.

vAPUS Mark I 2 tile test - 24 vCPUs - ESX 4.0

As always, vApus Mark paints a totally different picture than VMmark. In this case, “only” 8 Opteron cores are needed to keep up with the six Xeons.  While right now the Xeon X5670 is ahead with a significant margin (34%) on the current six-core Opteron, an octal-core Opteron might be competitive, on the condition that AMD prices it right. 

We are proud to present you our first vApus Mark I on Hyper-V. One of the great advantages of our virtualization benchmark is that it runs on all popular hypervisors. Below we tested with Hyper-V R2 6.1.7600.16385 (21st of July 2009).

vAPUS Mark I 2 tile test - 24 vCPUs - Hyper-V

Hyper-V R2 performs well, very well. The scheduler prefers to work with a number of physical CPUs that can be easily divided among the virtual CPUs. Contrary to ESX, where the 16 logical cores of the Xeon X5570 prevail, Hyper-V prefers the twelve cores of the Opteron 2435, much to our surprise. It interesting to see that ESX seems to prefer the Nehalem based architectures much more than Hyper-V. With ESX the gap between the six-core Opteron and six-core Xeon is 34%. With Hyper-V, this shrinks to 15%.

Take our results with a grain of salt though, as this is the very first time we have run vApus Mark I on Hyper-V on different architectures. We need more analyzing time to understand what is going on. My first bet is that ESX is very well optimized for the Nehalem architecture. This includes the excellent Hyper-threading optimizations and probably some optimizations to avoid one of the few Nehalem architecture limitations: the small “prefetch” (16 byte on Nehalem, 32 byte on Istanbul) and especially the relatively small TLB. That is pure speculation though, we will need more time to investigate this.

Virtualization & consolidation Final Words
Comments Locked

40 Comments

View All Comments

  • behrouz - Wednesday, March 17, 2010 - link

    why did you not test magny core ?
  • JohanAnandtech - Wednesday, March 17, 2010 - link

    For the same reason that there are no Magny-Cours benchmarks on AMD's site yet :-).
  • drewintheav - Tuesday, March 16, 2010 - link

    The INTEL i7 980X has dual QPI's and will run in a dual socket mainboard!!! Such as the EVGA W555 /Classified SR-2

  • Lukas - Thursday, March 18, 2010 - link

    No, i7 980X has only a single QPI link. But i'm pretty sure there's a corresponding W56xx CPU, with two QPI links and twice the price tag.
  • thunng8 - Tuesday, March 16, 2010 - link

    "server CPU architecture which already has the fastest cores on the market and you’ll get very impressive results"

    This is not entirely correct. If you limit your self to x64 architecture, it is correct, but the recently released IBM POWER7 8 core chip blows away the Nehalem architecture in the benchmarks released so far.

    For example, a 4 chip, 32 core 3.55Ghz POWER7 server does 85,220 SAPS in the SAP SD 2 tier benchmark and that isn't even the top bin POWER7. (top bin is 3.86Ghz with double the memory bandwidth / core) There are even larger margins in other benchmarks like specIntRate etc.
  • Photubias - Thursday, March 18, 2010 - link

    Just curious: what software (OS/applications) run on that 8Core POWER7 chip?
  • Lukas - Thursday, March 18, 2010 - link

    Linux, AIX, IBM i, z/OS

    That's pretty much it. Lot's of traditional OLTP workloads run on those platforms. Several flight booking systems run on z/OS.
  • Penti - Thursday, March 18, 2010 - link

    z/OS runs on System Z systems with z10 CICS processors. Eg Mainframes.

    IBM System i servers are just high-end POWER servers. Running mainly Java and database loads, directly on IBM i/OS (previously i5/OS and before that AS/400) or AIX, or Linux. IBM DB2 is integrated directly into IBM i/OS.
  • Torment - Thursday, March 18, 2010 - link

    And what does that setup cost?
  • vitchilo - Tuesday, March 16, 2010 - link

    What would be great is ONE game test... like Crysis or something...

    And ONE X264 encode test...

    Thanks a lot.

Log in

Don't have an account? Sign up now