vApus Mark II

vApus Mark II uses the same applications as vApus Mark I, but they have been updated to newer versions. vApus Mark I uses five VMs with three server applications:

  • One VM with the Nieuws.be OLAP database, based on SQL Server 2008 x64 running on Windows 2008 64-bit R2, stress tested by our in-house developed vApus test.
  • Three MCS eFMS portals running PHP, IIS on Windows 2003 R2, stress tested by our in house developed vApus test.
  • One OLTP database, based on the Swing bench 2.2 “Calling Circle benchmark” of Dominic Giles. We updated the Oracle database to version 11G R2 running on Windows 2008 R2.

All VMs are tested with several sequential user concurrencies. All VMs are “warmed up” with lower user counts. We measure only at the higher concurrencies, later in the test. At that point, results are repetitive as the databases are using their caches and buffers optimally.

The OLAP VM is based on the Microsoft SQL Server database of the Dutch Nieuws.be site, one of the newest web 2.0 websites launched in 2008. We updated to SQL Server 2008 R2. This VM gets now eight virtual CPUs (vCPUs), a feature that is supported by the newest hypervisors such as VMware ESX 4.0 and Xen 4.0. This kind of high vCPU count is one of the conditions that needs to be met before administrators will virtualize these kind of “heavy duty” applications. The application hardly touches the disk, as the vast majority of activity is in memory during the test cycle. About 135GB of disk space is necessary, but the most used data is cached in about 4GB of RAM.

The MCS eFMS portal, a real-world facility management web application, has been discussed in detail here. It is a complex IIS, PHP, and FastCGI site running on top of Windows 2003 R2 32-bit. Note that these two VMs run in a 32-bit guest OS, which impacts the VM monitor mode. We left this application running on Windows 2003, as virtualization allows you to minimize costs by avoiding unnecessary upgrades. We use three MCS VMs, as web servers are more numerous than database servers in most setups. Each VM gets two vCPUs and 2GB of RAM space.

Since OLTP testing with our own vApus stress testing software is still in beta, our fourth VM uses a freely available test: "Calling Circle" of the Oracle Swingbench Suite. Swingbench is a free load generator designed by Dominic Giles to stress test an Oracle database. We tested the same way as we have tested before, with one difference: we use an OLTP database that is only 2.7GB (instead of 9.5GB). The OLTP test runs on the Oracle 11g R2 64-bit on top of Windows 2008 Enterprise R2 (64-bit). Data is placed on an Intel X25-E SLC SSD, with logs on a separate SSD. This is done for each Calling Circle VM to avoid storage bottlenecks. The OLTP VM gets four vCPUs.

Notice that our total vCPU count is 18 (8 + 3 x 2 + 4). The advantage of using 18 vCPUs per tile is it will not be straightforward to schedule virtual CPUs on almost every CPU configuration. You might remember from our previous testing that if the number of virtual CPUs is a multiple of the number of physical cores, the server gets a performance advantage over other systems.

Careful monitoring (ESXtop) showed us that four tiles of vApus Mark II (72 vCPUs) were enough to keep the fastest system at an average of 96.5% CPU utilization during performance measurements.

Stress Testing the High End The Virtualization Landscape So Far
Comments Locked

51 Comments

View All Comments

  • Ratman6161 - Wednesday, August 11, 2010 - link

    Many products license on a per CPU basis. For Microsoft anyway, what they actually count is the number of sockets. For example SQL Server Enterprise retails for $25K per CPU. So an old 4 socket system with single cores would be 4 x $25K = $100K. A quad socket system with quad core CPUs would be a total of 16 cores but the pricing would still be 4 sockets x $25K = 100K. It used to be that Oracle had a complex formula for figuring this but I think they have now also gone to the simpler method of just counting sockets (though their enterprise edition is $47.5K).

    If you are using VMWare, they also charge per socket (last I knew) so two dual socket systems would cost the same as a single 4 socket system. Thing is though you need to have at least two boxes in order to enable the high availability (i.e. automatic failover) functionality.
  • Stuka87 - Wednesday, August 11, 2010 - link

    For VMWare they have a few pricing structures. You can be charged per physical socket, or you can get an unlimited socket license (which is what we have, running one seven R910's). You just need to figure out if you really need the top tier license.
  • semo - Tuesday, August 10, 2010 - link

    "Did I mention that there is more than 72GHz of computing power in there?"

    Is this ebay?
  • Devo2007 - Tuesday, August 10, 2010 - link

    I was going to comment on the same thing.

    1) A dual core 2GHz CPU does not equal "4GHz of computing power" - unless somehow you were achieving an exact doubling of performance (which is extremely rare if it exists at all).

    2) Even if there was a workload that did show a full doubling of performance, performance isn't measured in MHz & GHz. A dual-core 2GHz Intel processor does not perform the same as a 2GHz AMD CPU.

    More proof that the quality of content on AT is dropping. :(
  • mino - Wednesday, August 11, 2010 - link

    You seem to know very little about the (40yrs old!) virtualization market.
    It flourishes from *comoditising* processing power.

    Why clearly meant a joke, that statement of Johan, is much closer to the truth than most market "research" reports on x86.
  • JohanAnandtech - Wednesday, August 11, 2010 - link

    Exactly. ESX resource management let you reserve CPU power in GHz. So for ESX, two 2.26 GHz cores are indeed a 4.5 GHz resource.
  • duploxxx - Thursday, August 12, 2010 - link

    sure you can count resources together as much as you want... virtually. But in the end a single process is still only able to handle the max ghz a single cpu can offer but can finish the request faster. That is exactly the thing why those Nehalem and gulf still hold against the huge core count of Magny cours.
  • maeveth - Tuesday, August 10, 2010 - link

    So I have nothing at all against AnandTech's recent articles on Virtualization however so far all of them have only looked at Virtualization from a compute density point of view.

    I currently am the administrator of a VMware environment used for development work and I run into I/O bottle necks FAR before I ever run into a compute bottleneck. In fact computational power is pretty much the LAST bottleneck I run into. My environment currently holds just short of 300 VMs, OS varies. We peak at approximately 10-12K IOPS.

    From my experience you always have to look at potential performance in a virtual environment at a much larger perspective. Every bottleneck effects others in subtle ways. For example if you have a memory bottleneck, either host or guest based you will further impact your I/O subsystem, though you should aim to not have to swap. In my opinion your storage backend is the single most important factor when determining large-scale-out performance in a virtualized environment.

    My environment has never once run into a CPU bottleneck. I use IBM x3650/x3650M2 with Dual Quad Xeons. The M2s use X5570s specifically.

    While I agree having impressive magnitudes of "GHz" in your environment is kinda fun it hardly says anything about how that environment will preform in a real world environment. Granted it is all highly subject to work load patterns.

    I also want to make it clear that I understand that testing on a such a scale is extremely cost prohibitive. As such I am sure AnandTech, Johan speficially, is doing the best he can with what resources he is given. I just wanted to throw my knowledge out there.

    @ELC
    Yes, software licensing is a huge factor when purchasing ESX servers. ESX is licensed per socket. It's a balancing act that depends on your work load however. A top end ESX license costs about $5500/year per socket.
  • mino - Wednesday, August 11, 2010 - link

    However, IMO storage performance analysis is pretty much beyond AT's budget ballpark by an order of magnitude (or two).

    There is a reason this space is so happily "virtualized" by storage vendors AND customers to a "simple" IOPS number.
    It is a science on its own. Often closer to black (empiric) magic than deterministic rules ...

    Johan,
    on the other hand, nothing prevents you form mentioning this sad fact:

    Except edge cases, a good virtualization solution is build from the ground up with
    1. SLA's
    2. storage solution
    3. licensing considerations
    4. everything else (like processing architecture) dictated by the previous
  • JohanAnandtech - Wednesday, August 11, 2010 - link

    I can only agree of course: in most cases the storage solution is the main bottleneck. However, this is aloso a result of the fact that most storage solutions out there are not exactly speed demons. Many storage solutions out there consist of overengineered (and overpriced) software running on outdated hardware. But things are changing quickly now. HP for example seems to recognize that a storage solution is very similar to a server running specialized software. There is more, with a bit of luck, Hitachi and Intel will bring some real competition to the table. (currently STEC has almost a monopoly on the enterprise SSD disks). So your number 2 is going to tumble down :-).

Log in

Don't have an account? Sign up now