Real-world virtualization benchmarking: the best server CPUs comparedby Johan De Gelas on May 21, 2009 3:00 AM EST
- Posted in
- IT Computing
The Virtualization Benchmarking Chaos
There are an incredible number of pitfalls in the world of server application benchmarking, and virtualization just makes the whole situation much worse. In this report, we want to measure how well the CPUs are coping with virtualization. That means we need to choose our applications carefully. If we use a benchmark that spends very little time in the hypervisor, we are mostly testing the integer processing power and not how the CPU copes with virtualization overhead. As we have pointed out before, a benchmark like SPECjbb does not tell you much, as it spends less than one percent of its time in the hypervisor.
How is virtualization different? CPU A that beats CPU B in native situations can still be beaten by the latter in virtualized scenarios. There are various reasons why CPU A can still lose, for example CPU A…
- Takes much more time for switching from the VM to hypervisor and vice versa.
- Does not support hardware assisted paging: memory management will cause a lot more hypervisor interventions.
- Has smaller TLBs; Hardware Assisted Paging (EPT, NPT/RVI) places much more pressure on the TLBs.
- Has less bandwidth; an application that needs only 20% of the maximum bandwidth will be bottlenecked if you run six VMs of the same application.
- Has smaller caches; the more VMs, the more pressure there will be on the caches.
To fully understand this, it helps a lot if you read our Hardware Virtualization: the nuts and bolts article. Indeed, some applications run with negligible performance impact inside a virtual machine while others are tangibly slower in a virtualized environment. To get a rough idea of whether or not your application belongs to the latter or former group, a relatively easy rule of thumb can be used: how much time does your application spend in user mode, and how much time does it need help from the kernel? The kernel performs three tasks for user applications:
- System calls (File system, process creation, etc.)
- Interrupts (Accessing the disks, NICs, etc.)
- Memory management (i.e. allocating memory for buffers)
The more work your kernel has to perform for your application, the higher the chance that the hypervisor will need to work hard as well. If your application writes a small log after spending hours crunching numbers, it should be clear it's a typical (almost) "user mode only" application. The prime example of a "kernel intensive" application is an intensively used transactional database server that gets lots of requests from the network (interrupts, system calls), has to access the disks often (interrupts, system calls), and has buffers that grow over time (memory management).
However, a "user mode only" application can still lose a lot of performance in a virtualized environment in some situations:
- Oversubscribing: you assign more CPUs to the virtual machines than physically available. (This is a very normal and common way to get more out of your virtualized server.)
- Cache Contention: your application demands a lot of cache and the other virtualized applications do as well.
These kinds of performance losses are relatively easy to minimize. You could buy CPUs with larger caches, and assign (set affinity) certain cache/CPU hungry applications some of the physical cores. The other less intensive applications would share the CPU cores. In this article, we will focus on the more sensitive workloads out there that do quite a bit of I/O (and thus interrupts), need large memory buffers, and thus talk to the kernel a lot. This way we can really test the virtualization capabilities of the servers.