Sharing Cache and Memory Resources

In a virtualized environment, the hosted VMs are sharing both the CPU caches and the overall DRAM memory bandwidth. One cache-hungry application can quickly hog most of the shared L3 caches, and a bandwidth intensive one can do the same with the available and shared memory bandwidth. These VMs create the "noisy neighbor" problem. That is bad news for anyone consolidating a lot of VMs on top of a Xeon server, but it is complete show stopper for telco and other scenarios where service providers want to guarantee "Quality-of-Service" (QoS) and thus predictable latency. For Intel this is a notable scenario to address, as the telco market is one of the few markets where the Xeons still have some room to grow. Many telco applications still run on proprietary boxes, which makes virtualization a tantalizing option if Intel can deliver the necessary latency. 

Haswell had already some features to monitor cache usage, which in turn allowed you to identify the noisy neighbors. However the "Resource Director Technology" (RDT) of Broadwell can do a lot more. 

RDT can not only monitor L3 cache usage and memory bandwidth, but it can also allocate L3-cache space on a per thread/process/virtual machine basis. Threads are assigned a Resource Monitoring ID. Eight of these RMID are available per core/cache slice. Sixteen different classes of service can be assigned to an RMID: higher priority threads/applications can get a higher class, and thus a larger portion of the L3-cache. 

Intel has already demonstrated an application that made use of these new MSRs to read out memory bandwidth and L3 cache consumption on different levels. 

Broadwell Architecture Improvements TSX and Faster Virtualization
Comments Locked

112 Comments

View All Comments

  • petar_b - Saturday, August 27, 2016 - link

    Thanks Phil_Oracle, Brutalizer and Anand for this discussion. I have learned a lot from reading different opinions. I am working with IBM and Oracle software products, and from my small experience, Xeons are pathetic when compared to POWER or SPARC. To do same operation at home Xeon it takes 10x more time than what it takes the corporate server to do. I have double memory than corporate server and yet no help from it.
  • someonesomewherelse - Thursday, September 1, 2016 - link

    Btw how locked down are these Xeons and their motherboards in regards to overclocking? Assuming you could provide enough power and cooling could you reach a decent overclock? Obviously nobody is going to do that for mission crittical servers/workstations, but if I had too much money could I get a quad or octa core system with as much cores possible and at least try to overclock them?

Log in

Don't have an account? Sign up now