The Final Piece of the Puzzle: SR-IOV

The final step is to add a few buffers and Rx/Tx descriptors to each queue of your multi-queued device, and a single NIC can pretend to be a collection of tens of “small” NICs. That is what PCI SIG did, and they call each small NIC a virtual function. According to the PCI SIG SR-IOV specification you can have up to 256 (!) virtual functions per NIC. (Note: the SR-IOV specification is not limited to NICs; other I/O devices can be SR-IOV capable too.)


Courtesy of the excellent Youtube movie: "Intel SR-IOV"

Make sure there is a chipset with IOMMU/VT-d inside the system. The end result: each of those virtual functions can DMA packets in and out without any help of the hypervisor. That means that it is not necessary anymore for the CPU to copy the packages from the memory space of the NIC to the memory space of the VM. The VT-d/IOMMU capable chipset ensures that the DMA transfers of the virtual functions happen and do not interfere with each other. The beauty is that the VMs are connecting to these virtual functions by a standard paravirtualized driver (such as VMXnet in VMware), and as a result you should be able to migrate VMs without any trouble.

There you have it: all puzzles pieces are there. Multiple queues, virtual to physical address translation for DMA transfers, and a multi-headed NIC offer you higher throughput, lower latency, and lower CPU overhead than emulated hardware. At the same time, they offer the two advantages that made virtualized emulated hardware so popular: the ability to share one hardware device across several VMs and the ability to decouple the virtual machine from the underlying hardware.

SR-IOV Support

Of course, this is all theory until all software and hardware layers work together to support this. You need a VT-d or IOMMU chipset, the motherboard’s BIOS has to adapted to recognize all those virtual functions, and each virtual function must get memory mapped IO space like other PCI devices. A hypervisor that supports SR-IOV is also necessary. Last but not least, the NIC vendor has to provide you with an SR-IOV capable driver for the operating system and hypervisor of your choice.

With some help of mighty Intel, the opensource hypervisors (Xen, KVM) and the commercial product derivatives (Redhat, Citrix) were first to market with SR-IOV. At the end of 2009, both Xen and KVM had support for SR-IOV, more specifically for Intel 10G Ethernet 82599 controller. The Intel 82599 can offer up to 64 VFs. Citrix announced support for SR-IOV in Xenserver 5.6, so the only ones missing in action are VMware’s ESX and Microsoft’s Hyper-V.

Solving the Virtualization I/O Puzzle Meet the NICs
Comments Locked

38 Comments

View All Comments

  • fr500 - Wednesday, November 24, 2010 - link

    I guess there is LACP or PAGP and some propietary solution.

    A quick google told me it's called cross-module trunking.
  • mlambert - Wednesday, November 24, 2010 - link

    FCoE, iSCSI (*not that you would, but you could), FC, and IP all across the same link. Cisco offers VCP LACP with CNA as well. 2 links per server, 2 links per storage controller, thats not many cables.
  • mlambert - Wednesday, November 24, 2010 - link

    I meant VPC and Cisco is the only one that offers it today. I'm sure Brocade will in the near future.
  • Zok - Friday, November 26, 2010 - link

    Brocade's been doing this for a while with the Brocade 8000 (similar to the Nexus 5000), but their new new VDX series takes it a step further for FCoE.
  • Havor - Wednesday, November 24, 2010 - link

    Do these network adapters are real nice for servers, don't need a manged NIC, i just really want affordable 10Gbit over UTP ore STP.

    Even if its only 30~40M / 100ft because just like whit 100Mbit network in the old days my HDs are more then a little out preforming my network.

    Wondering when 10Gbit will become common on mobos.
  • Krobar - Thursday, November 25, 2010 - link

    Hi Johan,

    Wanted to say nice article first of all, you pretty much make the IT/Pro section what it is.

    In the descriptions of the cards and conclusion you didnt mention Solarflares "Legacy" Xen netfront support. This only works for paravirt Linux VMs and requires a couple of extra options at kernal compile time but it run like a train and requires no special hardware support from the motherboard at all. None of the other brands support this.
  • marraco - Thursday, November 25, 2010 - link

    I once made a resume of total cost of the network on the building where I work.

    Total cost of network cables was far larger than the cost of the equipment (at least with my country prices). Also, solving any cable related problem was a complete hell. The cables were hundreds, all entangled over the false roof.

    I would happily replace all that for 2 of tree cables with cheap switches at the end. Selling the cables would pay for new equipment and even give a profit.

    Each computer has his own cable to the central switch. A crazy design.
  • mino - Thursday, November 25, 2010 - link

    IF you go 10G for cable consolidation, you better forget about cheap switches.

    The real saving are in the manpower, not the cables themselves.
  • myxiplx - Thursday, November 25, 2010 - link

    If you're using a Supermicro Twin2, why don't you use the option for the on board Mellanox ConnectX-2? Supermicro have informed me that with a firmware update these will act as 10G Ethernet cards, and Mellanox's 10G Ethernet range has full support for SR-IOV:

    Main product page:
    http://www.mellanox.com/content/pages.php?pg=produ...

    Native support in XenServer 5:
    http://www.mellanox.com/content/pages.php?pg=produ...
  • AeroWB - Thursday, November 25, 2010 - link

    Nice Article,

    It is great to see more test around virtual environments. What surprises me a little bit is that at the start of the article you say that ESXi and Hyper-V do not support SR-IOV yet. So I was kind of expecting a test with Citrix Xenserver to show the advantages of that. Unfortunately it's not there. I hope you can do that in the near future.
    I work with both Vmware ESX and Citrix XenServer we have a live setup of both. We started with ESX and later added a XenServer system, but as XenServer is getting more mature and gets more and more features we probably replace the ESX setup with XenServer (as it is much much cheaper) when maintenance runs out in about one year so I'm really interested in tests on that platform.

Log in

Don't have an account? Sign up now