10Gbit Ethernet: Killing Another Bottleneck?

Name: 10Gbit Ethernet: Killing Another Bottleneck?
Item: 10Gbit Ethernet: Killing Another Bottleneck?
Author: Johan De Gelas

by Johan De Gelas on March 8, 2010 12:00 PM EST

Posted in
IT Computing

49 Comments | Add A Comment

49 Comments

Benchmark Configuration

We used a point-to-point configuration to eliminate the need for a switch. We have one machine that we use as the other “end of the network” and one machine on which we measure throughput and CPU load. We used Ixia IxChariot to test the network performance.

Server One ("the other end of the network"):
Supermicro SC846TQ-R900B chassis
Dual Intel Xeon 5160 “Woodcrest” at 3GHz
Supermicro X7DBN Rev1.00 Motherboard
Intel 5000P (Blackford) Chipset
4x4GB DDR2-667 Kingston Value Ram CAS 5
BIOS version 03/20/08

Server two (for measurements):
Supermicro A+ 2021M-UR+B chassis
Dual AMD Opteron 8389 “Shanghai” at 2.9GHz
Supermicro H8DMU+ Motherboard
NVIDIA MCP55 Pro Chipset
8x2GB of Kingston DDR2-667 Value RAM CAS 5
BIOS version 080014 (12/23/2009)

NICs

Both servers were equipped with the following NICs:

Two dual-portIntel PRO/1000 PT Server adapter (82571EB) (four ports in total)
One Supermicro AOC-STG-I2 dual-port 10Gbit/s Intel 82598EB
One Neterion Xframe-E 10Gbit/s

We tested the NICs using CentOS 5.4 x64 Kernel 2.6.18 and VMware ESX 4 Update 1

Important note:the NICs used are not the latest and greatest. For example, Neterion already has a more powerful 10Gbit NIC out, the Xframe 3100. We tested with what had available in our labs.

Drivers CentOS 5.4
Neterion Xframe-E: 2.0.25.1
Supermicro AOC-STG-I2 dual-port: 2.0.8-k3, 2.6.18-164.el5

Drivers ESX 4 Update 1 b208167
Neterion Xframe-E: vmware-esx-drivers-net-s2io-400.2.2.15.19752-1.0.4.00000
Supermicro AOC-STG-I2 dual-port: vmware-esx-drivers-net-ixgbe-400.2.0.38.2.3-1.0.4.164009
Intel PRO/1000 PT Server adapter: vmware-esx-drivers-net-e1000e-400.0.4.1.7-2vmw.1.9.208167

Index The Hardware

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

49 Comments

View All Comments

RequiemsAllure - Tuesday, March 9, 2010 - link
So, basically what these cards are doing (figuratively speaking) they are taking in"multiplexing" 8 or 16 requests (how however many virtual queues) together into a single NIC sorting (demultiplexing) them to a respective VM the VM then takes care of the request and sends it on its way.

can anyone tell me if i got this right?
has407 - Wednesday, March 10, 2010 - link
Yes, I think you've got it... that's pretty much how it works. At the risk of oversimplifying... these cards are like a multi-port switch with 10Gbe uplinks.

Consider a physical analog (depending on the card, and not exact but close enough): 8/16x 1Gbe ports on the server connected to a switch with 8/16x 1Gbe ports and 1/2x 10Gbe uplinks to the backbone.

Now replace that with a card on the server and 1/2x 10Gbe backbone ports. Port/switch/cable consolidation ratios of 8:1 or 16:1 can save serious $$$ (and with better/dynamic bandwidth allocation).

The typical sticking point is that 10Gbe switches/routers are still quite expensive, and unless you've got a critical mass of 10Gbe, the infrastructure cost can be a tough hump to get over.
LuxZg - Tuesday, March 9, 2010 - link
I've got to admit that I've skimped through the article (and first page ad a half of commnts).. But it seems through your testing & numbers that you haven't used a dedicated NIC for every card in the 4x 1Gbit example (4 VMs test), otherwise you'd get lower CPU numbers simly because you skip on the load scheduling that's done on CPU.

Any "VM expert" will tell you that you have 3 basic bottlenecks in any VM server:
- RAM (the more the better, mostly not a problem)
- disks (again, more is better, and absolutele minimum is at least one drive per VM)
- NICs

For NICs basic rule would be - if VM is loaded with network-heavy application, than VM should have a dedicated NIC. CPU utilization drops heavily, and NIC utilization is higher.

Having one 10Gbit NIC shared among 8 VMs which are all bottlenecked by NICs means you have your 35% CPU load. With one NIC dedicated to each VM you'd have CPU load near zero at file-copy loads (NIC has hardware scheduler, disc controller has the same for HDDs).

Like I've said, maybe I've overlooked something in article, but it seems to me your test are based on wrong assumptions. Besides, if you've got 8 file servers as VM, you've got an unnecessary overhead as well, it's one application (file serving) so no need to virtualize to 8 VMs on same hardware.

As a conclusion, VMs are all about planning, so I believe your test had a wrong approach.
JohanAnandtech - Tuesday, March 9, 2010 - link
"a dedicated NIC for every VM"

That might be the right approach when you have a few VMs on the server, but it does not seem to be reasonable when you have tens of VMs running. What do you mean by dedicating? pass-through? port grouping? Only Pass-through has near zero CPU load AFAIK, and I don't see many scenarios where pass-through is handy.

Also, if you use dedicated NICs for network intensive apps, that means that you can not use that bandwidth for the occasional spike in another "non NIC priviledged" VM.

It might not be feasible at all if you use DRS or Live migration.

The whole point of VMDQ is to offer the bandwidth necessary to the VM that needs it (for example give one VM 5 GBit/s, One VM 1 gbit/s and the others only 1 Mbit/s) and that the layer 2 routing overhead is mostly on the NIC. It seems to me that the planning you promote is very inflexible and I can see several scenario's where dedicated NICs will perform worse than one big pipe which can be load balanced accross the different VMs.
LuxZg - Wednesday, March 10, 2010 - link
Yes, I meant "dedicated" as "pass-through".

Yes, there are several scenarios where "one big" is better than several small ones, but think if 35% CPU load (and that's 35% of a very-expensive-CPU) is worth as sacrifice to have a reserve for few occasional spikes.

I do agree that putting several VMs on one NIC is ok, but that's for applications that aren't loaded with heavy network transfers. VM load balancing should be done for example like this (just a stupid example, don't hold onto it too hard):
- you have file server as one VM
- you have mail server on second VM
- you have some CPU-heavy app on separate VM

File server is heavy on networking and disc subsystem, but almost none on RAM/CPU. Mail server is dependant on several variables (antiSPAM, antivirus, amount of mailboxes & incoming mail, etc), so it can be light-to-heavy load for all subsystems. For this example let's say it's a lighter kind of load. Let's say this hardware machine has 2 NICs. You've got few CPUs with multiple cores, and plenty of disc/RAM. So what's right to do? Adding a CPU intensive VM, so that CPU isn't idle too much. You dedicate one NIC to file server, and you let mail server share NIC with CPU-intensive VM. That way file server has enough bandwidth that isn't taxing CPU to 35% cos of stupid virtual routing of great amounts of network packets, CPU is left mostly free for the CPU-intensive VM, and mail server happily lives in between the two, as it will be satisfied with leftover CPU and networking..

Now scale that to 20-30 VMs, and all you need is 10 NICs. For VMs that aren't network dependant you put them on "shared NICs", and for network-intensive apps you give those VMs dedicated NIC.

Just remember - 35% of a multi-socket & multi-core server is a huge expense, when you can do it on a dedicated NIC. NIC is, was, and will be much more cost effective for doing network packet scheduling than CPU.. Why pay several thousand $$$ for CPU if all you need is another NIC.
LuxZg - Tuesday, March 9, 2010 - link
I hate my own typos.. 2nd sentence.. "dedicated NIC for every VM" .. not "for every card".. probably there are more nonsense.. I'm in a hurry, sorry ppl!
anakha32 - Tuesday, March 9, 2010 - link
All the new 10G kit appears to be coming with SFP+ connectors. They can be used either with a transceiver for optical, or a pre-terminated copper cable (known as 'SFP+ Direct Attach').
CX4 seems to be deprecated as the cables are quite big and cumbersome.
zypresse - Tuesday, March 9, 2010 - link
I've seen some mini-Clusters (3-10 machines) lately with ethernet interconnects. Although I doubt that this is best solution, it would be nice to know how 10G ethernet actually performs in that area.
Calin - Tuesday, March 9, 2010 - link
I don't find a power use of <10W for a 10Gb link such a bad compromise over 0.5W per 1Gb Ethernet link (assuming that you can use that 10Gb link at close to maximum capacity). If nothing else, you're trading two 4-port 1Gb network cards for one 10Gb card.
MGSsancho - Tuesday, March 9, 2010 - link
Suns 40BGs adapters are not terribly expensive (start at $1500.) apparently they support 8 virtual lanes? So Mellanox provides Sun their silicon. went to their site and they do have other silicon/cards that explicitly state they support Virtual Protocol Interconnect. I'm curious if this is the same thing. I know you stated that the need really isn't there but would be interesting to see if you can ask for testing samples or look into the viability of Infiniband. Looking at their partners page they provide the silicon for xsigo as a previous poster stated. Again would be nice to see if 40Gb Infiniband with and without VPI technologies is superior to 10Gb Ethernet with acceleration as you provided with us today. For SANs, anything to lower latency for iscsi is desired. Perhaps spending a little for reduced latency on the network layer makes it worth the extra price for faster transactions? So many possibilities! Thank you for all the insightful research you have provided us!

10Gbit Ethernet: Killing Another Bottleneck?

Post Your Comment

49 Comments

View All Comments

RequiemsAllure - Tuesday, March 9, 2010 - link

has407 - Wednesday, March 10, 2010 - link

LuxZg - Tuesday, March 9, 2010 - link

JohanAnandtech - Tuesday, March 9, 2010 - link

LuxZg - Wednesday, March 10, 2010 - link

LuxZg - Tuesday, March 9, 2010 - link

anakha32 - Tuesday, March 9, 2010 - link

zypresse - Tuesday, March 9, 2010 - link

Calin - Tuesday, March 9, 2010 - link

MGSsancho - Tuesday, March 9, 2010 - link

Log in

Don't have an account? Sign up now