10Gbit Ethernet: Killing Another Bottleneck?

Name: 10Gbit Ethernet: Killing Another Bottleneck?
Item: 10Gbit Ethernet: Killing Another Bottleneck?
Author: Johan De Gelas

by Johan De Gelas on March 8, 2010 12:00 PM EST

Posted in
IT Computing

49 Comments | Add A Comment

49 Comments

Network Performance in ESX 4.0 Update 1

We used up to eight VMs, and each was assigned an “endpoint” in the Ixia IxChariot network test. This way we could measure the total network throughput that is possible to achieve with one, four or eight VMs. Making use of ESX NetQueue, the cards should be able to leverage their separate queues and the hardware Layer 2 “switch”.

First, we test with NetQueue disabled. The cards will behave like a card with only one Rx/Tx queue. To make the comparison more interesting, we added two dual-port gigabit NICs into the benchmark mix. Teamed NICs are currently by far the most cost effective way to increase network bandwidth.

The 10G cards show their potential. With four VMs, they are able to achieve 5 to 6Gbit/s. There is clearly a queue bottleneck: both 10G cards perform worse with eight VMs. Notice also that 4x1Gbit does very well. This combination has more queues and can cope well with the different network streams. Out of a maximum line speed of 4Gbit/s, we achieve almost 3.8Gbit/s with four and eight VMs. Now let's look at CPU load.

Once you need more than 1Gbit/s, you should pay attention to the CPU load. Four gigabit ports or one 10G port require 25~35% utilization of eight 2.9GHz Opterons cores. That means that you would need two or three cores dedicated just to keeping the network pipes filled. Let us see if NetQueue can do some magic.

NIC performance on ESX 4.0, NetQueue enabled

The performance of the Neterion card improves a bit, but it's not really impressive (+8% in the best case). The Intel 82598 EB chip on the Supermicro 10G NIC is now achieving 9.5Gbit/s with eight VMs, very close to the theoretical maximum. The 4x1Gbit/s NIC numbers were repeated in this graph for reference (no NetQueue was available).

So how much CPU power did these huge network streams absorb?

The Neterion driver does not seem to be optimized for ESX 4. Using NetQueue should lower CPU load, not increase it. The Supermicro/Intel 10G combination shows the way. It delivers twice as much bandwidth at half the CPU load compared to the two dual-port gigabit NICs.

The Hardware Wrap-Up

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

49 Comments

View All Comments

krazyderek - Monday, March 8, 2010 - link
Furthermore where do the upgrades stop? Dual NIC's are common on workstations but you can also get triple and quad built in, or add in cards. Where do you stop?

Maybe i'm looking for an answer to a question that doesn't have a clear cut answer, it's just a balancing act, and you have to balance performance with home much you have to spend.

If you upgrade the server to remove it as a bottleneck, then your clients become the bottleneck, if you team up enough client NIC's then your server become's the bottleneck again, if you upgrade the server with PCIe solid state drive like the Fusion IO and several 10Gb connections then your clients and your switch start to become the bottleneck, and on and on....
Kjella - Tuesday, March 9, 2010 - link
If you use "IT" "upgrades" and "end" in the same post, well... it doesn't end. It ends the day megacorporations can run off a handful of servers, which is never because the requirements keep going up. Like for example your HDD bottleneck, well then let's install a SSD array that can run tens (hundreds?) of thousands of IOPS and several Gbit/s speeds and something else becomes the bottleneck. It's been this way for decades.
has407 - Tuesday, March 9, 2010 - link
You stop when you have enough performance to meet your needs. How much is that? Depends on your needs. Where's the bottleneck? A bit of investigation will identify them.

If you have a server serving a bunch of clients, and the server network performance is unacceptable, then increasing the number of 1Gbe ports on the server is likely your best choice if you have expansion capability; if not then port/slot consolidation using 10Gbe may be appropriate. However, if server performance is limited by other factors (e.g., CPU/disk), then that's where you should focus.

If you have clients hitting a server, and the client network performance is unacceptable (and the server performance is OK), then (in general) aggregating ports on the client won't get you much (if anything). In that case 10Gbe on the client may be appropriate. However, if client performance is limited by other factors (e.g., CPU/disk), then that's where you should focus.

Link aggregation works best when traffic is going to different sources/destinations, and is generally most useful on a server which is serving multiple clients (or between switches with a variety of end-point IP's).

4X 1Gbe links != 1x 4Gbe link. Link aggregation and load balancing across multiple links is typically based on source/destination IP. If they're the same, they'll follow the same link/path, and link aggregation won't buy you much because all of the packets from the same source/destination are following the same path--which means they go over the same link, which means that the speed is limited to that of the single fastest link. (Some implementations can also load balance based on source/destination port as well as IP, which may help in some situations.)

That means that no matter how many 1Gbe links you have aggregated on client or server, the end-to-end speed for any given source/destination end-to-end IP pair (and possibly port) will never exceed that of a single 1Gbe link. (While there are ways to get around that, it's generally more trouble than it's worth and can seriously hurt performance.)
krazyderek - Tuesday, March 9, 2010 - link
I thought this is why you had to use Link Aggregation and NIC teaming in combination, giving the client and server one IP each on multiple ethernet cords, so that when a client with 2xnic is doing say, sequential transfer from a server with 2x 3x or 4x then you could get 240MB/s throughput if the storage systems can handle it on either end, but when a 2x client connects to anther 1x client then you're limited by the slower of the two connections and thus only capable of 120MB/s max, which would open the door to still have 120MB/s combing from another client at the same time.

Maybe it's all this SSD talk as of late, but i just want to see some of those options and bottlenecks tackled in real life and i just don't happen to have 5 or 10 SSD's kicking around to try it myself.
has407 - Wednesday, March 10, 2010 - link
Link Aggregation (LACP) == NIC teaming (c.f., 802.3ad/802.1AX). Assigning different IP's will not get you anything unless higher layers are aware and are capable (e.g., use multipath which can improve performance, but in my experience not a lot--and it comes with overhead).

Reordering Ethernet frames or IP packets can carry a heavy penalty--more than it's worth in many (most?) cases, which is why packets sent from any endpoint pair will (sans higher-order intervention) follow the same path. Endpoint pairs are typically based on IP, although some switches also use the port numbers (i.e., path == hash of IPs in the simple case, path == hash of IPs+ports in more sophisticated case). Which is why you genrally won't see performance exceed the fastest *single physical link* between endpoints (regardless of how many links you've teamed/aggregated), and which is why a single fast link can be better than teamed/aggregated links.

E.g., team 4x 1Gbe links on both client and server. You generally won't see more than 1Gb from the client to the server for any given xfer or protocol, If you run multiple xfer's using differnt protocols (i.e., different ports) and you have a smart switch, you may see > 1Gb.

In short, if you have a client with 4x 1Gb teamed/aggregated NICs, you won't see >1Gb for any IP/port pair, and probably not for any IP pair (depending on the switch/NIC smarts and how you've done your port aggregation/teaming) on the client, switch and server. Which again is why a single faster link is generally better than teaming/aggregation.

There's a simple way for you to test it... fire up an xfer from a client with teamed NICs to a server with plenty of bandwidth. In most cases it will max out at the rate of the fastest single physical link on the client (or server). Fire up another xfer using the same protocol/port. In most cases the aggregate xfer will remain about the same (all packets are following the same path). If you see an increase, congratulations, your teaming is using both IP and port hashing to distribute traffic.
Lord 666 - Monday, March 8, 2010 - link
In the market for a new SAN for a server in preparation for a consolidation/virtualization/headquarters move, one of my RFP requirements was for 10gbe capabilities now. Some peers of mine have questioned this requirement stating there is enough bandwidth with etherchanneled 4gb nics and FC would be the better option if thats not enough.

Thank you for doing this write up as it confirms that my hypothesis is correct and 10gbe will/is a valid requirement for the new gear from a very forward looking view.

It would be nice to see the same kit used in combination with SANS. With the constant churn of new gear, this will be very helpful.
has407 - Monday, March 8, 2010 - link
Agree. Anyone who suggests FC is the answer today is either running on inertia or trying to justify a legacy FC investment/infrastructure.

Build on 10Gbe if at all possible; if you need FC in places, look at FCoE.
mino - Tuesday, March 9, 2010 - link
FC is good solution where iSCSI over Gbit is not enough but 10Gbps, along with all the teething troubles, is just not worth it.
FC is reliable, 4Gb is no overpriced and has none of the issues of iSCSI.
It just works.

Granted, for heavily loaded situations, especially on blades, 10G is the way to go.
But for many medium loads, FC is often the simpler/cheaper option.
JohanAnandtech - Tuesday, March 9, 2010 - link
Which issues of iSCSI exactly? And FC is still $600 per port if I am not mistaken?

Not to mention that you need to import a whole new kind of knowledge in your organisation, while iSCSI works with the UTP/Ethernet tech that every decent ITer knows.
mino - Tuesday, March 16, 2010 - link
The issues with latency, reliability, multipathing etc. etc.

Basically the strongest point of iSCSI is the low up-front price and single-infrastructure mantra.
Optimal for small or scale-out. Not so much for mid projects.

"... while iSCSI works with the UTP/Ethernet tech that every decent ITer knows ..."
Sorry to break the news, but any serious IT shop has an in-house FC-storage experience going back a decade or more.

1Gbps is really not in a competition with FC. It is an order of magnuitude below in latency and protocol overhead (read low IOps).

Whe the fun starts is 10G vs FC and serious 10G infrastructure is actually more expensive per port than FC.

When iSCSI over 10G shines is in port consolidation oportunity.
Not in bandwith, not in latency, not in price/port.

10Gbit Ethernet: Killing Another Bottleneck?

Post Your Comment

49 Comments

View All Comments

krazyderek - Monday, March 8, 2010 - link

Kjella - Tuesday, March 9, 2010 - link

has407 - Tuesday, March 9, 2010 - link

krazyderek - Tuesday, March 9, 2010 - link

has407 - Wednesday, March 10, 2010 - link

Lord 666 - Monday, March 8, 2010 - link

has407 - Monday, March 8, 2010 - link

mino - Tuesday, March 9, 2010 - link

JohanAnandtech - Tuesday, March 9, 2010 - link

mino - Tuesday, March 16, 2010 - link

Log in

Don't have an account? Sign up now