Technology behind the Killer NIC

We will not be spending several pages and displaying numerous charts in an attempt to explain in absolute detail how the networking architecture and technology operates. Instead we will provide a high level technology overview in our explanations, which will hopefully provide the basic information needed to show why there are advantages in offloading the data packet processing from the CPU to a dedicated processing unit. Other technologies such as RDMA and Onloading are available but in the interest of space and keeping our readers awake we will not detail these options.

The basic technology the Killer NIC utilizes has been in the corporate server market for a few years. One of the most prevalent technologies utilized and the one our Killer NIC is based upon is the TCP/IP Offload Engine (TOE). TOE technology (okay that phrase deserves a laugh) is basically designed to offload all tasks associated with protocol processing from the main system processor and move it to the TOE network interface cards (TNIC). TOE technology also consists of software extensions to existing TCP/IP stacks within the operating system that enable the use of these dedicated hardware data planes for packet processing.

The process required to place packets of data inside TCP/IP packets can consume a significant amount CPU cycles dependent upon the size of the packet and amount of traffic. These dedicated cards have proven very effective in relieving TCP/IP packet processing from the CPU resulting in greater system performance from the server. The process allows the system's CPU to recover lost cycles so that applications that are CPU bound are now unaffected by TCP/IP processing. This technology is very beneficial in a corporate server or datacenter environment where there is a heavy volume of traffic that usually consists of large blocks of data being transferred, but does it really belong on your desktop where the actual CPU overhead is generally minimal? Before we address this question we need to take a further look at how the typical NIC operates.

The standard NIC available today usually processes TCP/IP operations in software that can create a substantial system overhead depending upon the network traffic on the host machine. Typically the areas that create increased system overhead are data copies along with protocol and interrupt processing. When a NIC receives a typical data packet, a series of interactions with the CPU begins which will handle the data and route it to the appropriate application. The CPU is first notified there is a data packet waiting and generally the processor will read the packet header and determine the contents of the data payload. It then requests the data payload and after verifying it, delivers it to the waiting application.

These data packets are buffered or queued on the host system. Depending upon the size and volume of the packets this constant fetching of information can create additional delays due to memory latencies and/or poor buffer management. The majority of standard desktop NICs also incorporate hardware checksum support and additional software enhancements to help eliminate transmit-data copies. This is advantageous when combined with packet prioritization techniques to control and enhance outbound traffic with intelligent queuing algorithms.

However, these same NICs cannot eliminate the receive-data copy routines that consume the majority of processor cycles in this process. A TNIC performs protocol processing on its dedicated processor before placing the data on the host system. TNICs will generally use zero-copy algorithms to place the packet data directly into the application buffers or memory. This routine bypasses the normal process of handshakes between the processor, NIC, memory, and application resulting in greatly reduced system overhead depending upon the packet size.

Most corporate or data center networks deal with large data payloads that typically are 8 Kbit/sec up to 64 Kbit/sec in nature (though we fully understand this can vary greatly). Our example will involve a 32 Kbit/sec application packet receipt that usually results in thirty or more interrupt-generating events between the host system and a typical NIC. Each of these multiple events are required to buffer the information, generate the data into Ethernet packets, process the incoming acknowledgements, and send the data to the waiting application. This process basically reverses itself if a reply is generated by the application and returned to the sender. This entire process can create significant protocol-processing overhead, memory latencies, and interrupt delays on the host system. We need to reiterate that our comments about "significant" system overhead are geared towards a corporate server or datacenter environment and not the typical desktop.

Depending upon the application and network traffic a TNIC can greatly reduce the network transaction load on the host system by changing the transaction process from one event per Ethernet packet to one event per application network I/O. The 32 Kbit/sec application packet process now becomes a single data-path offload process that moves all data packet processing to the TNIC. This eliminates the thirty or so interrupts along with the majority of system overhead required to process this single packet. In a data center or corporate server environment with large content delivery requirements to multiple users the savings in system overhead due to network transactions can have a significant impact. In some instances replacing a standard NIC in the server with a TNIC almost has the same effect as adding another CPU. That's an impressive savings in cost and power requirements, but once again is this technology needed on the desktop?

BigFoot Networks believes it is and we will see what they have to say about it and their technology next.

Index Killer NIC Technology
POST A COMMENT

87 Comments

View All Comments

  • DaveatBigfoot - Thursday, November 30, 2006 - link

    Dave from Bigfoot Networks here. We wanted to reach out to comments and forums around the Internet, address some of the issues being discussed, and be available for any questions you may have.

    I worked with Gary while he was writing this review. We have a tremendous amount of respect for him and Anandtech.com. I'd be liar if I didn't admit that we were disappointed with the performance and experience that the Anandtech review reflects. We welcomed the "Pepsi Challenge", and appreciated the real-world approach taken.

    While the performance numbers reported were lower than what our customers report, and what we see internally, we thought one of the best testimonials for the Killer was the blind test where a the Killer was added to gamers PC without his knowledge, and he thought there was a new video card or more RAM in the system. Truly, that is what the Killer is all about...smoother, faster gaming...less lag, better performance.

    Back when this review was written, we did have some issues with our drivers. I believe each and every issue manifested itself during Anandtech's testing. It was very unfortunate and not anticipated. Bypassing the windows network stack and putting a Linux computer on a PCI slot is a bit tricky. We aren't using that as an excuse, just stating it as a fact. Our latest software suite addresses all the issues that are referenced in this review.

    We have also recently released IPtables firewall for the Killer NIC. Many more FNApps are on the way, and with time the Killer's value will increase. A rarity in the hardware world.

    We sincerely hope, at some point, Anandtech will give the Killer another shot. We firmly stand by our product and believe it holds tremendous value for online gamers.

    I am also happy to answer any questions you may have about the Killer, so fire away!
    Reply
  • lwright84 - Thursday, November 09, 2006 - link

    http://hardware.gotfrag.com/portal/story/34683/">http://hardware.gotfrag.com/portal/story/34683/

    explains some of the features and shows some better results with this card.
    Reply
  • goinginstyle - Wednesday, November 29, 2006 - link

    They only tested two games and both were optimized for the KillerNIC. They give it an editors award for improving FEAR by 6.7%, come on. Reply
  • trajik78 - Sunday, November 05, 2006 - link

    did i mention $300 is f'in crazy for a NIC? Reply
  • cotak - Sunday, November 05, 2006 - link

    This is as useful as something that makes guys quicker during sex.

    As for people talking about this being enterprise storage technology. They use fiber for that with expensive fiber switches not Ethernet and not something you'd be able to afford at home.

    What's the point of reviewing something like this. In the first part of the review they say "the internet is variable". That's your key right there. There's no point in speeding up your connection to your cable/dsl modem when everything else from here to whatever is unknown. 300 bucks on a card like this and connecting it to your typical linksys router with the new VxWorks firmware with limited number of NAT connections it's about as dumb putting huge spoilers on a shitty car.
    Reply
  • trajik78 - Sunday, November 05, 2006 - link

    yup, pretty much every review has confirmed that this product is more than not-worthy of the $300 that could be better used for say a couple kegs of beer, or towards college tuition.

    when it comes down to it, your built in MB ethernet interface is more than worthy of your use for any circumstance, even it be HUGE FRAGFEST AT YOUR FRIENDS LAN PARTY!!
    Reply
  • aswinp - Wednesday, November 01, 2006 - link

    Check out this site for more info on TNICS:

    In my (small) experience in enterprise storage solutions, I believe one of the main reason for using TOE NICS is for iSCSI (SCSI over IP) SAN applications, instead of using Fiber Channel or other SAN solutions. So you basically have a SAN whose fabric is not based on expensive Fiber Channel hardware but on regular Ethernet.

    Top 10 Reasons to upgrade to a TNIC:
    http://www.alacritech.com/html/toe_top_ten.shtml">http://www.alacritech.com/html/toe_top_ten.shtml

    Benchmark Reports:
    http://www.alacritech.com/html/benchmark_reports.s...">http://www.alacritech.com/html/benchmark_reports.s...
    Reply
  • mlau - Thursday, November 02, 2006 - link

    I strongly suggest you read this mail and the paper it links to:
    http://www.cs.helsinki.fi/linux/linux-kernel/2003-...">http://www.cs.helsinki.fi/linux/linux-kernel/2003-...

    TOE is another marketing fad, nothing more.
    Reply
  • aswinp - Wednesday, November 01, 2006 - link

    I guess Killer NIC saw this technology starting to rise in popularity in the enterprise storage market and thought... "Hey, what happens if we apply this thing to gaming?". And so you get the Killer NIC.

    Although I admit the FNA feature is very interesting, if ever any software ever gets written to take advantage of it.

    What I'd really like to see is what happens when the Killer NIC is put in comparison to true TOE NICS in IP SAN applications. Coz its less expensive than these guys.
    Reply
  • soydeedo - Wednesday, November 01, 2006 - link

    hey guys. there have been scores of complaints regarding lag and such when running the new titan mode in battlefield 2142. if the titan [a very large airship] is moving while many players are aboard it things can get a bit hairy. i've experienced this myself although not very often, but it's pretty aggravating and severely impacts playability. i'm requesting that you play a couple rounds with a moving titan [it's imperative that it's moving] and report back your results with this killernic. i've made a post about this on firingsquad and totalbf2142 to no avail so if you guys would test this out i [and potentially many others if it offers any benefits] would appreciate it. thanks. =) Reply

Log in

Don't have an account? Sign up now