Packet Generation Options - A Quantitative Comparison

The determination of packet processing speeds of a firewall / router in a test largely absolves the need to take a look at the transport protocol (TCP or UDP). Towards this, packet generators are commonly used tomeasure the performance of routers, switches, and firewalls. Traditional bandwidth measurement at higher levels in the network stack make more sense for client devices running end-user applications. There are many commercial packet generating hardware appliances and applications used in the industry from vendors such as Ixia and Spirent. For software developers and homelab enthusiasts, and even for many hardware developers, PC software such as TRex and Ostinato fit the bill. While these software tools have a bit of a learning curve, there are simple command-line applications that can deliver quick performance measurement results.

FreeBSD supports a framework for fast packet I/O in netmap. It allows applications to access interface devices without the need to go through the host stack (assuming the existence of support from the device driver). Packet generators taking advantage of this framework can generate packets at line rates for even reasonably small packet sizes. The netmap source also includes pkt-gen, a sample packet generator application that utilizes the netmap framework. The open-source community has also created a number of applications utilizing netmap and pkt-gen, allowing for easier interactive testing as well as easy automation for common scenarios. One such application is ipgen. It also includes a built-in option to benchmark packet generation. iPerf is a popular network performance measurement tool. It outputs easy to understand bandwidth numbers particularly relevant to end users of client devices. iPerf3 includes a length parameter that allows control over the UDP datagram size, allowing the simulation of packet generation similar to pkt-gen and ipgen.

In the rest of this section, we benchmark each of these options on various machines in our testbed under different conditions. This includes the dimunitive Compulab fitlet-XA10-LAN with four gigabit LAN ports. It is an attractive x86-64 system for embedded networking applications requiring multiple network ports. While it is not in the same class as the other server systems being tested in this section, it does provide context to folks adopting these types of systems for packet generation / testbed applications.

iPerf3

The iPerf3 benchmarking tool is used to get a quick idea of the networking capabilities of end-user devices. In its most common usage avatar, various options such as the TCP window size / UDP packet length are left at default. The ability to alter the latter does provide an avenue to explore the packet generation capabilities of iPerf. Though iPerf allows the length parameter to be set to very high values for the UDP datagram size (up to the maximum theoretical value of around 64K), going above the MTU results in fragmentation.

`iperf3 -u -c ${ServerIP} -t ${RunDuration} -O 5 -f m -b 10G --length ${pktsize} 2>&1`

As part of our testing, the source was configured to send UDP datagrams of various lengths ranging from 16 bytes to 1500 bytes across the DUT in router mode, as shown in the testing script extract above.

The bandwidth drop when going from 1472 to 1500 for the datagram length is explained by fragmentation. Protocol overheads tag more bytes on top of the length parameter passed to iPerf3, and that exceeds the minimum configured MTU in the network path. Packet generators are expected to saturate the link bandwidth for all but the smallest packet sizes. The results above suggest that usage of iPerf3 for this purpose is not advisable.

ipgen

The ipgen tool is considered next because it has a built-in benchmark mode. This mode doesn't actually place the generated packets on the network interface - rather it is a pure test of the CPU and the memory subsystem's capability to generate raw packets of different sizes. Multiple instances of the packet generator running simultaneously need to be bound to different cores in order to obtain the best performance.

`timeout 10s cpuset -l $cpuset ipgen -X -s $pktsize 2>&1`

The ipgen benchmark involves generating packets of various sizes for 10 seconds each. The first set involves generating of a single stream, the second involves two simultaneous streams, and so on up to four simultaneous streams. The process is bound to distinct physical cores in case of systems having the physical core count different from the logical core count. The average packet generation rate for across all enabled streams (measured in million packets per second - Mpps) is presented in the graph below.

The generator must be able to output 1.488 Mpps on a 1G interface and 14.88 Mpps on a 10G interface in order to maintain wire speeds when minimum-sized packets are considered. Considering the network interfaces on the machines in the above graphs, the CPUs are suitably equipped for the presented best-case scenario where no attempt is made to dump out the generated packet contents or drive them on to a network interface. Enabling such activities is bound to introduce some performance penalties.

pkt-gen

The pkt-gen benchmark described here adds a practical layer to the benchmark mode seen in the previous sub-section. The generated packets are driven on the network interface to the external device (in this case, the E302-9D pfSense firewall) which is configured to drop them. The line-rate often acts as the limiting factor for large frame sizes.

`timeout ${RunDuration}s /usr/obj/usr/src/amd64.amd64/tools/tools/netmap/pkt-gen -i ${IntfName} -l ${pktsize} -s ${SrcIP} -d ${DestIP} -D ${DestMAC} -f tx -N -B 2>&1`

With the network interface as the limiting factor, benchmark numbers are presented only for a single stream. As expected, the CPU speed and cache organization plays a major role in this task, with the 5019D-4C-FN8TP (equipped with an actively cooled 2.2 GHz Intel Xeon D-2123IT) being able to generate packets at the line-rate even for minimum-sized packets.

Based on the above results, it is clear why the pkt-gen tool is adopted widely as a reliable packet generator for performance verification. It may not offer the flexibility and additional features needed for other purposes (fulfilled by offerings such as TRex and Ostinato), but it suffices for a majority of the testing we set out to do. Tools such as ipgen and iPerf3 are still used in a few sections, but, as we shall see further down, pkt-gen is able to stress the DUT the best without being bottlenecked by the stimulus generators.

Evaluation Setup and Testing Methodology pfSense Configuration for Benchmarking
Comments Locked

34 Comments

View All Comments

  • newyork10023 - Wednesday, July 29, 2020 - link

    I can't currently understand why there is any interest in anything other than AMD. Might be some niche SMP 4-8P needs, but the pure core and IO of AMD puts Intel to rest. With Intel unable to get to 7nm, I hope AMD gets a fare share.
  • Jorgp2 - Thursday, July 30, 2020 - link

    Why do you people have to shill everywhere you go?
  • newyork10023 - Thursday, July 30, 2020 - link

    Because we have no vested interest (in Intel) and talk honestly and openly?
  • Jorgp2 - Thursday, July 30, 2020 - link

    Sure it's because you don't know what you're going on about, and are just repeating the circlejerk?

Log in

Don't have an account? Sign up now