pfSense Configuration for Benchmarking

A perusal of the FreeBSD firewall performance evaluation guidelines and the accompanying infrastructure helped us narrow down the scope of testing. As elaborated in the section covering the testing methodology, the DUT was configured in various states and the iPerf3 regular TCP benchmark and the pkt-gen sweep for different packet sizes were run for traffic passing through the firewall. A test of the L3 forwarding capabilities of the DUT was also performed using the ipgen benchmark while keeping in mind its stimulus-generating machine limited nature.

Supermicro E302-9D as pfSense Firewall - Benchmarked Modes
Mode DUT Commands / Rules
Router sysctl net.inet.ip.forwarding=1
pfctl -d
PF (No Filters) sysctl net.inet.ip.forwarding=1
pfctl -e
pfctl -F all
PF (Default Ruleset) sysctl net.inet.ip.forwarding=1
pfctl -e
(Additional firewall rules specified at end of sub-section)
PF (NAT Mode) sysctl net.inet.ip.forwarding=1
pfctl -e
pfctl -F all -f /home/username/nat.pf
PF (IPSec) sysctl net.inet.ip.forwarding=1
pfctl -e
(Additional firewall rules specified at end of sub-section)

The table above summarizes the different states of evaluation and the shell commands used to place the DUT in that mode.

The additional firewall rules for the PF (Default Ruleset) case (added using easyrule / firewall log view) are as below:
pass in quick on ixl2 inet from 172.16.0.0/24 to 172.16.1.0/24 flags S/SA keep state label "USER_RULE"
pass in quick on ixl2 inet from 172.16.0.0/24 to 172.16.10.0/24 flags S/SA keep state label "USER_RULE"
pass in quick on ixl3 inet from 172.16.1.0/24 to 172.16.0.0/24 flags S/SA keep state label "USER_RULE"
pass in quick on ixl3 inet from 172.16.1.0/24 to 172.16.11.0/24 flags S/SA keep state label "USER_RULE"
pass in quick on ixl0 inet from 172.16.10.0/24 to 172.16.0.0/24 flags S/SA keep state label "USER_RULE"
pass in quick on ixl0 inet from 172.16.10.0/24 to 172.16.11.0/24 flags S/SA keep state label "USER_RULE"
pass in quick on ixl1 inet from 172.16.11.0/24 to 172.16.1.0/24 flags S/SA keep state label "USER_RULE"
pass in quick on ixl1 inet from 172.16.11.0/24 to 172.16.10.0/24 flags S/SA keep state label "USER_RULE"
pass in quick on igb3 inet from 172.16.20.0/24 to 172.16.21.0/24 flags S/SA keep state label "USER_RULE"
pass in quick on igb2 inet from 172.16.21.0/24 to 172.16.20.0/24 flags S/SA keep state label "USER_RULE"

The contents of the /home/username/nat.pf file referenced in the PF (NATMode) row of the table are as below:
set limit states 100000000
nat on ixl0 from 172.16.0.0/16 to any -> ixl0
nat on ixl1 from 172.16.0.0/16 to any -> ixl1
nat on igb2 from 172.16.0.0/16 to any -> igb2
pass in quick all keep state
pass out quick all keep state

The IPsec evaluation doesn't follow the steps outlined for the other modes. Instead of using both the source and the sink, along with iPerf3 and pkt-gen programs running on either side, only the source and the DUT are used. A baseline iPerf3 run between the source and the DUT (with no IPsec communication) is used for comparison. The communication between the two sets of ports is configured for IPsec using the script template below (invoked from the shell as an argument to the setkey -f command). The previous security policies and associations are flushed prior to the invocation.
flush;
spdflush;
# Host to host ESP
# Security Associations
add 172.16.0.2 172.16.0.1 esp 0x10001 -E -A ;
add 172.16.0.1 172.16.0.2 esp 0x10002 -E -A ;
add 172.16.1.2 172.16.1.1 esp 0x10003 -E -A ;
add 172.16.1.1 172.16.1.2 esp 0x10004 -E -A ;
# Security Policies
spdadd 172.16.0.2 172.16.0.1 any -P in IPsec esp/tunnel/172.16.0.2-172.16.0.1/require;
spdadd 172.16.0.1 172.16.0.2 any -P out IPsec esp/tunnel/172.16.0.1-172.16.0.2/require;
spdadd 172.16.1.2 172.16.1.1 any -P in IPsec esp/tunnel/172.16.1.2-172.16.1.1/require;
spdadd 172.16.1.1 172.16.1.2 any -P out IPsec esp/tunnel/172.16.1.1-172.16.1.2/require;

The template above is for the DUT side, with the one on the source side being similar (the in and out are reversed in the security policies section).

The next section provides additional benchmark processing details along with the results for both iPerf3 and ipgen tests. That is followed by a discussion of pkt-gen benchmark results.

Packet Generation Options - A Quantitative Comparison Benchmarking with iPerf3 and ipgen
Comments Locked

34 Comments

View All Comments

  • Jorgp2 - Thursday, July 30, 2020 - link

    Maybe you should learn the difference between a switch and a router first.
  • newyork10023 - Thursday, July 30, 2020 - link

    Why do you people have to troll everywhere you go?
  • Gonemad - Wednesday, July 29, 2020 - link

    Oh boy. I once got Wi-Fi "AC" 5GHz, 5Gbps, and 5G mobile networks mixed once by my mother. It took a while to explain those to her.

    Don't use 10G to mean 10 Gbps, please! HAHAHA.
  • timecop1818 - Wednesday, July 29, 2020 - link

    Fortunately, when Ethernet says 10Gbps, that's what it means.
  • imaheadcase - Wednesday, July 29, 2020 - link

    Put the name Supermicro on it and you know its not for consumers.
  • newyork10023 - Wednesday, July 29, 2020 - link

    The Supermicro manual states that a PCIe card installed is limited to networking (and will require a fan installed). An HBA card can't be installed?
  • abufrejoval - Wednesday, July 29, 2020 - link

    Since I use both pfSense as a firewall and a D-1541 Xeon machine (but not for the firewall) and I share the dream of systems that are practically silent, I feel compelled to add some thoughts:

    I started using pfSense on a passive J1900 Atom board which had dual Gbit on-board and cost less than €100. That worked pretty well until my broadband exceeded 200Mbit/s, mostly because it wasn’t just a firewall, but also added Suricata traffic inspection (tried Snort, too, very similar results).

    And that’s what’s wrong with this article: 10Gbit Xeon-Ds are great when all you do is push packet, but don’t look at them. They are even greater when you terminate SSL connections on them with the QuickAssist variants. They are great when they work together with their bigger CPU brothers, who will then crunch on the logic of the data.

    In the home-appliance context that you allude to, you won’t have ten types of machines to optimally distribute that work. QuickAssist won’t deliver benefits while the CPU will run out of steam far before even a Gbit connection is saturated when you use it just for the front end of the DMZ (firewall/SSL termination/VPN/deep inspection/load-balancing-failover).

    Put proxies, caches or even application servers on them as well, even a single 10Gbit interface may be a total waste.

    I had to resort to an i7-7700T which seems a bit quicker than the D-2123IT at only 35Watts TDP (and much cheaper) to sustain 500Mbit/s download bandwidth with the best gratis Suricata rule set. Judging by CPU load observations it will just about manage the Gbit loads its ports can handle, pretty sure that 2.5/5/10 Gbit will just throttle on inspection load, like the J1900 did at 200Mbit/s.

    I use a D-1541 as an additional compute node in an oVirt 3 node HCI gluster with 3x 2.5Gbit J5005 storage nodes. I can probably go to 6x 2.5Gbit before its 10Gbit NIC becomes a bottleneck.

    The D-1541’s benefit there is lots of RAM and cores, while it’s practically silent with 45 Watts TDP and none of the applications on it require vast amounts of CPU power.

    I am waiting for an 8-core AMD 4000 Pro 35 Watt TDP APU to come as Mini-ITX capable of handling 64 or 128GB of ECC-RAM to replace the Xeon D-1541 and bring the price for such a mini server below that of a laptop with the same ingredients.
  • newyork10023 - Wednesday, July 29, 2020 - link

    With an HBA (were it possible, hence my question), the 10Gbps serves a possible use (storage). Pushing and inspection exceeds x86 limits now. See TNSR for real x86 limits (wighout inspection).
  • abufrejoval - Wednesday, July 29, 2020 - link

    That would seem apply to the chassis, not to the mainboard or SoC.
    There is nothing to prevent it from working per se.

    I am pretty sure you can add a 16-port SAS HBA or even NVMeOF card and plenty of external storage, if thermals and power fit. A Mellanox 100Gbit card should be fine electrically, logically etc, even if there is nothing behind to sustain that throughput.

    I've had an Nvidia GTX1070 GPU in the SuperMicro Mini-ITX D-1541 for a while, no problem at all, functionally, even if games still seem to prefer Hertz over cores. Actually GPU accellerated machine learning inference was the original use case of that box.
  • newyork10023 - Wednesday, July 29, 2020 - link

    As pointed out, the D2123IT has no QAT, so a QAT accelerator would take up an available PCIe slot. It could push 10G packets then, but not save them or think (AI) on them.

Log in

Don't have an account? Sign up now