Basic Netfilter Function Block Diagram

This is an old revision of the document!


Both NFTables and IPTables use the Netfilter framework provided in the Linux kernal. NFtables was implemented to supersede IPTables, which due to the widespread use of IPTables, will probably take a long time.

The following is a basic block diagram of the Netfilter Filter and NAT (Network Address Translation) functions, which are the basic requirements for router.

       Incoming
       Packets
          |
    ┌────────────┐
    │ Prerouting │
    │ Rules      │
    └────────────┘
          |
     /----------\
     | Routing  |       NAT
     | Decision |-----------------|
     |  Rules   |       Filter    |
     \----------/                 |
           |                      |
    |------------|                |
    | Input      |                |
    | Rules      |                |
    |------------|                |
           |                      |
 |-------------------|      |----------|
 | Network Processes |      | Forward  |
 | within Router     |      | Rules    |
 |-------------------|      |----------|
           |                      |
    |------------|                |
    | Output     |                |
    | Rules      |                |
    |------------|                |
           |            FILTER    |
           |   |------------------|
           |   |        NAT
    |-------------|
    | Postrouting |
    | Rules       |
    |-------------|   
           |
       Outgoing
       Packets

Relationships Between Chains and Tables

If three tables have PREROUTING chains, in which order are they evaluated?

The following table indicates the chains that are available within each iptables table when read from left-to-right. For instance, we can tell that the raw table has both PREROUTING and OUTPUT chains. When read from top-to-bottom, it also displays the order in which each chain is called when the associated netfilter hook is triggered.

A few things should be noted. In the representation below, the nat table has been split between DNAT operations (those that alter the destination address of a packet) and SNAT operations (those that alter the source address) in order to display their ordering more clearly. We have also include rows that represent points where routing decisions are made and where connection tracking is enabled in order to give a more holistic view of the processes taking place:

Tables/Chains PREROUTING INPUT FORWARD OUTPUT POSTROUTING
(routing decision)
raw
(connection tracking enabled)
mangle
nat (DNAT)
(routing decision)
filter
security
nat (SNAT)

As a packet triggers a netfilter hook, the associated chains will be processed as they are listed in the table above from top-to-bottom. The hooks (columns) that a packet will trigger depend on whether it is an incoming or outgoing packet, the routing decisions that are made, and whether the packet passes filtering criteria.

Certain events will cause a table’s chain to be skipped during processing. For instance, only the first packet in a connection will be evaluated against the NAT rules. Any nat decisions made for the first packet will be applied to all subsequent packets in the connection without additional evaluation. Responses to NAT’ed connections will automatically have the reverse NAT rules applied to route correctly. Chain Traversal Order

Assuming that the server knows how to route a packet and that the firewall rules permit its transmission, the following flows represent the paths that will be traversed in different situations:

  • Incoming packets destined for the local system: PREROUTING → INPUT
  • Incoming packets destined to another host: PREROUTING → FORWARD → POSTROUTING
  • Locally generated packets: OUTPUT → POSTROUTING

If we combine the above information with the ordering laid out in the previous table, we can see that an incoming packet destined for the local system will first be evaluated against the PREROUTING chains of the raw, mangle, and nat tables. It will then traverse the INPUT chains of the mangle, filter, security, and nat tables before finally being delivered to the local socket.

Some references

PPPoE MTU Requirements

The PPPoE connection have various additional overhead to that in a standard Ethernet data field. The maximum length (MTU) of the data field of a standard Ethernet data field is limited 1500 bytes.

A standard PPPoE connection has an additional overhead of 8 bytes, which limits the MTU to 1492 bytes. However, some ISP (internet service providers) may have additional overheads. To determine the the largest MTU use the ping command. The ping command has a 28 bytes overhead (20 bytes IP header + 8 bytes for ICMP header). So the MTU is the greatest value that can be pinged without a fragmentation error, plus 28 bytes for the ping overhead. For a normal PPPoE connection this would be 1492 - 28 = 1464 bytes. (Note that a problem with this method is that it probably uses an existing modem router that sets the MTU, and it is possible that this setting acts as the limiter.) Some command examples:

  • ping -s 1464 -c1 google.com
  • tracepath vorash.stgraber.org

See references: How to Optimize your Internet Connection using MTU and RWIN, MTU and TCP MSS when using PPPoE, TCP Headers and UDP Headers Explained, Path MTU Discovery and Filtering ICMP Cisco Resolve IP Fragmentation, MTU, MSS, and PMTUD Issues with GRE and IPSEC, Understanding MTU for ADSL, and Wikipedia IPv4, Ethertype, IEEE 802.1Q, Maximum transmission unit, Point-to-point protocol over Ethernet, IPv6 packet, Internet Control Message Protocol version 6, and Path MTU Discovery.

The MSS is normally just 40 bytes less than the MTU. The MSS is used to avoid IP fragmentation at endpoints of TCP connections. The MSS is just the TCP data size and excludes the IP and TCP headers that are normally 20 bytes each. So normal mss would be 1492 - 40 = 1452 bytes

Some Ethernet data field overheads to consider:

  • PPPoE header = 8 bytes
  • IP header = 20 bytes, but can grow up to 60 bytes with options that are rarely used.
  • ICMP header = 8 bytes
  • TCP header = 20 bytes, but like IP can grow to 60 bytes long

The Ethernet datafield (MTU) is limited to 1500 bytes and the maximum Ethernet frame size must be 1536 bytes or greater. The following overheads in the Ethernet frame, over the MTU are given for information:

  • Preamble = 8 bytes
  • Destination MAC = 6 bytes
  • Source MAC = 6 bytes
  • VLAN header (optional) = 4 btyes
  • EtherType/Size = 2 bytes
  • Payload = maximum 1500 bytes (MTU)
  • CRC/FCS = 4 bytes
  • As can be seen above the Ethernet frame overhead is normally a minimum of 26 bytes and 30 bytes with VLAN (IEEE 802.1Q) tagging.

To set the PPPoE connection mtu edit the following file sudo vim /etc/ppp/ip-up and append the following to the end of the file: /sbin/ifconfig ppp0 mtu 1492.

ICMP Filtering

There seems to be a lot of conflicting information on filtering ICMP, too much!. ICMP is a fundamental component of IP protocal suite and simply blocking it in entirety is poor practice. In fact IPv6 will not function correctly without ICMP. Some judicious filtering and rate limiting seems the correct solution. The following is some reading on ICMP:

/app/www/public/data/attic/linux_router/netfilter.1719103776.txt.gz · Last modified: 2024-06-23 Sun wk25 08:49
CC Attribution-Share Alike 4.0 International Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International