CipeX - PMTU problems with IP-over-IP tunneling

Symptoms:

Hanging connections, stalling data transfers. Partially established connections.

Cause:

Connections over interfaces with an MTU (Max. Transmit Unit) smaller than 1500 bytes, such as PPPOE, (PPP) dial-up routers, MS Windows as router, CIPE or OpenVPN tunnels, non-Ethernet-II frames, etc., require somewhat smaller packets than usual. If a packet is too large for an interface, it is fragemented and sent in several smaller packets. The fragmented packets are reassembled at the target computer's interface, which is a fully transparant process.
Fragmentation is a cpu intensive task, too expensive for backbone routers, and is often avoided by setting the "Don't Fragment" (DF) bit. When the DF-bit is set to '1' in an IP packet, the router will not fragment it if too large, but sends an ICMP packet (type 3, subtype 4: unreachable, fragmentation needed) back to the sender, requesting smaller packets, and (if implemented by the router) also the MTU to use in a 16-bits data field. This process is called Path MTU (PMTU) discovery and is described in RFC 1191.
When a router or firewall doesn't allow ICMP packets and drops them, the ICMP notification will not be received by the sender and the PMTU process fails: the handshake is lost and the connection hangs or the file transfer stalls.

Solutions:

One solution is to allow fragmentation by (if possible) preventing that the DF-bit is set: Prevent transmission of packets larger than the smallest MTU along the path to the failing host, by specifying a smaller MTU for the interface or a smaller MSS for the route to that host.

Notes:



Ref: This is a summary of information available at the following locations:
  • http://sdb.suse.de/en/sdb/html/cg_pmtu.html
  • http://www.netheaven.com/pmtu.html
  • http://blue-labs.org/howto/mtu-mss.php
  • RFC 1191: Path MTU Discovery 1990 (obsoletes RFC 1063)

    Below the layout of the ICMP - Type 3: "Destination Unreachable",Code 4: "Fragmentation needed and DF set" packet as described in RFC 1191.
    
           0                   1                   2                   3
           0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          |   Type = 3    |   Code = 4    |           Checksum            |
          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          |           unused = 0          |         Next-Hop MTU          |
          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          |      Internet Header + 64 bits of Original Datagram Data      |
          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+