TL/DR

  • In-network tunneling turns the entire underlay into a PMTUD-blackhole for the overlay
  • The lack of a clear solution probably has a lot to do with ambiguity as to the role of tunnel interfaces
  • Existing workarounds boil down to either
    • “Throw bigger MTU at the underlay”, or
    • “Claw back MTU from the overlay”

Verbose

In-network tunneling / IP-overlay Breaks PMTUD
  • There’s not a working mechanism for underlay PMTU information to make it into the overlay
  • The entire underlay becomes a PMTUD blackhole for the overlay

How Can This Be?

  • Hosts and routers have different PMTUD obligations
  • Routers only have to generate PTB messages
    • Not implement path MTU
  • Hosts have to implement/maintain path MTU state
  • Devices doing IP-overlay encapsulation are forwarding datagrams (like a router) and generating datagrams (like a host)
    • And (usually) fulfilling the PMTUD obligations of neither role
Overlay/Underlay/Tunnel Interfaces
  • What happens when Eth-b receives a PTB to Tu-a?
    • Does it send the PTB to Tu-a?
    • Does Tu-a xlate the PTB into the overlay?
    • Does Tu-a implement path MTU?
  • Status quo, basically nothing
    • Tu-a isn’t maintaining its own path-MTU state
    • Tu-a isn’t translating the PTB into the overlay

Why nothing?

  • If Tu-a receives a PTB from the underlay
  • The PTB itself doesn’t contain enough information to successfully translate it into the overlay
  • Tu-a typically doesn’t implement path-MTU because … (?)
    • Everybody wants to buy “wire-speed” overlay capabilities
      • And that’s not feasible(?) to implement in-ASIC?
Current Workarounds
  • Overlay providers can declare much smaller MTUs on the overlay network services than on the underlay services and avoid the problem altogether

  • Overlay consumers can just decrease their interface MTUs for the same result