IPv6 and PPPoE with MSSFIX

Wed Aug 23 19:55:23 UTC 2023

> > I could dynamically add firewall rules to clamp MSS per authorized_ips
> > but, theoretically, the kernel has all the info to do that
> > automatically. I wonder if MSSFIX could detect the best MTU for a
> > specific address through the wireguard. It should consider the
> > peer-to-peer PMTU, the IP protocol wireguard is using and the normal
> > wireguard headers.
>
> Interesting idea Luiz, so if I understand correctly you have a wg device
> with multiple peers where only some of them need the reduced MTU and you'd
> like to use the maximum possible MTU for all peers.
>
> As things are this won't "just work" with MSSFIX because the wg device
> won't generate ICMP packet-too-big errors for packets sent to it for
> encapsulation regardless of the underlying PMTU, rather the wg device will
> always fragment when the resulting encapsulated packet doesn't fit as
> you've observed.
>
> AFAIK MSSFIX will only look at the actual outgoing route MTU and calculate
> the MSS from that. Since wg never causes (dynamic) PMTU entries to be
> created that won't work.
>
> However we can also just create "static" PMTU entries. As we've seen above
> linux uses the "mtu" route attribute to determine the actual PMTU behind a
> route, as opposed to the netdev MTU, which you should think of as the upper
> limit of what a link can support.
>
> So you can try adding a route specific for the peer that's behind PPPoE
> with the reduced PMTU. Assuming 2001:db8:1432::/64 is this peer's
> AllowedIPs:
>
>     $ ip route add 2001:db8:1432::/64 dev wg0 mtu 1432 proto static
>
> You should be able to add this in PostUp in your wg.conf. The "proto
> static" is optional, I just like to use that to mark administratively
> created routes.
>
> You're still going to want to set the peer's wg device MTU to 1432 or you
> can create "mtu" routes in a similar fashion there. Up to you.
>
> Also note MSSFIX or the nft equivalent mouthful `tcp flags syn tcp option
> maxseg size set rt mtu` is really only appropriate for IPv4 traffic since
> IPv4-PMTU is broken by too many networks. However over in always-sunny IPv6
> land PMTU does work and should be preferred to mangling TCP headers. The
> static PTMU route we created should cause the kernel to start sending the
> appropriate ICMPv6 packet-too-big errors when it's configured for IPv6
> forwarding.
>
> You can test the PTB behaviour with `ping 2001:db8:1432::1 -s3000 -M do`.
> The -s3000 sends large packets, careful with the size that's the ICMP
> _payload size_ so it's not equivalent to MTU, and `-M do` disables local
> fragmentation so you can see when PMTU is doing it's job. You'll get
> something like "ping: local error: message too long, mtu: XXXX" showing the
> PMTU value if ICMP-PTB error generation is working along the path.

I didn't think about adding the MTU directly to the route table. Now
it is more interesting. Wireguard adds a route to each allowed ips. If
we detect a pmtu change pmtu for a target, we could adjust those
routes to avoid fragmentation. I just don't know if we would break the
connection if we modify MTU up or down during a transfer. I believe
increasing it won't matter for existing connections as MSS is already
negotiated and bringing it down will just fragment the traffic.
Anyway, I believe it is better to fragment the plain packet than the
encrypted one. And for new TCP connections, the firewall can clamp TCP
MSS to the optimal value, even considering if it is using IPv4 or
IPv6.