[PATCH] wg-quick: linux: fix MTU calculation (use PMTUD)

Thomas Brierley tomxor at gmail.com
Sun Oct 29 19:23:00 UTC 2023


Currently MTU calculation fails to successfully utilise the kernel's
built-in path MTU discovery mechanism (PMTUD). Fixing this required a
re-write of the set_mtu_up() function, which also addresses two related
MTU issues as a side effect...

1. Trigger PMTUD Before Query

Currently the endpoint path MTU acquired from `ip route get` will almost
definitely be empty, because this only queries the routing cache.  To
trigger PMTUD on the endpoint and fill this cache, it is necessary to
send an ICMP with the DF bit set.

We now perform a ping beforehand with a total packet size equal to the
interface MTU, larger will not trigger PMTUD, and smaller can miss a
bottleneck. To calculate the ping payload, the device MTU and IP header
size must be determined first.

2. Consider IPv6/4 Header Size

Currently an 80 byte header size is assumed i.e. IPv6=40 + WireGuard=40.
However this is not optimal in the case of IPv4. Since determining the
IP header size is required for PMTUD anyway, this is now optimised as a
side effect of endpoint MTU calculation.

3. Use Smallest Endpoint MTU

Currently in the case of multiple endpoints the largest endpoint path
MTU is used. However WireGuard will dynamically switch between endpoints
when e.g. one fails, so the smallest MTU is now used to ensure all
endpoints will function correctly.

Signed-off-by: Thomas Brierley <tomxor at gmail.com>
---
 src/wg-quick/linux.bash | 41 ++++++++++++++++++++++++++---------------
 1 file changed, 26 insertions(+), 15 deletions(-)

diff --git a/src/wg-quick/linux.bash b/src/wg-quick/linux.bash
index 4193ce5..5aba2cb 100755
--- a/src/wg-quick/linux.bash
+++ b/src/wg-quick/linux.bash
@@ -123,22 +123,33 @@ add_addr() {
 }
 
 set_mtu_up() {
-	local mtu=0 endpoint output
-	if [[ -n $MTU ]]; then
-		cmd ip link set mtu "$MTU" up dev "$INTERFACE"
-		return
-	fi
-	while read -r _ endpoint; do
-		[[ $endpoint =~ ^\[?([a-z0-9:.]+)\]?:[0-9]+$ ]] || continue
-		output="$(ip route get "${BASH_REMATCH[1]}" || true)"
-		[[ ( $output =~ mtu\ ([0-9]+) || ( $output =~ dev\ ([^ ]+) && $(ip link show dev "${BASH_REMATCH[1]}") =~ mtu\ ([0-9]+) ) ) && ${BASH_REMATCH[1]} -gt $mtu ]] && mtu="${BASH_REMATCH[1]}"
-	done < <(wg show "$INTERFACE" endpoints)
-	if [[ $mtu -eq 0 ]]; then
-		read -r output < <(ip route show default || true) || true
-		[[ ( $output =~ mtu\ ([0-9]+) || ( $output =~ dev\ ([^ ]+) && $(ip link show dev "${BASH_REMATCH[1]}") =~ mtu\ ([0-9]+) ) ) && ${BASH_REMATCH[1]} -gt $mtu ]] && mtu="${BASH_REMATCH[1]}"
+	local dev devmtu end endmtu iph=40 wgh=40 mtu
+	# Device MTU
+	if [[ -n $(ip route show default) ]]; then
+		[[ $(ip route show default) =~ dev\ ([^ ]+) ]]
+		dev=${BASH_REMATCH[1]}
+		[[ $(ip addr show $dev scope global) =~ inet6 ]] &&
+			iph=40 || iph=20
+		if [[ $(ip link show dev $dev) =~ mtu\ ([0-9]+) ]]; then
+			devmtu=${BASH_REMATCH[1]}
+			[[ $(( devmtu - iph - wgh )) -gt $mtu ]] &&
+				mtu=$(( devmtu - iph - wgh ))
+		fi
+		# Endpoint MTU
+		while read -r _ end; do
+			[[ $end =~ ^\[?([a-f0-9:.]+)\]?:[0-9]+$ ]]
+			end=${BASH_REMATCH[1]}
+			[[ $end =~ [:] ]] &&
+				iph=40 || iph=20
+			ping -w 1 -M do -s $(( devmtu - iph - 8 )) $end &> /dev/null || true
+			if [[ $(ip route get $end) =~ mtu\ ([0-9]+) ]]; then
+				endmtu=${BASH_REMATCH[1]}
+				[[ $(( endmtu - iph - wgh )) -lt $mtu ]] &&
+					mtu=$(( endmtu - iph - wgh ))
+			fi
+		done < <(wg show "$INTERFACE" endpoints)
 	fi
-	[[ $mtu -gt 0 ]] || mtu=1500
-	cmd ip link set mtu $(( mtu - 80 )) up dev "$INTERFACE"
+	cmd ip link set mtu ${MTU:-${mtu:-1420}} up dev "$INTERFACE"
 }
 
 resolvconf_iface_prefix() {
-- 
2.30.2



More information about the WireGuard mailing list