From fbausch at ernw.de Sun Feb 5 19:19:20 2023 From: fbausch at ernw.de (Florian Bausch) Date: Sun, 5 Feb 2023 20:19:20 +0100 Subject: [PATCH] wg-tools: Fix too strict file permissions on resolv.conf Message-ID: <90cadce0-51e9-d9f3-4b27-084f49e99f1c@ernw.de> Hi, I hardened my system by setting a strict umask of 077 in /etc/login.defs. However, this breaks DNS as soon as wg-quick is used to bring up a WireGuard tunnel. This is, because the strict umask value will be applied to /etc/resolv.conf (at least if the DNS hatchet is used) and therefore, unprivileged processes are not able to read /etc/resolv.conf. While the behavior can be worked around by setting umask in other places, the fix below would prevent this behavior to occur. The umask 022 is applied before creating the new /etc/resolv.conf in the DNS hatchet. Kind regards --- contrib/dns-hatchet/hatchet.bash | 1 + 1 file changed, 1 insertion(+) diff --git a/contrib/dns-hatchet/hatchet.bash b/contrib/dns-hatchet/hatchet.bash index bc4d090..807a14a 100644 --- a/contrib/dns-hatchet/hatchet.bash +++ b/contrib/dns-hatchet/hatchet.bash @@ -20,6 +20,7 @@ set_dns() { [[ ${#DNS_SEARCH[@]} -eq 0 ]] || printf 'search %s\n' "${DNS_SEARCH[*]}" } | unshare -m --propagation shared bash -c "$(cat <<-_EOF set -e + umask 022 context="\$(stat -c %C /etc/resolv.conf 2>/dev/null)" || unset context mount --make-private /dev/shm mount -t tmpfs none /dev/shm -- 2.39.1 From icepic.dz at gmail.com Tue Feb 7 08:30:12 2023 From: icepic.dz at gmail.com (Janne Johansson) Date: Tue, 7 Feb 2023 09:30:12 +0100 Subject: Encapsulation support in wireguard-go In-Reply-To: References: Message-ID: Den tis 7 feb. 2023 kl 05:38 skrev Berkant ?pek : > Does wireguard-go provide any facility in order to encapsulate and > decapsulate WireGuard packets as they leave for and arrive from the > remote peer? Or, am I just better off to use kernel implementation > with a TUN device to handle en- and decapsulation and relaying? I think this is what you might be looking for, https://github.com/WireGuard/wireguard-go/tree/master/tun/netstack/examples which has the complete endpoint in the server (and client) so that the respective programs only need to be able to send/receive UDP packets and will build themselves a wg tunnel and talk http inside it. -- May the most significant bit of your life be positive. From Jason at zx2c4.com Tue Feb 7 22:10:29 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Tue, 7 Feb 2023 19:10:29 -0300 Subject: User on mailinglist that collects addresses to send spam? In-Reply-To: <5d0d60df-ee8c-87ab-379b-31ed33675668@chil.at> References: <5d0d60df-ee8c-87ab-379b-31ed33675668@chil.at> Message-ID: On Tue, Jan 17, 2023 at 12:45:48AM +0100, Christoph Loesch wrote: > Hi, > > sorry for this unrelated mail. > > I use a separate mail-address for every site/contact who wants my mail-address. > Just checked my spam-folder and what do I see? a mail from janisa at onertronics.com to my wireguard mail-address which I use only for this mailinglist. > So I guess janisa at onertronics.com is collecting mail-addresses here and then sending unwanted spam-mails. > > If this user is on this list, I would recommend to remove this address. > > proof: (on request I can also provide full mail headers) lore.kernel.org archives the mailing list, which preserves full headers: https://lore.kernel.org/wireguard/5d0d60df-ee8c-87ab-379b-31ed33675668 at chil.at/raw Anybody could subscribe and set up such a thing. Jason From Jason at zx2c4.com Tue Feb 7 22:53:48 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Tue, 7 Feb 2023 19:53:48 -0300 Subject: [PATCH] handle a network adapter ending in a space character In-Reply-To: References: Message-ID: Thanks. Can you send this with a `Signed-off-by:` line like every commit in that repository does, please? Then I'll apply this. Jason From Jason at zx2c4.com Tue Feb 7 22:54:19 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Tue, 7 Feb 2023 19:54:19 -0300 Subject: [PATCH] wg-tools: Fix too strict file permissions on resolv.conf In-Reply-To: <90cadce0-51e9-d9f3-4b27-084f49e99f1c@ernw.de> References: <90cadce0-51e9-d9f3-4b27-084f49e99f1c@ernw.de> Message-ID: Thanks. Can you send this with a `Signed-off-by:` line like every commit in that repository does, please? Then I'll apply this. Jason From Jason at zx2c4.com Tue Feb 7 22:56:43 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Tue, 7 Feb 2023 19:56:43 -0300 Subject: [PATCH] wg: Fix show all endpoints output In-Reply-To: <11c3d877-2d92-8593-0a9f-e2c918a791c3@gmail.com> References: <11c3d877-2d92-8593-0a9f-e2c918a791c3@gmail.com> Message-ID: Thanks. Can you send this with a `Signed-off-by:` line like every commit in that repository does, please? Then I'll apply this. Jason From Jason at zx2c4.com Wed Feb 8 02:03:54 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Tue, 7 Feb 2023 23:03:54 -0300 Subject: Allow client-side encrypted backups for Android app In-Reply-To: <5e029a99-a860-0ae0-be72-df53cf82d0ce@mokrynskyi.com> References: <5e029a99-a860-0ae0-be72-df53cf82d0ce@mokrynskyi.com> Message-ID: I think I'd prefer to still keep this a bit more locked down. There is the "export tunnels as zip" feature (which requires an explicit authentication step each time), which you can use for backup/restore. Jason From dcow at pm.me Wed Feb 8 02:19:42 2023 From: dcow at pm.me (David Cowden) Date: Wed, 08 Feb 2023 02:19:42 +0000 Subject: Allow client-side encrypted backups for Android app In-Reply-To: References: <5e029a99-a860-0ae0-be72-df53cf82d0ce@mokrynskyi.com> Message-ID: <7SV3pRtTQ0fygsJjyhdMRte9uso_M0G_jTfDeqnELBrym4Z_3NeGeIvgNYpEdXttpmafNk_A2NqH26O6VF_8pgrXjzUOmrCVPehRX7Iu_eE=@pm.me> On Android 12+ you can configure which files are backed up (among other things) at runtime using the BackupAgent API https://developer.android.com/guide/topics/data/autobackup. Would you be opposed to this being a configurable option that defaults to off? David ------- Original Message ------- On Tuesday, February 7th, 2023 at 7:03 PM, Jason A. Donenfeld wrote: > > > I think I'd prefer to still keep this a bit more locked down. There is > the "export tunnels as zip" feature (which requires an explicit > authentication step each time), which you can use for backup/restore. > > Jason From dseliv at gmail.com Wed Feb 8 06:30:16 2023 From: dseliv at gmail.com (Dmitry Selivanov) Date: Wed, 8 Feb 2023 09:30:16 +0300 Subject: [PATCH] wg: Fix show all endpoints output In-Reply-To: References: <11c3d877-2d92-8593-0a9f-e2c918a791c3@gmail.com> Message-ID: <2166b145-1508-b407-da27-93393989d209@gmail.com> Currently "wg show all endpoints" prints interface name only once while other "show all" commands print it on each line as man says. Signed-off-by: Dmitry Selivanov --- src/show.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/show.c b/src/show.c index 3fd3d9e..13777cf 100644 --- a/src/show.c +++ b/src/show.c @@ -312,9 +312,9 @@ static bool ugly_print(struct wgdevice *device, const char *param, bool with_int else printf("off\n"); } else if (!strcmp(param, "endpoints")) { - if (with_interface) - printf("%s\t", device->name); for_each_wgpeer(device, peer) { + if (with_interface) + printf("%s\t", device->name); printf("%s\t", key(peer->public_key)); if (peer->endpoint.addr.sa_family == AF_INET || peer->endpoint.addr.sa_family == AF_INET6) printf("%s\n", endpoint(&peer->endpoint.addr)); -- 2.30.2 From nazar at mokrynskyi.com Wed Feb 8 12:38:16 2023 From: nazar at mokrynskyi.com (Nazar Mokrynskyi) Date: Wed, 8 Feb 2023 14:38:16 +0200 Subject: Allow client-side encrypted backups for Android app In-Reply-To: <7SV3pRtTQ0fygsJjyhdMRte9uso_M0G_jTfDeqnELBrym4Z_3NeGeIvgNYpEdXttpmafNk_A2NqH26O6VF_8pgrXjzUOmrCVPehRX7Iu_eE=@pm.me> References: <5e029a99-a860-0ae0-be72-df53cf82d0ce@mokrynskyi.com> <7SV3pRtTQ0fygsJjyhdMRte9uso_M0G_jTfDeqnELBrym4Z_3NeGeIvgNYpEdXttpmafNk_A2NqH26O6VF_8pgrXjzUOmrCVPehRX7Iu_eE=@pm.me> Message-ID: I know there is an export feature in the app and I used it successfully, but it doesn't make much sense to me to have that and disable OS backups at the same time. There are use cases for one-off copying of things for which exporting as zip is great, but there are also others. I don't want to have set a reminder and regularly go though every single app manually, use their flavor of backup feature (that doesn't necessarily store everything BTW, including in Wireguard), then collect the files somehow, encrypt them and send to the destination. What I want is automation: configure the tool (SeedVault in my case) to create backups of all apps every day and store them in encrypted form on my private Nextcloud instance with ability to restore backups easily later on. The issue is that some apps like Wireguard prevent me from enjoying that workflow fully and right now I don't see why would it be beneficial for Wireguard to intentionally prevent that. With that context I hope it is clearer why I'd appreciate for current design decision around that to be re-evaluated. Sincerely, Nazar Mokrynskyi github.com/nazar-pc 08.02.23 04:19, David Cowden ????: > On Android 12+ you can configure which files are backed up (among other things) at runtime using the BackupAgent API https://developer.android.com/guide/topics/data/autobackup. Would you be opposed to this being a configurable option that defaults to off? > > David > > ------- Original Message ------- > On Tuesday, February 7th, 2023 at 7:03 PM, Jason A. Donenfeld wrote: > > >> >> I think I'd prefer to still keep this a bit more locked down. There is >> the "export tunnels as zip" feature (which requires an explicit >> authentication step each time), which you can use for backup/restore. >> >> Jason -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_0x8CF6D73DB34AAFEA.asc Type: application/pgp-keys Size: 4678 bytes Desc: OpenPGP public key URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 840 bytes Desc: OpenPGP digital signature URL: From nazar at mokrynskyi.com Wed Feb 8 14:00:09 2023 From: nazar at mokrynskyi.com (Nazar Mokrynskyi) Date: Wed, 8 Feb 2023 16:00:09 +0200 Subject: Allow client-side encrypted backups for Android app In-Reply-To: References: <5e029a99-a860-0ae0-be72-df53cf82d0ce@mokrynskyi.com> <7SV3pRtTQ0fygsJjyhdMRte9uso_M0G_jTfDeqnELBrym4Z_3NeGeIvgNYpEdXttpmafNk_A2NqH26O6VF_8pgrXjzUOmrCVPehRX7Iu_eE=@pm.me> Message-ID: <918a6ce2-436e-3b98-de88-0c4735163830@mokrynskyi.com> No, I'm requesting for Wireguard Android app to stop intentionally disallowing backups: https://git.zx2c4.com/wireguard-android/tree/ui/src/main/AndroidManifest.xml?id=713947e432126e0e29dcf497960e5fa0f6301e2b#n36 Sincerely, Nazar Mokrynskyi github.com/nazar-pc 08.02.23 15:34, John Sahhar ????: > I missed the intro to this thread, but if I'm understanding correctly > you need a safe way to back up your wg keys/configs? I wrote a bash > script a few years ago which I use for that, perhaps a starting place > for what you're trying to accomplish. > > https://github.com/ok-john/wireguard-tools/tree/master/contrib/key-grid > https://syscall.network/releases/key-grid.svg > > -- > Regards, > John Sahhar > Cryptographer @ Entropy > > On Wed, Feb 8, 2023 at 12:44 PM Nazar Mokrynskyi wrote: >> I know there is an export feature in the app and I used it successfully, but it doesn't make much sense to me to have that and disable OS backups at the same time. >> There are use cases for one-off copying of things for which exporting as zip is great, but there are also others. >> >> I don't want to have set a reminder and regularly go though every single app manually, use their flavor of backup feature (that doesn't necessarily store everything BTW, including in Wireguard), then collect the files somehow, encrypt them and send to the destination. >> >> What I want is automation: configure the tool (SeedVault in my case) to create backups of all apps every day and store them in encrypted form on my private Nextcloud instance with ability to restore backups easily later on. >> The issue is that some apps like Wireguard prevent me from enjoying that workflow fully and right now I don't see why would it be beneficial for Wireguard to intentionally prevent that. >> >> With that context I hope it is clearer why I'd appreciate for current design decision around that to be re-evaluated. >> >> Sincerely, Nazar Mokrynskyi >> github.com/nazar-pc >> >> 08.02.23 04:19, David Cowden ????: >>> On Android 12+ you can configure which files are backed up (among other things) at runtime using the BackupAgent API https://developer.android.com/guide/topics/data/autobackup. Would you be opposed to this being a configurable option that defaults to off? >>> >>> David >>> >>> ------- Original Message ------- >>> On Tuesday, February 7th, 2023 at 7:03 PM, Jason A. Donenfeld wrote: >>> >>> >>>> I think I'd prefer to still keep this a bit more locked down. There is >>>> the "export tunnels as zip" feature (which requires an explicit >>>> authentication step each time), which you can use for backup/restore. >>>> >>>> Jason -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_0x8CF6D73DB34AAFEA.asc Type: application/pgp-keys Size: 4678 bytes Desc: OpenPGP public key URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 840 bytes Desc: OpenPGP digital signature URL: From Jason at zx2c4.com Wed Feb 8 16:48:51 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Wed, 8 Feb 2023 13:48:51 -0300 Subject: [PATCH] wg: Fix show all endpoints output In-Reply-To: <2166b145-1508-b407-da27-93393989d209@gmail.com> References: <11c3d877-2d92-8593-0a9f-e2c918a791c3@gmail.com> <2166b145-1508-b407-da27-93393989d209@gmail.com> Message-ID: Applied, thanks. https://git.zx2c4.com/wireguard-tools/commit/?id=b4f6b4f229d291daf7c35c6f1e7f4841cc6d69bc From aptalca at linuxserver.io Wed Feb 8 19:33:37 2023 From: aptalca at linuxserver.io (aptalca) Date: Wed, 8 Feb 2023 14:33:37 -0500 Subject: [PATCH] wg-quick: Set sysctl only if necessary Message-ID: Currently, wg-quick script on linux attempts to set the sysctl "net.ipv4.conf.all.src_valid_mark=1" every time, no matter if it's already set or not. The issue is, when the script is run inside a container lacking the privilege for setting sysctls, it fails with a warning message. In such cases, like a docker container, the user is expected to set the sysctl via docker arguments when creating the container so the sysctl is already set correctly. There is no need for wg-quick to set it inside the container as it's already set. The warning in such cases is a false positive and is confusing to the user as it leads them to believe the sysctl is not set correctly. One example is the linuxserver wireguard docker image: https://github.com/linuxserver/docker-wireguard The container is meant to be created with the docker argument '--sysctl="net.ipv4.conf.all.src_valid_mark=1"' so there is no need for wg-quick to set it inside the container. It tries anyway and fails with a warning as listed below. Since the sysctl is already set correctly, everything works as expected. [#] ip link add wg0 type wireguard [#] wg setconf wg0 /dev/fd/63 [#] ip -4 address add 10.1.13.12/32 dev wg0 [#] ip link set mtu 1420 up dev wg0 [#] wg set wg0 fwmark 51820 [#] ip -4 route add 0.0.0.0/0 dev wg0 table 51820 [#] ip -4 rule add not fwmark 51820 table 51820 [#] ip -4 rule add table main suppress_prefixlength 0 [#] sysctl -q net.ipv4.conf.all.src_valid_mark=1 sysctl: setting key "net.ipv4.conf.all.src_valid_mark", ignoring: Read-only file system [#] iptables-restore -n [#] iptables -t nat -A POSTROUTING -o wg+ -j MASQUERADE Here's a patch that makes the sysctl setting attempt to be conditional. It first checks whether it's already set correctly, and only attempts to set it if necessary. Signed-off-by: aptalca --- src/wg-quick/linux.bash | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/wg-quick/linux.bash b/src/wg-quick/linux.bash index 69e5bef..5a8048f 100755 --- a/src/wg-quick/linux.bash +++ b/src/wg-quick/linux.bash @@ -237,7 +237,7 @@ add_default() { printf -v restore '%sCOMMIT\n*mangle\n-I POSTROUTING -m mark --mark %d -p udp -j CONNMARK --save-mark %s\n-I PREROUTING -p udp -j CONNMARK --restore-mark %s\nCOMMIT\n' "$restore" $table "$marker" "$marker" printf -v nftcmd '%sadd rule %s %s postmangle meta l4proto udp mark %d ct mark set mark \n' "$nftcmd" "$pf" "$nftable" $table printf -v nftcmd '%sadd rule %s %s premangle meta l4proto udp meta mark set ct mark \n' "$nftcmd" "$pf" "$nftable" - [[ $proto == -4 ]] && cmd sysctl -q net.ipv4.conf.all.src_valid_mark=1 + [[ $proto == -4 ]] && [[ $(sysctl -n net.ipv4.conf.all.src_valid_mark) != 1 ]] && cmd sysctl -q net.ipv4.conf.all.src_valid_mark=1 if type -p nft >/dev/null; then cmd nft -f <(echo -n "$nftcmd") else -- 2.34.1 From houmie at gmail.com Thu Feb 9 08:05:35 2023 From: houmie at gmail.com (Houman) Date: Thu, 9 Feb 2023 08:05:35 +0000 Subject: Compiling Wireguard-Android repo on a M1 Silicon Mac Message-ID: I have difficulties compiling the Wireguard-Android repo on a M1 Silicon Mac. I got this error message below. Any idea what I need to do? Thanks * What went wrong: Execution failed for task ':tunnel:buildCMakeDebug[arm64-v8a]'. > com.android.ide.common.process.ProcessException: ninja: Entering directory `/Users/houmie/Projects/wireguard-android/tunnel/.cxx/Debug/2w2e1e1x/arm64-v8a' [1/18] Building C object CMakeFiles/libwg-quick.so.dir/ndk-compat/compat.c.o [2/18] Building C object CMakeFiles/libwg-quick.so.dir/wireguard-tools/src/wg-quick/android.c.o [3/18] Linking C executable /Users/houmie/Projects/wireguard-android/tunnel/build/intermediates/cxx/Debug/2w2e1e1x/obj/arm64-v8a/libwg-quick.so [4/18] Building wireguard-go FAILED: CMakeFiles/libwg-go.so /Users/houmie/Projects/wireguard-android/tunnel/.cxx/Debug/2w2e1e1x/arm64-v8a/CMakeFiles/libwg-go.so cd /Users/houmie/Projects/wireguard-android/tunnel/tools/libwg-go && make ANDROID_ARCH_NAME=arm64 ANDROID_C_COMPILER=/Users/houmie/Library/Android/sdk/ndk/21.4.7075529/toolchains/llvm/prebuilt/darwin-x86_64/bin/clang ANDROID_TOOLCHAIN_ROOT=/Users/houmie/Library/Android/sdk/ndk/21.4.7075529/toolchains/llvm/prebuilt/darwin-x86_64 ANDROID_LLVM_TRIPLE=aarch64-none-linux-android21 ANDROID_SYSROOT= ANDROID_PACKAGE_NAME=com.wireguard.android.debug GRADLE_USER_HOME=/Users/houmie/.gradle "CFLAGS=-g -DANDROID -fdata-sections -ffunction-sections -funwind-tables -fstack-protector-strong -no-canonical-prefixes -D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -Wno-unused-command-line-argument" "LDFLAGS=-Wl,--exclude-libs,libgcc.a -Wl,--exclude-libs,libgcc_real.a -Wl,--exclude-libs,libatomic.a -static-libstdc++ -Wl,--build-id -Wl,--fatal-warnings -Wl,--no-undefined -Qunused-arguments -fuse-ld=gold" DESTDIR=/Users/houmie/Projects/wireguard-android/tunnel/build/intermediates/cxx/Debug/2w2e1e1x/obj/arm64-v8a BUILDDIR=/Users/houmie/Projects/wireguard-android/tunnel/build/intermediates/cxx/Debug/2w2e1e1x/obj/arm64-v8a/../generated-src mkdir -p "/Users/houmie/.gradle/caches/golang/" flock "/Users/houmie/.gradle/caches/golang/go1.18.2.darwin-arm64.tar.gz.lock" -c ' \ [ -f "/Users/houmie/.gradle/caches/golang/go1.18.2.darwin-arm64.tar.gz" ] && exit 0; \ curl -o "/Users/houmie/.gradle/caches/golang/go1.18.2.darwin-arm64.tar.gz.tmp" "https://dl.google.com/go/go1.18.2.darwin-arm64.tar.gz" && \ echo "6c7df9a2405f09aa9bab55c93c9c4ce41d3e58127d626bc1825ba5d0a0045d5c /Users/houmie/.gradle/caches/golang/go1.18.2.darwin-arm64.tar.gz.tmp" | sha256sum -c && \ mv "/Users/houmie/.gradle/caches/golang/go1.18.2.darwin-arm64.tar.gz.tmp" "/Users/houmie/.gradle/caches/golang/go1.18.2.darwin-arm64.tar.gz"' % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 131M 100 131M 0 0 17.0M 0 0:00:07 0:00:07 --:--:-- 20.0M zsh:1: command not found: sha256sum make: *** [/Users/houmie/.gradle/caches/golang/go1.18.2.darwin-arm64.tar.gz] Error 127 ninja: build stopped: subcommand failed. C++ build system [build] failed while executing: /Users/houmie/Library/Android/sdk/cmake/3.22.1/bin/ninja \ -C \ /Users/houmie/Projects/wireguard-android/tunnel/.cxx/Debug/2w2e1e1x/arm64-v8a \ libwg-quick.so \ libwg.so from /Users/houmie/Projects/wireguard-android/tunnel * Try: > Run with --stacktrace option to get the stack trace. > Run with --info or --debug option to get more log output. > Run with --scan to get full insights. From nomad at null.net Thu Feb 9 08:58:16 2023 From: nomad at null.net (Mark Lawrence) Date: Thu, 9 Feb 2023 09:58:16 +0100 Subject: Compiling Wireguard-Android repo on a M1 Silicon Mac In-Reply-To: References: Message-ID: On Thu Feb 09, 2023 at 08:05:35AM +0000, Houman wrote: > >I got this error message below. Any idea what I need to do? Thanks Just read it a little more carefully :-) >"https://dl.google.com/go/go1.18.2.darwin-arm64.tar.gz" && \ > echo "6c7df9a2405f09aa9bab55c93c9c4ce41d3e58127d626bc1825ba5d0a0045d5c > /Users/houmie/.gradle/caches/golang/go1.18.2.darwin-arm64.tar.gz.tmp" >| sha256sum -c && \ ^^^^^^^^ > ... > zsh:1: command not found: sha256sum It seems your system does not have the SHA utilities installed. -- Mark Lawrence From fbausch at ernw.de Wed Feb 15 12:54:05 2023 From: fbausch at ernw.de (Florian Bausch) Date: Wed, 15 Feb 2023 13:54:05 +0100 Subject: [PATCH] wg-tools: Fix too strict file permissions on resolv.conf In-Reply-To: References: <90cadce0-51e9-d9f3-4b27-084f49e99f1c@ernw.de> Message-ID: <5dd37668-9c40-38a9-4655-199d0f11b4d9@ernw.de> Hi, I hardened my system by setting a strict umask of 077 in /etc/login.defs. However, this breaks DNS as soon as wg-quick is used to bring up a WireGuard tunnel. This is, because the strict umask value will be applied to /etc/resolv.conf (at least if the DNS hatchet is used) and therefore, unprivileged processes are not able to read /etc/resolv.conf. While the behavior can be worked around by setting umask in other places, the fix below would prevent this behavior to occur. The umask 022 is applied before creating the new /etc/resolv.conf in the DNS hatchet. Kind regards Signed-off-by: Florian Bausch --- contrib/dns-hatchet/hatchet.bash | 1 + 1 file changed, 1 insertion(+) diff --git a/contrib/dns-hatchet/hatchet.bash b/contrib/dns-hatchet/hatchet.bash index bc4d090..807a14a 100644 --- a/contrib/dns-hatchet/hatchet.bash +++ b/contrib/dns-hatchet/hatchet.bash @@ -20,6 +20,7 @@ set_dns() { [[ ${#DNS_SEARCH[@]} -eq 0 ]] || printf 'search %s\n' "${DNS_SEARCH[*]}" } | unshare -m --propagation shared bash -c "$(cat <<-_EOF set -e + umask 022 context="\$(stat -c %C /etc/resolv.conf 2>/dev/null)" || unset context mount --make-private /dev/shm mount -t tmpfs none /dev/shm -- 2.39.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 4819 bytes Desc: S/MIME Cryptographic Signature URL: From fbausch at ernw.de Wed Feb 15 12:55:47 2023 From: fbausch at ernw.de (Florian Bausch) Date: Wed, 15 Feb 2023 13:55:47 +0100 Subject: [PATCH] wg-tools: Fix too strict file permissions on resolv.conf In-Reply-To: <5dd37668-9c40-38a9-4655-199d0f11b4d9@ernw.de> References: <90cadce0-51e9-d9f3-4b27-084f49e99f1c@ernw.de> <5dd37668-9c40-38a9-4655-199d0f11b4d9@ernw.de> Message-ID: <9191bd2e-5d28-fee3-7fab-246050f20b56@ernw.de> (This time without signature) Hi, I hardened my system by setting a strict umask of 077 in /etc/login.defs. However, this breaks DNS as soon as wg-quick is used to bring up a WireGuard tunnel. This is, because the strict umask value will be applied to /etc/resolv.conf (at least if the DNS hatchet is used) and therefore, unprivileged processes are not able to read /etc/resolv.conf. While the behavior can be worked around by setting umask in other places, the fix below would prevent this behavior to occur. The umask 022 is applied before creating the new /etc/resolv.conf in the DNS hatchet. Kind regards Signed-off-by: Florian Bausch --- contrib/dns-hatchet/hatchet.bash | 1 + 1 file changed, 1 insertion(+) diff --git a/contrib/dns-hatchet/hatchet.bash b/contrib/dns-hatchet/hatchet.bash index bc4d090..807a14a 100644 --- a/contrib/dns-hatchet/hatchet.bash +++ b/contrib/dns-hatchet/hatchet.bash @@ -20,6 +20,7 @@ set_dns() { [[ ${#DNS_SEARCH[@]} -eq 0 ]] || printf 'search %s\n' "${DNS_SEARCH[*]}" } | unshare -m --propagation shared bash -c "$(cat <<-_EOF set -e + umask 022 context="\$(stat -c %C /etc/resolv.conf 2>/dev/null)" || unset context mount --make-private /dev/shm mount -t tmpfs none /dev/shm -- 2.39.1 From dzm at unexpl0.red Sat Feb 11 15:39:12 2023 From: dzm at unexpl0.red (z) Date: Sat, 11 Feb 2023 15:39:12 +0000 Subject: Noise Protocol Question Message-ID: <0685312b-2d0f-495b-b321-80d46326b764@app.fastmail.com> Hi, I was reading over the source code for wireguard-go, and I noticed something in the device/noise-protocol.go file that I didn't understand. There are six invocations of the sharedSecret() function, which performs the X25519 operation on a local private key and a remote public key as part of an ECDH key agreement. The first two invocations check for an all zero ECDH result. a.la ss := pk.sharedSecret(pubkey) if isZero(ss) { return nil, errZeroECDHResult } If the result is zero, the operation is aborted. The subsequent 4 invocations, however, don't check for zero on the output of sharedSecret(), and continue processing regardless. In two of the 4 cases, I think I get why it isn't necessary, because the sharedSecret is used as input into an aead.Open, which would simply fail if the ECDH got zero'd out somehow. However the remaining two calls are associated with an aead.Seal, which would succeed, no matter what the shared secret is. TL;DR Why is wireguard go not calling isZero() on the output of the ECDH key agreement every time? Thanks, dzm From Jason at zx2c4.com Thu Feb 16 15:39:35 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Thu, 16 Feb 2023 16:39:35 +0100 Subject: Noise Protocol Question In-Reply-To: <0685312b-2d0f-495b-b321-80d46326b764@app.fastmail.com> References: <0685312b-2d0f-495b-b321-80d46326b764@app.fastmail.com> Message-ID: On Sat, Feb 11, 2023 at 03:39:12PM +0000, z wrote: > TL;DR Why is wireguard go not calling isZero() on the output of the ECDH key agreement every time? Good question. AFAICT, this was something I had noticed back when this code was in development, but then zero checking only got added to the initiation side, not the response side, in 8c34c4c ("First set of code review patches"). I don't know whether this was a mistake or if there was a rationale at the time. Fortunately, there aren't really any real consequences. But I did fix it up, so thanks very much for reporting this: https://git.zx2c4.com/wireguard-go/commit/?id=c7b76d3d9ecdc2ffde80decadda88c0c7cdfeedf Jason From aleksander.lobakin at intel.com Thu Feb 16 17:05:53 2023 From: aleksander.lobakin at intel.com (Alexander Lobakin) Date: Thu, 16 Feb 2023 18:05:53 +0100 Subject: [Patch] [testing][wireguard] Remove unneeded version.h include pointed out by 'make versioncheck' In-Reply-To: <83474b0e-9e44-642f-10c9-2e0ff94b06ca@gmail.com> References: <83474b0e-9e44-642f-10c9-2e0ff94b06ca@gmail.com> Message-ID: <597822c9-b859-7a08-6987-1d8a552f6f32@intel.com> From: Jesper Juhl Date: Thu, 16 Feb 2023 02:01:05 +0100 (CET) >> From e2fa4955c676960d0809e4afe8273075c94451c9 Mon Sep 17 00:00:00 2001 > From: Jesper Juhl > Date: Mon, 13 Feb 2023 02:58:36 +0100 > Subject: [PATCH 06/12] [testing][wireguard] Remove unneeded version.h > include > ?pointed out by 'make versioncheck' Your patch is broken, pls resend. Also I've no idea about the subject/prefix, shouldn't it be like: [PATCH net-next] wireguard: selftests: remove unneeded version.h ? > > Signed-off-by: Jesper Juhl > --- > ?tools/testing/selftests/wireguard/qemu/init.c | 1 - > ?1 file changed, 1 deletion(-) > > diff --git a/tools/testing/selftests/wireguard/qemu/init.c > b/tools/testing/selftests/wireguard/qemu/init.c > index 3e49924dd77e..20d8d3192f75 100644 > --- a/tools/testing/selftests/wireguard/qemu/init.c > +++ b/tools/testing/selftests/wireguard/qemu/init.c > @@ -24,7 +24,6 @@ > ?#include > ?#include > ?#include > -#include > > ?__attribute__((noreturn)) static void poweroff(void) > ?{ Thanks, Olek From Jason at zx2c4.com Thu Feb 16 17:35:51 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Thu, 16 Feb 2023 18:35:51 +0100 Subject: [Patch] [testing][wireguard] Remove unneeded version.h include pointed out by 'make versioncheck' In-Reply-To: <83474b0e-9e44-642f-10c9-2e0ff94b06ca@gmail.com> References: <83474b0e-9e44-642f-10c9-2e0ff94b06ca@gmail.com> Message-ID: No idea if this is something intended for me to apply or if it's an automated email. Fix the formatting, resend, and then maybe I'll apply it? From rm at romanrm.net Thu Feb 16 19:07:47 2023 From: rm at romanrm.net (Roman Mamedov) Date: Fri, 17 Feb 2023 00:07:47 +0500 Subject: Force a specific IP for outgoing WG traffic with SNAT? Message-ID: <20230217000747.0825b2e9@nvm> Hello, I'm trying to move all my WG communication with peers to a non-primary IP of my server. It has IPs added like this: inet6 2001:db8::ca6c/128 scope global deprecated valid_lft forever preferred_lft 0sec inet6 2001:db8::1/128 scope global nodad valid_lft forever preferred_lft forever What I tried: ip6tables -t nat -I POSTROUTING -d 2000::/3 -p udp --dport 51820 -j SNAT --to-source 2001:db8::ca6c Also tried to filter by --sport, and also briefly without a port filter at all. This has zero effect, as shown by tcpdump all the WG traffic still originates from 2001:db8::1 Does anyone have an idea why is that? Thanks -- With respect, Roman From stephen at networkplumber.org Sat Feb 18 17:50:36 2023 From: stephen at networkplumber.org (Stephen Hemminger) Date: Sat, 18 Feb 2023 09:50:36 -0800 Subject: Fw: [Bug 217054] New: wireguard - allowedips.c - warning: the frame size of 1032 bytes is larger than 1024 bytes Message-ID: <20230218095036.7c558146@hermes.local> Begin forwarded message: Date: Sat, 18 Feb 2023 15:49:26 +0000 From: bugzilla-daemon at kernel.org To: stephen at networkplumber.org Subject: [Bug 217054] New: wireguard - allowedips.c - warning: the frame size of 1032 bytes is larger than 1024 bytes https://bugzilla.kernel.org/show_bug.cgi?id=217054 Bug ID: 217054 Summary: wireguard - allowedips.c - warning: the frame size of 1032 bytes is larger than 1024 bytes Product: Networking Version: 2.5 Kernel Version: 6.1.12 Hardware: AMD OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Other Assignee: stephen at networkplumber.org Reporter: ionut_n2001 at yahoo.com Regression: No CC [M] drivers/memstick/core/memstick.o drivers/net/wireguard/allowedips.c: In function 'root_remove_peer_lists': drivers/net/wireguard/allowedips.c:80:1: warning: the frame size of 1032 bytes is larger than 1024 bytes [-Wframe-larger-than=] 80 | } | ^ drivers/net/wireguard/allowedips.c: In function 'root_free_rcu': drivers/net/wireguard/allowedips.c:67:1: warning: the frame size of 1032 bytes is larger than 1024 bytes [-Wframe-larger-than=] 67 | } | ^ CC [M] drivers/net/wireguard/ratelimiter.o CC [M] drivers/memstick/core/ms_block.o -- You may reply to this email to add a comment. You are receiving this mail because: You are the assignee for the bug. From Jason at zx2c4.com Sat Feb 18 19:06:18 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Sat, 18 Feb 2023 20:06:18 +0100 Subject: Fw: [Bug 217054] New: wireguard - allowedips.c - warning: the frame size of 1032 bytes is larger than 1024 bytes In-Reply-To: <20230218095036.7c558146@hermes.local> References: <20230218095036.7c558146@hermes.local> Message-ID: On Sat, Feb 18, 2023 at 09:50:36AM -0800, Stephen Hemminger wrote: > > > Begin forwarded message: > > Date: Sat, 18 Feb 2023 15:49:26 +0000 > From: bugzilla-daemon at kernel.org > To: stephen at networkplumber.org > Subject: [Bug 217054] New: wireguard - allowedips.c - warning: the frame size of 1032 bytes is larger than 1024 bytes > > > https://bugzilla.kernel.org/show_bug.cgi?id=217054 > > Bug ID: 217054 > Summary: wireguard - allowedips.c - warning: the frame size of > 1032 bytes is larger than 1024 bytes > Product: Networking > Version: 2.5 > Kernel Version: 6.1.12 > Hardware: AMD > OS: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Other > Assignee: stephen at networkplumber.org > Reporter: ionut_n2001 at yahoo.com > Regression: No > > CC [M] drivers/memstick/core/memstick.o > drivers/net/wireguard/allowedips.c: In function 'root_remove_peer_lists': > drivers/net/wireguard/allowedips.c:80:1: warning: the frame size of 1032 bytes > is larger than 1024 bytes [-Wframe-larger-than=] > 80 | } > | ^ > drivers/net/wireguard/allowedips.c: In function 'root_free_rcu': > drivers/net/wireguard/allowedips.c:67:1: warning: the frame size of 1032 bytes > is larger than 1024 bytes [-Wframe-larger-than=] > 67 | } > | ^ > CC [M] drivers/net/wireguard/ratelimiter.o > CC [M] drivers/memstick/core/ms_block.o This keeps coming up. The frame size that the compiler targets on 64-bit is 1280, not 1024. The reporter misconfigured the .config. Maybe there should be a min value for that. Dunno. Old topic. Jason From nico.schottelius at ungleich.ch Sat Feb 18 20:14:46 2023 From: nico.schottelius at ungleich.ch (Nico Schottelius) Date: Sat, 18 Feb 2023 21:14:46 +0100 Subject: Source IP incorrect on multi homed systems Message-ID: <87bklqd7vb.fsf@ungleich.ch> Dear group, I was wondering how wireguard [Linux kernel] or wireguard-go [FreeBSD] are supposed to decide which IP address to use for replying? I have seen both on FreeBSD and Linux that wireguard seems to use the IP address of the outgoing interface, i.e. the one with the route returning to the sender. However in multi homed situations, this can be wrong, let's take this example: 19:57:24.607526 net1 In IP 194.5.220.43.60770 > 147.78.195.254.51820: UDP, length 148 19:57:24.608358 net2 Out IP 195.141.200.73.51820 > 194.5.220.43.60770: UDP, length 92 The initiator sends from 194.5.220.43 to the receiver 147.78.195.254. Wireguard then replies with the source IP of 195.141.200.73 instead of 147.78.195.254. As the node is multi homed, the packet might leave through any of its uplinks and thus return with a random (unexpected) IP address and will not pass NAT rules on firewalls and finally be dropped. F.i. in above example the firewall drops the packet from 195.141.200.73, because there is no session entry for that. I have observed this behaviour both on Linux 6.1.11 as well as wireguard-go 0.0.20220316_8,1 on FreeBSD and in both cases the connection will break depending on which active interface is taken as exit. I would argue that wireguard should by default invert the IP addresses, i.e. switch dst=src, src=dst and then reply with that, instead of adapting an interface specific address, or is there a good reason for the current behaviour? Best regards, Nico -- Sustainable and modern Infrastructures by ungleich.ch From nico.schottelius at ungleich.ch Sat Feb 18 22:34:59 2023 From: nico.schottelius at ungleich.ch (Nico Schottelius) Date: Sat, 18 Feb 2023 23:34:59 +0100 Subject: Source IP incorrect on multi homed systems In-Reply-To: References: <87bklqd7vb.fsf@ungleich.ch> Message-ID: <877cwed218.fsf@ungleich.ch> Hello Omkhar, I tend to disagree. The problem is not the routing, but the selected source address, which is independent of routing. To be more specific: as there is BGP routing on all all interfaces, 147.78.195.254 is an accepted IP address on any interface. Best regards, Nico Omkhar Arasaratnam writes: > This looks like an asymmetric routing issue from what you?re describing, not a wireguard issue. > > You may want to look into policy based routing to address it. > > On Sat, Feb 18, 2023 at 15:54 Nico Schottelius wrote: > > Dear group, > > I was wondering how wireguard [Linux kernel] or wireguard-go [FreeBSD] > are supposed to decide which IP address to use for replying? > > I have seen both on FreeBSD and Linux that wireguard seems to use the IP > address of the outgoing interface, i.e. the one with the route returning > to the sender. However in multi homed situations, this can be wrong, > let's take this example: > > 19:57:24.607526 net1 In IP 194.5.220.43.60770 > 147.78.195.254.51820: UDP, length 148 > 19:57:24.608358 net2 Out IP 195.141.200.73.51820 > 194.5.220.43.60770: UDP, length 92 > > The initiator sends from 194.5.220.43 to the receiver 147.78.195.254. > Wireguard then replies with the source IP of 195.141.200.73 instead of > 147.78.195.254. > > As the node is multi homed, the packet might leave through any of its > uplinks and thus return with a random (unexpected) IP address and will > not pass NAT rules on firewalls and finally be dropped. F.i. in above > example the firewall drops the packet from 195.141.200.73, because there > is no session entry for that. > > I have observed this behaviour both on Linux 6.1.11 as well as > wireguard-go 0.0.20220316_8,1 on FreeBSD and in both cases the > connection will break depending on which active interface is taken as > exit. > > I would argue that wireguard should by default invert the IP > addresses, i.e. switch dst=src, src=dst and then reply with that, > instead of adapting an interface specific address, or is there a good > reason for the current behaviour? > > Best regards, > > Nico > > -- > Sustainable and modern Infrastructures by ungleich.ch -- Sustainable and modern Infrastructures by ungleich.ch From mike at pineview.net Sun Feb 19 00:45:55 2023 From: mike at pineview.net (Mike O'Connor) Date: Sun, 19 Feb 2023 11:15:55 +1030 Subject: Source IP incorrect on multi homed systems In-Reply-To: <87bklqd7vb.fsf@ungleich.ch> References: <87bklqd7vb.fsf@ungleich.ch> Message-ID: Generally all OSs will if sending from a local process will use the address of the outgoing interface for the packet. If the packet is forwarded and no NAT is used the address will be routed via the interface suggested by the routing table. So local routing can be a real pain, policy based routing is an option. The other option could be to setup an 'output' NAT to an address which is multi-homed. I have a system running which is multi-homed with out issue other than the actual routing machine. This machine is BGP connected to three locations. There is no NAT setup and because I also add the wireguard link addresses to the BGP sessions. Cheers On 19/2/2023 6:44 am, Nico Schottelius wrote: > Dear group, > > I was wondering how wireguard [Linux kernel] or wireguard-go [FreeBSD] > are supposed to decide which IP address to use for replying? > > I have seen both on FreeBSD and Linux that wireguard seems to use the IP > address of the outgoing interface, i.e. the one with the route returning > to the sender. However in multi homed situations, this can be wrong, > let's take this example: > > 19:57:24.607526 net1 In IP 194.5.220.43.60770 > 147.78.195.254.51820: UDP, length 148 > 19:57:24.608358 net2 Out IP 195.141.200.73.51820 > 194.5.220.43.60770: UDP, length 92 > > The initiator sends from 194.5.220.43 to the receiver 147.78.195.254. > Wireguard then replies with the source IP of 195.141.200.73 instead of > 147.78.195.254. > > As the node is multi homed, the packet might leave through any of its > uplinks and thus return with a random (unexpected) IP address and will > not pass NAT rules on firewalls and finally be dropped. F.i. in above > example the firewall drops the packet from 195.141.200.73, because there > is no session entry for that. > > I have observed this behaviour both on Linux 6.1.11 as well as > wireguard-go 0.0.20220316_8,1 on FreeBSD and in both cases the > connection will break depending on which active interface is taken as > exit. > > I would argue that wireguard should by default invert the IP > addresses, i.e. switch dst=src, src=dst and then reply with that, > instead of adapting an interface specific address, or is there a good > reason for the current behaviour? > > Best regards, > > Nico > > -- > Sustainable and modern Infrastructures by ungleich.ch From nico.schottelius at ungleich.ch Sun Feb 19 08:01:31 2023 From: nico.schottelius at ungleich.ch (Nico Schottelius) Date: Sun, 19 Feb 2023 09:01:31 +0100 Subject: Source IP incorrect on multi homed systems In-Reply-To: References: <87bklqd7vb.fsf@ungleich.ch> Message-ID: <875yby83n2.fsf@ungleich.ch> Let me rephrase the problem statement: - ping and http calls to the multi homed machine work correctly: I can ping 147.78.195.254 and the reply contains the same address. I can ping 195.141.200.73 and the reply contains the same address. I can curl 147.78.195.254 and the reply contains the same address. I can curl 195.141.200.73 and the reply contains the same address. - wireguard does NOT work because it changes the reply address: A packet sent to 147.78.195.254 is being replied with 195.141.200.73 In general, processes reply with the IP address that was used to contact them and not with the outgoing interface address, which would also break adding IP addresses to the loopback interface. For full detail, see ip addresses [0] and routing below [1] and tests executed [2]. I believe that this is a bug in wireguard. -------------------------------------------------------------------------------- [2] Let's see how it looks like in detail: 1) ping to 147.78.195.254: works [9:14] nb3:~% ping -c2 147.78.195.254 PING 147.78.195.254 (147.78.195.254) 56(84) bytes of data. 64 bytes from 147.78.195.254: icmp_seq=1 ttl=53 time=7.27 ms 64 bytes from 147.78.195.254: icmp_seq=2 ttl=53 time=6.30 ms --- 147.78.195.254 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1002ms rtt min/avg/max/mdev = 6.296/6.781/7.267/0.485 ms / # tcpdump -ni any host 194.5.220.43 tcpdump: data link type LINUX_SLL2 tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes 08:14:48.379618 net1 In IP 194.5.220.43 > 147.78.195.254: ICMP echo request, id 89, seq 1, length 64 08:14:48.379651 net2 Out IP 147.78.195.254 > 194.5.220.43: ICMP echo reply, id 89, seq 1, length 64 08:14:49.380340 net1 In IP 194.5.220.43 > 147.78.195.254: ICMP echo request, id 89, seq 2, length 64 08:14:49.380392 net2 Out IP 147.78.195.254 > 194.5.220.43: ICMP echo reply, id 89, seq 2, length 64 2) ping to 195.141.200.73 [9:14] nb3:~% ping -c2 195.141.200.73 PING 195.141.200.73 (195.141.200.73) 56(84) bytes of data. 64 bytes from 195.141.200.73: icmp_seq=1 ttl=53 time=11.3 ms 64 bytes from 195.141.200.73: icmp_seq=2 ttl=53 time=6.81 ms --- 195.141.200.73 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1002ms rtt min/avg/max/mdev = 6.813/9.057/11.301/2.244 ms [9:15] nb3:~% / # tcpdump -ni any host 194.5.220.43 tcpdump: data link type LINUX_SLL2 tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes 08:16:19.257697 net2 In IP 194.5.220.43 > 195.141.200.73: ICMP echo request, id 91, seq 1, length 64 08:16:19.257730 net2 Out IP 195.141.200.73 > 194.5.220.43: ICMP echo reply, id 91, seq 1, length 64 08:16:20.250948 net2 In IP 194.5.220.43 > 195.141.200.73: ICMP echo request, id 91, seq 2, length 64 08:16:20.250980 net2 Out IP 195.141.200.73 > 194.5.220.43: ICMP echo reply, id 91, seq 2, length 64 3) http to 147.78.195.254 [9:16] nb3:~% curl -s 147.78.195.254 > /dev/null ; echo $? 0 / # tcpdump -ni any host 194.5.220.43 tcpdump: data link type LINUX_SLL2 tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes 08:17:04.082945 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [S], seq 1405408358, win 64240, options [mss 1460,sackOK,TS val 1380610701 ecr 0,nop,wscale 7], length 0 08:17:04.082983 net2 Out IP 147.78.195.254.80 > 194.5.220.43.39274: Flags [S.], seq 3790092363, ack 1405408359, win 65160, options [mss 1460,sackOK,TS val 520503591 ecr 1380610701,nop,wscale 7], length 0 08:17:04.089996 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [.], ack 1, win 502, options [nop,nop,TS val 1380610709 ecr 520503591], length 0 08:17:04.090121 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [P.], seq 1:79, ack 1, win 502, options [nop,nop,TS val 1380610709 ecr 520503591], length 78: HTTP: GET / HTTP/1.1 08:17:04.090136 net2 Out IP 147.78.195.254.80 > 194.5.220.43.39274: Flags [.], ack 79, win 509, options [nop,nop,TS val 520503598 ecr 1380610709], length 0 08:17:04.090301 net2 Out IP 147.78.195.254.80 > 194.5.220.43.39274: Flags [P.], seq 1:239, ack 79, win 509, options [nop,nop,TS val 520503598 ecr 1380610709], length 238: HTTP: HTTP/1.1 200 OK 08:17:04.090381 net2 Out IP 147.78.195.254.80 > 194.5.220.43.39274: Flags [P.], seq 239:854, ack 79, win 509, options [nop,nop,TS val 520503598 ecr 1380610709], length 615: HTTP 08:17:04.096058 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [.], ack 239, win 501, options [nop,nop,TS val 1380610715 ecr 520503598], length 0 08:17:04.096059 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [.], ack 854, win 497, options [nop,nop,TS val 1380610715 ecr 520503598], length 0 08:17:04.096339 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [F.], seq 79, ack 854, win 501, options [nop,nop,TS val 1380610715 ecr 520503598], length 0 08:17:04.096450 net2 Out IP 147.78.195.254.80 > 194.5.220.43.39274: Flags [F.], seq 854, ack 80, win 509, options [nop,nop,TS val 520503604 ecr 1380610715], length 0 08:17:04.102609 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [.], ack 855, win 501, options [nop,nop,TS val 1380610721 ecr 520503604], length 0 4) http to 195.141.200.73 [9:17] nb3:~% curl -s 195.141.200.73 > /dev/null ; echo $? 0 / # tcpdump -ni any host 194.5.220.43 tcpdump: data link type LINUX_SLL2 tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes 08:18:05.951066 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [S], seq 1556080700, win 64240, options [mss 1460,sackOK,TS val 765965336 ecr 0,nop,wscale 7], length 0 08:18:05.951106 net2 Out IP 195.141.200.73.80 > 194.5.220.43.41484: Flags [S.], seq 3465881361, ack 1556080701, win 65160, options [mss 1460,sackOK,TS val 3168643538 ecr 765965336,nop,wscale 7], length 0 08:18:05.958699 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [.], ack 1, win 502, options [nop,nop,TS val 765965342 ecr 3168643538], length 0 08:18:05.958749 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [P.], seq 1:79, ack 1, win 502, options [nop,nop,TS val 765965342 ecr 3168643538], length 78: HTTP: GET / HTTP/1.1 08:18:05.958763 net2 Out IP 195.141.200.73.80 > 194.5.220.43.41484: Flags [.], ack 79, win 509, options [nop,nop,TS val 3168643545 ecr 765965342], length 0 08:18:05.959216 net2 Out IP 195.141.200.73.80 > 194.5.220.43.41484: Flags [P.], seq 1:239, ack 79, win 509, options [nop,nop,TS val 3168643546 ecr 765965342], length 238: HTTP: HTTP/1.1 200 OK 08:18:05.959327 net2 Out IP 195.141.200.73.80 > 194.5.220.43.41484: Flags [P.], seq 239:854, ack 79, win 509, options [nop,nop,TS val 3168643546 ecr 765965342], length 615: HTTP 08:18:05.965244 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [.], ack 239, win 501, options [nop,nop,TS val 765965350 ecr 3168643546], length 0 08:18:05.965348 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [.], ack 854, win 497, options [nop,nop,TS val 765965350 ecr 3168643546], length 0 08:18:05.965487 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [F.], seq 79, ack 854, win 501, options [nop,nop,TS val 765965350 ecr 3168643546], length 0 08:18:05.965573 net2 Out IP 195.141.200.73.80 > 194.5.220.43.41484: Flags [F.], seq 854, ack 80, win 509, options [nop,nop,TS val 3168643552 ecr 765965350], length 0 08:18:05.971916 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [.], ack 855, win 501, options [nop,nop,TS val 765965356 ecr 3168643552], length 0 [0] wireguard "server" that changes the source ip: / # ip a 1: lo: mtu 65536 qdisc noqueue state UNKNOWN qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 3: eth0 at if29: mtu 1500 qdisc noqueue state UP link/ether 66:4a:9c:12:5b:6c brd ff:ff:ff:ff:ff:ff inet6 2a0a:e5c0:10:1e:7f21:83ca:a7d:46d2/128 scope global valid_lft forever preferred_lft forever inet6 fe80::644a:9cff:fe12:5b6c/64 scope link valid_lft forever preferred_lft forever 4: net1: mtu 1500 qdisc mq state UP qlen 1000 link/ether 3c:ec:ef:cb:d8:1b brd ff:ff:ff:ff:ff:ff inet 147.78.195.254/27 brd 147.78.195.255 scope global net1 valid_lft forever preferred_lft forever inet6 2a0a:e5c0:1:8::53/64 scope global valid_lft forever preferred_lft forever inet6 fe80::3eec:efff:fecb:d81b/64 scope link valid_lft forever preferred_lft forever 5: v1477819464: mtu 1420 qdisc noqueue state UNKNOWN qlen 1000 link/[65534] inet 147.78.194.65/26 scope global v1477819464 valid_lft forever preferred_lft forever inet6 2a0a:e5c0:2e::1/64 scope global valid_lft forever preferred_lft forever 26: net2: mtu 1500 qdisc mq state UP qlen 1000 link/ether 3c:ec:ef:cb:d8:1c brd ff:ff:ff:ff:ff:ff inet 195.141.200.73/31 scope global net2 valid_lft forever preferred_lft forever inet6 2001:1700:3500:2::12/124 scope global valid_lft forever preferred_lft forever inet6 fe80::3eec:efff:fecb:d81c/64 scope link valid_lft forever preferred_lft forever / # wireguard client behind nat: nb3:/etc/wireguard# curl -4 ifconfig.io 194.5.220.43 nb3:/etc/wireguard# ip a sh dev wlan0 2: wlan0: mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 84:5c:f3:ed:52:9c brd ff:ff:ff:ff:ff:ff inet 192.168.4.85/24 brd 192.168.4.255 scope global dynamic noprefixroute wlan0 valid_lft 317sec preferred_lft 242sec inet6 2a0a:e5c0:13:0:865c:f3ff:feed:529c/64 scope global dynamic mngtmpaddr noprefixroute valid_lft 86394sec preferred_lft 14394sec inet6 fe80::865c:f3ff:feed:529c/64 scope link valid_lft forever preferred_lft forever nb3:/etc/wireguard# [1] / # ip route get 194.5.220.43 194.5.220.43 via 195.141.200.72 dev net2 src 195.141.200.73 / # Mike O'Connor writes: > Generally all OSs will if sending from a local process will use the > address of the outgoing interface for the packet. > > If the packet is forwarded and no NAT is used the address will be > routed via the interface suggested by the routing table. > > So local routing can be a real pain, policy based routing is an > option. The other option could be to setup an 'output' NAT to an > address which is multi-homed. > > I have a system running which is multi-homed with out issue other than > the actual routing machine. This machine is BGP connected to three > locations. > > There is no NAT setup and because I also add the wireguard link > addresses to the BGP sessions. > > Cheers > > > > On 19/2/2023 6:44 am, Nico Schottelius wrote: >> Dear group, >> >> I was wondering how wireguard [Linux kernel] or wireguard-go [FreeBSD] >> are supposed to decide which IP address to use for replying? >> >> I have seen both on FreeBSD and Linux that wireguard seems to use the IP >> address of the outgoing interface, i.e. the one with the route returning >> to the sender. However in multi homed situations, this can be wrong, >> let's take this example: >> >> 19:57:24.607526 net1 In IP 194.5.220.43.60770 > 147.78.195.254.51820: UDP, length 148 >> 19:57:24.608358 net2 Out IP 195.141.200.73.51820 > 194.5.220.43.60770: UDP, length 92 >> >> The initiator sends from 194.5.220.43 to the receiver 147.78.195.254. >> Wireguard then replies with the source IP of 195.141.200.73 instead of >> 147.78.195.254. >> >> As the node is multi homed, the packet might leave through any of its >> uplinks and thus return with a random (unexpected) IP address and will >> not pass NAT rules on firewalls and finally be dropped. F.i. in above >> example the firewall drops the packet from 195.141.200.73, because there >> is no session entry for that. >> >> I have observed this behaviour both on Linux 6.1.11 as well as >> wireguard-go 0.0.20220316_8,1 on FreeBSD and in both cases the >> connection will break depending on which active interface is taken as >> exit. >> >> I would argue that wireguard should by default invert the IP >> addresses, i.e. switch dst=src, src=dst and then reply with that, >> instead of adapting an interface specific address, or is there a good >> reason for the current behaviour? >> >> Best regards, >> >> Nico >> >> -- >> Sustainable and modern Infrastructures by ungleich.ch -- Sustainable and modern Infrastructures by ungleich.ch From mikma.wg at lists.m7n.se Sun Feb 19 09:19:39 2023 From: mikma.wg at lists.m7n.se (Mikma) Date: Sun, 19 Feb 2023 10:19:39 +0100 Subject: Source IP incorrect on multi homed systems In-Reply-To: <875yby83n2.fsf@ungleich.ch> References: <87bklqd7vb.fsf@ungleich.ch> <875yby83n2.fsf@ungleich.ch> Message-ID: <60C522A0-DFAA-4A25-9E6C-8C4AF0962F5B@lists.m7n.se> Have you tried setting the preferred src address of the route(s) to the addresses you desire? From "man ip": > src ADDRESS the source address to prefer when sending to the destinations covered by the route prefix. On 19 February 2023 09:01:31 CET, Nico Schottelius wrote: > >Let me rephrase the problem statement: > > - ping and http calls to the multi homed machine work correctly: > I can ping 147.78.195.254 and the reply contains the same address. > I can ping 195.141.200.73 and the reply contains the same address. > I can curl 147.78.195.254 and the reply contains the same address. > I can curl 195.141.200.73 and the reply contains the same address. > > - wireguard does NOT work because it changes the reply address: > A packet sent to 147.78.195.254 is being replied with 195.141.200.73 > >In general, processes reply with the IP address that was used to contact >them and not with the outgoing interface address, which would also break >adding IP addresses to the loopback interface. > >For full detail, see ip addresses [0] and routing below [1] and tests >executed [2]. > >I believe that this is a bug in wireguard. From nico.schottelius at ungleich.ch Sun Feb 19 12:04:39 2023 From: nico.schottelius at ungleich.ch (Nico Schottelius) Date: Sun, 19 Feb 2023 13:04:39 +0100 Subject: Source IP incorrect on multi homed systems In-Reply-To: <60C522A0-DFAA-4A25-9E6C-8C4AF0962F5B@lists.m7n.se> References: <87bklqd7vb.fsf@ungleich.ch> <875yby83n2.fsf@ungleich.ch> <60C522A0-DFAA-4A25-9E6C-8C4AF0962F5B@lists.m7n.se> Message-ID: <873571dfff.fsf@ungleich.ch> Hello Mikma, Mikma writes: > Have you tried setting the preferred src address of the route(s) to the addresses you desire? > > From "man ip": > >> src ADDRESS the source address to prefer when sending to the destinations covered by the route prefix. unfortunately this does not solve the problem. The expected behaviour of wireguard is to reply with the same IP address, like nginx and the kernel ICMP handler do, not with a route based outgoing interface IP address. In a BGP based environment the route can vary dynamically and I showed a stripped down version to make it easier to understand. In practices, many of our systems have 4-7 different upstreams and the packet can come in on any interface and should leave the machine on the current correct interface depending on the route import. In no case however, wireguard should change the response address, because this breaks stateful firewalls. As demonstrated in my last email, both the in-kernel ICMP handler as well as user space applications like nginx behave correctly on the same machine. I briefly checked the wireguard source code and I did not right away spot the network handling part that sets the source IP, so I am wondering if this bug is due to wireguard not handling it at all? Best regards, Nico -- Sustainable and modern Infrastructures by ungleich.ch From nico.schottelius at ungleich.ch Sun Feb 19 12:10:19 2023 From: nico.schottelius at ungleich.ch (Nico Schottelius) Date: Sun, 19 Feb 2023 13:10:19 +0100 Subject: Source IP incorrect on multi homed systems In-Reply-To: <875yby83n2.fsf@ungleich.ch> References: <87bklqd7vb.fsf@ungleich.ch> <875yby83n2.fsf@ungleich.ch> Message-ID: <87y1otc0p5.fsf@ungleich.ch> Aside from nginx + icmp being handled correctly as a reference, I want to further elaborate on this case to show that something is really wrong with the current behaviour: A typical scenario for routers is to have a lot of global reachable IP addresses (IPv6, IPv4) assigned to the loopback interface, such as this system: [13:11] router2.place6:~# ip a sh dev lo 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 2a0a:e5c0:1e:a::b/128 scope global valid_lft forever preferred_lft forever inet6 2a0a:e5c0:1e:a::a/128 scope global valid_lft forever preferred_lft forever inet6 2a0a:e5c0:2:a::b/128 scope global valid_lft forever preferred_lft forever inet6 2a0a:e5c0:2:a::a/128 scope global valid_lft forever preferred_lft forever inet6 2a0a:e5c0:2:1::7/128 scope global valid_lft forever preferred_lft forever inet6 2a0a:e5c0:2:1::6/128 scope global valid_lft forever preferred_lft forever inet6 2a0a:e5c0:2:1::5/128 scope global valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever The motivation behind that is that independent of the actual routing interface, these IP addresses are always reachable. Now in the case of wireguard selecting the source IP based on the outgoing interface, this is never going to work, as lo cannot send packets to the outside world. Nico Schottelius writes: > Let me rephrase the problem statement: > > - ping and http calls to the multi homed machine work correctly: > I can ping 147.78.195.254 and the reply contains the same address. > I can ping 195.141.200.73 and the reply contains the same address. > I can curl 147.78.195.254 and the reply contains the same address. > I can curl 195.141.200.73 and the reply contains the same address. > > - wireguard does NOT work because it changes the reply address: > A packet sent to 147.78.195.254 is being replied with 195.141.200.73 > > In general, processes reply with the IP address that was used to contact > them and not with the outgoing interface address, which would also break > adding IP addresses to the loopback interface. > > For full detail, see ip addresses [0] and routing below [1] and tests > executed [2]. > > I believe that this is a bug in wireguard. > > -------------------------------------------------------------------------------- > > [2] > > Let's see how it looks like in detail: > > 1) ping to 147.78.195.254: works > > [9:14] nb3:~% ping -c2 147.78.195.254 > PING 147.78.195.254 (147.78.195.254) 56(84) bytes of data. > 64 bytes from 147.78.195.254: icmp_seq=1 ttl=53 time=7.27 ms > 64 bytes from 147.78.195.254: icmp_seq=2 ttl=53 time=6.30 ms > > --- 147.78.195.254 ping statistics --- > 2 packets transmitted, 2 received, 0% packet loss, time 1002ms > rtt min/avg/max/mdev = 6.296/6.781/7.267/0.485 ms > > / # tcpdump -ni any host 194.5.220.43 > tcpdump: data link type LINUX_SLL2 > tcpdump: verbose output suppressed, use -v[v]... for full protocol decode > listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes > 08:14:48.379618 net1 In IP 194.5.220.43 > 147.78.195.254: ICMP echo request, id 89, seq 1, length 64 > 08:14:48.379651 net2 Out IP 147.78.195.254 > 194.5.220.43: ICMP echo reply, id 89, seq 1, length 64 > 08:14:49.380340 net1 In IP 194.5.220.43 > 147.78.195.254: ICMP echo request, id 89, seq 2, length 64 > 08:14:49.380392 net2 Out IP 147.78.195.254 > 194.5.220.43: ICMP echo reply, id 89, seq 2, length 64 > > 2) ping to 195.141.200.73 > > [9:14] nb3:~% ping -c2 195.141.200.73 > PING 195.141.200.73 (195.141.200.73) 56(84) bytes of data. > 64 bytes from 195.141.200.73: icmp_seq=1 ttl=53 time=11.3 ms > 64 bytes from 195.141.200.73: icmp_seq=2 ttl=53 time=6.81 ms > > --- 195.141.200.73 ping statistics --- > 2 packets transmitted, 2 received, 0% packet loss, time 1002ms > rtt min/avg/max/mdev = 6.813/9.057/11.301/2.244 ms > [9:15] nb3:~% > / # tcpdump -ni any host 194.5.220.43 > tcpdump: data link type LINUX_SLL2 > tcpdump: verbose output suppressed, use -v[v]... for full protocol decode > listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes > 08:16:19.257697 net2 In IP 194.5.220.43 > 195.141.200.73: ICMP echo request, id 91, seq 1, length 64 > 08:16:19.257730 net2 Out IP 195.141.200.73 > 194.5.220.43: ICMP echo reply, id 91, seq 1, length 64 > 08:16:20.250948 net2 In IP 194.5.220.43 > 195.141.200.73: ICMP echo request, id 91, seq 2, length 64 > 08:16:20.250980 net2 Out IP 195.141.200.73 > 194.5.220.43: ICMP echo reply, id 91, seq 2, length 64 > > 3) http to 147.78.195.254 > > [9:16] nb3:~% curl -s 147.78.195.254 > /dev/null ; echo $? > 0 > / # tcpdump -ni any host 194.5.220.43 > tcpdump: data link type LINUX_SLL2 > tcpdump: verbose output suppressed, use -v[v]... for full protocol decode > listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes > 08:17:04.082945 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [S], seq 1405408358, win 64240, options [mss 1460,sackOK,TS val 1380610701 ecr 0,nop,wscale 7], length 0 > 08:17:04.082983 net2 Out IP 147.78.195.254.80 > 194.5.220.43.39274: Flags [S.], seq 3790092363, ack 1405408359, win 65160, options [mss 1460,sackOK,TS val 520503591 ecr 1380610701,nop,wscale 7], length 0 > 08:17:04.089996 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [.], ack 1, win 502, options [nop,nop,TS val 1380610709 ecr 520503591], length 0 > 08:17:04.090121 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [P.], seq 1:79, ack 1, win 502, options [nop,nop,TS val 1380610709 ecr 520503591], length 78: HTTP: GET / HTTP/1.1 > 08:17:04.090136 net2 Out IP 147.78.195.254.80 > 194.5.220.43.39274: Flags [.], ack 79, win 509, options [nop,nop,TS val 520503598 ecr 1380610709], length 0 > 08:17:04.090301 net2 Out IP 147.78.195.254.80 > 194.5.220.43.39274: Flags [P.], seq 1:239, ack 79, win 509, options [nop,nop,TS val 520503598 ecr 1380610709], length 238: HTTP: HTTP/1.1 200 OK > 08:17:04.090381 net2 Out IP 147.78.195.254.80 > 194.5.220.43.39274: Flags [P.], seq 239:854, ack 79, win 509, options [nop,nop,TS val 520503598 ecr 1380610709], length 615: HTTP > 08:17:04.096058 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [.], ack 239, win 501, options [nop,nop,TS val 1380610715 ecr 520503598], length 0 > 08:17:04.096059 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [.], ack 854, win 497, options [nop,nop,TS val 1380610715 ecr 520503598], length 0 > 08:17:04.096339 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [F.], seq 79, ack 854, win 501, options [nop,nop,TS val 1380610715 ecr 520503598], length 0 > 08:17:04.096450 net2 Out IP 147.78.195.254.80 > 194.5.220.43.39274: Flags [F.], seq 854, ack 80, win 509, options [nop,nop,TS val 520503604 ecr 1380610715], length 0 > 08:17:04.102609 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [.], ack 855, win 501, options [nop,nop,TS val 1380610721 ecr 520503604], length 0 > > > 4) http to 195.141.200.73 > > [9:17] nb3:~% curl -s 195.141.200.73 > /dev/null ; echo $? > 0 > > / # tcpdump -ni any host 194.5.220.43 > tcpdump: data link type LINUX_SLL2 > tcpdump: verbose output suppressed, use -v[v]... for full protocol decode > listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes > 08:18:05.951066 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [S], seq 1556080700, win 64240, options [mss 1460,sackOK,TS val 765965336 ecr 0,nop,wscale 7], length 0 > 08:18:05.951106 net2 Out IP 195.141.200.73.80 > 194.5.220.43.41484: Flags [S.], seq 3465881361, ack 1556080701, win 65160, options [mss 1460,sackOK,TS val 3168643538 ecr 765965336,nop,wscale 7], length 0 > 08:18:05.958699 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [.], ack 1, win 502, options [nop,nop,TS val 765965342 ecr 3168643538], length 0 > 08:18:05.958749 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [P.], seq 1:79, ack 1, win 502, options [nop,nop,TS val 765965342 ecr 3168643538], length 78: HTTP: GET / HTTP/1.1 > 08:18:05.958763 net2 Out IP 195.141.200.73.80 > 194.5.220.43.41484: Flags [.], ack 79, win 509, options [nop,nop,TS val 3168643545 ecr 765965342], length 0 > 08:18:05.959216 net2 Out IP 195.141.200.73.80 > 194.5.220.43.41484: Flags [P.], seq 1:239, ack 79, win 509, options [nop,nop,TS val 3168643546 ecr 765965342], length 238: HTTP: HTTP/1.1 200 OK > 08:18:05.959327 net2 Out IP 195.141.200.73.80 > 194.5.220.43.41484: Flags [P.], seq 239:854, ack 79, win 509, options [nop,nop,TS val 3168643546 ecr 765965342], length 615: HTTP > 08:18:05.965244 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [.], ack 239, win 501, options [nop,nop,TS val 765965350 ecr 3168643546], length 0 > 08:18:05.965348 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [.], ack 854, win 497, options [nop,nop,TS val 765965350 ecr 3168643546], length 0 > 08:18:05.965487 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [F.], seq 79, ack 854, win 501, options [nop,nop,TS val 765965350 ecr 3168643546], length 0 > 08:18:05.965573 net2 Out IP 195.141.200.73.80 > 194.5.220.43.41484: Flags [F.], seq 854, ack 80, win 509, options [nop,nop,TS val 3168643552 ecr 765965350], length 0 > 08:18:05.971916 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [.], ack 855, win 501, options [nop,nop,TS val 765965356 ecr 3168643552], length 0 > > > > [0] > wireguard "server" that changes the source ip: > > / # ip a > 1: lo: mtu 65536 qdisc noqueue state UNKNOWN qlen 1000 > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > inet 127.0.0.1/8 scope host lo > valid_lft forever preferred_lft forever > inet6 ::1/128 scope host > valid_lft forever preferred_lft forever > 3: eth0 at if29: mtu 1500 qdisc noqueue state UP > link/ether 66:4a:9c:12:5b:6c brd ff:ff:ff:ff:ff:ff > inet6 2a0a:e5c0:10:1e:7f21:83ca:a7d:46d2/128 scope global > valid_lft forever preferred_lft forever > inet6 fe80::644a:9cff:fe12:5b6c/64 scope link > valid_lft forever preferred_lft forever > 4: net1: mtu 1500 qdisc mq state UP qlen 1000 > link/ether 3c:ec:ef:cb:d8:1b brd ff:ff:ff:ff:ff:ff > inet 147.78.195.254/27 brd 147.78.195.255 scope global net1 > valid_lft forever preferred_lft forever > inet6 2a0a:e5c0:1:8::53/64 scope global > valid_lft forever preferred_lft forever > inet6 fe80::3eec:efff:fecb:d81b/64 scope link > valid_lft forever preferred_lft forever > 5: v1477819464: mtu 1420 qdisc noqueue state UNKNOWN qlen 1000 > link/[65534] > inet 147.78.194.65/26 scope global v1477819464 > valid_lft forever preferred_lft forever > inet6 2a0a:e5c0:2e::1/64 scope global > valid_lft forever preferred_lft forever > 26: net2: mtu 1500 qdisc mq state UP qlen 1000 > link/ether 3c:ec:ef:cb:d8:1c brd ff:ff:ff:ff:ff:ff > inet 195.141.200.73/31 scope global net2 > valid_lft forever preferred_lft forever > inet6 2001:1700:3500:2::12/124 scope global > valid_lft forever preferred_lft forever > inet6 fe80::3eec:efff:fecb:d81c/64 scope link > valid_lft forever preferred_lft forever > / # > > wireguard client behind nat: > > nb3:/etc/wireguard# curl -4 ifconfig.io > 194.5.220.43 > nb3:/etc/wireguard# ip a sh dev wlan0 > 2: wlan0: mtu 1500 qdisc noqueue state UP group default qlen 1000 > link/ether 84:5c:f3:ed:52:9c brd ff:ff:ff:ff:ff:ff > inet 192.168.4.85/24 brd 192.168.4.255 scope global dynamic noprefixroute wlan0 > valid_lft 317sec preferred_lft 242sec > inet6 2a0a:e5c0:13:0:865c:f3ff:feed:529c/64 scope global dynamic mngtmpaddr noprefixroute > valid_lft 86394sec preferred_lft 14394sec > inet6 fe80::865c:f3ff:feed:529c/64 scope link > valid_lft forever preferred_lft forever > nb3:/etc/wireguard# > > > [1] > / # ip route get 194.5.220.43 > 194.5.220.43 via 195.141.200.72 dev net2 src 195.141.200.73 > / # > > > Mike O'Connor writes: > >> Generally all OSs will if sending from a local process will use the >> address of the outgoing interface for the packet. >> >> If the packet is forwarded and no NAT is used the address will be >> routed via the interface suggested by the routing table. >> >> So local routing can be a real pain, policy based routing is an >> option. The other option could be to setup an 'output' NAT to an >> address which is multi-homed. >> >> I have a system running which is multi-homed with out issue other than >> the actual routing machine. This machine is BGP connected to three >> locations. >> >> There is no NAT setup and because I also add the wireguard link >> addresses to the BGP sessions. >> >> Cheers >> >> >> >> On 19/2/2023 6:44 am, Nico Schottelius wrote: >>> Dear group, >>> >>> I was wondering how wireguard [Linux kernel] or wireguard-go [FreeBSD] >>> are supposed to decide which IP address to use for replying? >>> >>> I have seen both on FreeBSD and Linux that wireguard seems to use the IP >>> address of the outgoing interface, i.e. the one with the route returning >>> to the sender. However in multi homed situations, this can be wrong, >>> let's take this example: >>> >>> 19:57:24.607526 net1 In IP 194.5.220.43.60770 > 147.78.195.254.51820: UDP, length 148 >>> 19:57:24.608358 net2 Out IP 195.141.200.73.51820 > 194.5.220.43.60770: UDP, length 92 >>> >>> The initiator sends from 194.5.220.43 to the receiver 147.78.195.254. >>> Wireguard then replies with the source IP of 195.141.200.73 instead of >>> 147.78.195.254. >>> >>> As the node is multi homed, the packet might leave through any of its >>> uplinks and thus return with a random (unexpected) IP address and will >>> not pass NAT rules on firewalls and finally be dropped. F.i. in above >>> example the firewall drops the packet from 195.141.200.73, because there >>> is no session entry for that. >>> >>> I have observed this behaviour both on Linux 6.1.11 as well as >>> wireguard-go 0.0.20220316_8,1 on FreeBSD and in both cases the >>> connection will break depending on which active interface is taken as >>> exit. >>> >>> I would argue that wireguard should by default invert the IP >>> addresses, i.e. switch dst=src, src=dst and then reply with that, >>> instead of adapting an interface specific address, or is there a good >>> reason for the current behaviour? >>> >>> Best regards, >>> >>> Nico >>> >>> -- >>> Sustainable and modern Infrastructures by ungleich.ch -- Sustainable and modern Infrastructures by ungleich.ch From nico.schottelius at ungleich.ch Sun Feb 19 12:13:58 2023 From: nico.schottelius at ungleich.ch (Nico Schottelius) Date: Sun, 19 Feb 2023 13:13:58 +0100 Subject: Source IP incorrect on multi homed systems In-Reply-To: <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> Message-ID: <87ttzhc0jt.fsf@ungleich.ch> Hey Sebastian, Sebastian Hyrwall writes: > It is kinda. It's been mentioned multiple times over the years but no one seems to want to fix it. Atleast you should be able to specify bind/src ip in the > config. I gave up WG because of it. Wasn't accepted by my projects security policy since src ip could not be configured. > > There is an unofficial patch however, > > https://github.com/torvalds/linux/commit/5fa98082093344c86345f9f63305cae9d5f9f281 the binding is somewhat related to this issue and I was looking for that feature some time ago, too. While it is correlated and I would really appreciate binding support, I am not sure whether the linked patch does actually fix the problem I am seeing in multi homed devices. As long as wireguard does not reply with the same IP address it was contacted with, packets will get dropped on stateful firewalls, because the returning packet does not match the state session database. Best regards, Nico -- Sustainable and modern Infrastructures by ungleich.ch From wireguard-mail at chil.at Sun Feb 19 14:39:20 2023 From: wireguard-mail at chil.at (Christoph Loesch) Date: Sun, 19 Feb 2023 15:39:20 +0100 Subject: Source IP incorrect on multi homed systems In-Reply-To: <87ttzhc0jt.fsf@ungleich.ch> References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> Message-ID: <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> Hi, I don't think no one wants to fix it, there are several users having this issue. I rather guess no one could find a suitable solution to fix it. @Nico: did you try to delete the affected route and add it again with the correct source IP ? as I mentioned it in https://lists.zx2c4.com/pipermail/wireguard/2021-November/007324.html ip route del ip route add dev src This way I was able to (at least temporary) fix this issue on multi homed systems. Kind regards, Christoph Am 19.02.2023 um 13:13 schrieb Nico Schottelius: > Hey Sebastian, > > Sebastian Hyrwall writes: > >> It is kinda. It's been mentioned multiple times over the years but no one seems to want to fix it. Atleast you should be able to specify bind/src ip in the >> config. I gave up WG because of it. Wasn't accepted by my projects security policy since src ip could not be configured. >> >> There is an unofficial patch however, >> >> https://github.com/torvalds/linux/commit/5fa98082093344c86345f9f63305cae9d5f9f281 > the binding is somewhat related to this issue and I was looking for that > feature some time ago, too. While it is correlated and I would really > appreciate binding support, I am not sure whether the linked patch does > actually fix the problem I am seeing in multi homed devices. > > As long as wireguard does not reply with the same IP address it was > contacted with, packets will get dropped on stateful firewalls, because > the returning packet does not match the state session database. > > Best regards, > > Nico > > -- > Sustainable and modern Infrastructures by ungleich.ch From david at kerr.net Sun Feb 19 16:32:23 2023 From: david at kerr.net (David Kerr) Date: Sun, 19 Feb 2023 11:32:23 -0500 Subject: Source IP incorrect on multi homed systems In-Reply-To: <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> Message-ID: Without getting into the debate of whether wireguard is acting correctly or not, I think there is a possible workaround. 1. In the iptables mangle table PREROUTING, match the incoming interface and destination address and --set-xmark a firewall MARK unique to this interface/destination 2. Create a new ip route table that sets the default route to go out on the interface with the source address you want (same as destination address in iptables) 3. Create a new ip rule that sends all packets with firewall mark set in iptables to the routing table you just created Repeat above for each interface/address you need to mangle, with a unique firewall mark and routing table for each. It may be necessary to use CONNMARK in PREROUTING and OUTPUT to --restore_mark. I can't remember if this is needed or not, its been a while since I configured iptables with this. This should ensure that any packet that comes into an interface/address is replied to from the same interface/address. David On Sun, Feb 19, 2023 at 9:44 AM Christoph Loesch wrote: > > Hi, > > I don't think no one wants to fix it, there are several users having this issue. I rather guess no one could find a suitable solution to fix it. > > @Nico: did you try to delete the affected route and add it again with the correct source IP ? > > as I mentioned it in https://lists.zx2c4.com/pipermail/wireguard/2021-November/007324.html > > ip route del > ip route add dev src > > This way I was able to (at least temporary) fix this issue on multi homed systems. > > Kind regards, > Christoph > > Am 19.02.2023 um 13:13 schrieb Nico Schottelius: > > Hey Sebastian, > > > > Sebastian Hyrwall writes: > > > >> It is kinda. It's been mentioned multiple times over the years but no one seems to want to fix it. Atleast you should be able to specify bind/src ip in the > >> config. I gave up WG because of it. Wasn't accepted by my projects security policy since src ip could not be configured. > >> > >> There is an unofficial patch however, > >> > >> https://github.com/torvalds/linux/commit/5fa98082093344c86345f9f63305cae9d5f9f281 > > the binding is somewhat related to this issue and I was looking for that > > feature some time ago, too. While it is correlated and I would really > > appreciate binding support, I am not sure whether the linked patch does > > actually fix the problem I am seeing in multi homed devices. > > > > As long as wireguard does not reply with the same IP address it was > > contacted with, packets will get dropped on stateful firewalls, because > > the returning packet does not match the state session database. > > > > Best regards, > > > > Nico > > > > -- > > Sustainable and modern Infrastructures by ungleich.ch From luhe at fsoj.de Sun Feb 19 14:32:08 2023 From: luhe at fsoj.de (Luca Heise) Date: Sun, 19 Feb 2023 15:32:08 +0100 Subject: Missing translation keys from recent on-demand updates Message-ID: <4f87360e-acd3-3e0c-ad54-8cf3c54b6b4f@fsoj.de> Hello, I noticed that some translations are missing on macOS and iOS as just the translation-keys are shown. According to a diff (at least) the following keys from the original/English version are neither in the German Localizable.strings nor in the German crowdin. Looks like this in other languages too. All of these keys were added in the last updates, so I assume that they were just forgotten to be added. tunnelListCaptionOnDemand tunnelStatusAddendumOnDemand tunnelStatusOnDemandDisabled tunnelStatusAddendumOnDemandEnabled tunnelStatusAddendumOnDemandDisabled tunnelOnDemandSSIDTextFieldPlaceholder I don't really know the translation process here. I think adding these keys to crowdin so people could start to translate would be the first step. Can anyone do that? Luca From sh at keff.org Sun Feb 19 16:54:10 2023 From: sh at keff.org (Sebastian Hyrvall) Date: Sun, 19 Feb 2023 23:54:10 +0700 Subject: Source IP incorrect on multi homed systems In-Reply-To: References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> Message-ID: <00b94fdf-22d5-00ad-e068-30ad4a453236@keff.org> You should get into that debate. Proposing firewall workarounds is not a correct solution so please don't do it. It needs to be fixed. It's an immature VPN solution that always just proposed a workaround instead of fixing the problem. It seems to be designed by people that are good at software and cryptography but has no clue about networking stacks. On 2023-02-19 23:32, David Kerr wrote: > Without getting into the debate of whether wireguard is acting > correctly or not, I think there is a possible workaround. > > 1. In the iptables mangle table PREROUTING, match the incoming > interface and destination address and --set-xmark a firewall MARK > unique to this interface/destination > 2. Create a new ip route table that sets the default route to go out > on the interface with the source address you want (same as destination > address in iptables) > 3. Create a new ip rule that sends all packets with firewall mark set > in iptables to the routing table you just created > > Repeat above for each interface/address you need to mangle, with a > unique firewall mark and routing table for each. > > It may be necessary to use CONNMARK in PREROUTING and OUTPUT to > --restore_mark. I can't remember if this is needed or not, its been a > while since I configured iptables with this. > > This should ensure that any packet that comes into an > interface/address is replied to from the same interface/address. > > David > > > On Sun, Feb 19, 2023 at 9:44 AM Christoph Loesch wrote: >> Hi, >> >> I don't think no one wants to fix it, there are several users having this issue. I rather guess no one could find a suitable solution to fix it. >> >> @Nico: did you try to delete the affected route and add it again with the correct source IP ? >> >> as I mentioned it in https://lists.zx2c4.com/pipermail/wireguard/2021-November/007324.html >> >> ip route del >> ip route add dev src >> >> This way I was able to (at least temporary) fix this issue on multi homed systems. >> >> Kind regards, >> Christoph >> >> Am 19.02.2023 um 13:13 schrieb Nico Schottelius: >>> Hey Sebastian, >>> >>> Sebastian Hyrwall writes: >>> >>>> It is kinda. It's been mentioned multiple times over the years but no one seems to want to fix it. Atleast you should be able to specify bind/src ip in the >>>> config. I gave up WG because of it. Wasn't accepted by my projects security policy since src ip could not be configured. >>>> >>>> There is an unofficial patch however, >>>> >>>> https://github.com/torvalds/linux/commit/5fa98082093344c86345f9f63305cae9d5f9f281 >>> the binding is somewhat related to this issue and I was looking for that >>> feature some time ago, too. While it is correlated and I would really >>> appreciate binding support, I am not sure whether the linked patch does >>> actually fix the problem I am seeing in multi homed devices. >>> >>> As long as wireguard does not reply with the same IP address it was >>> contacted with, packets will get dropped on stateful firewalls, because >>> the returning packet does not match the state session database. >>> >>> Best regards, >>> >>> Nico >>> >>> -- >>> Sustainable and modern Infrastructures by ungleich.ch From tlhackque at yahoo.com Sun Feb 19 17:05:11 2023 From: tlhackque at yahoo.com (tlhackque) Date: Sun, 19 Feb 2023 12:05:11 -0500 Subject: Source IP incorrect on multi homed systems In-Reply-To: References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> Message-ID: FWIW, while clever, I don't think that iptables mark solves all cases.? E.g., consider an interface with multiple addresses, where a packet comes in on a secondary address.? The proposed rule would send it out the right interface, but still with the wrong (primary) address picked from the interface... With IPv6 it's common to assign an address to a service rather than a host so services can move easily.? So multiple addresses per interface are the rule, not the exception. I do the same with IPv4 inside addresses, though these days public IPv4 addresses are scarce enough that it's not common for public IPs.? It amounts to the same issue - the NAT tracking is stateful. Trying to work around this with routing seems like a maze of twisty passages - so I agree that the right solution is for WG to respond from the address that receives a packet. On 19-Feb-23 11:32, David Kerr wrote: > Without getting into the debate of whether wireguard is acting > correctly or not, I think there is a possible workaround. > > 1. In the iptables mangle table PREROUTING, match the incoming > interface and destination address and --set-xmark a firewall MARK > unique to this interface/destination > 2. Create a new ip route table that sets the default route to go out > on the interface with the source address you want (same as destination > address in iptables) > 3. Create a new ip rule that sends all packets with firewall mark set > in iptables to the routing table you just created > > Repeat above for each interface/address you need to mangle, with a > unique firewall mark and routing table for each. > > It may be necessary to use CONNMARK in PREROUTING and OUTPUT to > --restore_mark. I can't remember if this is needed or not, its been a > while since I configured iptables with this. > > This should ensure that any packet that comes into an > interface/address is replied to from the same interface/address. > > David > > > On Sun, Feb 19, 2023 at 9:44 AM Christoph Loesch wrote: >> Hi, >> >> I don't think no one wants to fix it, there are several users having this issue. I rather guess no one could find a suitable solution to fix it. >> >> @Nico: did you try to delete the affected route and add it again with the correct source IP ? >> >> as I mentioned it inhttps://lists.zx2c4.com/pipermail/wireguard/2021-November/007324.html >> >> ip route del >> ip route add dev src >> >> This way I was able to (at least temporary) fix this issue on multi homed systems. >> >> Kind regards, >> Christoph >> >> Am 19.02.2023 um 13:13 schrieb Nico Schottelius: >>> Hey Sebastian, >>> >>> Sebastian Hyrwall writes: >>> >>>> It is kinda. It's been mentioned multiple times over the years but no one seems to want to fix it. Atleast you should be able to specify bind/src ip in the >>>> config. I gave up WG because of it. Wasn't accepted by my projects security policy since src ip could not be configured. >>>> >>>> There is an unofficial patch however, >>>> >>>> https://github.com/torvalds/linux/commit/5fa98082093344c86345f9f63305cae9d5f9f281 >>> the binding is somewhat related to this issue and I was looking for that >>> feature some time ago, too. While it is correlated and I would really >>> appreciate binding support, I am not sure whether the linked patch does >>> actually fix the problem I am seeing in multi homed devices. >>> >>> As long as wireguard does not reply with the same IP address it was >>> contacted with, packets will get dropped on stateful firewalls, because >>> the returning packet does not match the state session database. >>> >>> Best regards, >>> >>> Nico >>> >>> -- >>> Sustainable and modern Infrastructures by ungleich.ch -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 840 bytes Desc: OpenPGP digital signature URL: From dxld at darkboxed.org Sun Feb 19 18:04:28 2023 From: dxld at darkboxed.org (=?UTF-8?q?Daniel=20Gr=C3=B6ber?=) Date: Sun, 19 Feb 2023 19:04:28 +0100 Subject: [RESEND PATCH v3] wg: Support restricting address family of DNS resolved Endpoint Message-ID: <20230219180428.438453-1-dxld@darkboxed.org> When using wireguard tunnels for providing IPv6 connectivity to machines it can be important to pin which IP address family should be used. Consider a peer using a DNS name with both A/AAAA records, wg will currently blindly follow system policy and use the first address returned by getaddrinfo(). In typical deployments this will cause the IPv6 address of the peer to be used, however when the whole IPv6 internet is being routed over our wg iface all this accomplishes is a traffic black hole. Naturally this can be worked around by having different DNS names for v4-only / dual-stack addresses, however this may not be possible in some situations where, say, a dynamic-DNS service is also in use. To fix this we allow users to control which address family they want using the new AddressFamily= config option, see wg.8 for details. We also update reresolve-dns to take the AddressFamily option into account. We would like to note that the not_oif patch[1] would also alleviate this problem but since this never got merged it's not a workable solution. [1]: http://marc.info/?t=145452167200014&r=1&w=2 Signed-off-by: Daniel Gr?ber --- contrib/reresolve-dns/reresolve-dns.sh | 4 ++- src/config.c | 41 ++++++++++++++++++++------ src/config.h | 2 +- src/containers.h | 5 ++++ src/man/wg.8 | 8 ++++- src/set.c | 9 +++++- src/setconf.c | 2 +- 7 files changed, 57 insertions(+), 14 deletions(-) diff --git a/contrib/reresolve-dns/reresolve-dns.sh b/contrib/reresolve-dns/reresolve-dns.sh index 711c332..bdb47ac 100755 --- a/contrib/reresolve-dns/reresolve-dns.sh +++ b/contrib/reresolve-dns/reresolve-dns.sh @@ -17,7 +17,7 @@ process_peer() { [[ $PEER_SECTION -ne 1 || -z $PUBLIC_KEY || -z $ENDPOINT ]] && return 0 [[ $(wg show "$INTERFACE" latest-handshakes) =~ ${PUBLIC_KEY//+/\\+}\ ([0-9]+) ]] || return 0 (( ($EPOCHSECONDS - ${BASH_REMATCH[1]}) > 135 )) || return 0 - wg set "$INTERFACE" peer "$PUBLIC_KEY" endpoint "$ENDPOINT" + wg set "$INTERFACE" peer "$PUBLIC_KEY" endpoint "$ENDPOINT" address-family "$FAMILY" reset_peer_section } @@ -25,6 +25,7 @@ reset_peer_section() { PEER_SECTION=0 PUBLIC_KEY="" ENDPOINT="" + FAMILY=unspec } reset_peer_section @@ -38,6 +39,7 @@ while read -r line || [[ -n $line ]]; do case "$key" in PublicKey) PUBLIC_KEY="$value"; continue ;; Endpoint) ENDPOINT="$value"; continue ;; + AddressFamily) FAMILY="$value"; continue ;; esac fi done < "$CONFIG_FILE" diff --git a/src/config.c b/src/config.c index 81ccb47..e8db900 100644 --- a/src/config.c +++ b/src/config.c @@ -192,14 +192,14 @@ static inline int parse_dns_retries(void) return (int)ret; } -static inline bool parse_endpoint(struct sockaddr *endpoint, const char *value) +static inline bool parse_endpoint(struct sockaddr *endpoint, const char *value, int family) { char *mutable = strdup(value); char *begin, *end; int ret, retries = parse_dns_retries(); struct addrinfo *resolved; struct addrinfo hints = { - .ai_family = AF_UNSPEC, + .ai_family = family, .ai_socktype = SOCK_DGRAM, .ai_protocol = IPPROTO_UDP }; @@ -279,6 +279,20 @@ static inline bool parse_endpoint(struct sockaddr *endpoint, const char *value) return true; } +static inline bool parse_address_family(int *family, const char *value) +{ + if (strcmp(value, "inet") == 0) + *family = AF_INET; + else if (strcmp(value, "inet6") == 0) + *family = AF_INET6; + else if (strcmp(value, "unspec") == 0) + *family = AF_UNSPEC; + else + return false; + + return true; +} + static inline bool parse_persistent_keepalive(uint16_t *interval, uint32_t *flags, const char *value) { unsigned long ret; @@ -454,8 +468,10 @@ static bool process_line(struct config_ctx *ctx, const char *line) goto error; } else if (ctx->is_peer_section) { if (key_match("Endpoint")) - ret = parse_endpoint(&ctx->last_peer->endpoint.addr, value); - else if (key_match("PublicKey")) { + ctx->last_peer->endpoint_value = strdup(value); + else if (key_match("AddressFamily")) { + ret = parse_address_family(&ctx->last_peer->addr_fam, value); + } else if (key_match("PublicKey")) { ret = parse_key(ctx->last_peer->public_key, value); if (ret) ctx->last_peer->flags |= WGPEER_HAS_PUBLIC_KEY; @@ -527,19 +543,22 @@ bool config_read_init(struct config_ctx *ctx, bool append) return true; } -struct wgdevice *config_read_finish(struct config_ctx *ctx) +struct wgdevice *config_read_finish(struct wgdevice *device) { struct wgpeer *peer; - for_each_wgpeer(ctx->device, peer) { + for_each_wgpeer(device, peer) { if (!(peer->flags & WGPEER_HAS_PUBLIC_KEY)) { fprintf(stderr, "A peer is missing a public key\n"); goto err; } + + if (!parse_endpoint(&peer->endpoint.addr, peer->endpoint_value, peer->addr_fam)) + goto err; } - return ctx->device; + return device; err: - free_wgdevice(ctx->device); + free_wgdevice(device); return NULL; } @@ -611,7 +630,11 @@ struct wgdevice *config_read_cmd(const char *argv[], int argc) argv += 1; argc -= 1; } else if (!strcmp(argv[0], "endpoint") && argc >= 2 && peer) { - if (!parse_endpoint(&peer->endpoint.addr, argv[1])) + peer->endpoint_value = strdup(argv[1]); + argv += 2; + argc -= 2; + } else if (!strcmp(argv[0], "address-family") && argc >= 2 && peer) { + if (!parse_address_family(&peer->addr_fam, argv[1])) goto error; argv += 2; argc -= 2; diff --git a/src/config.h b/src/config.h index 443cf21..6f81da2 100644 --- a/src/config.h +++ b/src/config.h @@ -22,6 +22,6 @@ struct config_ctx { struct wgdevice *config_read_cmd(const char *argv[], int argc); bool config_read_init(struct config_ctx *ctx, bool append); bool config_read_line(struct config_ctx *ctx, const char *line); -struct wgdevice *config_read_finish(struct config_ctx *ctx); +struct wgdevice *config_read_finish(struct wgdevice *device); #endif diff --git a/src/containers.h b/src/containers.h index a82e8dd..c111621 100644 --- a/src/containers.h +++ b/src/containers.h @@ -52,12 +52,15 @@ struct wgpeer { uint8_t public_key[WG_KEY_LEN]; uint8_t preshared_key[WG_KEY_LEN]; + char *endpoint_value; union { struct sockaddr addr; struct sockaddr_in addr4; struct sockaddr_in6 addr6; } endpoint; + int addr_fam; + struct timespec64 last_handshake_time; uint64_t rx_bytes, tx_bytes; uint16_t persistent_keepalive_interval; @@ -99,6 +102,8 @@ static inline void free_wgdevice(struct wgdevice *dev) for (struct wgpeer *peer = dev->first_peer, *np = peer ? peer->next_peer : NULL; peer; peer = np, np = peer ? peer->next_peer : NULL) { for (struct wgallowedip *allowedip = peer->first_allowedip, *na = allowedip ? allowedip->next_allowedip : NULL; allowedip; allowedip = na, na = allowedip ? allowedip->next_allowedip : NULL) free(allowedip); + if (peer->endpoint_value) + free(peer->endpoint_value); free(peer); } free(dev); diff --git a/src/man/wg.8 b/src/man/wg.8 index 7984539..fd9fde7 100644 --- a/src/man/wg.8 +++ b/src/man/wg.8 @@ -55,7 +55,7 @@ transfer-rx, transfer-tx, persistent-keepalive. Shows the current configuration of \fI\fP in the format described by \fICONFIGURATION FILE FORMAT\fP below. .TP -\fBset\fP \fI\fP [\fIlisten-port\fP \fI\fP] [\fIfwmark\fP \fI\fP] [\fIprivate-key\fP \fI\fP] [\fIpeer\fP \fI\fP [\fIremove\fP] [\fIpreshared-key\fP \fI\fP] [\fIendpoint\fP \fI:\fP] [\fIpersistent-keepalive\fP \fI\fP] [\fIallowed-ips\fP \fI/\fP[,\fI/\fP]...] ]... +\fBset\fP \fI\fP [\fIlisten-port\fP \fI\fP] [\fIfwmark\fP \fI\fP] [\fIprivate-key\fP \fI\fP] [\fIpeer\fP \fI\fP [\fIremove\fP] [\fIpreshared-key\fP \fI\fP] [\fIendpoint\fP \fI:\fP] [\fIaddress-family\fP \fI\fP] [\fIpersistent-keepalive\fP \fI\fP] [\fIallowed-ips\fP \fI/\fP[,\fI/\fP]...] ]... Sets configuration values for the specified \fI\fP. Multiple \fIpeer\fPs may be specified, and if the \fIremove\fP argument is given for a peer, that peer is removed, not configured. If \fIlisten-port\fP @@ -163,6 +163,12 @@ port number. This endpoint will be updated automatically to the most recent source IP address and port of correctly authenticated packets from the peer. Optional. .IP \(bu +AddressFamily \(em one of \fIinet\fP, \fIinet6\fP or \fIunspec\fP. When a +hostname is given for \fIEndpoint\fP, setting this to \fIinet\fP or +\fIinet6\fP will allow only addresses of the given family to be +used. Defaults to \fIunspec\fP, which causes the first returned address of +any type to be used. +.IP \(bu PersistentKeepalive \(em a seconds interval, between 1 and 65535 inclusive, of how often to send an authenticated empty packet to the peer for the purpose of keeping a stateful firewall or NAT mapping valid persistently. For example, if the interface diff --git a/src/set.c b/src/set.c index 75560fd..20ee85e 100644 --- a/src/set.c +++ b/src/set.c @@ -18,13 +18,20 @@ int set_main(int argc, const char *argv[]) int ret = 1; if (argc < 3) { - fprintf(stderr, "Usage: %s %s [listen-port ] [fwmark ] [private-key ] [peer [remove] [preshared-key ] [endpoint :] [persistent-keepalive ] [allowed-ips /[,/]...] ]...\n", PROG_NAME, argv[0]); + fprintf(stderr, "Usage: %s %s [listen-port ] [fwmark ] [private-key ] [peer [remove] [preshared-key ] [endpoint :] [address-family ] [persistent-keepalive ] [allowed-ips /[,/]...] ]...\n", PROG_NAME, argv[0]); return 1; } device = config_read_cmd(argv + 2, argc - 2); if (!device) goto cleanup; + + device = config_read_finish(device); + if (!device) { + fprintf(stderr, "Invalid configuration\n"); + goto cleanup; + } + strncpy(device->name, argv[1], IFNAMSIZ - 1); device->name[IFNAMSIZ - 1] = '\0'; diff --git a/src/setconf.c b/src/setconf.c index 1c5b138..c90fd30 100644 --- a/src/setconf.c +++ b/src/setconf.c @@ -127,7 +127,7 @@ int setconf_main(int argc, const char *argv[]) goto cleanup; } } - device = config_read_finish(&ctx); + device = config_read_finish(ctx.device); if (!device) { fprintf(stderr, "Invalid configuration\n"); goto cleanup; -- 2.30.2 From icepic.dz at gmail.com Sun Feb 19 18:04:49 2023 From: icepic.dz at gmail.com (Janne Johansson) Date: Sun, 19 Feb 2023 19:04:49 +0100 Subject: Source IP incorrect on multi homed systems In-Reply-To: <00b94fdf-22d5-00ad-e068-30ad4a453236@keff.org> References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> <00b94fdf-22d5-00ad-e068-30ad4a453236@keff.org> Message-ID: Den s?n 19 feb. 2023 kl 18:06 skrev Sebastian Hyrvall : > > You should get into that debate. Proposing firewall workarounds is not a > correct solution so please don't do it. It needs to be fixed. It's an > immature VPN solution that always just proposed a workaround instead of > fixing the problem. I would make sure that you are not mis-ascribing the problem* to "an immature VPN" and not what the default UDP behaviour of the kernel is, to pick a working interface to send packets from based on the routing table, in which any/all udp based tunnel would suffer the same problem. If you google it, you may find that other udp transports face the same "problem". *) https://en.wiktionary.org/wiki/Chesterton%27s_fence -- May the most significant bit of your life be positive. From sh at keff.org Sun Feb 19 18:08:44 2023 From: sh at keff.org (Sebastian Hyrvall) Date: Mon, 20 Feb 2023 01:08:44 +0700 Subject: Source IP incorrect on multi homed systems In-Reply-To: References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> <00b94fdf-22d5-00ad-e068-30ad4a453236@keff.org> Message-ID: It is the default behavior of the kernel. But all networking software dealing in security knows how to correctly behave. You are welcome to inform me of something else suffering the same problem. On 2023-02-20 01:04, Janne Johansson wrote: > Den s?n 19 feb. 2023 kl 18:06 skrev Sebastian Hyrvall : >> You should get into that debate. Proposing firewall workarounds is not a >> correct solution so please don't do it. It needs to be fixed. It's an >> immature VPN solution that always just proposed a workaround instead of >> fixing the problem. > I would make sure that you are not mis-ascribing the problem* to "an > immature VPN" and not what the default UDP behaviour of the kernel is, > to pick a working interface to send packets from based on the routing > table, in which any/all udp based tunnel would suffer the same > problem. If you google it, you may find that other udp transports face > the same "problem". > > *) https://en.wiktionary.org/wiki/Chesterton%27s_fence > From dxld at darkboxed.org Sun Feb 19 18:23:57 2023 From: dxld at darkboxed.org (=?UTF-8?q?Daniel=20Gr=C3=B6ber?=) Date: Sun, 19 Feb 2023 19:23:57 +0100 Subject: [PATCH v2] wg: Allow config to read secret keys from file Message-ID: <20230219182357.444395-1-dxld@darkboxed.org> This adds two new config keys PrivateKeyFile= and PresharedKeyFile= that simply hook up the existing code for the `wg set ... private-key /file` codepath. By using the new options wireguard configs can become a lot easier to manage and deploy as we don't have to treat them as secrets anymore. This way they can, for example, be tracked in public git repos while the secret keys can be provisioned using an out of band system or with a manual one-time step instead. Before this patch we were using an ugly hack: it's possible to simply omit PrivateKey= and set it using `PostUp = wg set %i private-key /some/file`. However this breaks when we try to use setconf or synconf as they will (rightly) unset the private key when it's missing in the underlying config file breaking connectivity. Reviewed-By: Michael Tokarev Signed-off-by: Daniel Gr?ber --- src/config.c | 8 ++++++++ src/man/wg.8 | 4 ++++ 2 files changed, 12 insertions(+) diff --git a/src/config.c b/src/config.c index e8db900..f9980fe 100644 --- a/src/config.c +++ b/src/config.c @@ -464,6 +464,10 @@ static bool process_line(struct config_ctx *ctx, const char *line) ret = parse_key(ctx->device->private_key, value); if (ret) ctx->device->flags |= WGDEVICE_HAS_PRIVATE_KEY; + } else if (key_match("PrivateKeyFile")) { + ret = parse_keyfile(ctx->device->private_key, value); + if (ret) + ctx->device->flags |= WGDEVICE_HAS_PRIVATE_KEY; } else goto error; } else if (ctx->is_peer_section) { @@ -483,6 +487,10 @@ static bool process_line(struct config_ctx *ctx, const char *line) ret = parse_key(ctx->last_peer->preshared_key, value); if (ret) ctx->last_peer->flags |= WGPEER_HAS_PRESHARED_KEY; + } else if (key_match("PresharedKeyFile")) { + ret = parse_keyfile(ctx->last_peer->preshared_key, value); + if (ret) + ctx->last_peer->flags |= WGPEER_HAS_PRESHARED_KEY; } else goto error; } else diff --git a/src/man/wg.8 b/src/man/wg.8 index fd9fde7..48f084d 100644 --- a/src/man/wg.8 +++ b/src/man/wg.8 @@ -134,6 +134,8 @@ The \fIInterface\fP section may contain the following fields: .IP \(bu PrivateKey \(em a base64 private key generated by \fIwg genkey\fP. Required. .IP \(bu +PrivateKeyFile \(em path to a file containing a base64 private key. May be used instead of \fIPrivateKey\fP. Optional. +.IP \(bu ListenPort \(em a 16-bit port for listening. Optional; if not specified, chosen randomly. .IP \(bu @@ -151,6 +153,8 @@ and may be omitted. This option adds an additional layer of symmetric-key cryptography to be mixed into the already existing public-key cryptography, for post-quantum resistance. .IP \(bu +PresharedKeyFile \(em path to a file containing a base64 preshared key. May be used instead of \fIPresharedKey\fP. Optional. +.IP \(bu AllowedIPs \(em a comma-separated list of IP (v4 or v6) addresses with CIDR masks from which incoming traffic for this peer is allowed and to which outgoing traffic for this peer is directed. The catch-all -- 2.30.2 From rm at romanrm.net Sun Feb 19 18:31:18 2023 From: rm at romanrm.net (Roman Mamedov) Date: Sun, 19 Feb 2023 23:31:18 +0500 Subject: [RESEND PATCH v3] wg: Support restricting address family of DNS resolved Endpoint In-Reply-To: <20230219180428.438453-1-dxld@darkboxed.org> References: <20230219180428.438453-1-dxld@darkboxed.org> Message-ID: <20230219233118.2d9654f9@nvm> On Sun, 19 Feb 2023 19:04:28 +0100 Daniel Gr?ber wrote: > +static inline bool parse_address_family(int *family, const char *value) > +{ > + if (strcmp(value, "inet") == 0) > + *family = AF_INET; > + else if (strcmp(value, "inet6") == 0) > + *family = AF_INET6; Wouldn't the first condition match "inet6" as well, not ever checking the second condition? > + else if (strcmp(value, "unspec") == 0) > + *family = AF_UNSPEC; > + else > + return false; > + > + return true; > +} -- With respect, Roman From david at kerr.net Sun Feb 19 18:37:52 2023 From: david at kerr.net (David Kerr) Date: Sun, 19 Feb 2023 13:37:52 -0500 Subject: Source IP incorrect on multi homed systems In-Reply-To: References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> Message-ID: My proposed workaround specifically stated to match on both the interface and destination address, and to set a route with both interface and [source] address. This allows for multiple IP addresses on the same interface -- which you can do with both IPv4 and IPv6. But yes, it is a nasty hack. You really need to understand what is going on between the firewall and routing tables/rules and it is easy to get confused. On Sun, Feb 19, 2023 at 12:10 PM tlhackque wrote: > > FWIW, while clever, I don't think that iptables mark solves all cases. > E.g., consider an interface with multiple addresses, where a packet > comes in on a secondary address. The proposed rule would send it out > the right interface, but still with the wrong (primary) address picked > from the interface... > > With IPv6 it's common to assign an address to a service rather than a > host so services can move easily. So multiple addresses per interface > are the rule, not the exception. > > I do the same with IPv4 inside addresses, though these days public IPv4 > addresses are scarce enough that it's not common for public IPs. It > amounts to the same issue - the NAT tracking is stateful. > > Trying to work around this with routing seems like a maze of twisty > passages - so I agree that the right solution is for WG to respond from > the address that receives a packet. > > On 19-Feb-23 11:32, David Kerr wrote: > > Without getting into the debate of whether wireguard is acting > > correctly or not, I think there is a possible workaround. > > > > 1. In the iptables mangle table PREROUTING, match the incoming > > interface and destination address and --set-xmark a firewall MARK > > unique to this interface/destination > > 2. Create a new ip route table that sets the default route to go out > > on the interface with the source address you want (same as destination > > address in iptables) > > 3. Create a new ip rule that sends all packets with firewall mark set > > in iptables to the routing table you just created > > > > Repeat above for each interface/address you need to mangle, with a > > unique firewall mark and routing table for each. > > > > It may be necessary to use CONNMARK in PREROUTING and OUTPUT to > > --restore_mark. I can't remember if this is needed or not, its been a > > while since I configured iptables with this. > > > > This should ensure that any packet that comes into an > > interface/address is replied to from the same interface/address. > > > > David > > > > > > On Sun, Feb 19, 2023 at 9:44 AM Christoph Loesch wrote: > >> Hi, > >> > >> I don't think no one wants to fix it, there are several users having this issue. I rather guess no one could find a suitable solution to fix it. > >> > >> @Nico: did you try to delete the affected route and add it again with the correct source IP ? > >> > >> as I mentioned it inhttps://lists.zx2c4.com/pipermail/wireguard/2021-November/007324.html > >> > >> ip route del > >> ip route add dev src > >> > >> This way I was able to (at least temporary) fix this issue on multi homed systems. > >> > >> Kind regards, > >> Christoph > >> > >> Am 19.02.2023 um 13:13 schrieb Nico Schottelius: > >>> Hey Sebastian, > >>> > >>> Sebastian Hyrwall writes: > >>> > >>>> It is kinda. It's been mentioned multiple times over the years but no one seems to want to fix it. Atleast you should be able to specify bind/src ip in the > >>>> config. I gave up WG because of it. Wasn't accepted by my projects security policy since src ip could not be configured. > >>>> > >>>> There is an unofficial patch however, > >>>> > >>>> https://github.com/torvalds/linux/commit/5fa98082093344c86345f9f63305cae9d5f9f281 > >>> the binding is somewhat related to this issue and I was looking for that > >>> feature some time ago, too. While it is correlated and I would really > >>> appreciate binding support, I am not sure whether the linked patch does > >>> actually fix the problem I am seeing in multi homed devices. > >>> > >>> As long as wireguard does not reply with the same IP address it was > >>> contacted with, packets will get dropped on stateful firewalls, because > >>> the returning packet does not match the state session database. > >>> > >>> Best regards, > >>> > >>> Nico > >>> > >>> -- > >>> Sustainable and modern Infrastructures by ungleich.ch > From tlhackque at yahoo.com Sun Feb 19 18:42:02 2023 From: tlhackque at yahoo.com (tlhackque) Date: Sun, 19 Feb 2023 13:42:02 -0500 Subject: Source IP incorrect on multi homed systems In-Reply-To: References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> Message-ID: BTW, DNS is a common UDP (well, mostly) protocol that encountered the same issue. See RFC 2181 (1997), where you'll find (emphasis added): > 4 . Server > Reply Source Address Selection > > Most, if not all, DNS clients, expect the address from which a reply > is received to be the same address as that to which the query > eliciting the reply was sent. This is true for servers acting as > clients for the purposes of recursive query resolution, as well as > simple resolver clients. The address, along with the identifier (ID) > in the reply is used for disambiguating replies, and filtering > spurious responses. This may, or may not, have been intended when > the DNS was designed, but is now a fact of life. > > Some multi-homed hosts running DNS servers generate a reply using a > source address that is not the same as the destination address from > the client's request packet. > _**Such replies will be discarded by the client because the source > address of the reply does not match that of a host to which the client > sent the original request.** _ That is, it > appears to be an unsolicited response. > > 4.1 . UDP > Source Address Selection > > ***To avoid these problems, servers when responding to queries using > UDP _must _cause the reply to be sent with the source address field in > the IP header set to the address that was in the destination address > field of the IP header of the packet containing the query causing the > response.** * > If this would cause the response to be sent from an IP > address that is not permitted for this purpose, then the response may > be sent from any legal IP address allocated to the server. That > address should be chosen to maximise the possibility that the client > will be able to use it for further queries. Servers configured in > such a way that not all their addresses are equally reachable from > all potential clients need take particular care when responding to > queries sent to anycast, multicast, or similar, addresses. > On 19-Feb-23 12:05, tlhackque wrote: > FWIW, while clever, I don't think that iptables mark solves all cases. > E.g., consider an interface with multiple addresses, where a packet > comes in on a secondary address.? The proposed rule would send it out > the right interface, but still with the wrong (primary) address picked > from the interface... > > With IPv6 it's common to assign an address to a service rather than a > host so services can move easily.? So multiple addresses per interface > are the rule, not the exception. > > I do the same with IPv4 inside addresses, though these days public > IPv4 addresses are scarce enough that it's not common for public IPs.? > It amounts to the same issue - the NAT tracking is stateful. > > Trying to work around this with routing seems like a maze of twisty > passages - so I agree that the right solution is for WG to respond > from the address that receives a packet. > > On 19-Feb-23 11:32, David Kerr wrote: >> Without getting into the debate of whether wireguard is acting >> correctly or not, I think there is a possible workaround. >> >> 1. In the iptables mangle table PREROUTING, match the incoming >> interface and destination address and --set-xmark a firewall MARK >> unique to this interface/destination >> 2. Create a new ip route table that sets the default route to go out >> on the interface with the source address you want (same as destination >> address in iptables) >> 3. Create a new ip rule that sends all packets with firewall mark set >> in iptables to the routing table you just created >> >> Repeat above for each interface/address you need to mangle, with a >> unique firewall mark and routing table for each. >> >> It may be necessary to use CONNMARK in PREROUTING and OUTPUT to >> --restore_mark.? I can't remember if this is needed or not, its been a >> while since I configured iptables with this. >> >> This should ensure that any packet that comes into an >> interface/address is replied to from the same interface/address. >> >> David >> >> >> On Sun, Feb 19, 2023 at 9:44 AM Christoph >> Loesch? wrote: >>> Hi, >>> >>> I don't think no one wants to fix it, there are several users having >>> this issue. I rather guess no one could find a suitable solution to >>> fix it. >>> >>> @Nico: did you try to delete the affected route and add it again >>> with the correct source IP ? >>> >>> as I mentioned it >>> inhttps://lists.zx2c4.com/pipermail/wireguard/2021-November/007324.html >>> >>> ip route del >>> ip route add dev src >>> >>> This way I was able to (at least temporary) fix this issue on multi >>> homed systems. >>> >>> Kind regards, >>> Christoph >>> >>> Am 19.02.2023 um 13:13 schrieb Nico Schottelius: >>>> Hey Sebastian, >>>> >>>> Sebastian Hyrwall? writes: >>>> >>>>> It is kinda. It's been mentioned multiple times over the years but >>>>> no one seems to want to fix it. Atleast you should be able to >>>>> specify bind/src ip in the >>>>> config. I gave up WG because of it. Wasn't accepted by my projects >>>>> security policy since src ip could not be configured. >>>>> >>>>> There is an unofficial patch however, >>>>> >>>>> https://github.com/torvalds/linux/commit/5fa98082093344c86345f9f63305cae9d5f9f281 >>>>> >>>> the binding is somewhat related to this issue and I was looking for >>>> that >>>> feature some time ago, too. While it is correlated and I would really >>>> appreciate binding support, I am not sure whether the linked patch >>>> does >>>> actually fix the problem I am seeing in multi homed devices. >>>> >>>> As long as wireguard does not reply with the same IP address it was >>>> contacted with, packets will get dropped on stateful firewalls, >>>> because >>>> the returning packet does not match the state session database. >>>> >>>> Best regards, >>>> >>>> Nico >>>> >>>> -- >>>> Sustainable and modern Infrastructures by ungleich.ch > -- This communication may not represent my employer's views, if any, on the matters discussed. -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 840 bytes Desc: OpenPGP digital signature URL: From tlhackque at yahoo.com Sun Feb 19 18:52:35 2023 From: tlhackque at yahoo.com (tlhackque) Date: Sun, 19 Feb 2023 13:52:35 -0500 Subject: Source IP incorrect on multi homed systems In-Reply-To: References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> Message-ID: On 19-Feb-23 13:37, David Kerr wrote: > My proposed workaround specifically stated to match on both the > interface and destination address, and to set a route with both > interface and [source] address. This allows for multiple IP addresses > on the same interface -- which you can do with both IPv4 and IPv6. Fair enough.? Of course, that means having a unique rule and mark for each if/destination address, which you now have to manage - and avoid conflicts with all other uses of mark.? One of which is wg-quick... "manage" includes remembering to add/remove the rule and allocate/deallocate the mark synchronously with wg-enabled IP addresses - and if wg is listening on all addresses, that means every ip address. You can get there, but as I said, it's a maze of twisty passages and the complications of managing it pile up. > But yes, it is a nasty hack. You really need to understand what is > going on between the firewall and routing tables/rules and it is easy > to get confused. > > > On Sun, Feb 19, 2023 at 12:10 PM tlhackque wrote: >> FWIW, while clever, I don't think that iptables mark solves all cases. >> E.g., consider an interface with multiple addresses, where a packet >> comes in on a secondary address. The proposed rule would send it out >> the right interface, but still with the wrong (primary) address picked >> from the interface... >> >> With IPv6 it's common to assign an address to a service rather than a >> host so services can move easily. So multiple addresses per interface >> are the rule, not the exception. >> >> I do the same with IPv4 inside addresses, though these days public IPv4 >> addresses are scarce enough that it's not common for public IPs. It >> amounts to the same issue - the NAT tracking is stateful. >> >> Trying to work around this with routing seems like a maze of twisty >> passages - so I agree that the right solution is for WG to respond from >> the address that receives a packet. >> >> On 19-Feb-23 11:32, David Kerr wrote: >>> Without getting into the debate of whether wireguard is acting >>> correctly or not, I think there is a possible workaround. >>> >>> 1. In the iptables mangle table PREROUTING, match the incoming >>> interface and destination address and --set-xmark a firewall MARK >>> unique to this interface/destination >>> 2. Create a new ip route table that sets the default route to go out >>> on the interface with the source address you want (same as destination >>> address in iptables) >>> 3. Create a new ip rule that sends all packets with firewall mark set >>> in iptables to the routing table you just created >>> >>> Repeat above for each interface/address you need to mangle, with a >>> unique firewall mark and routing table for each. >>> >>> It may be necessary to use CONNMARK in PREROUTING and OUTPUT to >>> --restore_mark. I can't remember if this is needed or not, its been a >>> while since I configured iptables with this. >>> >>> This should ensure that any packet that comes into an >>> interface/address is replied to from the same interface/address. >>> >>> David >>> >>> >>> On Sun, Feb 19, 2023 at 9:44 AM Christoph Loesch wrote: >>>> Hi, >>>> >>>> I don't think no one wants to fix it, there are several users having this issue. I rather guess no one could find a suitable solution to fix it. >>>> >>>> @Nico: did you try to delete the affected route and add it again with the correct source IP ? >>>> >>>> as I mentioned it inhttps://lists.zx2c4.com/pipermail/wireguard/2021-November/007324.html >>>> >>>> ip route del >>>> ip route add dev src >>>> >>>> This way I was able to (at least temporary) fix this issue on multi homed systems. >>>> >>>> Kind regards, >>>> Christoph >>>> >>>> Am 19.02.2023 um 13:13 schrieb Nico Schottelius: >>>>> Hey Sebastian, >>>>> >>>>> Sebastian Hyrwall writes: >>>>> >>>>>> It is kinda. It's been mentioned multiple times over the years but no one seems to want to fix it. Atleast you should be able to specify bind/src ip in the >>>>>> config. I gave up WG because of it. Wasn't accepted by my projects security policy since src ip could not be configured. >>>>>> >>>>>> There is an unofficial patch however, >>>>>> >>>>>> https://github.com/torvalds/linux/commit/5fa98082093344c86345f9f63305cae9d5f9f281 >>>>> the binding is somewhat related to this issue and I was looking for that >>>>> feature some time ago, too. While it is correlated and I would really >>>>> appreciate binding support, I am not sure whether the linked patch does >>>>> actually fix the problem I am seeing in multi homed devices. >>>>> >>>>> As long as wireguard does not reply with the same IP address it was >>>>> contacted with, packets will get dropped on stateful firewalls, because >>>>> the returning packet does not match the state session database. >>>>> >>>>> Best regards, >>>>> >>>>> Nico >>>>> >>>>> -- >>>>> Sustainable and modern Infrastructures by ungleich.ch -- This communication may not represent my employer's views, if any, on the matters discussed. -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 840 bytes Desc: OpenPGP digital signature URL: From peter at fiberdirekt.se Sun Feb 19 18:59:13 2023 From: peter at fiberdirekt.se (Peter Linder) Date: Sun, 19 Feb 2023 19:59:13 +0100 Subject: Source IP incorrect on multi homed systems In-Reply-To: <87y1otc0p5.fsf@ungleich.ch> References: <87bklqd7vb.fsf@ungleich.ch> <875yby83n2.fsf@ungleich.ch> <87y1otc0p5.fsf@ungleich.ch> Message-ID: Indeed this is how you typically set up a multihomed service (addresses on lo and then announce that using BGP or something). If you use one of the network links directly for the service and that link network goes down (it may not even be in your AS so you may not know?) then the service is offline. use a route-map in your bgp config to set the src address of routes to the address on lo, that works for wg :) /Peter On 2023-02-19 13:10, Nico Schottelius wrote: > Aside from nginx + icmp being handled correctly as a reference, > I want to further elaborate on this case to show that something is > really wrong with the current behaviour: > > A typical scenario for routers is to have a lot of global reachable IP > addresses (IPv6, IPv4) assigned to the loopback interface, such as this > system: > > [13:11] router2.place6:~# ip a sh dev lo > 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > inet 127.0.0.1/8 scope host lo > valid_lft forever preferred_lft forever > inet6 2a0a:e5c0:1e:a::b/128 scope global > valid_lft forever preferred_lft forever > inet6 2a0a:e5c0:1e:a::a/128 scope global > valid_lft forever preferred_lft forever > inet6 2a0a:e5c0:2:a::b/128 scope global > valid_lft forever preferred_lft forever > inet6 2a0a:e5c0:2:a::a/128 scope global > valid_lft forever preferred_lft forever > inet6 2a0a:e5c0:2:1::7/128 scope global > valid_lft forever preferred_lft forever > inet6 2a0a:e5c0:2:1::6/128 scope global > valid_lft forever preferred_lft forever > inet6 2a0a:e5c0:2:1::5/128 scope global > valid_lft forever preferred_lft forever > inet6 ::1/128 scope host > valid_lft forever preferred_lft forever > > The motivation behind that is that independent of the actual routing > interface, these IP addresses are always reachable. > > Now in the case of wireguard selecting the source IP based on the > outgoing interface, this is never going to work, as lo cannot send > packets to the outside world. > > > Nico Schottelius writes: > >> Let me rephrase the problem statement: >> >> - ping and http calls to the multi homed machine work correctly: >> I can ping 147.78.195.254 and the reply contains the same address. >> I can ping 195.141.200.73 and the reply contains the same address. >> I can curl 147.78.195.254 and the reply contains the same address. >> I can curl 195.141.200.73 and the reply contains the same address. >> >> - wireguard does NOT work because it changes the reply address: >> A packet sent to 147.78.195.254 is being replied with 195.141.200.73 >> >> In general, processes reply with the IP address that was used to contact >> them and not with the outgoing interface address, which would also break >> adding IP addresses to the loopback interface. >> >> For full detail, see ip addresses [0] and routing below [1] and tests >> executed [2]. >> >> I believe that this is a bug in wireguard. >> >> -------------------------------------------------------------------------------- >> >> [2] >> >> Let's see how it looks like in detail: >> >> 1) ping to 147.78.195.254: works >> >> [9:14] nb3:~% ping -c2 147.78.195.254 >> PING 147.78.195.254 (147.78.195.254) 56(84) bytes of data. >> 64 bytes from 147.78.195.254: icmp_seq=1 ttl=53 time=7.27 ms >> 64 bytes from 147.78.195.254: icmp_seq=2 ttl=53 time=6.30 ms >> >> --- 147.78.195.254 ping statistics --- >> 2 packets transmitted, 2 received, 0% packet loss, time 1002ms >> rtt min/avg/max/mdev = 6.296/6.781/7.267/0.485 ms >> >> / # tcpdump -ni any host 194.5.220.43 >> tcpdump: data link type LINUX_SLL2 >> tcpdump: verbose output suppressed, use -v[v]... for full protocol decode >> listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes >> 08:14:48.379618 net1 In IP 194.5.220.43 > 147.78.195.254: ICMP echo request, id 89, seq 1, length 64 >> 08:14:48.379651 net2 Out IP 147.78.195.254 > 194.5.220.43: ICMP echo reply, id 89, seq 1, length 64 >> 08:14:49.380340 net1 In IP 194.5.220.43 > 147.78.195.254: ICMP echo request, id 89, seq 2, length 64 >> 08:14:49.380392 net2 Out IP 147.78.195.254 > 194.5.220.43: ICMP echo reply, id 89, seq 2, length 64 >> >> 2) ping to 195.141.200.73 >> >> [9:14] nb3:~% ping -c2 195.141.200.73 >> PING 195.141.200.73 (195.141.200.73) 56(84) bytes of data. >> 64 bytes from 195.141.200.73: icmp_seq=1 ttl=53 time=11.3 ms >> 64 bytes from 195.141.200.73: icmp_seq=2 ttl=53 time=6.81 ms >> >> --- 195.141.200.73 ping statistics --- >> 2 packets transmitted, 2 received, 0% packet loss, time 1002ms >> rtt min/avg/max/mdev = 6.813/9.057/11.301/2.244 ms >> [9:15] nb3:~% >> / # tcpdump -ni any host 194.5.220.43 >> tcpdump: data link type LINUX_SLL2 >> tcpdump: verbose output suppressed, use -v[v]... for full protocol decode >> listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes >> 08:16:19.257697 net2 In IP 194.5.220.43 > 195.141.200.73: ICMP echo request, id 91, seq 1, length 64 >> 08:16:19.257730 net2 Out IP 195.141.200.73 > 194.5.220.43: ICMP echo reply, id 91, seq 1, length 64 >> 08:16:20.250948 net2 In IP 194.5.220.43 > 195.141.200.73: ICMP echo request, id 91, seq 2, length 64 >> 08:16:20.250980 net2 Out IP 195.141.200.73 > 194.5.220.43: ICMP echo reply, id 91, seq 2, length 64 >> >> 3) http to 147.78.195.254 >> >> [9:16] nb3:~% curl -s 147.78.195.254 > /dev/null ; echo $? >> 0 >> / # tcpdump -ni any host 194.5.220.43 >> tcpdump: data link type LINUX_SLL2 >> tcpdump: verbose output suppressed, use -v[v]... for full protocol decode >> listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes >> 08:17:04.082945 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [S], seq 1405408358, win 64240, options [mss 1460,sackOK,TS val 1380610701 ecr 0,nop,wscale 7], length 0 >> 08:17:04.082983 net2 Out IP 147.78.195.254.80 > 194.5.220.43.39274: Flags [S.], seq 3790092363, ack 1405408359, win 65160, options [mss 1460,sackOK,TS val 520503591 ecr 1380610701,nop,wscale 7], length 0 >> 08:17:04.089996 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [.], ack 1, win 502, options [nop,nop,TS val 1380610709 ecr 520503591], length 0 >> 08:17:04.090121 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [P.], seq 1:79, ack 1, win 502, options [nop,nop,TS val 1380610709 ecr 520503591], length 78: HTTP: GET / HTTP/1.1 >> 08:17:04.090136 net2 Out IP 147.78.195.254.80 > 194.5.220.43.39274: Flags [.], ack 79, win 509, options [nop,nop,TS val 520503598 ecr 1380610709], length 0 >> 08:17:04.090301 net2 Out IP 147.78.195.254.80 > 194.5.220.43.39274: Flags [P.], seq 1:239, ack 79, win 509, options [nop,nop,TS val 520503598 ecr 1380610709], length 238: HTTP: HTTP/1.1 200 OK >> 08:17:04.090381 net2 Out IP 147.78.195.254.80 > 194.5.220.43.39274: Flags [P.], seq 239:854, ack 79, win 509, options [nop,nop,TS val 520503598 ecr 1380610709], length 615: HTTP >> 08:17:04.096058 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [.], ack 239, win 501, options [nop,nop,TS val 1380610715 ecr 520503598], length 0 >> 08:17:04.096059 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [.], ack 854, win 497, options [nop,nop,TS val 1380610715 ecr 520503598], length 0 >> 08:17:04.096339 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [F.], seq 79, ack 854, win 501, options [nop,nop,TS val 1380610715 ecr 520503598], length 0 >> 08:17:04.096450 net2 Out IP 147.78.195.254.80 > 194.5.220.43.39274: Flags [F.], seq 854, ack 80, win 509, options [nop,nop,TS val 520503604 ecr 1380610715], length 0 >> 08:17:04.102609 net1 In IP 194.5.220.43.39274 > 147.78.195.254.80: Flags [.], ack 855, win 501, options [nop,nop,TS val 1380610721 ecr 520503604], length 0 >> >> >> 4) http to 195.141.200.73 >> >> [9:17] nb3:~% curl -s 195.141.200.73 > /dev/null ; echo $? >> 0 >> >> / # tcpdump -ni any host 194.5.220.43 >> tcpdump: data link type LINUX_SLL2 >> tcpdump: verbose output suppressed, use -v[v]... for full protocol decode >> listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes >> 08:18:05.951066 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [S], seq 1556080700, win 64240, options [mss 1460,sackOK,TS val 765965336 ecr 0,nop,wscale 7], length 0 >> 08:18:05.951106 net2 Out IP 195.141.200.73.80 > 194.5.220.43.41484: Flags [S.], seq 3465881361, ack 1556080701, win 65160, options [mss 1460,sackOK,TS val 3168643538 ecr 765965336,nop,wscale 7], length 0 >> 08:18:05.958699 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [.], ack 1, win 502, options [nop,nop,TS val 765965342 ecr 3168643538], length 0 >> 08:18:05.958749 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [P.], seq 1:79, ack 1, win 502, options [nop,nop,TS val 765965342 ecr 3168643538], length 78: HTTP: GET / HTTP/1.1 >> 08:18:05.958763 net2 Out IP 195.141.200.73.80 > 194.5.220.43.41484: Flags [.], ack 79, win 509, options [nop,nop,TS val 3168643545 ecr 765965342], length 0 >> 08:18:05.959216 net2 Out IP 195.141.200.73.80 > 194.5.220.43.41484: Flags [P.], seq 1:239, ack 79, win 509, options [nop,nop,TS val 3168643546 ecr 765965342], length 238: HTTP: HTTP/1.1 200 OK >> 08:18:05.959327 net2 Out IP 195.141.200.73.80 > 194.5.220.43.41484: Flags [P.], seq 239:854, ack 79, win 509, options [nop,nop,TS val 3168643546 ecr 765965342], length 615: HTTP >> 08:18:05.965244 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [.], ack 239, win 501, options [nop,nop,TS val 765965350 ecr 3168643546], length 0 >> 08:18:05.965348 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [.], ack 854, win 497, options [nop,nop,TS val 765965350 ecr 3168643546], length 0 >> 08:18:05.965487 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [F.], seq 79, ack 854, win 501, options [nop,nop,TS val 765965350 ecr 3168643546], length 0 >> 08:18:05.965573 net2 Out IP 195.141.200.73.80 > 194.5.220.43.41484: Flags [F.], seq 854, ack 80, win 509, options [nop,nop,TS val 3168643552 ecr 765965350], length 0 >> 08:18:05.971916 net2 In IP 194.5.220.43.41484 > 195.141.200.73.80: Flags [.], ack 855, win 501, options [nop,nop,TS val 765965356 ecr 3168643552], length 0 >> >> >> >> [0] >> wireguard "server" that changes the source ip: >> >> / # ip a >> 1: lo: mtu 65536 qdisc noqueue state UNKNOWN qlen 1000 >> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >> inet 127.0.0.1/8 scope host lo >> valid_lft forever preferred_lft forever >> inet6 ::1/128 scope host >> valid_lft forever preferred_lft forever >> 3: eth0 at if29: mtu 1500 qdisc noqueue state UP >> link/ether 66:4a:9c:12:5b:6c brd ff:ff:ff:ff:ff:ff >> inet6 2a0a:e5c0:10:1e:7f21:83ca:a7d:46d2/128 scope global >> valid_lft forever preferred_lft forever >> inet6 fe80::644a:9cff:fe12:5b6c/64 scope link >> valid_lft forever preferred_lft forever >> 4: net1: mtu 1500 qdisc mq state UP qlen 1000 >> link/ether 3c:ec:ef:cb:d8:1b brd ff:ff:ff:ff:ff:ff >> inet 147.78.195.254/27 brd 147.78.195.255 scope global net1 >> valid_lft forever preferred_lft forever >> inet6 2a0a:e5c0:1:8::53/64 scope global >> valid_lft forever preferred_lft forever >> inet6 fe80::3eec:efff:fecb:d81b/64 scope link >> valid_lft forever preferred_lft forever >> 5: v1477819464: mtu 1420 qdisc noqueue state UNKNOWN qlen 1000 >> link/[65534] >> inet 147.78.194.65/26 scope global v1477819464 >> valid_lft forever preferred_lft forever >> inet6 2a0a:e5c0:2e::1/64 scope global >> valid_lft forever preferred_lft forever >> 26: net2: mtu 1500 qdisc mq state UP qlen 1000 >> link/ether 3c:ec:ef:cb:d8:1c brd ff:ff:ff:ff:ff:ff >> inet 195.141.200.73/31 scope global net2 >> valid_lft forever preferred_lft forever >> inet6 2001:1700:3500:2::12/124 scope global >> valid_lft forever preferred_lft forever >> inet6 fe80::3eec:efff:fecb:d81c/64 scope link >> valid_lft forever preferred_lft forever >> / # >> >> wireguard client behind nat: >> >> nb3:/etc/wireguard# curl -4 ifconfig.io >> 194.5.220.43 >> nb3:/etc/wireguard# ip a sh dev wlan0 >> 2: wlan0: mtu 1500 qdisc noqueue state UP group default qlen 1000 >> link/ether 84:5c:f3:ed:52:9c brd ff:ff:ff:ff:ff:ff >> inet 192.168.4.85/24 brd 192.168.4.255 scope global dynamic noprefixroute wlan0 >> valid_lft 317sec preferred_lft 242sec >> inet6 2a0a:e5c0:13:0:865c:f3ff:feed:529c/64 scope global dynamic mngtmpaddr noprefixroute >> valid_lft 86394sec preferred_lft 14394sec >> inet6 fe80::865c:f3ff:feed:529c/64 scope link >> valid_lft forever preferred_lft forever >> nb3:/etc/wireguard# >> >> >> [1] >> / # ip route get 194.5.220.43 >> 194.5.220.43 via 195.141.200.72 dev net2 src 195.141.200.73 >> / # >> >> >> Mike O'Connor writes: >> >>> Generally all OSs will if sending from a local process will use the >>> address of the outgoing interface for the packet. >>> >>> If the packet is forwarded and no NAT is used the address will be >>> routed via the interface suggested by the routing table. >>> >>> So local routing can be a real pain, policy based routing is an >>> option. The other option could be to setup an 'output' NAT to an >>> address which is multi-homed. >>> >>> I have a system running which is multi-homed with out issue other than >>> the actual routing machine. This machine is BGP connected to three >>> locations. >>> >>> There is no NAT setup and because I also add the wireguard link >>> addresses to the BGP sessions. >>> >>> Cheers >>> >>> >>> >>> On 19/2/2023 6:44 am, Nico Schottelius wrote: >>>> Dear group, >>>> >>>> I was wondering how wireguard [Linux kernel] or wireguard-go [FreeBSD] >>>> are supposed to decide which IP address to use for replying? >>>> >>>> I have seen both on FreeBSD and Linux that wireguard seems to use the IP >>>> address of the outgoing interface, i.e. the one with the route returning >>>> to the sender. However in multi homed situations, this can be wrong, >>>> let's take this example: >>>> >>>> 19:57:24.607526 net1 In IP 194.5.220.43.60770 > 147.78.195.254.51820: UDP, length 148 >>>> 19:57:24.608358 net2 Out IP 195.141.200.73.51820 > 194.5.220.43.60770: UDP, length 92 >>>> >>>> The initiator sends from 194.5.220.43 to the receiver 147.78.195.254. >>>> Wireguard then replies with the source IP of 195.141.200.73 instead of >>>> 147.78.195.254. >>>> >>>> As the node is multi homed, the packet might leave through any of its >>>> uplinks and thus return with a random (unexpected) IP address and will >>>> not pass NAT rules on firewalls and finally be dropped. F.i. in above >>>> example the firewall drops the packet from 195.141.200.73, because there >>>> is no session entry for that. >>>> >>>> I have observed this behaviour both on Linux 6.1.11 as well as >>>> wireguard-go 0.0.20220316_8,1 on FreeBSD and in both cases the >>>> connection will break depending on which active interface is taken as >>>> exit. >>>> >>>> I would argue that wireguard should by default invert the IP >>>> addresses, i.e. switch dst=src, src=dst and then reply with that, >>>> instead of adapting an interface specific address, or is there a good >>>> reason for the current behaviour? >>>> >>>> Best regards, >>>> >>>> Nico >>>> >>>> -- >>>> Sustainable and modern Infrastructures by ungleich.ch > > -- > Sustainable and modern Infrastructures by ungleich.ch From nico.schottelius at ungleich.ch Sun Feb 19 20:02:38 2023 From: nico.schottelius at ungleich.ch (Nico Schottelius) Date: Sun, 19 Feb 2023 21:02:38 +0100 Subject: Source IP incorrect on multi homed systems In-Reply-To: <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> Message-ID: <87wn4d76xd.fsf@ungleich.ch> Hello Christoph, Christoph Loesch writes: > @Nico: did you try to delete the affected route and add it again with the correct source IP ? No, I did not because the routes are really dynamic on the affected systems and I would need to overwrite the BGP routes with a better metric, which in turn will likely break the return path. > as I mentioned it in https://lists.zx2c4.com/pipermail/wireguard/2021-November/007324.html > > ip route del > ip route add dev src > > This way I was able to (at least temporary) fix this issue on multi homed systems. Much appreciate the hint. However changing routes manually on as many routers/vpn endpoints as we have is not a practical solution. To fix the current project's issue we have shifted the VPN endpoint to a single homed device for the moment. Best regards, Nico -- Sustainable and modern Infrastructures by ungleich.ch From nico.schottelius at ungleich.ch Sun Feb 19 20:11:20 2023 From: nico.schottelius at ungleich.ch (Nico Schottelius) Date: Sun, 19 Feb 2023 21:11:20 +0100 Subject: Source IP incorrect on multi homed systems In-Reply-To: References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> <00b94fdf-22d5-00ad-e068-30ad4a453236@keff.org> Message-ID: <87sff176lw.fsf@ungleich.ch> Hey Janne, Janne Johansson writes: > *) https://en.wiktionary.org/wiki/Chesterton%27s_fence I am happy to have learned a new principle today, thanks for that. And to be sure that everyone is on the same page: Wireguard should reply by default with the source address that used to be the destination address, but at the moment wireguard is not doing that at the moment. If anyone disagrees with above statement, please let me know. -- Sustainable and modern Infrastructures by ungleich.ch From mjt at tls.msk.ru Sun Feb 19 20:18:47 2023 From: mjt at tls.msk.ru (Michael Tokarev) Date: Sun, 19 Feb 2023 23:18:47 +0300 Subject: [RESEND PATCH v3] wg: Support restricting address family of DNS resolved Endpoint In-Reply-To: <20230219233118.2d9654f9@nvm> References: <20230219180428.438453-1-dxld@darkboxed.org> <20230219233118.2d9654f9@nvm> Message-ID: <24c6fa52-a512-a01e-5351-90cb33a32a3f@msgid.tls.msk.ru> 19.02.2023 21:31, Roman Mamedov ?????: > On Sun, 19 Feb 2023 19:04:28 +0100 > Daniel Gr?ber wrote: > >> +static inline bool parse_address_family(int *family, const char *value) >> +{ >> + if (strcmp(value, "inet") == 0) >> + *family = AF_INET; >> + else if (strcmp(value, "inet6") == 0) >> + *family = AF_INET6; > > Wouldn't the first condition match "inet6" as well, not ever checking the > second condition? No. It is not memcmp. /mjt From nico.schottelius at ungleich.ch Sun Feb 19 20:18:34 2023 From: nico.schottelius at ungleich.ch (Nico Schottelius) Date: Sun, 19 Feb 2023 21:18:34 +0100 Subject: Source IP incorrect on multi homed systems In-Reply-To: References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> Message-ID: <87o7pp76a2.fsf@ungleich.ch> tlhackque writes: >> [...] >> 4.1 . UDP >> Source Address Selection >> >> ***To avoid these problems, servers when responding to queries >> using UDP _must _cause the reply to be sent with the source address >> field in the IP header set to the address that was in the >> destination address field of the IP header of the packet containing >> the query causing the response.** * OMG, we really have seen everything already, haven't we? Jason, what do you think about adopting the RFC2181 Source Address Selection algorithm for wireguard? If I am not mistaken that would mean in practice: if orignal_pkg.ip_dst == one_of_my_ips then return_pkg.ip.src = orignal_pkg.ip_dst return_pkg.ip.dst = orignal_pkg.ip_src fi For me that sounds like a sane approach (aside from my very simplified algorithm). Best regards, Nico -- Sustainable and modern Infrastructures by ungleich.ch From rm at romanrm.net Sun Feb 19 20:42:52 2023 From: rm at romanrm.net (Roman Mamedov) Date: Mon, 20 Feb 2023 01:42:52 +0500 Subject: Source IP incorrect on multi homed systems In-Reply-To: <87o7pp76a2.fsf@ungleich.ch> References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> <87o7pp76a2.fsf@ungleich.ch> Message-ID: <20230220014252.21178988@nvm> On Sun, 19 Feb 2023 21:18:34 +0100 Nico Schottelius wrote: > If I am not mistaken that would mean in practice: > > if orignal_pkg.ip_dst == one_of_my_ips then > return_pkg.ip.src = orignal_pkg.ip_dst > return_pkg.ip.dst = orignal_pkg.ip_src > fi > > For me that sounds like a sane approach (aside from > my very simplified algorithm). Except there is no request and response in WG, and as such no original or return packet. Another peer contacts you, then some time later you contact the other peer. Or the other way round. WG-wise what will need to be done is to store in the each peer's information structure the local IP that we are supposed to use for communication with that peer; and updating it when receiving packets from the peer, using the destination of those. So you would see a "Local IP" in each "peer" section when doing a "wg show". Also, until there is such IP initially stored, it will have to be some default outgoing IP of the system towards that peer. BTW, how would this work in your setup, what if not the peer contacts you first, but your machine needs to contact the peer? -- With respect, Roman From tlhackque at yahoo.com Sun Feb 19 21:39:53 2023 From: tlhackque at yahoo.com (tlhackque) Date: Sun, 19 Feb 2023 16:39:53 -0500 Subject: Source IP incorrect on multi homed systems In-Reply-To: <20230220014252.21178988@nvm> References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> <87o7pp76a2.fsf@ungleich.ch> <20230220014252.21178988@nvm> Message-ID: <7d5c1dd6-fded-37bb-2ac1-d8dc541a822e@yahoo.com> On 19-Feb-23 15:42, Roman Mamedov wrote: > On Sun, 19 Feb 2023 21:18:34 +0100 > Nico Schottelius wrote: > >> If I am not mistaken that would mean in practice: >> >> if orignal_pkg.ip_dst == one_of_my_ips then >> return_pkg.ip.src = orignal_pkg.ip_dst >> return_pkg.ip.dst = orignal_pkg.ip_src >> fi >> >> For me that sounds like a sane approach (aside from >> my very simplified algorithm). > Except there is no request and response in WG, and as such no original or > return packet. Another peer contacts you, then some time later you contact the > other peer. Or the other way round. > > WG-wise what will need to be done is to store in the each peer's information > structure the local IP that we are supposed to use for communication with that > peer; and updating it when receiving packets from the peer, using the > destination of those. So you would see a "Local IP" in each "peer" section > when doing a "wg show". > > Also, until there is such IP initially stored, it will have to be some default > outgoing IP of the system towards that peer. BTW, how would this work in your > setup, what if not the peer contacts you first, but your machine needs to > contact the peer? > The situation can be (and often is) the same for both peers. If you're the initiator, you send to the peer address using its configured or DNS IP address, and normal routing.? You note the address used to send, and use it for future communications to that peer.? The first packet sets state in the posited firewall/nat. Subsequent packets using the same source address ensures that the firewall sees them as the same flow. When the peer gets around to saying something - which it will at latest when the keepalive timer goes off, but probably sooner, it will have noted your source address and it's local IP address (the one you used).? So it will send using the source address that you know about. This is the same algorithm used by the peer, so they should agree. When either end detects and address change, the process restarts. There is a possibility that the initial packets pass in flight, but I think that would at most result in a dropped packet, which will be resent. I don't think there's a deadlock, but in the event of thrashing, a tie-breaker of using the lowest candidate IP address generally works.. When there are multiple choices, it doesn't really matter which pair of IP addresses are picked, as long as they're stable while the systems reside on the same networks.? (E.G. it could be two notebook PCs in different hotel rooms, not just two fixed servers or one fixed server and one mobile.)? The goal is to establish a flow that stateful packet inspection, NAT, routing can recognize and use to keep a pinhole open... I don't have time at the moment to work out the corner cases, but that's the overall approach. -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 840 bytes Desc: OpenPGP digital signature URL: From nico.schottelius at ungleich.ch Sun Feb 19 21:19:23 2023 From: nico.schottelius at ungleich.ch (Nico Schottelius) Date: Sun, 19 Feb 2023 22:19:23 +0100 Subject: Source IP incorrect on multi homed systems In-Reply-To: <20230220014252.21178988@nvm> References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> <87o7pp76a2.fsf@ungleich.ch> <20230220014252.21178988@nvm> Message-ID: <87h6vh72d4.fsf@ungleich.ch> Hey Roman, Roman Mamedov writes: > On Sun, 19 Feb 2023 21:18:34 +0100 > Nico Schottelius wrote: > >> If I am not mistaken that would mean in practice: >> >> if orignal_pkg.ip_dst == one_of_my_ips then >> return_pkg.ip.src = orignal_pkg.ip_dst >> return_pkg.ip.dst = orignal_pkg.ip_src >> fi >> >> For me that sounds like a sane approach (aside from >> my very simplified algorithm). > > Except there is no request and response in WG, and as such no original or > return packet. Another peer contacts you, then some time later you contact the > other peer. Or the other way round. > > WG-wise what will need to be done is to store in the each peer's information > structure the local IP that we are supposed to use for communication with that > peer; and updating it when receiving packets from the peer, using the > destination of those. So you would see a "Local IP" in each "peer" section > when doing a "wg show". That is very interesting, thanks for the insight. Reading above paragraph, I was having a very similar thought that we need to record the local IP. > Also, until there is such IP initially stored, it will have to be some default > outgoing IP of the system towards that peer. BTW, how would this work in your > setup, what if not the peer contacts you first, but your machine needs to > contact the peer? So far this situation doesn't exist for us, because only servers are multi homed. However, having an option to specify something a local address in each peer section would probably be a good solution to disambiguate it and if not specified, use the default, as in whatever other processes are using that don't define it explicitly - i.e. follow the process of least surprise. Best regards, Nico -- Sustainable and modern Infrastructures by ungleich.ch From tlhackque at yahoo.com Sun Feb 19 22:06:44 2023 From: tlhackque at yahoo.com (tlhackque) Date: Sun, 19 Feb 2023 17:06:44 -0500 Subject: Source IP incorrect on multi homed systems In-Reply-To: <87h6vh72d4.fsf@ungleich.ch> References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> <87o7pp76a2.fsf@ungleich.ch> <20230220014252.21178988@nvm> <87h6vh72d4.fsf@ungleich.ch> Message-ID: <48ed189f-3b11-7649-8d0a-fbef3f788bdc@yahoo.com> On 19-Feb-23 16:19, Nico Schottelius wrote: > So far this situation doesn't exist for us, because only servers are > multi homed. It's not that uncommon; consider a docked notebook that has a WiFi address and an Ethernet address on the same subnet. While typically the routing priorities favor the Ethernet, the mobile will have both addresses. In a car, you can have WiFi thru the car and mobile data.? (Not saying I like this, but ..) There are probably other cases, but I wouldn't assume it's only a server issue. As I also noted in another note: two servers can have the same issue, if both are multi-homed. The solution really needs to be symmetric. -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 840 bytes Desc: OpenPGP digital signature URL: From tlhackque at yahoo.com Sun Feb 19 22:28:48 2023 From: tlhackque at yahoo.com (tlhackque) Date: Sun, 19 Feb 2023 17:28:48 -0500 Subject: Source IP incorrect on multi homed systems In-Reply-To: References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> Message-ID: Actually in my case (I'm not the originator of this thread), I don't run BGP.? But I do have both site-site and mobile-site clients.? Much simpler environment, but same issue. I do understand UDP. As I've noted, DNS UDP has the same issue, and an RFC was issued to clarify that responses MUST come from the address on which a query is received. WG isn't quite the same, as it isn't a request/response protocol.? But it is a flow between two endpoints, and NAT/firewalls will open a pinhole for incoming packets when they see an outbound packet. One of the nice things about WG is that except for this issue, it has no dependencies on custom routing (or anything but UDP) and "just works".? It should "just work" on multihomed hosts, without handstands, BGP routing, different ports, and the like.? It also needs to work where it's not feasible to layer on work-arounds, such as VPSs where you don't get to pick your kernel...or your firewall. Picking stable endpoint addresses would make the traffic look like the kind of flow that these middleboxes recognize, and things would "just work". On 19-Feb-23 13:25, John Lauro wrote: > I think the ip route with src would work, but only as a short lived > work around.? The problem with that is if dealing with dynamic routes > is it could go a way when a link is down and then come back and the > src setting would be lost.? You would need the bgp software to add the > src. > > UDP is connectionless.? Sending back out the same as it's coming in > isn't strictly the same.? The streams are not attached the same as > they would be with TCP on nginx or a reply with icmp. You should be > able to whitelist the udp port on the NAT devices, as it shouldn't use > state info. > > I am not sure if you are attempting to do site to site or client to > server/site and which end has the NAT (or both). What I do for site to > site is use a different port for each connection and have a separate > BGP connection for each possible connection (ie: different one for > different network providers).? Have a full mesh with 8 sites and upto > 3 providers per site. > > That said, you probably have floating IPs on the client side, and > don't want to lock in a single IP on the multi-homed server side?? You > could nat the incoming IPs on the border from an internal IP and then > then lock to a single private IP on the wireguard server for in/out > and that border nat would force the reply back to the same gateway it > came in from. > > I know, you don't want work arounds, just want to mention it's not the > same as comparing a single stream to something that handles routing > though it.? As you are doing bgp and redundant routes I assume you > also reset rp_filter on all nat/wireguard/routers so the routers will > allow packets to come from different sources. > > On Sun, Feb 19, 2023 at 12:07 PM tlhackque wrote: > > FWIW, while clever, I don't think that iptables mark solves all > cases. > E.g., consider an interface with multiple addresses, where a packet > comes in on a secondary address.? The proposed rule would send it out > the right interface, but still with the wrong (primary) address > picked > from the interface... > > With IPv6 it's common to assign an address to a service rather than a > host so services can move easily.? So multiple addresses per > interface > are the rule, not the exception. > > I do the same with IPv4 inside addresses, though these days public > IPv4 > addresses are scarce enough that it's not common for public IPs.? It > amounts to the same issue - the NAT tracking is stateful. > > Trying to work around this with routing seems like a maze of twisty > passages - so I agree that the right solution is for WG to respond > from > the address that receives a packet. > > On 19-Feb-23 11:32, David Kerr wrote: > > Without getting into the debate of whether wireguard is acting > > correctly or not, I think there is a possible workaround. > > > > 1. In the iptables mangle table PREROUTING, match the incoming > > interface and destination address and --set-xmark a firewall MARK > > unique to this interface/destination > > 2. Create a new ip route table that sets the default route to go out > > on the interface with the source address you want (same as > destination > > address in iptables) > > 3. Create a new ip rule that sends all packets with firewall > mark set > > in iptables to the routing table you just created > > > > Repeat above for each interface/address you need to mangle, with a > > unique firewall mark and routing table for each. > > > > It may be necessary to use CONNMARK in PREROUTING and OUTPUT to > > --restore_mark.? I can't remember if this is needed or not, its > been a > > while since I configured iptables with this. > > > > This should ensure that any packet that comes into an > > interface/address is replied to from the same interface/address. > > > > David > > > > > > On Sun, Feb 19, 2023 at 9:44 AM Christoph > Loesch wrote: > >> Hi, > >> > >> I don't think no one wants to fix it, there are several users > having this issue. I rather guess no one could find a suitable > solution to fix it. > >> > >> @Nico: did you try to delete the affected route and add it > again with the correct source IP ? > >> > >> as I mentioned it > inhttps://lists.zx2c4.com/pipermail/wireguard/2021-November/007324.html > > >> > >> ip route del > >> ip route add dev src > >> > >> This way I was able to (at least temporary) fix this issue on > multi homed systems. > >> > >> Kind regards, > >> Christoph > >> > >> Am 19.02.2023 um 13:13 schrieb Nico Schottelius: > >>> Hey Sebastian, > >>> > >>> Sebastian Hyrwall? writes: > >>> > >>>> It is kinda. It's been mentioned multiple times over the > years but no one seems to want to fix it. Atleast you should be > able to specify bind/src ip in the > >>>> config. I gave up WG because of it. Wasn't accepted by my > projects security policy since src ip could not be configured. > >>>> > >>>> There is an unofficial patch however, > >>>> > >>>> > https://github.com/torvalds/linux/commit/5fa98082093344c86345f9f63305cae9d5f9f281 > >>> the binding is somewhat related to this issue and I was > looking for that > >>> feature some time ago, too. While it is correlated and I would > really > >>> appreciate binding support, I am not sure whether the linked > patch does > >>> actually fix the problem I am seeing in multi homed devices. > >>> > >>> As long as wireguard does not reply with the same IP address > it was > >>> contacted with, packets will get dropped on stateful > firewalls, because > >>> the returning packet does not match the state session database. > >>> > >>> Best regards, > >>> > >>> Nico > >>> > >>> -- > >>> Sustainable and modern Infrastructures by ungleich.ch > > -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 840 bytes Desc: OpenPGP digital signature URL: From dxld at darkboxed.org Sun Feb 19 22:42:00 2023 From: dxld at darkboxed.org (Daniel =?utf-8?Q?Gr=C3=B6ber?=) Date: Sun, 19 Feb 2023 23:42:00 +0100 Subject: Src addr code review (Was: Source IP incorrect on multi homed systems) In-Reply-To: <87h6vh72d4.fsf@ungleich.ch> References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> <87o7pp76a2.fsf@ungleich.ch> <20230220014252.21178988@nvm> <87h6vh72d4.fsf@ungleich.ch> Message-ID: <20230219224200.g5mwcaybee4hujov@House.clients.dxld.at> Hi, I though it might be useful to do some quick and dirty code review instead of speculating wildly to figure out where these source IP selection problems could be coming from ;) >From previous code deep dives I know the udp_tunnel_xmit_skb function is where tunnel packets get handed off to the kernel. So in net/wireguard/socket.c:send4 we have: udp_tunnel_xmit_skb(rt, sock, skb, fl.saddr, fl.daddr, ds, ip4_dst_hoplimit(&rt->dst), 0, fl.fl4_sport, fl.fl4_dport, false, false); Where fl.saddr is the source address that's supposedly wrong (sometimes? I guess?) Where does that come from? Let's look at the code (heavily culled): struct flowi4 fl = { .saddr = endpoint->src4.s_addr, }; if (cache) rt = dst_cache_get_ip4(cache, &fl.saddr); if (!rt) { if (unlikely(!inet_confirm_addr(sock_net(sock), NULL, 0, fl.saddr, RT_SCOPE_HOST))) fl.saddr = 0; if (unlikely(endpoint->src_if4 && ((IS_ERR(rt) && PTR_ERR(rt) == -EINVAL) || (!IS_ERR(rt) && rt->dst.dev->ifindex != endpoint->src_if4)))) fl.saddr = 0; Well it's initialized from endpoint->src4.s_addr, overwritten with zero in some cases, which I believe lets the kernel do it's regular source addr selection, and populated from something called dst_cache at some callsites. @Nico could it perhaps simply be that you're hitting one of these zero'ing cases and that's why it's using regular kernel src addr selection instead of the cached endpoint src4 address? The first case !inet_confirm_addr(..., RT_SCOPE_HOST) ought to confirm that the saddr is actually still a local address. Makes sens if the address we remembered was removed from the interface we can't use it anymore. The second case looks like it's checking if the (sometimes cached) src_if4 interface index is still what the route we're about to use points to. If neither of those seem likely we can keep reading :) --Daniel From cao88yu at gmail.com Mon Feb 20 00:28:48 2023 From: cao88yu at gmail.com (=?UTF-8?B?5pu554Wc?=) Date: Mon, 20 Feb 2023 08:28:48 +0800 Subject: Src addr code review (Was: Source IP incorrect on multi homed systems) In-Reply-To: <20230219224200.g5mwcaybee4hujov@House.clients.dxld.at> References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> <87o7pp76a2.fsf@ungleich.ch> <20230220014252.21178988@nvm> <87h6vh72d4.fsf@ungleich.ch> <20230219224200.g5mwcaybee4hujov@House.clients.dxld.at> Message-ID: Hi all, I've hacked that source code myself months ago, and it works well on my use case (I have 4 dual stack pppoe wan set on my openwrt router, and seted a wireguard sever on it), my hack will pickup the dst_addr from incoming handshake packet in kernel sk_buff, and then use that addr as src_addr to reply. I'm not good at source code, and I know that my hack may be ugly, but it works, hope this patch can help: https://github.com/openwrt/packages/issues/9538#issuecomment-1150592803 Daniel Gr?ber ?2023?2?20??? 06:42??? > > Hi, > > I though it might be useful to do some quick and dirty code review instead > of speculating wildly to figure out where these source IP selection > problems could be coming from ;) > > From previous code deep dives I know the udp_tunnel_xmit_skb function is > where tunnel packets get handed off to the kernel. So in > net/wireguard/socket.c:send4 we have: > > udp_tunnel_xmit_skb(rt, sock, skb, fl.saddr, fl.daddr, ds, > ip4_dst_hoplimit(&rt->dst), 0, fl.fl4_sport, > fl.fl4_dport, false, false); > > Where fl.saddr is the source address that's supposedly wrong (sometimes? I > guess?) Where does that come from? > > Let's look at the code (heavily culled): > > struct flowi4 fl = { > .saddr = endpoint->src4.s_addr, > }; > if (cache) > rt = dst_cache_get_ip4(cache, &fl.saddr); > if (!rt) { > if (unlikely(!inet_confirm_addr(sock_net(sock), NULL, 0, > fl.saddr, RT_SCOPE_HOST))) > fl.saddr = 0; > if (unlikely(endpoint->src_if4 && ((IS_ERR(rt) && > PTR_ERR(rt) == -EINVAL) || (!IS_ERR(rt) && > rt->dst.dev->ifindex != endpoint->src_if4)))) > fl.saddr = 0; > > Well it's initialized from endpoint->src4.s_addr, overwritten with zero in > some cases, which I believe lets the kernel do it's regular source addr > selection, and populated from something called dst_cache at some callsites. > > @Nico could it perhaps simply be that you're hitting one of these zero'ing > cases and that's why it's using regular kernel src addr selection instead > of the cached endpoint src4 address? > > The first case !inet_confirm_addr(..., RT_SCOPE_HOST) ought to confirm that > the saddr is actually still a local address. Makes sens if the address we > remembered was removed from the interface we can't use it anymore. > > The second case looks like it's checking if the (sometimes cached) src_if4 > interface index is still what the route we're about to use points to. > > If neither of those seem likely we can keep reading :) > > --Daniel > > > From luizluca at gmail.com Mon Feb 20 00:58:43 2023 From: luizluca at gmail.com (Luiz Angelo Daros de Luca) Date: Sun, 19 Feb 2023 21:58:43 -0300 Subject: Source IP incorrect on multi homed systems In-Reply-To: References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> Message-ID: Yes, wg is not a request/response protocol. But it does have some state. Can't wireguard remember the last local address that each peer sent traffic? It is just like the tracking already in use for peer ip address. If there is an "last address" it would be nice if we could hint the kernel to use that as the source address, with a fallback to the current behavior if the address is not available. It might solve a couple of problems. I just don't know if it is possible to hint the source address without enforcing it. It not, wg would have to deal with cases when the address is gone. Regards, Luiz From tech at tootai.net Mon Feb 20 09:33:36 2023 From: tech at tootai.net (Daniel) Date: Mon, 20 Feb 2023 10:33:36 +0100 Subject: Wish - Add PostUp/PostDown per peer section Message-ID: Hello. Would it be possible to add the PostUp and PostDown commands in peer section ? Ex. of use case: dynamically add route when a peer connect Have a nice day -- Daniel From nico.schottelius at ungleich.ch Mon Feb 20 09:47:36 2023 From: nico.schottelius at ungleich.ch (Nico Schottelius) Date: Mon, 20 Feb 2023 10:47:36 +0100 Subject: Src addr code review (Was: Source IP incorrect on multi homed systems) In-Reply-To: <20230219224200.g5mwcaybee4hujov@House.clients.dxld.at> References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> <87o7pp76a2.fsf@ungleich.ch> <20230220014252.21178988@nvm> <87h6vh72d4.fsf@ungleich.ch> <20230219224200.g5mwcaybee4hujov@House.clients.dxld.at> Message-ID: <87leksbr5n.fsf@ungleich.ch> Hey Daniel, thanks a lot for diving in ... Daniel Gr?ber writes: > Let's look at the code (heavily culled): > > struct flowi4 fl = { > .saddr = endpoint->src4.s_addr, > }; > if (cache) > rt = dst_cache_get_ip4(cache, &fl.saddr); What I am wondering is, how did it get into the cache in the first place? > [...] > > @Nico could it perhaps simply be that you're hitting one of these zero'ing > cases and that's why it's using regular kernel src addr selection instead > of the cached endpoint src4 address? That could absolutely be the case. What is funky is that I see the problem on two very different systems, but maybe it's a good time to elaborate on this: - System A: - Wireguard module loaded on the host - Wireguard wg-quick used within a kubernetes pods that has permissions for managing wireguard - The same pod also runs bird for BGP peering - System B: - Wireguard running as wireguard-go on OpnSense / FreeBSD - BGP running with frr Both systems exhibit the behaviour, but maybe it's better to focus on System A first, as this seems to be more the "upstream" source. Best regards, Nico -- Sustainable and modern Infrastructures by ungleich.ch From nico.schottelius at ungleich.ch Mon Feb 20 10:40:28 2023 From: nico.schottelius at ungleich.ch (Nico Schottelius) Date: Mon, 20 Feb 2023 11:40:28 +0100 Subject: Src addr code review (Was: Source IP incorrect on multi homed systems) In-Reply-To: References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> <87o7pp76a2.fsf@ungleich.ch> <20230220014252.21178988@nvm> <87h6vh72d4.fsf@ungleich.ch> <20230219224200.g5mwcaybee4hujov@House.clients.dxld.at> Message-ID: <87sff0oc06.fsf@ungleich.ch> Hello ??, on github it seems your patch was applied / the issue was closed - is that the correct current status? Best regards, Nico ?? writes: > Hi all, > I've hacked that source code myself months ago, and it works well on > my use case (I have 4 dual stack pppoe wan set on my openwrt router, > and seted a wireguard sever on it), my hack will pickup the dst_addr > from incoming handshake packet in kernel sk_buff, and then use that > addr as src_addr to reply. > I'm not good at source code, and I know that my hack may be ugly, but > it works, hope this patch can help: > https://github.com/openwrt/packages/issues/9538#issuecomment-1150592803 > > Daniel Gr?ber ?2023?2?20??? 06:42??? >> >> Hi, >> >> I though it might be useful to do some quick and dirty code review instead >> of speculating wildly to figure out where these source IP selection >> problems could be coming from ;) >> >> From previous code deep dives I know the udp_tunnel_xmit_skb function is >> where tunnel packets get handed off to the kernel. So in >> net/wireguard/socket.c:send4 we have: >> >> udp_tunnel_xmit_skb(rt, sock, skb, fl.saddr, fl.daddr, ds, >> ip4_dst_hoplimit(&rt->dst), 0, fl.fl4_sport, >> fl.fl4_dport, false, false); >> >> Where fl.saddr is the source address that's supposedly wrong (sometimes? I >> guess?) Where does that come from? >> >> Let's look at the code (heavily culled): >> >> struct flowi4 fl = { >> .saddr = endpoint->src4.s_addr, >> }; >> if (cache) >> rt = dst_cache_get_ip4(cache, &fl.saddr); >> if (!rt) { >> if (unlikely(!inet_confirm_addr(sock_net(sock), NULL, 0, >> fl.saddr, RT_SCOPE_HOST))) >> fl.saddr = 0; >> if (unlikely(endpoint->src_if4 && ((IS_ERR(rt) && >> PTR_ERR(rt) == -EINVAL) || (!IS_ERR(rt) && >> rt->dst.dev->ifindex != endpoint->src_if4)))) >> fl.saddr = 0; >> >> Well it's initialized from endpoint->src4.s_addr, overwritten with zero in >> some cases, which I believe lets the kernel do it's regular source addr >> selection, and populated from something called dst_cache at some callsites. >> >> @Nico could it perhaps simply be that you're hitting one of these zero'ing >> cases and that's why it's using regular kernel src addr selection instead >> of the cached endpoint src4 address? >> >> The first case !inet_confirm_addr(..., RT_SCOPE_HOST) ought to confirm that >> the saddr is actually still a local address. Makes sens if the address we >> remembered was removed from the interface we can't use it anymore. >> >> The second case looks like it's checking if the (sometimes cached) src_if4 >> interface index is still what the route we're about to use points to. >> >> If neither of those seem likely we can keep reading :) >> >> --Daniel >> >> >> -- Sustainable and modern Infrastructures by ungleich.ch From icepic.dz at gmail.com Mon Feb 20 11:09:25 2023 From: icepic.dz at gmail.com (Janne Johansson) Date: Mon, 20 Feb 2023 12:09:25 +0100 Subject: Source IP incorrect on multi homed systems Message-ID: rewriting for the lists, managed to bold some pasted text and hence get blocked due to html-mails not allowed on list. Den s?n 19 feb. 2023 kl 21:17 skrev Nico Schottelius : > Janne Johansson writes: > > *) https://en.wiktionary.org/wiki/Chesterton%27s_fence > > I am happy to have learned a new principle today, thanks for that. > > And to be sure that everyone is on the same page: > > Wireguard should reply by default with the source address that > used to be the destination address, but at the moment wireguard is not > doing that at the moment. > > If anyone disagrees with above statement, please let me know. I disagree, but perhaps only because that statement is slightly too short. Let's assume I have two ISPs and hence a multihomed wg peer, with ip A.x.x.x from isp A, and ip B.x.x.x from isp B. For some reason, this box has a routing table that says "prefer link A to reach the internet", but I set up client C to set up wireguard to B.1.2.3 and client C sends it udp packet with src ip C and dest IP B.x.x.x. Since UDP is stateless, the "response" from the multihomed server is created "out of thin air" as a random UDP packet destined for C. We don't feel it is unrelated to the previous received packet, but from the tcp stack perspective it is. The routing table now decides that interface A will be the awesomest for sending UDP to C, and therefore creates a packet with source ip A.x.x.x and dest ip C.x.x.x and sends it off. This surprise seems to be the main issue in this thread. Perhaps we see this multihomed box as slightly misconfigured as far as wireguard goes, perhaps it should have posted A.x.x.x instead of B.x.x.x as the wg endpoint to the client or whatever, but the facts remain. Now, in your above statement you hope to get everyone to agree on, this would need to also include "sending it back on interface B, to the gw used by interface B to ISP B if there is one" or else isp A might drop the packet as being sent from a "forged" address since it looks like a fake source ip from isp As perspective. The routing lookups - before any applied tricks - will look at destination IPs only and make the decision based on that. I think the proposed solution, while attractive at first glance, may be trading one kind of "surprise" behaviour to another where the interface B might be less useful than A which would explain why the default route is set to use A. If you look at the many posts on the internet over many years about "why udp source ip got chosen wrong on multihomed boxes" you see answers like: "You either bind(2) to each interface address and manage multiple sockets, or let the kernel do the implicit source IP assignment with INADDR_ANY. There is no other way." ( https://stackoverflow.com/questions/3062205/setting-the-source-ip-for-a-udp-socket , not about vpns and lots older than wireguard) What this means is that if you have a box where links and interfaces come and go (usb wifi dongles, tethered cell phones..) then wireguard now has to make a lot of extra work, trying to keep tabs on what interfaces exist or not, instead of just binding to port 0 and letting the kernel handle this by itself in the normal but to some, surprising way for udp packets. My gut feeling is that if you have a setup like this example multihomed peer, you get to do some extra steps, which may include the aforementioned firewall mark "tricks", use VRFs/Namespaced interfaces/routing domains, add a specific route to client C ip over link B, bind wg to a loopback interface or source-nat on outgoing wg traffic or something along those lines in order to have a wg endpoint on a less-preferred interface and not cause issues with stateful-nat-gws at client C. -- May the most significant bit of your life be positive. From cao88yu at gmail.com Mon Feb 20 11:21:07 2023 From: cao88yu at gmail.com (=?UTF-8?B?5pu554Wc?=) Date: Mon, 20 Feb 2023 19:21:07 +0800 Subject: Src addr code review (Was: Source IP incorrect on multi homed systems) In-Reply-To: <87sff0oc06.fsf@ungleich.ch> References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> <87o7pp76a2.fsf@ungleich.ch> <20230220014252.21178988@nvm> <87h6vh72d4.fsf@ungleich.ch> <20230219224200.g5mwcaybee4hujov@House.clients.dxld.at> <87sff0oc06.fsf@ungleich.ch> Message-ID: Hi Nico, That issue was closed by myself, but the patch didn't get applied cause the issue was came from wireguard itself, and the maintener told me that I should send my patch to wireguard upstream (but I just gave up for sending it to wireguard team). Nico Schottelius ?2023?2?20??? 18:41??? > > > Hello ??, > > on github it seems your patch was applied / the issue was closed - is > that the correct current status? > > Best regards, > > Nico > > ?? writes: > > > Hi all, > > I've hacked that source code myself months ago, and it works well on > > my use case (I have 4 dual stack pppoe wan set on my openwrt router, > > and seted a wireguard sever on it), my hack will pickup the dst_addr > > from incoming handshake packet in kernel sk_buff, and then use that > > addr as src_addr to reply. > > I'm not good at source code, and I know that my hack may be ugly, but > > it works, hope this patch can help: > > https://github.com/openwrt/packages/issues/9538#issuecomment-1150592803 > > > > Daniel Gr?ber ?2023?2?20??? 06:42??? > >> > >> Hi, > >> > >> I though it might be useful to do some quick and dirty code review instead > >> of speculating wildly to figure out where these source IP selection > >> problems could be coming from ;) > >> > >> From previous code deep dives I know the udp_tunnel_xmit_skb function is > >> where tunnel packets get handed off to the kernel. So in > >> net/wireguard/socket.c:send4 we have: > >> > >> udp_tunnel_xmit_skb(rt, sock, skb, fl.saddr, fl.daddr, ds, > >> ip4_dst_hoplimit(&rt->dst), 0, fl.fl4_sport, > >> fl.fl4_dport, false, false); > >> > >> Where fl.saddr is the source address that's supposedly wrong (sometimes? I > >> guess?) Where does that come from? > >> > >> Let's look at the code (heavily culled): > >> > >> struct flowi4 fl = { > >> .saddr = endpoint->src4.s_addr, > >> }; > >> if (cache) > >> rt = dst_cache_get_ip4(cache, &fl.saddr); > >> if (!rt) { > >> if (unlikely(!inet_confirm_addr(sock_net(sock), NULL, 0, > >> fl.saddr, RT_SCOPE_HOST))) > >> fl.saddr = 0; > >> if (unlikely(endpoint->src_if4 && ((IS_ERR(rt) && > >> PTR_ERR(rt) == -EINVAL) || (!IS_ERR(rt) && > >> rt->dst.dev->ifindex != endpoint->src_if4)))) > >> fl.saddr = 0; > >> > >> Well it's initialized from endpoint->src4.s_addr, overwritten with zero in > >> some cases, which I believe lets the kernel do it's regular source addr > >> selection, and populated from something called dst_cache at some callsites. > >> > >> @Nico could it perhaps simply be that you're hitting one of these zero'ing > >> cases and that's why it's using regular kernel src addr selection instead > >> of the cached endpoint src4 address? > >> > >> The first case !inet_confirm_addr(..., RT_SCOPE_HOST) ought to confirm that > >> the saddr is actually still a local address. Makes sens if the address we > >> remembered was removed from the interface we can't use it anymore. > >> > >> The second case looks like it's checking if the (sometimes cached) src_if4 > >> interface index is still what the route we're about to use points to. > >> > >> If neither of those seem likely we can keep reading :) > >> > >> --Daniel > >> > >> > >> > > > -- > Sustainable and modern Infrastructures by ungleich.ch From johnalauro at gmail.com Sun Feb 19 18:30:42 2023 From: johnalauro at gmail.com (John Lauro) Date: Sun, 19 Feb 2023 13:30:42 -0500 Subject: Fwd: Source IP incorrect on multi homed systems In-Reply-To: References: <875yby83n2.fsf@ungleich.ch> <2ed829aaed9fec59ac2a9b32c4ce0a9005b8d8b850be81c81a226791855fe4eb@mu.id> <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> Message-ID: I think the ip route with src would work, but only as a short lived work around. The problem with it is if dealing with dynamic routes is it could go a way when a link is down and then come back and the src setting would be lost. You would need the bgp software to add the src. UDP is connectionless. Sending back out the same as it's coming in isn't strictly the same. The streams are not attached the same as they would be with TCP on nginx or a reply with icmp. You should be able to whitelist the udp port on the NAT devices, as it shouldn't use state info. I am not sure if you are attempting to do site to site or client to server/site and which end has the NAT (or both). What I do for site to site is use a different port for each connection and have a separate BGP connection for each possible connection (ie: different one for different network providers). Have a full mesh with 8 sites and upto 3 providers per site. That said, you probably have floating IPs on the client side, and don't want to lock in a single IP on the multi-homed server side? You could nat the incoming IPs on the border from an internal IP and then then lock to a single private IP on the wireguard server for in/out and that border nat would force the reply back to the same gateway it came in from. I know, you don't want work arounds, just want to mention it's not the same as comparing a single stream to something that handles routing though it. As you are doing bgp and redundant routes I assume you also reset rp_filter on all nat/wireguard/routers so the routers will allow packets to come from different sources. On Sun, Feb 19, 2023 at 12:07 PM tlhackque wrote: > > FWIW, while clever, I don't think that iptables mark solves all cases. > E.g., consider an interface with multiple addresses, where a packet > comes in on a secondary address. The proposed rule would send it out > the right interface, but still with the wrong (primary) address picked > from the interface... > > With IPv6 it's common to assign an address to a service rather than a > host so services can move easily. So multiple addresses per interface > are the rule, not the exception. > > I do the same with IPv4 inside addresses, though these days public IPv4 > addresses are scarce enough that it's not common for public IPs. It > amounts to the same issue - the NAT tracking is stateful. > > Trying to work around this with routing seems like a maze of twisty > passages - so I agree that the right solution is for WG to respond from > the address that receives a packet. From karog at jgibbons.com Sun Feb 19 18:41:27 2023 From: karog at jgibbons.com (karog) Date: Sun, 19 Feb 2023 13:41:27 -0500 Subject: [PATCH v2] Allow config to read secret keys from file In-Reply-To: <20230219182357.444395-1-dxld@darkboxed.org> References: <20230219182357.444395-1-dxld@darkboxed.org> Message-ID: <0EB0C3F5-AB25-4E14-9390-7FE24CAD7BB8@jgibbons.com> Instead of using new config keys, did you consider using special values for this case like PrivateKey=file:/path/to/key PresharedKey=file:/path/to/preshared In addition to not proliferating new config keys, it also prevents the possibility of erring by including both PrivateKey and PrivateKeyFile This kind of syntax is used in systemd service files for things like StandardOut and StandardError karog > On Feb 19, 2023, at 1:23 PM, Daniel Gr?ber wrote: > > This adds two new config keys PrivateKeyFile= and PresharedKeyFile= that > simply hook up the existing code for the `wg set ... private-key /file` > codepath. > > By using the new options wireguard configs can become a lot easier to > manage and deploy as we don't have to treat them as secrets anymore. This > way they can, for example, be tracked in public git repos while the secret > keys can be provisioned using an out of band system or with a manual > one-time step instead. > > Before this patch we were using an ugly hack: it's possible to simply omit > PrivateKey= and set it using `PostUp = wg set %i private-key /some/file`. > However this breaks when we try to use setconf or synconf as they > will (rightly) unset the private key when it's missing in the underlying > config file breaking connectivity. > > Reviewed-By: Michael Tokarev > Signed-off-by: Daniel Gr?ber > --- > src/config.c | 8 ++++++++ > src/man/wg.8 | 4 ++++ > 2 files changed, 12 insertions(+) > > diff --git a/src/config.c b/src/config.c > index e8db900..f9980fe 100644 > --- a/src/config.c > +++ b/src/config.c > @@ -464,6 +464,10 @@ static bool process_line(struct config_ctx *ctx, const char *line) > ret = parse_key(ctx->device->private_key, value); > if (ret) > ctx->device->flags |= WGDEVICE_HAS_PRIVATE_KEY; > + } else if (key_match("PrivateKeyFile")) { > + ret = parse_keyfile(ctx->device->private_key, value); > + if (ret) > + ctx->device->flags |= WGDEVICE_HAS_PRIVATE_KEY; > } else > goto error; > } else if (ctx->is_peer_section) { > @@ -483,6 +487,10 @@ static bool process_line(struct config_ctx *ctx, const char *line) > ret = parse_key(ctx->last_peer->preshared_key, value); > if (ret) > ctx->last_peer->flags |= WGPEER_HAS_PRESHARED_KEY; > + } else if (key_match("PresharedKeyFile")) { > + ret = parse_keyfile(ctx->last_peer->preshared_key, value); > + if (ret) > + ctx->last_peer->flags |= WGPEER_HAS_PRESHARED_KEY; > } else > goto error; > } else > diff --git a/src/man/wg.8 b/src/man/wg.8 > index fd9fde7..48f084d 100644 > --- a/src/man/wg.8 > +++ b/src/man/wg.8 > @@ -134,6 +134,8 @@ The \fIInterface\fP section may contain the following fields: > .IP \(bu > PrivateKey \(em a base64 private key generated by \fIwg genkey\fP. Required. > .IP \(bu > +PrivateKeyFile \(em path to a file containing a base64 private key. May be used instead of \fIPrivateKey\fP. Optional. > +.IP \(bu > ListenPort \(em a 16-bit port for listening. Optional; if not specified, chosen > randomly. > .IP \(bu > @@ -151,6 +153,8 @@ and may be omitted. This option adds an additional layer of symmetric-key > cryptography to be mixed into the already existing public-key cryptography, > for post-quantum resistance. > .IP \(bu > +PresharedKeyFile \(em path to a file containing a base64 preshared key. May be used instead of \fIPresharedKey\fP. Optional. > +.IP \(bu > AllowedIPs \(em a comma-separated list of IP (v4 or v6) addresses with > CIDR masks from which incoming traffic for this peer is allowed and to > which outgoing traffic for this peer is directed. The catch-all > -- > 2.30.2 From me at FingerlessGloves.me Mon Feb 20 11:29:21 2023 From: me at FingerlessGloves.me (FingerlessGloves) Date: Mon, 20 Feb 2023 11:29:21 +0000 Subject: Wish - Add PostUp/PostDown per peer section In-Reply-To: References: Message-ID: <538cc161fec50f06cffcc0d0e08b0a10@FingerlessGloves.me> This would be very handy. Currently we have a python script that reads in the peers and add/removes ARP entries from the physical interface, which runs as a service and checks WireGuard every 10 seconds ?. --- FingerlessGloves On 2023-02-20 09:33 AM, Daniel wrote: > Hello. Would it be possible to add the PostUp and PostDown commands in > peer section ? > > Ex. of use case: dynamically add route when a peer connect > > Have a nice day From dxld at darkboxed.org Mon Feb 20 20:43:47 2023 From: dxld at darkboxed.org (dxld at darkboxed.org) Date: Mon, 20 Feb 2023 21:43:47 +0100 Subject: Src addr code review (Was: Source IP incorrect on multi homed systems) In-Reply-To: <87leksbr5n.fsf@ungleich.ch> References: <87ttzhc0jt.fsf@ungleich.ch> <7d7bc930-65d9-f13e-cedc-e0451407be85@chil.at> <87o7pp76a2.fsf@ungleich.ch> <20230220014252.21178988@nvm> <87h6vh72d4.fsf@ungleich.ch> <20230219224200.g5mwcaybee4hujov@House.clients.dxld.at> <87leksbr5n.fsf@ungleich.ch> Message-ID: <20230220204347.nqeusqotqtxbjiw2@House.clients.dxld.at> Hi Nico, On Mon, Feb 20, 2023 at 10:47:36AM +0100, Nico Schottelius wrote: > Daniel Gr?ber writes: > > Let's look at the code (heavily culled): > > > > struct flowi4 fl = { > > .saddr = endpoint->src4.s_addr, > > }; > > if (cache) > > rt = dst_cache_get_ip4(cache, &fl.saddr); > > What I am wondering is, how did it get into the cache in the first place? Right so, endpoint->src4 is set in wg_socket_set_peer_endpoint, which is called either trough through wg_socket_endpoint_from_skb in the handshake receive code or wg_socket_set_peer_endpoint in the data path. The _from_skb variant also calls wg_socket_endpoint_from_skb. Here we're remembering the src addr of the (received) packet in addr4 and the dst addr we're going to use for sending as src4 as you'd expect: endpoint->addr4.sin_family = AF_INET; endpoint->addr4.sin_port = udp_hdr(skb)->source; endpoint->addr4.sin_addr.s_addr = ip_hdr(skb)->saddr; endpoint->src4.s_addr = ip_hdr(skb)->daddr; endpoint->src_if4 = skb->skb_iif; The dst_cache is set just after those zero'ing conditionals we were looking at before. It's cleared whenever the endpoint/port changes or one of those cases is hit. Note the dst_cache is only used for data packets, so handshakes would be unaffected if it was the cause of your woes. > > @Nico could it perhaps simply be that you're hitting one of these zero'ing > > cases and that's why it's using regular kernel src addr selection instead > > of the cached endpoint src4 address? > > That could absolutely be the case. What is funky is that I see the > problem on two very different systems > > Both systems exhibit the behaviour, but maybe it's better to focus on > System A first, as this seems to be more the "upstream" source. It is weird indeed, but yeah. One thing at a time. BTW, what kernel version/distro are we dealing with? --Daniel