From syzbot+48f45f6dd79ca20c3283 at syzkaller.appspotmail.com Sun May 4 19:10:35 2025 From: syzbot+48f45f6dd79ca20c3283 at syzkaller.appspotmail.com (syzbot) Date: Sun, 04 May 2025 19:10:35 -0000 Subject: [syzbot] [wireguard?] INFO: rcu detected stall in wg_packet_handshake_receive_worker (3) Message-ID: <6817bba9.050a0220.11da1b.0036.GAE@google.com> Hello, syzbot found the following issue on: HEAD commit: ebd297a2affa Merge tag 'net-6.15-rc5' of git://git.kernel... git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=152b41cc580000 kernel config: https://syzkaller.appspot.com/x/.config?x=541aa584278da96c dashboard link: https://syzkaller.appspot.com/bug?extid=48f45f6dd79ca20c3283 compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=170e1f74580000 Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/6ddda4d4b637/disk-ebd297a2.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/e2a2d6ca1abd/vmlinux-ebd297a2.xz kernel image: https://storage.googleapis.com/syzbot-assets/1b7bc593408e/bzImage-ebd297a2.xz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+48f45f6dd79ca20c3283 at syzkaller.appspotmail.com rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: rcu: 0-...!: (3 ticks this GP) idle=f5e4/1/0x4000000000000000 softirq=17502/17502 fqs=0 rcu: (detected by 1, t=10503 jiffies, g=8145, q=1636 ncpus=2) Sending NMI from CPU 1 to CPUs 0: NMI backtrace for cpu 0 CPU: 0 UID: 0 PID: 5843 Comm: kworker/0:3 Not tainted 6.15.0-rc4-syzkaller-00147-gebd297a2affa #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/19/2025 Workqueue: wg-kex-wg0 wg_packet_handshake_receive_worker RIP: 0010:bytes_is_nonzero mm/kasan/generic.c:89 [inline] RIP: 0010:memory_is_nonzero mm/kasan/generic.c:104 [inline] RIP: 0010:memory_is_poisoned_n mm/kasan/generic.c:129 [inline] RIP: 0010:memory_is_poisoned mm/kasan/generic.c:161 [inline] RIP: 0010:check_region_inline mm/kasan/generic.c:180 [inline] RIP: 0010:kasan_check_range+0x105/0x1a0 mm/kasan/generic.c:189 Code: 75 0a b8 01 00 00 00 45 3a 11 7c 0b 44 89 c2 e8 61 ec ff ff 83 f0 01 5b 5d 41 5c c3 cc cc cc cc 48 85 d2 74 4f 48 01 ea eb 09 <48> 83 c0 01 48 39 d0 74 41 80 38 00 74 f2 eb b2 41 bc 08 00 00 00 RSP: 0018:ffffc90000007d00 EFLAGS: 00000046 RAX: ffffed100f465a10 RBX: ffffed100f465a11 RCX: ffffffff89808340 RDX: ffffed100f465a11 RSI: 0000000000000004 RDI: ffff88807a32d080 RBP: ffffed100f465a10 R08: 0000000000000001 R09: ffffed100f465a10 R10: ffff88807a32d083 R11: 0000000000000000 R12: 0000000000000000 R13: ffff88807a32d000 R14: 0000000000000000 R15: dffffc0000000000 FS: 0000000000000000(0000) GS:ffff8881249e2000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000560c50fa2060 CR3: 0000000029abe000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: instrument_atomic_write include/linux/instrumented.h:82 [inline] atomic_set include/linux/atomic/atomic-instrumented.h:67 [inline] taprio_set_budgets+0x1a0/0x310 net/sched/sch_taprio.c:672 advance_sched+0x5f6/0xc80 net/sched/sch_taprio.c:977 __run_hrtimer kernel/time/hrtimer.c:1761 [inline] __hrtimer_run_queues+0x1ff/0xad0 kernel/time/hrtimer.c:1825 hrtimer_interrupt+0x397/0x8e0 kernel/time/hrtimer.c:1887 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1038 [inline] __sysvec_apic_timer_interrupt+0x108/0x3f0 arch/x86/kernel/apic/apic.c:1055 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline] sysvec_apic_timer_interrupt+0x9f/0xc0 arch/x86/kernel/apic/apic.c:1049 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702 RIP: 0010:lock_acquire+0x62/0x350 kernel/locking/lockdep.c:5870 Code: ce 0b 12 83 f8 07 0f 87 bc 02 00 00 89 c0 48 0f a3 05 42 ea ec 0e 0f 82 74 02 00 00 8b 35 da 19 ed 0e 85 f6 0f 85 8d 00 00 00 <48> 8b 44 24 30 65 48 2b 05 19 ce 0b 12 0f 85 c7 02 00 00 48 83 c4 RSP: 0018:ffffc90003537850 EFLAGS: 00000206 RAX: 0000000000000046 RBX: ffffffff8e588130 RCX: 0000000000000001 RDX: 0000000000000000 RSI: ffffffff8dbbb25f RDI: ffffffff8bf47e20 RBP: 0000000000000002 R08: 52b2bffd12c8faba R09: ffffffff968457c8 R10: 0000000000000004 R11: 0000000000002bc0 R12: 0000000000000001 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 srcu_lock_acquire include/linux/srcu.h:161 [inline] srcu_read_lock include/linux/srcu.h:253 [inline] kasan_quarantine_reduce+0x8e/0x1e0 mm/kasan/quarantine.c:259 __kasan_slab_alloc+0x69/0x90 mm/kasan/common.c:329 kasan_slab_alloc include/linux/kasan.h:250 [inline] slab_post_alloc_hook mm/slub.c:4161 [inline] slab_alloc_node mm/slub.c:4210 [inline] __kmalloc_cache_noprof+0x1f1/0x3e0 mm/slub.c:4367 kmalloc_noprof include/linux/slab.h:905 [inline] kzalloc_noprof include/linux/slab.h:1039 [inline] keypair_create drivers/net/wireguard/noise.c:100 [inline] wg_noise_handshake_begin_session+0xe5/0xe80 drivers/net/wireguard/noise.c:827 wg_packet_send_handshake_response+0x216/0x310 drivers/net/wireguard/send.c:96 wg_receive_handshake_packet+0x247/0xbf0 drivers/net/wireguard/receive.c:154 wg_packet_handshake_receive_worker+0x17f/0x3a0 drivers/net/wireguard/receive.c:213 process_one_work+0x9cc/0x1b70 kernel/workqueue.c:3238 process_scheduled_works kernel/workqueue.c:3319 [inline] worker_thread+0x6c8/0xf10 kernel/workqueue.c:3400 kthread+0x3c2/0x780 kernel/kthread.c:464 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:153 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 rcu: rcu_preempt kthread timer wakeup didn't happen for 10502 jiffies! g8145 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 rcu: Possible timer handling issue on cpu=0 timer-softirq=4379 rcu: rcu_preempt kthread starved for 10503 jiffies! g8145 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0 rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. rcu: RCU grace-period kthread stack dump: task:rcu_preempt state:I stack:28728 pid:16 tgid:16 ppid:2 task_flags:0x208040 flags:0x00004000 Call Trace: context_switch kernel/sched/core.c:5382 [inline] __schedule+0x116f/0x5de0 kernel/sched/core.c:6767 __schedule_loop kernel/sched/core.c:6845 [inline] schedule+0xe7/0x3a0 kernel/sched/core.c:6860 schedule_timeout+0x123/0x290 kernel/time/sleep_timeout.c:99 rcu_gp_fqs_loop+0x1ea/0xb00 kernel/rcu/tree.c:2046 rcu_gp_kthread+0x270/0x380 kernel/rcu/tree.c:2248 kthread+0x3c2/0x780 kernel/kthread.c:464 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:153 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller at googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. If the report is already addressed, let syzbot know by replying with: #syz fix: exact-commit-title If you want syzbot to run the reproducer, reply with: #syz test: git://repo/address.git branch-or-commit-hash If you attach or paste a git patch, syzbot will apply it before testing. If you want to overwrite report's subsystems, reply with: #syz set subsystems: new-subsystem (See the list of subsystem names on the web dashboard) If the report is a duplicate of another one, reply with: #syz dup: exact-subject-of-another-report If you want to undo deduplication, reply with: #syz undup From syzbot+listad97b905a104dc343053 at syzkaller.appspotmail.com Mon May 12 06:34:21 2025 From: syzbot+listad97b905a104dc343053 at syzkaller.appspotmail.com (syzbot) Date: Sun, 11 May 2025 23:34:21 -0700 Subject: [syzbot] Monthly wireguard report (May 2025) Message-ID: <6821966d.050a0220.f2294.0052.GAE@google.com> Hello wireguard maintainers/developers, This is a 31-day syzbot report for the wireguard subsystem. All related reports/information can be found at: https://syzkaller.appspot.com/upstream/s/wireguard During the period, 0 new issues were detected and 0 were fixed. In total, 5 issues are still open and 19 have already been fixed. Some of the still happening issues: Ref Crashes Repro Title <1> 12253 Yes BUG: workqueue lockup (5) https://syzkaller.appspot.com/bug?extid=f0b66b520b54883d4b9d <2> 360 No INFO: task hung in wg_netns_pre_exit (5) https://syzkaller.appspot.com/bug?extid=f2fbf7478a35a94c8b7c <3> 248 No INFO: task hung in netdev_run_todo (4) https://syzkaller.appspot.com/bug?extid=894cca71fa925aabfdb2 <4> 3 Yes INFO: rcu detected stall in wg_packet_handshake_receive_worker (3) https://syzkaller.appspot.com/bug?extid=48f45f6dd79ca20c3283 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller at googlegroups.com. To disable reminders for individual bugs, reply with the following command: #syz set no-reminders To change bug's subsystems, reply with: #syz set subsystems: new-subsystem You may send multiple commands in a single email message. From trianglesnake2002 at gmail.com Mon May 5 07:13:14 2025 From: trianglesnake2002 at gmail.com (TriangleSnake) Date: Mon, 05 May 2025 07:13:14 -0000 Subject: [PATCH] wg-quick: add 'dev' to 'ip link add' to avoid keyword conflicts Message-ID: <20250505071306.80342-1-trianglesnake2002@gmail.com> Signed-off-by: TriangleSnake --- src/wg-quick/linux.bash | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/wg-quick/linux.bash b/src/wg-quick/linux.bash index 4193ce5..93df80d 100755 --- a/src/wg-quick/linux.bash +++ b/src/wg-quick/linux.bash @@ -87,7 +87,7 @@ auto_su() { add_if() { local ret - if ! cmd ip link add "$INTERFACE" type wireguard; then + if ! cmd ip link add dev "$INTERFACE" type wireguard; then ret=$? [[ -e /sys/module/wireguard ]] || ! command -v "${WG_QUICK_USERSPACE_IMPLEMENTATION:-wireguard-go}" >/dev/null && exit $ret echo "[!] Missing WireGuard kernel module. Falling back to slow userspace implementation." >&2 -- 2.39.5 (Apple Git-154) From jordan at jrife.io Sat May 17 19:29:51 2025 From: jordan at jrife.io (Jordan Rife) Date: Sat, 17 May 2025 12:29:51 -0700 Subject: [RESEND PATCH v1 wireguard-tools] ipc: linux: Support incremental allowed ips updates Message-ID: <20250517192955.594735-1-jordan@jrife.io> Extend the interface of `wg set` to leverage the WGALLOWEDIP_F_REMOVE_ME flag, a direct way of removing a single allowed ip from a peer, allowing for incremental updates to a peer's configuration. By default, allowed-ips fully replaces a peer's allowed ips using WGPEER_REPLACE_ALLOWEDIPS under the hood. When '+' or '-' is prepended to any ip in the list, wg clears WGPEER_F_REPLACE_ALLOWEDIPS and sets the WGALLOWEDIP_F_REMOVE_ME flag on any ip prefixed with '-'. $ wg set wg0 peer allowed-ips +192.168.88.0/24,-192.168.0.1/32 This command means "add 192.168.88.0/24 to this peer's allowed ips if not present, and remove 192.168.0.1/32 if present". Use -isystem so that headers in uapi/ take precedence over system headers; otherwise, the build will fail on systems running kernels without the WGALLOWEDIP_F_REMOVE_ME flag. Note that this patch is meant to be merged alongside the kernel patch that introduces the flag. Signed-off-by: Jordan Rife --- src/Makefile | 2 +- src/config.c | 27 +++++++++++++++++++++++++++ src/containers.h | 5 +++++ src/ipc-linux.h | 2 ++ src/man/wg.8 | 8 ++++++-- src/set.c | 2 +- src/uapi/linux/linux/wireguard.h | 9 +++++++++ 7 files changed, 51 insertions(+), 4 deletions(-) diff --git a/src/Makefile b/src/Makefile index 0533910..1c4b3f6 100644 --- a/src/Makefile +++ b/src/Makefile @@ -39,7 +39,7 @@ PLATFORM ?= $(shell uname -s | tr '[:upper:]' '[:lower:]') CFLAGS ?= -O3 ifneq ($(wildcard uapi/$(PLATFORM)/.),) -CFLAGS += -idirafter uapi/$(PLATFORM) +CFLAGS += -isystem uapi/$(PLATFORM) endif CFLAGS += -std=gnu99 -D_GNU_SOURCE CFLAGS += -Wall -Wextra diff --git a/src/config.c b/src/config.c index 81ccb47..b740f73 100644 --- a/src/config.c +++ b/src/config.c @@ -337,6 +337,29 @@ static bool validate_netmask(struct wgallowedip *allowedip) return true; } +#if defined(__linux__) +static inline void parse_ip_prefix(struct wgpeer *peer, uint32_t *flags, char **mask) +{ + /* If the IP is prefixed with either '+' or '-' consider + * this an incremental change. Disable WGPEER_REPLACE_ALLOWEDIPS. + */ + switch ((*mask)[0]) { + case '-': + *flags |= WGALLOWEDIP_REMOVE_ME; + /* fall through */ + case '+': + peer->flags &= ~WGPEER_REPLACE_ALLOWEDIPS; + (*mask)++; + } +} +#else +static inline void parse_ip_prefix(struct wgpeer *peer __attribute__ ((unused)), + uint32_t *flags __attribute__ ((unused)), + char **mask __attribute__ ((unused))) +{ +} +#endif + static inline bool parse_allowedips(struct wgpeer *peer, struct wgallowedip **last_allowedip, const char *value) { struct wgallowedip *allowedip = *last_allowedip, *new_allowedip; @@ -353,9 +376,12 @@ static inline bool parse_allowedips(struct wgpeer *peer, struct wgallowedip **la } sep = mutable; while ((mask = strsep(&sep, ","))) { + uint32_t flags = 0; unsigned long cidr; char *end, *ip; + parse_ip_prefix(peer, &flags, &mask); + saved_entry = strdup(mask); ip = strsep(&mask, "/"); @@ -387,6 +413,7 @@ static inline bool parse_allowedips(struct wgpeer *peer, struct wgallowedip **la else goto err; new_allowedip->cidr = cidr; + new_allowedip->flags = flags; if (!validate_netmask(new_allowedip)) fprintf(stderr, "Warning: AllowedIP has nonzero host part: %s/%s\n", ip, mask); diff --git a/src/containers.h b/src/containers.h index a82e8dd..8fd813a 100644 --- a/src/containers.h +++ b/src/containers.h @@ -28,6 +28,10 @@ struct timespec64 { int64_t tv_nsec; }; +enum { + WGALLOWEDIP_REMOVE_ME = 1U << 0, +}; + struct wgallowedip { uint16_t family; union { @@ -35,6 +39,7 @@ struct wgallowedip { struct in6_addr ip6; }; uint8_t cidr; + uint32_t flags; struct wgallowedip *next_allowedip; }; diff --git a/src/ipc-linux.h b/src/ipc-linux.h index d29c0c5..01247f1 100644 --- a/src/ipc-linux.h +++ b/src/ipc-linux.h @@ -228,6 +228,8 @@ again: } if (!mnl_attr_put_u8_check(nlh, SOCKET_BUFFER_SIZE, WGALLOWEDIP_A_CIDR_MASK, allowedip->cidr)) goto toobig_allowedips; + if (allowedip->flags && !mnl_attr_put_u32_check(nlh, SOCKET_BUFFER_SIZE, WGALLOWEDIP_A_FLAGS, allowedip->flags)) + goto toobig_allowedips; mnl_attr_nest_end(nlh, allowedip_nest); allowedip_nest = NULL; } diff --git a/src/man/wg.8 b/src/man/wg.8 index 7984539..1ec68df 100644 --- a/src/man/wg.8 +++ b/src/man/wg.8 @@ -55,7 +55,7 @@ transfer-rx, transfer-tx, persistent-keepalive. Shows the current configuration of \fI\fP in the format described by \fICONFIGURATION FILE FORMAT\fP below. .TP -\fBset\fP \fI\fP [\fIlisten-port\fP \fI\fP] [\fIfwmark\fP \fI\fP] [\fIprivate-key\fP \fI\fP] [\fIpeer\fP \fI\fP [\fIremove\fP] [\fIpreshared-key\fP \fI\fP] [\fIendpoint\fP \fI:\fP] [\fIpersistent-keepalive\fP \fI\fP] [\fIallowed-ips\fP \fI/\fP[,\fI/\fP]...] ]... +\fBset\fP \fI\fP [\fIlisten-port\fP \fI\fP] [\fIfwmark\fP \fI\fP] [\fIprivate-key\fP \fI\fP] [\fIpeer\fP \fI\fP [\fIremove\fP] [\fIpreshared-key\fP \fI\fP] [\fIendpoint\fP \fI:\fP] [\fIpersistent-keepalive\fP \fI\fP] [\fIallowed-ips\fP \fI[+|-]/\fP[,\fI[+|-]/\fP]...] ]... Sets configuration values for the specified \fI\fP. Multiple \fIpeer\fPs may be specified, and if the \fIremove\fP argument is given for a peer, that peer is removed, not configured. If \fIlisten-port\fP @@ -72,7 +72,11 @@ the device. The use of \fIpreshared-key\fP is optional, and may be omitted; it adds an additional layer of symmetric-key cryptography to be mixed into the already existing public-key cryptography, for post-quantum resistance. If \fIallowed-ips\fP is specified, but the value is the empty string, all -allowed ips are removed from the peer. The use of \fIpersistent-keepalive\fP +allowed ips are removed from the peer. By default, \fIallowed-ips\fP replaces +a peer's allowed ips. (Linux only) If + or - is prepended to any of the ips then +the update is incremental; ips prefixed with '+' or '' are added to the peer's +allowed ips if not present while ips prefixed with '-' are removed if present. +The use of \fIpersistent-keepalive\fP is optional and is by default off; setting it to 0 or "off" disables it. Otherwise it represents, in seconds, between 1 and 65535 inclusive, how often to send an authenticated empty packet to the peer, for the purpose of keeping diff --git a/src/set.c b/src/set.c index 75560fd..992ffa2 100644 --- a/src/set.c +++ b/src/set.c @@ -18,7 +18,7 @@ int set_main(int argc, const char *argv[]) int ret = 1; if (argc < 3) { - fprintf(stderr, "Usage: %s %s [listen-port ] [fwmark ] [private-key ] [peer [remove] [preshared-key ] [endpoint :] [persistent-keepalive ] [allowed-ips /[,/]...] ]...\n", PROG_NAME, argv[0]); + fprintf(stderr, "Usage: %s %s [listen-port ] [fwmark ] [private-key ] [peer [remove] [preshared-key ] [endpoint :] [persistent-keepalive ] [allowed-ips [+|-]/[,[+|-]/]...] ]...\n", PROG_NAME, argv[0]); return 1; } diff --git a/src/uapi/linux/linux/wireguard.h b/src/uapi/linux/linux/wireguard.h index 0efd52c..6ca266a 100644 --- a/src/uapi/linux/linux/wireguard.h +++ b/src/uapi/linux/linux/wireguard.h @@ -101,6 +101,10 @@ * WGALLOWEDIP_A_FAMILY: NLA_U16 * WGALLOWEDIP_A_IPADDR: struct in_addr or struct in6_addr * WGALLOWEDIP_A_CIDR_MASK: NLA_U8 + * WGALLOWEDIP_A_FLAGS: NLA_U32, WGALLOWEDIP_F_REMOVE_ME if + * the specified IP should be removed; + * otherwise, this IP will be added if + * it is not already present. * 0: NLA_NESTED * ... * 0: NLA_NESTED @@ -184,11 +188,16 @@ enum wgpeer_attribute { }; #define WGPEER_A_MAX (__WGPEER_A_LAST - 1) +enum wgallowedip_flag { + WGALLOWEDIP_F_REMOVE_ME = 1U << 0, + __WGALLOWEDIP_F_ALL = WGALLOWEDIP_F_REMOVE_ME +}; enum wgallowedip_attribute { WGALLOWEDIP_A_UNSPEC, WGALLOWEDIP_A_FAMILY, WGALLOWEDIP_A_IPADDR, WGALLOWEDIP_A_CIDR_MASK, + WGALLOWEDIP_A_FLAGS, __WGALLOWEDIP_A_LAST }; #define WGALLOWEDIP_A_MAX (__WGALLOWEDIP_A_LAST - 1) -- 2.43.0 From jordan at jrife.io Sat May 17 19:29:52 2025 From: jordan at jrife.io (Jordan Rife) Date: Sat, 17 May 2025 12:29:52 -0700 Subject: [RESEND PATCH v3 net-next] wireguard: allowedips: Add WGALLOWEDIP_F_REMOVE_ME flag In-Reply-To: <20250517192955.594735-1-jordan@jrife.io> References: <20250517192955.594735-1-jordan@jrife.io> Message-ID: <20250517192955.594735-2-jordan@jrife.io> The current netlink API for WireGuard does not directly support removal of allowed ips from a peer. A user can remove an allowed ip from a peer in one of two ways: 1. By using the WGPEER_F_REPLACE_ALLOWEDIPS flag and providing a new list of allowed ips which omits the allowed ip that is to be removed. 2. By reassigning an allowed ip to a "dummy" peer then removing that peer with WGPEER_F_REMOVE_ME. With the first approach, the driver completely rebuilds the allowed ip list for a peer. If my current configuration is such that a peer has allowed ips 192.168.0.2 and 192.168.0.3 and I want to remove 192.168.0.2 the actual transition looks like this. [192.168.0.2, 192.168.0.3] <-- Initial state [] <-- Step 1: Allowed ips removed for peer [192.168.0.3] <-- Step 2: Allowed ips added back for peer This is true even if the allowed ip list is small and the update does not need to be batched into multiple WG_CMD_SET_DEVICE requests, as the removal and subsequent addition of ips is non-atomic within a single request. Consequently, wg_allowedips_lookup_dst and wg_allowedips_lookup_src may return NULL while reconfiguring a peer even for packets bound for ips a user did not intend to remove leading to unintended interruptions in connectivity. This presents in userspace as failed calls to sendto and sendmsg for UDP sockets. In my case, I ran netperf while repeatedly reconfiguring the allowed ips for a peer with wg. /usr/local/bin/netperf -H 10.102.73.72 -l 10m -t UDP_STREAM -- -R 1 -m 1024 send_data: data send error: No route to host (errno 113) netperf: send_omni: send_data failed: No route to host While this may not be of particular concern for environments where peers and allowed ips are mostly static, systems like Cilium manage peers and allowed ips in a dynamic environment where peers (i.e. Kubernetes nodes) and allowed ips (i.e. pods running on those nodes) can frequently change making WGPEER_F_REPLACE_ALLOWEDIPS problematic. The second approach avoids any possible connectivity interruptions but is hacky and less direct, requiring the creation of a temporary peer just to dispose of an allowed ip. Introduce a new flag called WGALLOWEDIP_F_REMOVE_ME which in the same way that WGPEER_F_REMOVE_ME allows a user to remove a single peer from a WireGuard device's configuration allows a user to remove an ip from a peer's set of allowed ips. This enables incremental updates to a device's configuration without any connectivity blips or messy workarounds. A corresponding patch for wg extends the existing `wg set` interface to leverage this feature. $ wg set wg0 peer allowed-ips +192.168.88.0/24,-192.168.0.1/32 When '+' or '-' is prepended to any ip in the list, wg clears WGPEER_F_REPLACE_ALLOWEDIPS and sets the WGALLOWEDIP_F_REMOVE_ME flag on any ip prefixed with '-'. v2->v3 ------ * Revert WG_GENL_VERSION back to 1 (Jason). * Rename _remove() to remove_node() (Jason). * Remove unnecessary !peer guard from remove() (Jason). * Adjust line length for calls to wg_allowedips_(remove|insert)_v(4|6) (Jason). * Fix punctuation inside uapi docs for WGALLOWEDIP_A_FLAGS (Jason). * Get rid of remove-ip program and use wg instead in selftests (Jason). * Use NLA_POLICY_MASK for WGALLOWEDIP_A_FLAGS validation (Jakub). v1->v2 ------ * Fixed some Sparse warnings. Link: https://lore.kernel.org/netdev/20240905200551.4099064-1-jrife at google.com/ Signed-off-by: Jordan Rife --- drivers/net/wireguard/allowedips.c | 106 ++++++++++++++------ drivers/net/wireguard/allowedips.h | 4 + drivers/net/wireguard/netlink.c | 37 ++++--- drivers/net/wireguard/selftest/allowedips.c | 48 +++++++++ include/uapi/linux/wireguard.h | 9 ++ tools/testing/selftests/wireguard/netns.sh | 32 ++++++ 6 files changed, 193 insertions(+), 43 deletions(-) diff --git a/drivers/net/wireguard/allowedips.c b/drivers/net/wireguard/allowedips.c index 4b8528206cc8..dcf068ba2881 100644 --- a/drivers/net/wireguard/allowedips.c +++ b/drivers/net/wireguard/allowedips.c @@ -249,6 +249,56 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, return 0; } +static void remove_node(struct allowedips_node *node, struct mutex *lock) +{ + struct allowedips_node *child, **parent_bit, *parent; + bool free_parent; + + list_del_init(&node->peer_list); + RCU_INIT_POINTER(node->peer, NULL); + if (node->bit[0] && node->bit[1]) + return; + child = rcu_dereference_protected(node->bit[!rcu_access_pointer(node->bit[0])], + lockdep_is_held(lock)); + if (child) + child->parent_bit_packed = node->parent_bit_packed; + parent_bit = (struct allowedips_node **)(node->parent_bit_packed & ~3UL); + *parent_bit = child; + parent = (void *)parent_bit - + offsetof(struct allowedips_node, bit[node->parent_bit_packed & 1]); + free_parent = !rcu_access_pointer(node->bit[0]) && + !rcu_access_pointer(node->bit[1]) && + (node->parent_bit_packed & 3) <= 1 && + !rcu_access_pointer(parent->peer); + if (free_parent) + child = rcu_dereference_protected(parent->bit[!(node->parent_bit_packed & 1)], + lockdep_is_held(lock)); + call_rcu(&node->rcu, node_free_rcu); + if (!free_parent) + return; + if (child) + child->parent_bit_packed = parent->parent_bit_packed; + *(struct allowedips_node **)(parent->parent_bit_packed & ~3UL) = child; + call_rcu(&parent->rcu, node_free_rcu); +} + +static int remove(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, + u8 cidr, struct wg_peer *peer, struct mutex *lock) +{ + struct allowedips_node *node; + + if (unlikely(cidr > bits)) + return -EINVAL; + if (!rcu_access_pointer(*trie) || + !node_placement(*trie, key, cidr, bits, &node, lock) || + peer != rcu_access_pointer(node->peer)) + return 0; + + remove_node(node, lock); + + return 0; +} + void wg_allowedips_init(struct allowedips *table) { table->root4 = table->root6 = NULL; @@ -300,44 +350,38 @@ int wg_allowedips_insert_v6(struct allowedips *table, const struct in6_addr *ip, return add(&table->root6, 128, key, cidr, peer, lock); } +int wg_allowedips_remove_v4(struct allowedips *table, const struct in_addr *ip, + u8 cidr, struct wg_peer *peer, struct mutex *lock) +{ + /* Aligned so it can be passed to fls */ + u8 key[4] __aligned(__alignof(u32)); + + ++table->seq; + swap_endian(key, (const u8 *)ip, 32); + return remove(&table->root4, 32, key, cidr, peer, lock); +} + +int wg_allowedips_remove_v6(struct allowedips *table, const struct in6_addr *ip, + u8 cidr, struct wg_peer *peer, struct mutex *lock) +{ + /* Aligned so it can be passed to fls64 */ + u8 key[16] __aligned(__alignof(u64)); + + ++table->seq; + swap_endian(key, (const u8 *)ip, 128); + return remove(&table->root6, 128, key, cidr, peer, lock); +} + void wg_allowedips_remove_by_peer(struct allowedips *table, struct wg_peer *peer, struct mutex *lock) { - struct allowedips_node *node, *child, **parent_bit, *parent, *tmp; - bool free_parent; + struct allowedips_node *node, *tmp; if (list_empty(&peer->allowedips_list)) return; ++table->seq; - list_for_each_entry_safe(node, tmp, &peer->allowedips_list, peer_list) { - list_del_init(&node->peer_list); - RCU_INIT_POINTER(node->peer, NULL); - if (node->bit[0] && node->bit[1]) - continue; - child = rcu_dereference_protected(node->bit[!rcu_access_pointer(node->bit[0])], - lockdep_is_held(lock)); - if (child) - child->parent_bit_packed = node->parent_bit_packed; - parent_bit = (struct allowedips_node **)(node->parent_bit_packed & ~3UL); - *parent_bit = child; - parent = (void *)parent_bit - - offsetof(struct allowedips_node, bit[node->parent_bit_packed & 1]); - free_parent = !rcu_access_pointer(node->bit[0]) && - !rcu_access_pointer(node->bit[1]) && - (node->parent_bit_packed & 3) <= 1 && - !rcu_access_pointer(parent->peer); - if (free_parent) - child = rcu_dereference_protected( - parent->bit[!(node->parent_bit_packed & 1)], - lockdep_is_held(lock)); - call_rcu(&node->rcu, node_free_rcu); - if (!free_parent) - continue; - if (child) - child->parent_bit_packed = parent->parent_bit_packed; - *(struct allowedips_node **)(parent->parent_bit_packed & ~3UL) = child; - call_rcu(&parent->rcu, node_free_rcu); - } + list_for_each_entry_safe(node, tmp, &peer->allowedips_list, peer_list) + remove_node(node, lock); } int wg_allowedips_read_node(struct allowedips_node *node, u8 ip[16], u8 *cidr) diff --git a/drivers/net/wireguard/allowedips.h b/drivers/net/wireguard/allowedips.h index 2346c797eb4d..931958cb6e10 100644 --- a/drivers/net/wireguard/allowedips.h +++ b/drivers/net/wireguard/allowedips.h @@ -38,6 +38,10 @@ int wg_allowedips_insert_v4(struct allowedips *table, const struct in_addr *ip, u8 cidr, struct wg_peer *peer, struct mutex *lock); int wg_allowedips_insert_v6(struct allowedips *table, const struct in6_addr *ip, u8 cidr, struct wg_peer *peer, struct mutex *lock); +int wg_allowedips_remove_v4(struct allowedips *table, const struct in_addr *ip, + u8 cidr, struct wg_peer *peer, struct mutex *lock); +int wg_allowedips_remove_v6(struct allowedips *table, const struct in6_addr *ip, + u8 cidr, struct wg_peer *peer, struct mutex *lock); void wg_allowedips_remove_by_peer(struct allowedips *table, struct wg_peer *peer, struct mutex *lock); /* The ip input pointer should be __aligned(__alignof(u64))) */ diff --git a/drivers/net/wireguard/netlink.c b/drivers/net/wireguard/netlink.c index f7055180ba4a..386f65042072 100644 --- a/drivers/net/wireguard/netlink.c +++ b/drivers/net/wireguard/netlink.c @@ -46,7 +46,8 @@ static const struct nla_policy peer_policy[WGPEER_A_MAX + 1] = { static const struct nla_policy allowedip_policy[WGALLOWEDIP_A_MAX + 1] = { [WGALLOWEDIP_A_FAMILY] = { .type = NLA_U16 }, [WGALLOWEDIP_A_IPADDR] = NLA_POLICY_MIN_LEN(sizeof(struct in_addr)), - [WGALLOWEDIP_A_CIDR_MASK] = { .type = NLA_U8 } + [WGALLOWEDIP_A_CIDR_MASK] = { .type = NLA_U8 }, + [WGALLOWEDIP_A_FLAGS] = NLA_POLICY_MASK(NLA_U32, __WGALLOWEDIP_F_ALL), }; static struct wg_device *lookup_interface(struct nlattr **attrs, @@ -329,6 +330,7 @@ static int set_port(struct wg_device *wg, u16 port) static int set_allowedip(struct wg_peer *peer, struct nlattr **attrs) { int ret = -EINVAL; + u32 flags = 0; u16 family; u8 cidr; @@ -337,19 +339,30 @@ static int set_allowedip(struct wg_peer *peer, struct nlattr **attrs) return ret; family = nla_get_u16(attrs[WGALLOWEDIP_A_FAMILY]); cidr = nla_get_u8(attrs[WGALLOWEDIP_A_CIDR_MASK]); + if (attrs[WGALLOWEDIP_A_FLAGS]) + flags = nla_get_u32(attrs[WGALLOWEDIP_A_FLAGS]); if (family == AF_INET && cidr <= 32 && - nla_len(attrs[WGALLOWEDIP_A_IPADDR]) == sizeof(struct in_addr)) - ret = wg_allowedips_insert_v4( - &peer->device->peer_allowedips, - nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr, peer, - &peer->device->device_update_lock); - else if (family == AF_INET6 && cidr <= 128 && - nla_len(attrs[WGALLOWEDIP_A_IPADDR]) == sizeof(struct in6_addr)) - ret = wg_allowedips_insert_v6( - &peer->device->peer_allowedips, - nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr, peer, - &peer->device->device_update_lock); + nla_len(attrs[WGALLOWEDIP_A_IPADDR]) == sizeof(struct in_addr)) { + if (flags & WGALLOWEDIP_F_REMOVE_ME) + ret = wg_allowedips_remove_v4(&peer->device->peer_allowedips, + nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr, + peer, &peer->device->device_update_lock); + else + ret = wg_allowedips_insert_v4(&peer->device->peer_allowedips, + nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr, + peer, &peer->device->device_update_lock); + } else if (family == AF_INET6 && cidr <= 128 && + nla_len(attrs[WGALLOWEDIP_A_IPADDR]) == sizeof(struct in6_addr)) { + if (flags & WGALLOWEDIP_F_REMOVE_ME) + ret = wg_allowedips_remove_v6(&peer->device->peer_allowedips, + nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr, + peer, &peer->device->device_update_lock); + else + ret = wg_allowedips_insert_v6(&peer->device->peer_allowedips, + nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr, + peer, &peer->device->device_update_lock); + } return ret; } diff --git a/drivers/net/wireguard/selftest/allowedips.c b/drivers/net/wireguard/selftest/allowedips.c index 25de7058701a..41837efa70cb 100644 --- a/drivers/net/wireguard/selftest/allowedips.c +++ b/drivers/net/wireguard/selftest/allowedips.c @@ -460,6 +460,10 @@ static __init struct wg_peer *init_peer(void) wg_allowedips_insert_v##version(&t, ip##version(ipa, ipb, ipc, ipd), \ cidr, mem, &mutex) +#define remove(version, mem, ipa, ipb, ipc, ipd, cidr) \ + wg_allowedips_remove_v##version(&t, ip##version(ipa, ipb, ipc, ipd), \ + cidr, mem, &mutex) + #define maybe_fail() do { \ ++i; \ if (!_s) { \ @@ -585,6 +589,50 @@ bool __init wg_allowedips_selftest(void) test_negative(4, a, 192, 0, 0, 0); test_negative(4, a, 255, 0, 0, 0); + insert(4, a, 1, 0, 0, 0, 32); + insert(4, a, 192, 0, 0, 0, 24); + insert(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef, 128); + insert(6, a, 0x24446800, 0xf0e40800, 0xeeaebeef, 0, 98); + test(4, a, 1, 0, 0, 0); + test(4, a, 192, 0, 0, 1); + test(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef); + test(6, a, 0x24446800, 0xf0e40800, 0xeeaebeef, 0x10101010); + /* Must be an exact match to remove */ + remove(4, a, 192, 0, 0, 0, 32); + test(4, a, 192, 0, 0, 1); + /* NULL peer should have no effect and return 0 */ + test_boolean(!remove(4, NULL, 192, 0, 0, 0, 24)); + test(4, a, 192, 0, 0, 1); + /* different peer should have no effect and return 0 */ + test_boolean(!remove(4, b, 192, 0, 0, 0, 24)); + test(4, a, 192, 0, 0, 1); + /* invalid CIDR should have no effect and return -EINVAL */ + test_boolean(remove(4, b, 192, 0, 0, 0, 33) == -EINVAL); + test(4, a, 192, 0, 0, 1); + remove(4, a, 192, 0, 0, 0, 24); + test_negative(4, a, 192, 0, 0, 1); + remove(4, a, 1, 0, 0, 0, 32); + test_negative(4, a, 1, 0, 0, 0); + /* Must be an exact match to remove */ + remove(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef, 96); + test(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef); + /* NULL peer should have no effect and return 0 */ + test_boolean(!remove(6, NULL, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef, 128)); + test(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef); + /* different peer should have no effect and return 0 */ + test_boolean(!remove(6, b, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef, 128)); + test(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef); + /* invalid CIDR should have no effect and return -EINVAL */ + test_boolean(remove(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef, 129) == -EINVAL); + test(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef); + remove(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef, 128); + test_negative(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef); + /* Must match the peer to remove */ + remove(6, b, 0x24446800, 0xf0e40800, 0xeeaebeef, 0, 98); + test(6, a, 0x24446800, 0xf0e40800, 0xeeaebeef, 0x10101010); + remove(6, a, 0x24446800, 0xf0e40800, 0xeeaebeef, 0, 98); + test_negative(6, a, 0x24446800, 0xf0e40800, 0xeeaebeef, 0x10101010); + wg_allowedips_free(&t, &mutex); wg_allowedips_init(&t); insert(4, a, 192, 168, 0, 0, 16); diff --git a/include/uapi/linux/wireguard.h b/include/uapi/linux/wireguard.h index ae88be14c947..8c26391196d5 100644 --- a/include/uapi/linux/wireguard.h +++ b/include/uapi/linux/wireguard.h @@ -101,6 +101,10 @@ * WGALLOWEDIP_A_FAMILY: NLA_U16 * WGALLOWEDIP_A_IPADDR: struct in_addr or struct in6_addr * WGALLOWEDIP_A_CIDR_MASK: NLA_U8 + * WGALLOWEDIP_A_FLAGS: NLA_U32, WGALLOWEDIP_F_REMOVE_ME if + * the specified IP should be removed; + * otherwise, this IP will be added if + * it is not already present. * 0: NLA_NESTED * ... * 0: NLA_NESTED @@ -184,11 +188,16 @@ enum wgpeer_attribute { }; #define WGPEER_A_MAX (__WGPEER_A_LAST - 1) +enum wgallowedip_flag { + WGALLOWEDIP_F_REMOVE_ME = 1U << 0, + __WGALLOWEDIP_F_ALL = WGALLOWEDIP_F_REMOVE_ME +}; enum wgallowedip_attribute { WGALLOWEDIP_A_UNSPEC, WGALLOWEDIP_A_FAMILY, WGALLOWEDIP_A_IPADDR, WGALLOWEDIP_A_CIDR_MASK, + WGALLOWEDIP_A_FLAGS, __WGALLOWEDIP_A_LAST }; #define WGALLOWEDIP_A_MAX (__WGALLOWEDIP_A_LAST - 1) diff --git a/tools/testing/selftests/wireguard/netns.sh b/tools/testing/selftests/wireguard/netns.sh index 55500f901fbc..70248c77ce11 100755 --- a/tools/testing/selftests/wireguard/netns.sh +++ b/tools/testing/selftests/wireguard/netns.sh @@ -611,6 +611,38 @@ n0 wg set wg0 peer "$pub2" allowed-ips "$allowedips" } < <(n0 wg show wg0 allowed-ips) ip0 link del wg0 +# Test IP removal +allowedips=( ) +for i in {1..197}; do + allowedips+=( 192.168.0.$i ) + allowedips+=( abcd::$i ) +done +saved_ifs="$IFS" +IFS=, +allowedips="${allowedips[*]}" +IFS="$saved_ifs" +ip0 link add wg0 type wireguard +n0 wg set wg0 peer "$pub1" allowed-ips "$allowedips" +pub1_hex=$(echo "$pub1" | base64 -d | xxd -p -c 50) +n0 wg set wg0 peer "$pub1" allowed-ips -192.168.0.1/32,-192.168.0.20/32,-192.168.0.100/32,-abcd::1/128,-abcd::20/128,-abcd::100/128 +n0 wg show wg0 allowed-ips +{ + read -r pub allowedips + [[ $pub == "$pub1" ]] + i=0 + for ip in $allowedips; do + [[ "$ip" != "192.168.0.1" ]] + [[ "$ip" != "192.168.0.20" ]] + [[ "$ip" != "192.168.0.100" ]] + [[ "$ip" != "abcd::1" ]] + [[ "$ip" != "abcd::20" ]] + [[ "$ip" != "abcd::100" ]] + ((++i)) + done + ((i == 388)) +} < <(n0 wg show wg0 allowed-ips) +ip0 link del wg0 + ! n0 wg show doesnotexist || false ip0 link add wg0 type wireguard -- 2.43.0 From Jason at zx2c4.com Tue May 20 20:14:38 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Tue, 20 May 2025 22:14:38 +0200 Subject: [RESEND PATCH v1 wireguard-tools] ipc: linux: Support incremental allowed ips updates In-Reply-To: <20250517192955.594735-1-jordan@jrife.io> References: <20250517192955.594735-1-jordan@jrife.io> Message-ID: On Sat, May 17, 2025 at 12:29:51PM -0700, Jordan Rife wrote: > Extend the interface of `wg set` to leverage the WGALLOWEDIP_F_REMOVE_ME > flag, a direct way of removing a single allowed ip from a peer, > allowing for incremental updates to a peer's configuration. By default, > allowed-ips fully replaces a peer's allowed ips using > WGPEER_REPLACE_ALLOWEDIPS under the hood. When '+' or '-' is prepended > to any ip in the list, wg clears WGPEER_F_REPLACE_ALLOWEDIPS and sets > the WGALLOWEDIP_F_REMOVE_ME flag on any ip prefixed with '-'. > > $ wg set wg0 peer allowed-ips +192.168.88.0/24,-192.168.0.1/32 > > This command means "add 192.168.88.0/24 to this peer's allowed ips if > not present, and remove 192.168.0.1/32 if present". > > Use -isystem so that headers in uapi/ take precedence over system > headers; otherwise, the build will fail on systems running kernels > without the WGALLOWEDIP_F_REMOVE_ME flag. > > Note that this patch is meant to be merged alongside the kernel patch > that introduces the flag. Merged here: https://git.zx2c4.com/wireguard-tools/commit/?id=0788f90810efde88cfa07ed96e7eca77c7f2eedd With a followup here: https://git.zx2c4.com/wireguard-tools/commit/?id=dce8ac6e2fa30f8b07e84859f244f81b3c6b2353 Sorry for the delay. Next, the kernel changes. Regards, Jason From Jason at zx2c4.com Tue May 20 21:10:30 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Tue, 20 May 2025 23:10:30 +0200 Subject: [RESEND PATCH v1 wireguard-tools] ipc: linux: Support incremental allowed ips updates In-Reply-To: References: <20250517192955.594735-1-jordan@jrife.io> Message-ID: On Tue, May 20, 2025 at 10:14:38PM +0200, Jason A. Donenfeld wrote: > On Sat, May 17, 2025 at 12:29:51PM -0700, Jordan Rife wrote: > > Extend the interface of `wg set` to leverage the WGALLOWEDIP_F_REMOVE_ME > > flag, a direct way of removing a single allowed ip from a peer, > > allowing for incremental updates to a peer's configuration. By default, > > allowed-ips fully replaces a peer's allowed ips using > > WGPEER_REPLACE_ALLOWEDIPS under the hood. When '+' or '-' is prepended > > to any ip in the list, wg clears WGPEER_F_REPLACE_ALLOWEDIPS and sets > > the WGALLOWEDIP_F_REMOVE_ME flag on any ip prefixed with '-'. > > > > $ wg set wg0 peer allowed-ips +192.168.88.0/24,-192.168.0.1/32 > > > > This command means "add 192.168.88.0/24 to this peer's allowed ips if > > not present, and remove 192.168.0.1/32 if present". > > > > Use -isystem so that headers in uapi/ take precedence over system > > headers; otherwise, the build will fail on systems running kernels > > without the WGALLOWEDIP_F_REMOVE_ME flag. > > > > Note that this patch is meant to be merged alongside the kernel patch > > that introduces the flag. > > Merged here: > https://git.zx2c4.com/wireguard-tools/commit/?id=0788f90810efde88cfa07ed96e7eca77c7f2eedd > > With a followup here: > https://git.zx2c4.com/wireguard-tools/commit/?id=dce8ac6e2fa30f8b07e84859f244f81b3c6b2353 Also, https://git.zx2c4.com/wireguard-go/commit/?id=256bcbd70d5b4eaae2a9f21a9889498c0f89041c From Jason at zx2c4.com Tue May 20 21:47:56 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Tue, 20 May 2025 23:47:56 +0200 Subject: [RESEND PATCH v3 net-next] wireguard: allowedips: Add WGALLOWEDIP_F_REMOVE_ME flag In-Reply-To: <20250517192955.594735-2-jordan@jrife.io> References: <20250517192955.594735-1-jordan@jrife.io> <20250517192955.594735-2-jordan@jrife.io> Message-ID: Hi Jakub, Jordan, On Sat, May 17, 2025 at 12:29:52PM -0700, Jordan Rife wrote: > * Use NLA_POLICY_MASK for WGALLOWEDIP_A_FLAGS validation (Jakub). [...] > + [WGALLOWEDIP_A_FLAGS] = NLA_POLICY_MASK(NLA_U32, __WGALLOWEDIP_F_ALL), I wonder... Can we update, in a separate patch, these to also use NLA_POLICY_MASK? ... [WGDEVICE_A_FLAGS] = { .type = NLA_U32 }, ... [WGPEER_A_FLAGS] = { .type = NLA_U32 }, ... Some consistency would be nice. Regards, Jason From Jason at zx2c4.com Tue May 20 21:50:36 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Tue, 20 May 2025 23:50:36 +0200 Subject: [RESEND PATCH v3 net-next] wireguard: allowedips: Add WGALLOWEDIP_F_REMOVE_ME flag In-Reply-To: <20250517192955.594735-2-jordan@jrife.io> References: <20250517192955.594735-1-jordan@jrife.io> <20250517192955.594735-2-jordan@jrife.io> Message-ID: On Sat, May 17, 2025 at 12:29:52PM -0700, Jordan Rife wrote: > +pub1_hex=$(echo "$pub1" | base64 -d | xxd -p -c 50) There's no xxd or base64 commands on the test harness vm, but also this line isn't used, so I'll just nix it on commit. From Jason at zx2c4.com Tue May 20 22:00:00 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Wed, 21 May 2025 00:00:00 +0200 Subject: [RESEND PATCH v3 net-next] wireguard: allowedips: Add WGALLOWEDIP_F_REMOVE_ME flag In-Reply-To: References: <20250517192955.594735-1-jordan@jrife.io> <20250517192955.594735-2-jordan@jrife.io> Message-ID: On Tue, May 20, 2025 at 11:47:56PM +0200, Jason A. Donenfeld wrote: > Hi Jakub, Jordan, > > On Sat, May 17, 2025 at 12:29:52PM -0700, Jordan Rife wrote: > > * Use NLA_POLICY_MASK for WGALLOWEDIP_A_FLAGS validation (Jakub). > [...] > > + [WGALLOWEDIP_A_FLAGS] = NLA_POLICY_MASK(NLA_U32, __WGALLOWEDIP_F_ALL), > > I wonder... Can we update, in a separate patch, these to also use > NLA_POLICY_MASK? > > ... > [WGDEVICE_A_FLAGS] = { .type = NLA_U32 }, > ... > [WGPEER_A_FLAGS] = { .type = NLA_U32 }, > ... > > Some consistency would be nice. Perhaps I'll commit something like this? >From 22b6d15ad2a2e38bc80ebf65694106ff554b572f Mon Sep 17 00:00:00 2001 From: "Jason A. Donenfeld" Date: Tue, 20 May 2025 23:56:18 +0200 Subject: [PATCH] wireguard: netlink: use NLA_POLICY_MASK where possible Rather than manually validating flags against the various __ALL_* constants, put this in the netlink policy description and have the upper layer machinery check it for us. Signed-off-by: Jason A. Donenfeld --- drivers/net/wireguard/netlink.c | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/drivers/net/wireguard/netlink.c b/drivers/net/wireguard/netlink.c index f7055180ba4a..b82266da949a 100644 --- a/drivers/net/wireguard/netlink.c +++ b/drivers/net/wireguard/netlink.c @@ -24,7 +24,7 @@ static const struct nla_policy device_policy[WGDEVICE_A_MAX + 1] = { [WGDEVICE_A_IFNAME] = { .type = NLA_NUL_STRING, .len = IFNAMSIZ - 1 }, [WGDEVICE_A_PRIVATE_KEY] = NLA_POLICY_EXACT_LEN(NOISE_PUBLIC_KEY_LEN), [WGDEVICE_A_PUBLIC_KEY] = NLA_POLICY_EXACT_LEN(NOISE_PUBLIC_KEY_LEN), - [WGDEVICE_A_FLAGS] = { .type = NLA_U32 }, + [WGDEVICE_A_FLAGS] = { .type = NLA_POLICY_MASK(NLA_U32, __WGDEVICE_F_ALL) }, [WGDEVICE_A_LISTEN_PORT] = { .type = NLA_U16 }, [WGDEVICE_A_FWMARK] = { .type = NLA_U32 }, [WGDEVICE_A_PEERS] = { .type = NLA_NESTED } @@ -33,7 +33,7 @@ static const struct nla_policy device_policy[WGDEVICE_A_MAX + 1] = { static const struct nla_policy peer_policy[WGPEER_A_MAX + 1] = { [WGPEER_A_PUBLIC_KEY] = NLA_POLICY_EXACT_LEN(NOISE_PUBLIC_KEY_LEN), [WGPEER_A_PRESHARED_KEY] = NLA_POLICY_EXACT_LEN(NOISE_SYMMETRIC_KEY_LEN), - [WGPEER_A_FLAGS] = { .type = NLA_U32 }, + [WGPEER_A_FLAGS] = { .type = NLA_POLICY_MASK(NLA_U32, __WGPEER_F_ALL) }, [WGPEER_A_ENDPOINT] = NLA_POLICY_MIN_LEN(sizeof(struct sockaddr)), [WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL] = { .type = NLA_U16 }, [WGPEER_A_LAST_HANDSHAKE_TIME] = NLA_POLICY_EXACT_LEN(sizeof(struct __kernel_timespec)), @@ -373,9 +373,6 @@ static int set_peer(struct wg_device *wg, struct nlattr **attrs) if (attrs[WGPEER_A_FLAGS]) flags = nla_get_u32(attrs[WGPEER_A_FLAGS]); - ret = -EOPNOTSUPP; - if (flags & ~__WGPEER_F_ALL) - goto out; ret = -EPFNOSUPPORT; if (attrs[WGPEER_A_PROTOCOL_VERSION]) { @@ -506,9 +503,6 @@ static int wg_set_device(struct sk_buff *skb, struct genl_info *info) if (info->attrs[WGDEVICE_A_FLAGS]) flags = nla_get_u32(info->attrs[WGDEVICE_A_FLAGS]); - ret = -EOPNOTSUPP; - if (flags & ~__WGDEVICE_F_ALL) - goto out; if (info->attrs[WGDEVICE_A_LISTEN_PORT] || info->attrs[WGDEVICE_A_FWMARK]) { struct net *net; -- 2.48.1 From Jason at zx2c4.com Tue May 20 22:18:26 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Wed, 21 May 2025 00:18:26 +0200 Subject: [PATCH] wg-quick: add 'dev' to 'ip link add' to avoid keyword conflicts In-Reply-To: <20250505071306.80342-1-trianglesnake2002@gmail.com> References: <20250505071306.80342-1-trianglesnake2002@gmail.com> Message-ID: On Mon, May 05, 2025 at 03:13:06PM +0800, TriangleSnake wrote: > Signed-off-by: TriangleSnake > --- > src/wg-quick/linux.bash | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/wg-quick/linux.bash b/src/wg-quick/linux.bash > index 4193ce5..93df80d 100755 > --- a/src/wg-quick/linux.bash > +++ b/src/wg-quick/linux.bash > @@ -87,7 +87,7 @@ auto_su() { > > add_if() { > local ret > - if ! cmd ip link add "$INTERFACE" type wireguard; then > + if ! cmd ip link add dev "$INTERFACE" type wireguard; then > ret=$? > [[ -e /sys/module/wireguard ]] || ! command -v "${WG_QUICK_USERSPACE_IMPLEMENTATION:-wireguard-go}" >/dev/null && exit $ret > echo "[!] Missing WireGuard kernel module. Falling back to slow userspace implementation." >&2 Applied, thanks. Jason From Jason at zx2c4.com Tue May 20 22:59:30 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Wed, 21 May 2025 00:59:30 +0200 Subject: [PATCH 1/1] src/config.c: handle strdup failure In-Reply-To: <20240709090728.872-2-chipitsine@gmail.com> References: <20240709090728.872-1-chipitsine@gmail.com> <20240709090728.872-2-chipitsine@gmail.com> Message-ID: Applied, thanks for the patch. From Jason at zx2c4.com Tue May 20 23:04:06 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Wed, 21 May 2025 01:04:06 +0200 Subject: [PATCH] wg-quick: linux: check iptables existance prior trying restore In-Reply-To: <20240204101029.1805-1-athoik@gmail.com> References: <20240204101029.1805-1-athoik@gmail.com> Message-ID: Shouldn't this be an error if it can't apply the rule, though? What it's doing is somewhat important. Not sure I like the idea of failing open. From Jason at zx2c4.com Tue May 20 23:21:55 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Wed, 21 May 2025 01:21:55 +0200 Subject: [ANNOUNCE] wireguard-tools v1.0.20250521 released Message-ID: <5343c669ed36696c5dde065a1675a298@thinkpad.zx2c4.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hello, A new version, v1.0.20250521, of wireguard-tools has been tagged in the git repository, containing various required userspace utilities, such as the wg(8) and wg-quick(8) commands and documentation. == Changes == * ipc: use more clever PnP enumerator * embeddable-wg-library: add named wg_endpoint union * reresolve-dns: use $EPOCHSECONDS instead of $(date +%s) * wg-quick: android: use right regex for host-vs-IP * global: dual license core files as MIT for FreeBSD * wg-quick: linux: prevent traffic from momentarily leaking into tunnel * ipc: freebsd: move if_wg path to reflect new in-tree location * show: apply const to right part of pointer * ipc: freebsd: avoid leaking memory in kernel_get_device() * ipc: freebsd: NULL out some freed memory in kernel_set_device() * show: fix show all endpoints output * man: set private key in PreUp rather than PostUp * ipc: linux: enforce IFNAMSIZ limit * ipc: freebsd: use AF_LOCAL for the control socket * wg-quick: linux: add 'dev' to 'ip link add' to avoid keyword conflicts * config: handle strdup failure * wg-quick: run PreUp hook after creating interface This makes PreUp actually useful. * ipc: linux: support incremental allowed ips updates * ipc: add stub for allowedips flags on other platforms These two are neat and worth mentioning. Soon it will be possible to run: # wg set wg0 peer ... allowed-ips +1.2.3.4/32,-2.4.6.8/32 in order to add and remove allowedips without first clearing all of them. This release contains commits from: Jason A. Donenfeld, Kyle Evans, TriangleSnake, Tom Yan, Mikael Magnusson, Jordan Rife, Ilia Shipitsin, Dmitry Selivanov, and Daniel Gr?ber. As always, the source is available at https://git.zx2c4.com/wireguard-tools/ and information about the project is available at https://www.wireguard.com/ . This release is available in compressed tarball form here: https://git.zx2c4.com/wireguard-tools/snapshot/wireguard-tools-1.0.20250521.tar.xz SHA2-256: b6f2628b85b1b23cc06517ec9c74f82d52c4cdbd020f3dd2f00c972a1782950e A PGP signature of that file decompressed is available here: https://git.zx2c4.com/wireguard-tools/snapshot/wireguard-tools-1.0.20250521.tar.asc Signing key: AB9942E6D4A4CFC3412620A749FC7012A5DE03AE Remember to unxz the tarball before verifying the signature. If you're a package maintainer, please bump your package version. If you're a user, the WireGuard team welcomes any and all feedback on this latest version. Finally, WireGuard development thrives on donations. By popular demand, we have a webpage for this: https://www.wireguard.com/donations/ Thank you, Jason Donenfeld -----BEGIN PGP SIGNATURE----- iQJEBAEBCAAuFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmgtDoIQHGphc29uQHp4 MmM0LmNvbQAKCRBJ/HASpd4DrqIuEACLc6L9YTFeIdMux9b4z2D7EiiX2+7+skp5 S0fEh07EelPTbPFPbLswvogyJrOK7SWXfLdeHgPSSxBFH95OZUhdnorM8XfOC5ST VmMktVy9UZwxmrCVDd401rXRdZpkv4Dfob9iDU7McOcLoV+QnugEofljRuQbQExy LB7e99hj06TP8eFBsQydfxv8OGSHu9H61ypVHCESZFFheFCfcJWjo8OzQu+CuhFl fIklPiiAQL3ZQQA+v7r6wm93QuYwyQKfwNUC4F0quwwDMtWpgKs2hXRxMa93fZ3W f2GlemJVWO5bj7VQKf6qZpILkCOLeIXPvAi8PIZX5QOvX2Yon88KiCw2caZqrm67 XWoqD75c1xZX78wHR0xniyAi5w12Hzpcnci7mBm7RBNBDN15ZQY01V5t4H6VulM2 uOUGspm/rcK1tCYDFV9DWAFon6881JJllXOe30n/dBsE0mNL2r4k5R5qCNxbWMmB vp1qP6Thu3XO1vLo42A5EmJEbu04blCSVi3BKfby/21VTzCbrgufvv3JKAauTFEz mmNwP1OW927/4C0aypkxUSkAv8aOwfrC+VWX5UAiFm9Zirc1nBgGFGjxZDQ+biV5 FT4zoEfqQQJ2pwZQq2leNKiFQE7pIcbvmY1O7LB1WXA92Ba4pGk7m1fP0Q8jLl7L YBEORDlgig== =de0O -----END PGP SIGNATURE----- From Jason at zx2c4.com Tue May 20 23:25:04 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Wed, 21 May 2025 01:25:04 +0200 Subject: [RESEND PATCH v3 net-next] wireguard: allowedips: Add WGALLOWEDIP_F_REMOVE_ME flag In-Reply-To: <20250517192955.594735-2-jordan@jrife.io> References: <20250517192955.594735-1-jordan@jrife.io> <20250517192955.594735-2-jordan@jrife.io> Message-ID: On Sat, May 17, 2025 at 12:29:52PM -0700, Jordan Rife wrote: > Introduce a new flag called WGALLOWEDIP_F_REMOVE_ME which in the same > way that WGPEER_F_REMOVE_ME allows a user to remove a single peer from > a WireGuard device's configuration allows a user to remove an ip from a > peer's set of allowed ips. This enables incremental updates to a > device's configuration without any connectivity blips or messy > workarounds. Applied as: https://git.zx2c4.com/wireguard-linux/commit/?h=devel&id=8f697b71a615c5dfff98fe93554036a2643d1976 And the userspace changes have been released already: https://lists.zx2c4.com/pipermail/wireguard/2025-May/008789.html Thanks for this! And sorry it took so long to get it applied. I'll send this up via net-next in a few days after a bunch of testing. Jason From simon at rozman.si Wed May 21 07:25:51 2025 From: simon at rozman.si (Simon Rozman) Date: Wed, 21 May 2025 07:25:51 +0000 Subject: Crash on Windows ARM64 when Import and potential fix In-Reply-To: References: Message-ID: Hi, > There are numerous reports that the import function causes a crash on > Windows Arm64. (I believe it is actually the file selector that causes > the crash) > > https://www.reddit.com/r/WireGuard/comments/kwqnb5/wireguard_client_cras > hes_when_trying_to_add/ > Thanks for reaching out. We already have a patch for this in the wireguard-windows repo: https://git.zx2c4.com/wireguard-windows/commit/?id=8e6558eba6665b51de35779bffa46803dbc4c10d It is pending review and official release. Best regards, Simon From simon at rozman.si Wed May 21 08:39:50 2025 From: simon at rozman.si (Simon Rozman) Date: Wed, 21 May 2025 08:39:50 +0000 Subject: Race-condition when removing instance of WinTUN adapter? In-Reply-To: <3026a6c9-ee83-43b4-97a1-0904d85b8ad7@app.fastmail.com> References: <3026a6c9-ee83-43b4-97a1-0904d85b8ad7@app.fastmail.com> Message-ID: Hi, > We are receiving variations of the following errors on multiple Windows > machines from customers. The UUID is what we are setting as the tunnel > adapter ID. > > ``` > Spawning native process to remove instance Error executing worker > process: "SWD\WINTUN\{E9245BC1-B8C1-44CA-AB1D-C6AAD4F13B9C}": The system > cannot find the path specified. (Code 0x00000003) ``` > > and > > ``` > Spawning native process to remove instance Failed to create process: > rundll32 > "C:\Windows\Temp\ab11b60bba2fb3bcc9a355e9e3a89003522ede647bc6c00704e734a > 8447c1ce5\setupapihost.dll",RemoveInstance "SWD\WINTUN\{E9245BC1-B8C1- > 44CA-AB1D-C6AAD4F13B9C}": Het systeem kan het opgegeven pad niet vinden. > (Code 0x00000003) ``` > > (Error message here is Dutch because that particular customer is in the > Netherlands. It means "The system cannot find the path specified") > > From reading the source code at [0], I suspect that this is "only" a > race condition where the adapter is simply no longer present at the time > we want to remove it? I am not very well versed at C code which is why I > figured I'd ask here before attempting to patch the code. Would it make > sense to add a check for this particular error code and not fail as a > result, making the function idempotent? Exactly: failure to remove something that is already removed is not a failure. I've updated the source code to ignore this error when deleting the adapter instance: https://git.zx2c4.com/wintun/commit/?id=28627e00f0889f19ecb0abbf013e04bc897ab9f5 Thank you for reporting this. Best regards, Simon From Jason at zx2c4.com Wed May 21 21:08:43 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Wed, 21 May 2025 23:08:43 +0200 Subject: Incorrect computation of the MTU in wg-quick In-Reply-To: References: Message-ID: > I would like to report a behavior that seems to be incorrect in the way > wg-quick computes the MTU to assign to a wireguard interface: > https://git.zx2c4.com/wireguard-tools/tree/src/wg-quick/linux.bash#n125 > In this block, wg-quick goes through every endpoint it knows about, > gets the mtu of the route to reach the endpoint, and takes the highest > value among all the computed values. > However it appears to me that the chosen value should instead be the > lowest among all endpoints rather than the highest. > > As an example, if I declare myself (localhost) as an endpoint (it may or > may not be supported, but that?s how I found about this issue), then the > mtu will be set to 65456 (65536-80) which is higher than what the other > endpoints are able to manage, and I?ll only be able to contact myself > properly. Thanks for the report. Fixed: https://git.zx2c4.com/wireguard-tools/commit/?id=5150cd647073be1f1c12688aef291bdf17970154 Let me know if that looks okay to you. Jason From jordan at jrife.io Wed May 21 23:02:19 2025 From: jordan at jrife.io (Jordan Rife) Date: Wed, 21 May 2025 16:02:19 -0700 Subject: [RESEND PATCH v1 wireguard-tools] ipc: linux: Support incremental allowed ips updates In-Reply-To: References: <20250517192955.594735-1-jordan@jrife.io> Message-ID: > > Merged here: > > https://git.zx2c4.com/wireguard-tools/commit/?id=0788f90810efde88cfa07ed96e7eca77c7f2eedd > > > > With a followup here: > > https://git.zx2c4.com/wireguard-tools/commit/?id=dce8ac6e2fa30f8b07e84859f244f81b3c6b2353 > > Also, > https://git.zx2c4.com/wireguard-go/commit/?id=256bcbd70d5b4eaae2a9f21a9889498c0f89041c Nice, cool to see this extended to wireguard-go as well. As a follow up, I was planning to also create a patch for golang.zx2c4.com/wireguard/wgctrl so the feature can be used from there too. Jordan From jordan at jrife.io Wed May 21 23:11:14 2025 From: jordan at jrife.io (Jordan Rife) Date: Wed, 21 May 2025 16:11:14 -0700 Subject: [RESEND PATCH v3 net-next] wireguard: allowedips: Add WGALLOWEDIP_F_REMOVE_ME flag In-Reply-To: References: <20250517192955.594735-1-jordan@jrife.io> <20250517192955.594735-2-jordan@jrife.io> Message-ID: On Wed, May 21, 2025 at 12:00:00AM +0200, Jason A. Donenfeld wrote: > On Tue, May 20, 2025 at 11:47:56PM +0200, Jason A. Donenfeld wrote: > > Hi Jakub, Jordan, > > > > On Sat, May 17, 2025 at 12:29:52PM -0700, Jordan Rife wrote: > > > * Use NLA_POLICY_MASK for WGALLOWEDIP_A_FLAGS validation (Jakub). > > [...] > > > + [WGALLOWEDIP_A_FLAGS] = NLA_POLICY_MASK(NLA_U32, __WGALLOWEDIP_F_ALL), > > > > I wonder... Can we update, in a separate patch, these to also use > > NLA_POLICY_MASK? > > > > ... > > [WGDEVICE_A_FLAGS] = { .type = NLA_U32 }, > > ... > > [WGPEER_A_FLAGS] = { .type = NLA_U32 }, > > ... > > > > Some consistency would be nice. > > Perhaps I'll commit something like this? > > From 22b6d15ad2a2e38bc80ebf65694106ff554b572f Mon Sep 17 00:00:00 2001 > From: "Jason A. Donenfeld" > Date: Tue, 20 May 2025 23:56:18 +0200 > Subject: [PATCH] wireguard: netlink: use NLA_POLICY_MASK where possible > > Rather than manually validating flags against the various __ALL_* > constants, put this in the netlink policy description and have the upper > layer machinery check it for us. > > Signed-off-by: Jason A. Donenfeld > --- > drivers/net/wireguard/netlink.c | 10 ++-------- > 1 file changed, 2 insertions(+), 8 deletions(-) > > diff --git a/drivers/net/wireguard/netlink.c b/drivers/net/wireguard/netlink.c > index f7055180ba4a..b82266da949a 100644 > --- a/drivers/net/wireguard/netlink.c > +++ b/drivers/net/wireguard/netlink.c > @@ -24,7 +24,7 @@ static const struct nla_policy device_policy[WGDEVICE_A_MAX + 1] = { > [WGDEVICE_A_IFNAME] = { .type = NLA_NUL_STRING, .len = IFNAMSIZ - 1 }, > [WGDEVICE_A_PRIVATE_KEY] = NLA_POLICY_EXACT_LEN(NOISE_PUBLIC_KEY_LEN), > [WGDEVICE_A_PUBLIC_KEY] = NLA_POLICY_EXACT_LEN(NOISE_PUBLIC_KEY_LEN), > - [WGDEVICE_A_FLAGS] = { .type = NLA_U32 }, > + [WGDEVICE_A_FLAGS] = { .type = NLA_POLICY_MASK(NLA_U32, __WGDEVICE_F_ALL) }, > [WGDEVICE_A_LISTEN_PORT] = { .type = NLA_U16 }, > [WGDEVICE_A_FWMARK] = { .type = NLA_U32 }, > [WGDEVICE_A_PEERS] = { .type = NLA_NESTED } > @@ -33,7 +33,7 @@ static const struct nla_policy device_policy[WGDEVICE_A_MAX + 1] = { > static const struct nla_policy peer_policy[WGPEER_A_MAX + 1] = { > [WGPEER_A_PUBLIC_KEY] = NLA_POLICY_EXACT_LEN(NOISE_PUBLIC_KEY_LEN), > [WGPEER_A_PRESHARED_KEY] = NLA_POLICY_EXACT_LEN(NOISE_SYMMETRIC_KEY_LEN), > - [WGPEER_A_FLAGS] = { .type = NLA_U32 }, > + [WGPEER_A_FLAGS] = { .type = NLA_POLICY_MASK(NLA_U32, __WGPEER_F_ALL) }, > [WGPEER_A_ENDPOINT] = NLA_POLICY_MIN_LEN(sizeof(struct sockaddr)), > [WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL] = { .type = NLA_U16 }, > [WGPEER_A_LAST_HANDSHAKE_TIME] = NLA_POLICY_EXACT_LEN(sizeof(struct __kernel_timespec)), > @@ -373,9 +373,6 @@ static int set_peer(struct wg_device *wg, struct nlattr **attrs) > > if (attrs[WGPEER_A_FLAGS]) > flags = nla_get_u32(attrs[WGPEER_A_FLAGS]); > - ret = -EOPNOTSUPP; > - if (flags & ~__WGPEER_F_ALL) > - goto out; > > ret = -EPFNOSUPPORT; > if (attrs[WGPEER_A_PROTOCOL_VERSION]) { > @@ -506,9 +503,6 @@ static int wg_set_device(struct sk_buff *skb, struct genl_info *info) > > if (info->attrs[WGDEVICE_A_FLAGS]) > flags = nla_get_u32(info->attrs[WGDEVICE_A_FLAGS]); > - ret = -EOPNOTSUPP; > - if (flags & ~__WGDEVICE_F_ALL) > - goto out; > > if (info->attrs[WGDEVICE_A_LISTEN_PORT] || info->attrs[WGDEVICE_A_FWMARK]) { > struct net *net; > -- > 2.48.1 This changes the error code returned in userspace in these cases from EOPNOTSUPP to EINVAL I think, but if there's nothing relying on that behavior then it seems like a nice cleanup to me. Jordan From jordan at jrife.io Wed May 21 23:13:19 2025 From: jordan at jrife.io (Jordan Rife) Date: Wed, 21 May 2025 16:13:19 -0700 Subject: [RESEND PATCH v3 net-next] wireguard: allowedips: Add WGALLOWEDIP_F_REMOVE_ME flag In-Reply-To: References: <20250517192955.594735-1-jordan@jrife.io> <20250517192955.594735-2-jordan@jrife.io> Message-ID: On Wed, May 21, 2025 at 01:25:04AM +0200, Jason A. Donenfeld wrote: > On Sat, May 17, 2025 at 12:29:52PM -0700, Jordan Rife wrote: > > Introduce a new flag called WGALLOWEDIP_F_REMOVE_ME which in the same > > way that WGPEER_F_REMOVE_ME allows a user to remove a single peer from > > a WireGuard device's configuration allows a user to remove an ip from a > > peer's set of allowed ips. This enables incremental updates to a > > device's configuration without any connectivity blips or messy > > workarounds. > > Applied as: > https://git.zx2c4.com/wireguard-linux/commit/?h=devel&id=8f697b71a615c5dfff98fe93554036a2643d1976 > > And the userspace changes have been released already: > https://lists.zx2c4.com/pipermail/wireguard/2025-May/008789.html > > Thanks for this! And sorry it took so long to get it applied. I'll send > this up via net-next in a few days after a bunch of testing. > > Jason No problem, we all get busy :). Thanks for applying. Jordan From Jason at zx2c4.com Wed May 21 23:51:39 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Thu, 22 May 2025 01:51:39 +0200 Subject: [RESEND PATCH v1 wireguard-tools] ipc: linux: Support incremental allowed ips updates In-Reply-To: References: <20250517192955.594735-1-jordan@jrife.io> Message-ID: On Thu, May 22, 2025 at 1:02?AM Jordan Rife wrote: > > > Merged here: > > > https://git.zx2c4.com/wireguard-tools/commit/?id=0788f90810efde88cfa07ed96e7eca77c7f2eedd > > > > > > With a followup here: > > > https://git.zx2c4.com/wireguard-tools/commit/?id=dce8ac6e2fa30f8b07e84859f244f81b3c6b2353 > > > > Also, > > https://git.zx2c4.com/wireguard-go/commit/?id=256bcbd70d5b4eaae2a9f21a9889498c0f89041c > > Nice, cool to see this extended to wireguard-go as well. As a follow up, > I was planning to also create a patch for golang.zx2c4.com/wireguard/wgctrl > so the feature can be used from there too. Wonderful, please do! Looking forward to merging that. There's already an open PR in FreeBSD too. From liuhangbin at gmail.com Thu May 22 04:34:44 2025 From: liuhangbin at gmail.com (Hangbin Liu) Date: Thu, 22 May 2025 04:34:44 +0000 Subject: [PATCHv6 net-next 0/2] wireguard: selftests: use nftables for testing In-Reply-To: <20250408081652.1330-1-liuhangbin@gmail.com> References: <20250408081652.1330-1-liuhangbin@gmail.com> Message-ID: Hi Jason, I just saw this patch set is not applied to wireguard tree. Did I missed any change request? Should I repost the patch? BTW, what prefix should I use when the target is wireguard next? [PATCH wireguard-next] ? Thanks Hangbin On Tue, Apr 08, 2025 at 08:16:50AM +0000, Hangbin Liu wrote: > This patch set convert the wireguard selftest to nftables, as iptables is > deparated and nftables is the default framework of most releases. > > v6: fix typo in patch 1/2. Update the description (Phil Sutter) > v5: remove the counter in nft rules and link nft statically (Jason A. Donenfeld) > v4: no update, just re-send > v3: drop iptables directly (Jason A. Donenfeld) > Also convert to using nft for qemu testing (Jason A. Donenfeld) > v2: use one nft table for testing (Phil Sutter) > > Hangbin Liu (2): > wireguard: selftests: convert iptables to nft > wireguard: selftests: update to using nft for qemu test > > tools/testing/selftests/wireguard/netns.sh | 29 +++++++++------ > .../testing/selftests/wireguard/qemu/Makefile | 36 ++++++++++++++----- > .../selftests/wireguard/qemu/kernel.config | 7 ++-- > 3 files changed, 49 insertions(+), 23 deletions(-) > > -- > 2.46.0 > From calestyo at scientia.org Thu May 22 22:36:46 2025 From: calestyo at scientia.org (Christoph Anton Mitterer) Date: Fri, 23 May 2025 00:36:46 +0200 Subject: are WG clients expected to automatically handle it when the endpoint is within the AllowedIPs Message-ID: <1a897464d3fb56184b83cb6ac7b4a2407047b10e.camel@scientia.org> (re-posting, now that the list seems to work again) Hey folks. In science/education, many organisations (I could find the total list only in the Android app, but there it seems to be several 1000) use eduVPN to provide VPN access to their users. It comes with a client which, AFAIU, either sets up some OpenVPN or WG VPN. I've previously used the OpenVPN profile files successfully with NetworkManager but now wanted to switch to WG, and again I don't wanna use the eduVPN client, because I think this should be done with the native tools that integrate nicely into the system (e.g. NM for desktop environments, ifupdown/systemd-networkd/etc. for servers). I guess quite a few sites offer two kinds of profiles, "full" (where the VPN is set up so that all traffic goes via it) and "split" (where only the subnets of the respective organisations go via the VPN. For WG and split a provided config looks like: [Interface] MTU = 1392 PrivateKey = blafasl Address = 10.153.154.19/24,2001:4ca0:4fff:2:4::13/96 DNS = 10.156.33.53,129.187.5.1,2001:4ca0::53:1,2001:4ca0::53:2,lmu.de,uni- muenchen.de,mwn.de [Peer] PublicKey = 7Bp04UdAbZDqChLFgm0sJa6YUaIsye0mZ2c0AxKe5RE= AllowedIPs = 10.0.0.0/8,85.208.24.0/22,129.27.124.136/32,129.187.0.0/16,131.159.0.0/ 16,138.244.0.0/15,138.246.0.0/16,141.39.128.0/18,141.39.240.0/20,141.40 .0.0/16,141.84.0.0/16,172.16.0.0/12,192.54.42.0/24,192.55.197.0/24,192. 68.211.0/24,192.68.212.0/24,192.168.0.0/16,193.174.96.0/23,194.94.155.2 24/28,2001:4ca0::/29,2a09:80c0::/29 Endpoint = eduvpn-n14.srv.lrz.de:51820 for full it's effectively the same, except for: AllowedIPs = 0.0.0.0/0,::/0 Using that config with NM fails, for which I've opened [0] which is mostly about the "split" setup and for which there's [1] which is mostly about the full setup. The reason being, that the endpoint has IPs that are also within the AllowedIPs subnet and no special care is taken (well for full, it seems they?re about to handle it [2]), that packets to the endpoint don't go via the tunnel. With wg-quick, full works, but split fails, too, I guess because add_default is only called in the AllowedIPs = 0.0.0.0/0,::/0 case. https://github.com/WireGuard/wireguard-tools/blob/17c78d31c27a3c311a2ff42a881057753c6ef2a4/src/wg-quick/linux.bash#L169-L170 So the question is now, should clients be expected to automatically handle the split case (they apparently are for the full case)... ... or are (split) profiles expected to "simply" (well it could be ugly in practise) provide their AllowedIPs so that it doesn't contain any endpoints. The practical problem with the latter would of course be that the endpoints will typically be within subnets that shall also be tunnelled. Thanks, Chris. [0] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1737 [1] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1521 [2] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2158 From Jason at zx2c4.com Fri May 23 18:28:33 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Fri, 23 May 2025 20:28:33 +0200 Subject: [PATCH] wg: syncconf: also handle psk changes In-Reply-To: <20250314093920.3448871-1-patrick.havelange_ext@softathome.com> References: <20250314093920.3448871-1-patrick.havelange_ext@softathome.com> Message-ID: Your patch is wrong; did you test it? When it detects a mismatch in PSK, it removes the peer, rather than removing the PSK. I implemented a different fix: https://git.zx2c4.com/wireguard-tools/commit/?id=780182e37d2b5981171766b8f31bcefd64da7a43 It seems to work for me, but please test this and let me know. Either way, thanks for the bug report. Thanks, Jason From Jason at zx2c4.com Fri May 23 18:46:36 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Fri, 23 May 2025 20:46:36 +0200 Subject: [PATCH wireguard-tools] wg-quick: escaped # in Pre/PostUp/Down recognised In-Reply-To: <20250115113349.106339-1-robyn@kosching.me> References: <20250115113349.106339-1-robyn@kosching.me> Message-ID: Thanks. Applied with some minor changes here: https://git.zx2c4.com/wireguard-tools/commit/?id=90deacd33da06534ee98d41c6e76e108b35cf077 From Jason at zx2c4.com Fri May 23 19:02:39 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Fri, 23 May 2025 21:02:39 +0200 Subject: wg-quick fails with systemd resolvconf compatibility shim In-Reply-To: References: Message-ID: Thanks for the report. Fixed here: https://git.zx2c4.com/wireguard-tools/commit/?id=d3b40aff964789a2a0533cb7a070592a75a996e3 From git at claire.sharkgirl.ing Sun May 25 08:04:57 2025 From: git at claire.sharkgirl.ing (Claire Elaina) Date: Sun, 25 May 2025 18:04:57 +1000 Subject: [PATCH wireguard-tools] wg-quick: android: add support for {Pre, Post}{Up, Down} hooks Message-ID: <20250525080457.998659-1-git@claire.sharkgirl.ing> --- src/wg-quick/android.c | 96 +++++++++++++++++++++++++++++++++++++++--- 1 file changed, 89 insertions(+), 7 deletions(-) diff --git a/src/wg-quick/android.c b/src/wg-quick/android.c index 1263ee4..8a8df47 100644 --- a/src/wg-quick/android.c +++ b/src/wg-quick/android.c @@ -60,6 +60,15 @@ static void *xcalloc(size_t nmemb, size_t size) exit(errno); } +static void *xrealloc(void *ptr, size_t size) +{ + void *ret = realloc(ptr, size); + if (ret) + return ret; + perror("Error: realloc"); + exit(errno); +} + static void *xstrdup(const char *str) { char *ret = strdup(str); @@ -126,6 +135,27 @@ static void free_command_buffer(struct command_buffer *c) free(c->line); } +struct str_list { + char **items; + size_t len; +}; + +static void append_str_list(struct str_list *l, char *item) +{ + l->len++; + l->items = xrealloc(l->items, sizeof(char*) * l->len); + l->items[l->len - 1] = item; +} + +static void free_str_list(struct str_list *l) +{ + if (!l) + return; + for (size_t i = 0; i < l->len; ++i) + free(l->items[i]); + free(l->items); +} + static void freep(void *p) { free(*(void **)p); @@ -140,6 +170,7 @@ static void fclosep(FILE **f) #define _cleanup_regfree_ _cleanup_(regfree) #define DEFINE_CMD(name) _cleanup_(free_command_buffer) struct command_buffer name = { 0 }; +#define DEFINE_STR_LIST(name) _cleanup_(free_str_list) struct str_list name = { 0 }; static char *vcmd_ret(struct command_buffer *c, const char *cmd_fmt, va_list args) { @@ -239,6 +270,12 @@ _printf_(1, 2) static void cndc(const char *cmd_fmt, ...) } } +static void execute_hooks(const struct str_list *hooks) +{ + for (size_t i = 0; i < hooks->len; ++i) + cmd("%s", hooks->items[i]); +} + /* Values are from AOSP repository platform/frameworks/native in libs/binder/ndk/include_ndk/android/binder_status.h. */ enum { STATUS_OK = 0, @@ -1112,7 +1149,7 @@ static void cmd_up_cleanup(void) free(cleanup_iface); } -static void cmd_up(const char *iface, const char *config, unsigned int mtu, const char *addrs, const char *dnses, const char *excluded_applications, const char *included_applications) +static void cmd_up(const char *iface, const char *config, unsigned int mtu, const char *addrs, const char *dnses, const char *excluded_applications, const char *included_applications, const struct str_list *pre_up, const struct str_list *post_up) { DEFINE_CMD(c); unsigned int netid = 0; @@ -1127,6 +1164,7 @@ static void cmd_up(const char *iface, const char *config, unsigned int mtu, cons atexit(cmd_up_cleanup); add_if(iface); + execute_hooks(pre_up); set_config(iface, config); listen_port = determine_listen_port(iface); up_if(&netid, iface, listen_port); @@ -1135,6 +1173,7 @@ static void cmd_up(const char *iface, const char *config, unsigned int mtu, cons set_routes(iface, netid); set_mtu(iface, mtu); set_users(netid, excluded_applications, included_applications); + execute_hooks(post_up); broadcast_change(); free(cleanup_iface); @@ -1142,7 +1181,7 @@ static void cmd_up(const char *iface, const char *config, unsigned int mtu, cons exit(EXIT_SUCCESS); } -static void cmd_down(const char *iface) +static void cmd_down(const char *iface, const struct str_list *pre_down, const struct str_list *post_down) { DEFINE_CMD(c); bool found = false; @@ -1161,12 +1200,14 @@ static void cmd_down(const char *iface) exit(EMEDIUMTYPE); } + execute_hooks(pre_down); del_if(iface); + execute_hooks(post_down); broadcast_change(); exit(EXIT_SUCCESS); } -static void parse_options(char **iface, char **config, unsigned int *mtu, char **addrs, char **dnses, char **excluded_applications, char **included_applications, const char *arg) +static void parse_options(char **iface, char **config, unsigned int *mtu, char **addrs, char **dnses, char **excluded_applications, char **included_applications, struct str_list *pre_up, struct str_list *post_up, struct str_list *pre_down, struct str_list *post_down, const char *arg) { _cleanup_fclose_ FILE *file = NULL; _cleanup_free_ char *line = NULL; @@ -1236,6 +1277,27 @@ static void parse_options(char **iface, char **config, unsigned int *mtu, char * } clean[j] = '\0'; + char *line_value = strchr(line, '='); + _cleanup_free_ char *unstripped_value = NULL; + if (line_value) { + /* Skip equal sign. */ + line_value++; + + /* Skip leading whitespace. */ + while (isspace(line_value[0])) + line_value++; + + /* Calculate length of the value without trailing whitespace. */ + size_t line_value_len = strlen(line_value); + while (line_value_len && isspace(line_value[line_value_len - 1])) + line_value_len--; + + /* Create the string. */ + unstripped_value = xmalloc(line_value_len + 1); + memcpy(unstripped_value, line_value, line_value_len); + unstripped_value[line_value_len] = '\0'; + } + if (clean[0] == '[') in_interface_section = false; if (!strcasecmp(clean, "[Interface]")) @@ -1256,6 +1318,22 @@ static void parse_options(char **iface, char **config, unsigned int *mtu, char * } else if (!strncasecmp(clean, "MTU=", 4) && j > 4) { *mtu = atoi(clean + 4); continue; + } else if (!strncasecmp(clean, "PreUp=", 6) && j > 6) { + append_str_list(pre_up, unstripped_value); + unstripped_value = NULL; + continue; + } else if (!strncasecmp(clean, "PostUp=", 7) && j > 7) { + append_str_list(post_up, unstripped_value); + unstripped_value = NULL; + continue; + } else if (!strncasecmp(clean, "PreDown=", 8) && j > 8) { + append_str_list(pre_down, unstripped_value); + unstripped_value = NULL; + continue; + } else if (!strncasecmp(clean, "PostDown=", 9) && j > 9) { + append_str_list(post_down, unstripped_value); + unstripped_value = NULL; + continue; } } *config = concat_and_free(*config, "", line); @@ -1279,6 +1357,10 @@ int main(int argc, char *argv[]) _cleanup_free_ char *dnses = NULL; _cleanup_free_ char *excluded_applications = NULL; _cleanup_free_ char *included_applications = NULL; + DEFINE_STR_LIST(pre_up); + DEFINE_STR_LIST(post_up); + DEFINE_STR_LIST(pre_down); + DEFINE_STR_LIST(post_down); unsigned int mtu; char prop[PROP_VALUE_MAX + 1]; @@ -1289,12 +1371,12 @@ int main(int argc, char *argv[]) cmd_usage(argv[0]); else if (argc == 3 && !strcmp(argv[1], "up")) { auto_su(argc, argv); - parse_options(&iface, &config, &mtu, &addrs, &dnses, &excluded_applications, &included_applications, argv[2]); - cmd_up(iface, config, mtu, addrs, dnses, excluded_applications, included_applications); + parse_options(&iface, &config, &mtu, &addrs, &dnses, &excluded_applications, &included_applications, &pre_up, &post_up, &pre_down, &post_down, argv[2]); + cmd_up(iface, config, mtu, addrs, dnses, excluded_applications, included_applications, &pre_up, &post_up); } else if (argc == 3 && !strcmp(argv[1], "down")) { auto_su(argc, argv); - parse_options(&iface, &config, &mtu, &addrs, &dnses, &excluded_applications, &included_applications, argv[2]); - cmd_down(iface); + parse_options(&iface, &config, &mtu, &addrs, &dnses, &excluded_applications, &included_applications, &pre_up, &post_up, &pre_down, &post_down, argv[2]); + cmd_down(iface, &pre_down, &post_down); } else { cmd_usage(argv[0]); return 1; -- 2.49.0 From Jason at zx2c4.com Sun May 25 12:45:29 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Sun, 25 May 2025 14:45:29 +0200 Subject: [PATCH wireguard-tools] wg-quick: android: add support for {Pre, Post}{Up, Down} hooks In-Reply-To: <20250525080457.998659-1-git@claire.sharkgirl.ing> References: <20250525080457.998659-1-git@claire.sharkgirl.ing> Message-ID: On Sun, May 25, 2025 at 06:04:57PM +1000, Claire Elaina wrote: > +static void execute_hooks(const struct str_list *hooks) > +{ > + for (size_t i = 0; i < hooks->len; ++i) > + cmd("%s", hooks->items[i]); > +} This was also posted here, so copying Adam: https://lore.kernel.org/wireguard/DM6PR13MB24579CD788EF28E019933C0A92609 at DM6PR13MB2457.namprd13.prod.outlook.com/ https://github.com/WireGuard/wireguard-android/pull/23 This feature is appealing, but I've always held off on it because I'm afraid of the malware potential on client platforms where people are pretty looseygoosey with loading in random config files. Even on Windows, it only got added behind a hidden registry setting. If we added it here, maybe it'd need to be quite gated too. But then how do we handle cases where a config had it but it was disabled and then it gets enabled and it's there by surprise? Maybe strip it out on import if it's disabled? What about the transition from root to non-root and back? Anyway, many questions. Wondering, what commands do you want to run? Jason From ismael at bouya.org Sun May 25 22:51:02 2025 From: ismael at bouya.org (Ismael Bouya) Date: Mon, 26 May 2025 00:51:02 +0200 Subject: Incorrect computation of the MTU in wg-quick In-Reply-To: References: Message-ID: Hi Jason, It seems to work correctly now, thanks for the fix! Kind regards, (Wed, May 21, 2025 at 11:08:43PM +0200) Jason A. Donenfeld : > > I would like to report a behavior that seems to be incorrect in the way > > wg-quick computes the MTU to assign to a wireguard interface: > > https://git.zx2c4.com/wireguard-tools/tree/src/wg-quick/linux.bash#n125 > > In this block, wg-quick goes through every endpoint it knows about, > > gets the mtu of the route to reach the endpoint, and takes the highest > > value among all the computed values. > > However it appears to me that the chosen value should instead be the > > lowest among all endpoints rather than the highest. > > > > As an example, if I declare myself (localhost) as an endpoint (it may or > > may not be supported, but that?s how I found about this issue), then the > > mtu will be set to 65456 (65536-80) which is higher than what the other > > endpoints are able to manage, and I?ll only be able to contact myself > > properly. > > Thanks for the report. Fixed: > https://git.zx2c4.com/wireguard-tools/commit/?id=5150cd647073be1f1c12688aef291bdf17970154 > > Let me know if that looks okay to you. > > Jason -- Ismael -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From claire at sharkgirl.ing Sun May 25 22:45:12 2025 From: claire at sharkgirl.ing (Claire) Date: Mon, 26 May 2025 08:45:12 +1000 Subject: [PATCH wireguard-tools] wg-quick: android: add support for {Pre, Post}{Up, Down} hooks In-Reply-To: References: <20250525080457.998659-1-git@claire.sharkgirl.ing> Message-ID: > Wondering, what commands do you want to run? PostUp = wg set CelesteWAN fwmark 0 X problem: I have a Raspberry Pi at home, and I want to have an encrypted link between it and client devices. When I'm at home (i.e. connected to the Pi's LAN), I want the clients to connect directly to the Pi with its LAN IP address. When I'm away from home, I want them to connect through a remote server that has access to the Pi. Y problem: I cannot do port forwarding on my home internet connection because of CGNAT (hence, I cannot have the clients use the Pi's public IP address). My cursed idea is to nest Wireguard over Wireguard when not on LAN, so the connection would be "Phone -> Server -> Pi". This works fine on my laptop, but unfortunately not on my phone (pings to the Pi result in no response). However, when I manually run `wg set CelesteWAN fwmark 0` after the tunnel is already set up, the connection works. I have made a patch to allow setting FwMark in the config, but it doesn't work when testing. Perhaps the `iptables -m mark ...` rules are interfering. I want to try only setting the `fwmark` for the interface, but I feel like it's too niche to upstream, so I wanted to add generic command execution. If there's a less cursed way to make Wireguard over Wireguard work, or even not having to do WoW, I'd appreciate it. Sincerely, Claire Elaina From patrick.havelange_ext at softathome.com Mon May 26 06:55:03 2025 From: patrick.havelange_ext at softathome.com (Patrick HAVELANGE (EXT)) Date: Mon, 26 May 2025 06:55:03 +0000 Subject: [PATCH] wg: syncconf: also handle psk changes In-Reply-To: References: <20250314093920.3448871-1-patrick.havelange_ext@softathome.com> Message-ID: Hello Jason, I indeed tested it. The peer is indeed removed but is then added in a step later (if I remember correctly) so it produces the proper result in the end. At least it looked correct from my point of view. P.H. ________________________________________ From: Jason A. Donenfeld Sent: 23 May 2025 20:28 To: Patrick HAVELANGE (EXT) Cc: wireguard at lists.zx2c4.com Subject: Re: [PATCH] wg: syncconf: also handle psk changes This Mail comes from Outside of SoftAtHome: Do not answer, click links or open attachments unless you recognize the sender and know the content is safe. Your patch is wrong; did you test it? When it detects a mismatch in PSK, it removes the peer, rather than removing the PSK. I implemented a different fix: https://git.zx2c4.com/wireguard-tools/commit/?id=780182e37d2b5981171766b8f31bcefd64da7a43 It seems to work for me, but please test this and let me know. Either way, thanks for the bug report. Thanks, Jason -- This message and any attachments herein are, unless otherwise stated, confidential, intended solely for the addressees and are SoftAtHome?s ownership. Any unauthorized use, reproduction or dissemination is prohibited unless formaly agreed beforehand by the sender. If you are not the intended addressee of this message, please immediately delete it and all its attachments from your computer system and notify the sender. SoftAtHome reserves the right to monitor all email communications through its networks. Any views or opinions presented are solely those of its author and do not necessarily represent those of SoftAtHome. The internet cannot guarantee the integrity of this message. SoftAtHome not shall be liable for the message if altered, changed or falsified. While we take all reasonable precautions to ensure that viruses are not transmitted via emails, we recommend that you take your own measures to prevent viruses from entering your computer system. SoftAtHome is a French Soci?t? Anonyme with a Board of Directors, having a capital of 6 450 699 Euros having its registered office located at 9-11 rue du d?barcad?re ? 92700 ? Colombes ? France ? Tel + 33 (0)1 57 66 88 88 ? Fax + 33 (0)1 57 66 88 89 - RCS Nanterre B 500 440 813 ? Intra-Community VAT: FR 04500440813 -- Ce message et toutes les pi?ces jointes qui y sont incluses sont, sauf indication contraire, confidentiels, destin?s uniquement aux destinataires et sont la propri?t? de SoftAtHome. Toute utilisation non autoris?e, reproduction ou diffusion est interdite, sauf accord formel pr?alable de l'exp?diteur. Si vous n'?tes pas le destinataire pr?vu de ce message, veuillez le supprimer imm?diatement ainsi que toutes ses pi?ces jointes de votre syst?me informatique et en informer l'exp?diteur. SoftAtHome se r?serve le droit de surveiller toutes les communications par e-mail via ses r?seaux. Les opinions exprim?es dans ce message sont celles de leur auteur et ne repr?sentent pas n?cessairement celles de SoftAtHome. L?Internet ne permettant pas d?assurer l?int?grit? de ce message, SoftAtHome d?cline toute responsabilit? ? ce titre, dans l?hypoth?se o? il aurait ?t? alt?r?, d?form? ou falsifi?. Par ailleurs et malgr? toutes les pr?cautions prises pour ?viter la pr?sence de virus dans nos envois, nous vous recommandons de prendre, de votre c?t?, les mesures permettant d'assurer la non-introduction de virus dans votre syst?me informatique. SoftAtHome est une Soci?t? Anonyme fran?aise ? Conseil d?Administration ayant un capital de 6 450 699 euros, dont le si?ge social est situ? au 9-11 rue du d?barcad?re - 92700 - Colombes - France - Tel + 33 (0)1 57 66 88 88 - Fax + 33 (0)1 57 66 88 89 RCS Nanterre B 500 440 813 - TVA intracommunautaire : FR 04500440813 From liuhangbin at gmail.com Tue May 27 03:26:33 2025 From: liuhangbin at gmail.com (Hangbin Liu) Date: Tue, 27 May 2025 03:26:33 +0000 Subject: [PATCHv7 RESEND wireguard 0/2] wireguard: selftests: use nftables for testing Message-ID: <20250527032635.10361-1-liuhangbin@gmail.com> This patch set convert the wireguard selftest to nftables, as iptables is deparated and nftables is the default framework of most releases. v7: re-post, no update. v6: fix typo in patch 1/2. Update the description (Phil Sutter) v5: remove the counter in nft rules and link nft statically (Jason A. Donenfeld) v4: no update, just re-send v3: drop iptables directly (Jason A. Donenfeld) Also convert to using nft for qemu testing (Jason A. Donenfeld) v2: use one nft table for testing (Phil Sutter) Hangbin Liu (2): wireguard: selftests: convert iptables to nft wireguard: selftests: update to using nft for qemu test tools/testing/selftests/wireguard/netns.sh | 29 +++++++++------ .../testing/selftests/wireguard/qemu/Makefile | 36 ++++++++++++++----- .../selftests/wireguard/qemu/kernel.config | 7 ++-- 3 files changed, 49 insertions(+), 23 deletions(-) -- 2.46.0 From liuhangbin at gmail.com Tue May 27 03:26:34 2025 From: liuhangbin at gmail.com (Hangbin Liu) Date: Tue, 27 May 2025 03:26:34 +0000 Subject: [PATCHv7 RESEND wireguard 1/2] wireguard: selftests: convert iptables to nft In-Reply-To: <20250527032635.10361-1-liuhangbin@gmail.com> References: <20250527032635.10361-1-liuhangbin@gmail.com> Message-ID: <20250527032635.10361-2-liuhangbin@gmail.com> Convert the selftest to nft as it is the replacement for iptables, which is used by default in most releases. Signed-off-by: Hangbin Liu --- tools/testing/selftests/wireguard/netns.sh | 29 ++++++++++++++-------- 1 file changed, 19 insertions(+), 10 deletions(-) diff --git a/tools/testing/selftests/wireguard/netns.sh b/tools/testing/selftests/wireguard/netns.sh index 55500f901fbc..8b840fef90af 100755 --- a/tools/testing/selftests/wireguard/netns.sh +++ b/tools/testing/selftests/wireguard/netns.sh @@ -75,6 +75,11 @@ pp ip netns add $netns1 pp ip netns add $netns2 ip0 link set up dev lo +# init nft tables +n0 nft add table ip wgtest +n1 nft add table ip wgtest +n2 nft add table ip wgtest + ip0 link add dev wg0 type wireguard ip0 link set wg0 netns $netns1 ip0 link add dev wg0 type wireguard @@ -196,13 +201,14 @@ ip1 link set wg0 mtu 1300 ip2 link set wg0 mtu 1300 n1 wg set wg0 peer "$pub2" endpoint 127.0.0.1:2 n2 wg set wg0 peer "$pub1" endpoint 127.0.0.1:1 -n0 iptables -A INPUT -m length --length 1360 -j DROP +n0 nft add chain ip wgtest INPUT { type filter hook input priority filter \; policy accept \; } +n0 nft add rule ip wgtest INPUT meta length 1360 drop n1 ip route add 192.168.241.2/32 dev wg0 mtu 1299 n2 ip route add 192.168.241.1/32 dev wg0 mtu 1299 n2 ping -c 1 -W 1 -s 1269 192.168.241.1 n2 ip route delete 192.168.241.1/32 dev wg0 mtu 1299 n1 ip route delete 192.168.241.2/32 dev wg0 mtu 1299 -n0 iptables -F INPUT +n0 nft flush table ip wgtest ip1 link set wg0 mtu $orig_mtu ip2 link set wg0 mtu $orig_mtu @@ -335,7 +341,8 @@ n0 bash -c 'printf 1 > /proc/sys/net/ipv4/ip_forward' [[ -e /proc/sys/net/netfilter/nf_conntrack_udp_timeout ]] || modprobe nf_conntrack n0 bash -c 'printf 2 > /proc/sys/net/netfilter/nf_conntrack_udp_timeout' n0 bash -c 'printf 2 > /proc/sys/net/netfilter/nf_conntrack_udp_timeout_stream' -n0 iptables -t nat -A POSTROUTING -s 192.168.1.0/24 -d 10.0.0.0/24 -j SNAT --to 10.0.0.1 +n0 nft add chain ip wgtest POSTROUTING { type nat hook postrouting priority srcnat\; policy accept \; } +n0 nft add rule ip wgtest POSTROUTING ip saddr 192.168.1.0/24 ip daddr 10.0.0.0/24 snat to 10.0.0.1 n1 wg set wg0 peer "$pub2" endpoint 10.0.0.100:2 persistent-keepalive 1 n1 ping -W 1 -c 1 192.168.241.2 @@ -349,10 +356,11 @@ n1 wg set wg0 peer "$pub2" persistent-keepalive 0 # Test that sk_bound_dev_if works n1 ping -I wg0 -c 1 -W 1 192.168.241.2 # What about when the mark changes and the packet must be rerouted? -n1 iptables -t mangle -I OUTPUT -j MARK --set-xmark 1 +n1 nft add chain ip wgtest OUTPUT { type route hook output priority mangle\; policy accept \; } +n1 nft add rule ip wgtest OUTPUT meta mark set 0x1 n1 ping -c 1 -W 1 192.168.241.2 # First the boring case n1 ping -I wg0 -c 1 -W 1 192.168.241.2 # Then the sk_bound_dev_if case -n1 iptables -t mangle -D OUTPUT -j MARK --set-xmark 1 +n1 nft flush table ip wgtest # Test that onion routing works, even when it loops n1 wg set wg0 peer "$pub3" allowed-ips 192.168.242.2/32 endpoint 192.168.241.2:5 @@ -386,16 +394,17 @@ n1 ping -W 1 -c 100 -f 192.168.99.7 n1 ping -W 1 -c 100 -f abab::1111 # Have ns2 NAT into wg0 packets from ns0, but return an icmp error along the right route. -n2 iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -d 192.168.241.0/24 -j SNAT --to 192.168.241.2 -n0 iptables -t filter -A INPUT \! -s 10.0.0.0/24 -i vethrs -j DROP # Manual rpfilter just to be explicit. +n2 nft add chain ip wgtest POSTROUTING { type nat hook postrouting priority srcnat\; policy accept \; } +n2 nft add rule ip wgtest POSTROUTING ip saddr 10.0.0.0/24 ip daddr 192.168.241.0/24 snat to 192.168.241.2 +n0 nft add chain ip wgtest INPUT { type filter hook input priority filter \; policy accept \; } +n0 nft add rule ip wgtest INPUT iifname "vethrs" ip saddr != 10.0.0.0/24 drop n2 bash -c 'printf 1 > /proc/sys/net/ipv4/ip_forward' ip0 -4 route add 192.168.241.1 via 10.0.0.100 n2 wg set wg0 peer "$pub1" remove [[ $(! n0 ping -W 1 -c 1 192.168.241.1 || false) == *"From 10.0.0.100 icmp_seq=1 Destination Host Unreachable"* ]] -n0 iptables -t nat -F -n0 iptables -t filter -F -n2 iptables -t nat -F +n0 nft flush table ip wgtest +n2 nft flush table ip wgtest ip0 link del vethrc ip0 link del vethrs ip1 link del wg0 -- 2.46.0 From liuhangbin at gmail.com Tue May 27 03:26:35 2025 From: liuhangbin at gmail.com (Hangbin Liu) Date: Tue, 27 May 2025 03:26:35 +0000 Subject: [PATCHv7 RESEND wireguard 2/2] wireguard: selftests: update to using nft for qemu test In-Reply-To: <20250527032635.10361-1-liuhangbin@gmail.com> References: <20250527032635.10361-1-liuhangbin@gmail.com> Message-ID: <20250527032635.10361-3-liuhangbin@gmail.com> Since we will replace iptables with nft for wireguard netns testing, let's also convert the qemu test to use nft at the same time. Co-developed-by: Phil Sutter Signed-off-by: Phil Sutter Signed-off-by: Hangbin Liu --- .../testing/selftests/wireguard/qemu/Makefile | 36 ++++++++++++++----- .../selftests/wireguard/qemu/kernel.config | 7 ++-- 2 files changed, 30 insertions(+), 13 deletions(-) diff --git a/tools/testing/selftests/wireguard/qemu/Makefile b/tools/testing/selftests/wireguard/qemu/Makefile index 35856b11c143..2442ae99f007 100644 --- a/tools/testing/selftests/wireguard/qemu/Makefile +++ b/tools/testing/selftests/wireguard/qemu/Makefile @@ -40,7 +40,9 @@ endef $(eval $(call tar_download,IPERF,iperf,3.11,.tar.gz,https://downloads.es.net/pub/iperf/,de8cb409fad61a0574f4cb07eb19ce1159707403ac2dc01b5d175e91240b7e5f)) $(eval $(call tar_download,BASH,bash,5.1.16,.tar.gz,https://ftp.gnu.org/gnu/bash/,5bac17218d3911834520dad13cd1f85ab944e1c09ae1aba55906be1f8192f558)) $(eval $(call tar_download,IPROUTE2,iproute2,5.17.0,.tar.gz,https://www.kernel.org/pub/linux/utils/net/iproute2/,bda331d5c4606138892f23a565d78fca18919b4d508a0b7ca8391c2da2db68b9)) -$(eval $(call tar_download,IPTABLES,iptables,1.8.7,.tar.bz2,https://www.netfilter.org/projects/iptables/files/,c109c96bb04998cd44156622d36f8e04b140701ec60531a10668cfdff5e8d8f0)) +$(eval $(call tar_download,LIBMNL,libmnl,1.0.5,.tar.bz2,https://www.netfilter.org/projects/libmnl/files/,274b9b919ef3152bfb3da3a13c950dd60d6e2bcd54230ffeca298d03b40d0525)) +$(eval $(call tar_download,LIBNFTNL,libnftnl,1.2.8,.tar.xz,https://www.netfilter.org/projects/libnftnl/files/,37fea5d6b5c9b08de7920d298de3cdc942e7ae64b1a3e8b880b2d390ae67ad95)) +$(eval $(call tar_download,NFTABLES,nftables,1.1.1,.tar.xz,https://www.netfilter.org/projects/nftables/files/,6358830f3a64f31e39b0ad421d7dadcd240b72343ded48d8ef13b8faf204865a)) $(eval $(call tar_download,NMAP,nmap,7.92,.tgz,https://nmap.org/dist/,064183ea642dc4c12b1ab3b5358ce1cef7d2e7e11ffa2849f16d339f5b717117)) $(eval $(call tar_download,IPUTILS,iputils,s20190709,.tar.gz,https://github.com/iputils/iputils/archive/s20190709.tar.gz/#,a15720dd741d7538dd2645f9f516d193636ae4300ff7dbc8bfca757bf166490a)) $(eval $(call tar_download,WIREGUARD_TOOLS,wireguard-tools,1.0.20210914,.tar.xz,https://git.zx2c4.com/wireguard-tools/snapshot/,97ff31489217bb265b7ae850d3d0f335ab07d2652ba1feec88b734bc96bd05ac)) @@ -322,8 +324,7 @@ $(BUILD_PATH)/init-cpio-spec.txt: $(TOOLCHAIN_PATH)/.installed $(BUILD_PATH)/ini echo "file /bin/ss $(IPROUTE2_PATH)/misc/ss 755 0 0" >> $@ echo "file /bin/ping $(IPUTILS_PATH)/ping 755 0 0" >> $@ echo "file /bin/ncat $(NMAP_PATH)/ncat/ncat 755 0 0" >> $@ - echo "file /bin/xtables-legacy-multi $(IPTABLES_PATH)/iptables/xtables-legacy-multi 755 0 0" >> $@ - echo "slink /bin/iptables xtables-legacy-multi 777 0 0" >> $@ + echo "file /bin/nft $(NFTABLES_PATH)/src/nft 755 0 0" >> $@ echo "slink /bin/ping6 ping 777 0 0" >> $@ echo "dir /lib 755 0 0" >> $@ echo "file /lib/libc.so $(TOOLCHAIN_PATH)/$(CHOST)/lib/libc.so 755 0 0" >> $@ @@ -338,7 +339,7 @@ $(KERNEL_BUILD_PATH)/.config: $(TOOLCHAIN_PATH)/.installed kernel.config arch/$( cd $(KERNEL_BUILD_PATH) && ARCH=$(KERNEL_ARCH) $(KERNEL_PATH)/scripts/kconfig/merge_config.sh -n $(KERNEL_BUILD_PATH)/.config $(KERNEL_BUILD_PATH)/minimal.config $(if $(findstring yes,$(DEBUG_KERNEL)),cp debug.config $(KERNEL_BUILD_PATH) && cd $(KERNEL_BUILD_PATH) && ARCH=$(KERNEL_ARCH) $(KERNEL_PATH)/scripts/kconfig/merge_config.sh -n $(KERNEL_BUILD_PATH)/.config debug.config,) -$(KERNEL_BZIMAGE): $(TOOLCHAIN_PATH)/.installed $(KERNEL_BUILD_PATH)/.config $(BUILD_PATH)/init-cpio-spec.txt $(IPERF_PATH)/src/iperf3 $(IPUTILS_PATH)/ping $(BASH_PATH)/bash $(IPROUTE2_PATH)/misc/ss $(IPROUTE2_PATH)/ip/ip $(IPTABLES_PATH)/iptables/xtables-legacy-multi $(NMAP_PATH)/ncat/ncat $(WIREGUARD_TOOLS_PATH)/src/wg $(BUILD_PATH)/init +$(KERNEL_BZIMAGE): $(TOOLCHAIN_PATH)/.installed $(KERNEL_BUILD_PATH)/.config $(BUILD_PATH)/init-cpio-spec.txt $(IPERF_PATH)/src/iperf3 $(IPUTILS_PATH)/ping $(BASH_PATH)/bash $(IPROUTE2_PATH)/misc/ss $(IPROUTE2_PATH)/ip/ip $(LIBMNL_PATH)/libmnl $(LIBNFTNL_PATH)/libnftnl $(NFTABLES_PATH)/src/nft $(NMAP_PATH)/ncat/ncat $(WIREGUARD_TOOLS_PATH)/src/wg $(BUILD_PATH)/init $(MAKE) -C $(KERNEL_PATH) O=$(KERNEL_BUILD_PATH) ARCH=$(KERNEL_ARCH) CROSS_COMPILE=$(CROSS_COMPILE) .PHONY: $(KERNEL_BZIMAGE) @@ -421,15 +422,32 @@ $(IPROUTE2_PATH)/misc/ss: | $(IPROUTE2_PATH)/.installed $(USERSPACE_DEPS) $(MAKE) -C $(IPROUTE2_PATH) PREFIX=/ misc/ss $(STRIP) -s $@ -$(IPTABLES_PATH)/.installed: $(IPTABLES_TAR) +$(LIBMNL_PATH)/.installed: $(LIBMNL_TAR) mkdir -p $(BUILD_PATH) flock -s $<.lock tar -C $(BUILD_PATH) -xf $< - sed -i -e "/nfnetlink=[01]/s:=[01]:=0:" -e "/nfconntrack=[01]/s:=[01]:=0:" $(IPTABLES_PATH)/configure touch $@ -$(IPTABLES_PATH)/iptables/xtables-legacy-multi: | $(IPTABLES_PATH)/.installed $(USERSPACE_DEPS) - cd $(IPTABLES_PATH) && ./configure --prefix=/ $(CROSS_COMPILE_FLAG) --enable-static --disable-shared --disable-nftables --disable-bpf-compiler --disable-nfsynproxy --disable-libipq --disable-connlabel --with-kernel=$(BUILD_PATH)/include - $(MAKE) -C $(IPTABLES_PATH) +$(LIBMNL_PATH)/libmnl: | $(LIBMNL_PATH)/.installed $(USERSPACE_DEPS) + cd $(LIBMNL_PATH) && ./configure --prefix=$(TOOLCHAIN_PATH) $(CROSS_COMPILE_FLAG) --enable-static --disable-shared + $(MAKE) -C $(LIBMNL_PATH) install + +$(LIBNFTNL_PATH)/.installed: $(LIBNFTNL_TAR) + mkdir -p $(BUILD_PATH) + flock -s $<.lock tar -C $(BUILD_PATH) -xf $< + touch $@ + +$(LIBNFTNL_PATH)/libnftnl: | $(LIBNFTNL_PATH)/.installed $(USERSPACE_DEPS) + cd $(LIBNFTNL_PATH) && PKG_CONFIG_PATH="$(TOOLCHAIN_PATH)/lib/pkgconfig" ./configure --prefix=$(TOOLCHAIN_PATH) $(CROSS_COMPILE_FLAG) --enable-static --disable-shared + $(MAKE) -C $(LIBNFTNL_PATH) install + +$(NFTABLES_PATH)/.installed: $(NFTABLES_TAR) + mkdir -p $(BUILD_PATH) + flock -s $<.lock tar -C $(BUILD_PATH) -xf $< + touch $@ + +$(NFTABLES_PATH)/src/nft: | $(NFTABLES_PATH)/.installed $(USERSPACE_DEPS) + cd $(NFTABLES_PATH) && PKG_CONFIG_PATH="$(TOOLCHAIN_PATH)/lib/pkgconfig" ./configure --prefix=/ $(CROSS_COMPILE_FLAG) --enable-static --disable-shared --disable-debug --disable-man-doc --with-mini-gmp --without-cli + $(MAKE) -C $(NFTABLES_PATH) PREFIX=/ $(STRIP) -s $@ $(NMAP_PATH)/.installed: $(NMAP_TAR) diff --git a/tools/testing/selftests/wireguard/qemu/kernel.config b/tools/testing/selftests/wireguard/qemu/kernel.config index f314d3789f17..9930116ecd81 100644 --- a/tools/testing/selftests/wireguard/qemu/kernel.config +++ b/tools/testing/selftests/wireguard/qemu/kernel.config @@ -19,10 +19,9 @@ CONFIG_NETFILTER_XTABLES=y CONFIG_NETFILTER_XT_NAT=y CONFIG_NETFILTER_XT_MATCH_LENGTH=y CONFIG_NETFILTER_XT_MARK=y -CONFIG_IP_NF_IPTABLES=y -CONFIG_IP_NF_FILTER=y -CONFIG_IP_NF_MANGLE=y -CONFIG_IP_NF_NAT=y +CONFIG_NF_TABLES=m +CONFIG_NF_TABLES_INET=y +CONFIG_NFT_NAT=y CONFIG_IP_ADVANCED_ROUTER=y CONFIG_IP_MULTIPLE_TABLES=y CONFIG_IPV6_MULTIPLE_TABLES=y -- 2.46.0 From mirco.barone at polito.it Tue May 27 09:08:30 2025 From: mirco.barone at polito.it (Mirco Barone) Date: Tue, 27 May 2025 09:08:30 +0000 Subject: [PATCH] Enabling Threaded NAPI by Default Message-ID: Hi everyone, While testing WireGuard with a large number of tunnels, we expected throughput to scale linearly with the number of active tunnels. Instead, we observed very poor performance due to a bottleneck caused by multiple NAPI functions stacking on the same CPU core, preventing the system from scaling effectively. More details are provided in this paper on page 3: https://netdevconf.info/0x18/docs/netdev-0x18-paper23-talk-paper.pdf Since each peer has its own NAPI struct, the problem can potentially occur when many peers are created on the same machine. The simple solution we found is to enable threaded NAPI, which improves considerably the throughput in our testing conditions while, at the same time, showing no drawbacks in case of traditional deployment scenarios (i.e., single tunnel). Hence, we feel we could slightly modify the code and move to threaded NAPI as the new default. Any comment? The option to revert to NAPI handled by a softirq is still preserved, by simply changing the `/sys/class/net//threaded` flag. ----------------------------------------------------------------------- CHANGES ----------------------------------------------------------------------- drivers/net/wireguard/device.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/wireguard/device.c b/drivers/net/wireguard/device.c index 45e9b908dbfb..bb77f54d7526 100644 --- a/drivers/net/wireguard/device.c +++ b/drivers/net/wireguard/device.c @@ -363,6 +363,7 @@ static int wg_newlink(struct net *src_net, struct net_device *dev, ret = wg_ratelimiter_init(); if (ret < 0) goto err_free_handshake_queue; + dev_set_threaded(dev,true); ret = register_netdevice(dev); if (ret < 0) From Jason at zx2c4.com Tue May 27 12:17:22 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Tue, 27 May 2025 14:17:22 +0200 Subject: [PATCH] Enabling Threaded NAPI by Default In-Reply-To: References: Message-ID: Hi, Indeed I'm interested in this, but I need this as a proper git formatted patch with a commit message that has real information: - What kind of speedups and under which circumstances? - Are there any known performance regressions? Small packets? Bursty traffic? - Why is this not enabled by default everywhere and what makes WireGuard special? And so forth. All of this should be in a normally written git message, with the patch sent using `git send-email`. Thanks, Jason From mirco.barone at polito.it Wed May 28 17:26:34 2025 From: mirco.barone at polito.it (Mirco Barone) Date: Wed, 28 May 2025 17:26:34 +0000 Subject: R: [PATCH] Enabling Threaded NAPI by Default In-Reply-To: References: Message-ID: This patch enables threaded NAPI by default for WireGuard devices in response to low performance behavior that we observed when multiple tunnels (and thus multiple wg devices) are deployed on a single host. This affects any kind of multi-tunnel deployment, regardless of whether the tunnels share the same endpoints or not (i.e., a VPN concentrator type of gateway would also be affected). The problem is caused by the fact that, in case of a traffic surge that involves multiple tunnels at the same time, the polling of the NAPI instance of all these wg devices tends to converge onto the same core, causing underutilization of the CPU and bottlenecking performance. This happens because NAPI polling is hosted by default in softirq context, but the WireGuard driver only raises this softirq after the rx peer queue has been drained, which doesn't happen during high traffic. In this case, the softirq already active on a core is reused instead of raising a new one. As a result, once two or more tunnel softirqs have been scheduled on the same core, they remain pinned there until the surge ends. In our experiments, this almost always leads to all tunnel NAPIs being handled on a single core shortly after a surge begins, limiting scalability to less than 3? the performance of a single tunnel, despite plenty of unused CPU cores being available. The proposed mitigation is to enable threaded NAPI for all WireGuard devices. This moves the NAPI polling context to a dedicated per-device kernel thread, allowing the scheduler to balance the load across all available cores. On our 32-core gateways, enabling threaded NAPI yields a ~4? performance improvement with 16 tunnels, increasing throughput from ~13 Gbps to ~48 Gbps. Meanwhile, CPU usage on the receiver (which is the bottleneck) jumps from 20% to 100%. We have found no performance regressions in any scenario we tested. Single-tunnel throughput remains unchanged. More details are available in our Netdev paper: https://netdevconf.info/0x18/docs/netdev-0x18-paper23-talk-paper.pdf --- drivers/net/wireguard/device.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/wireguard/device.c b/drivers/net/wireguard/device.c index 45e9b908dbfb..bb77f54d7526 100644 --- a/drivers/net/wireguard/device.c +++ b/drivers/net/wireguard/device.c @@ -363,6 +363,7 @@ static int wg_newlink(struct net *src_net, struct net_device *dev, ret = wg_ratelimiter_init(); if (ret < 0) goto err_free_handshake_queue; + dev_set_threaded(dev,true); ret = register_netdevice(dev); if (ret < 0) -- 2.34.1 From Jason at zx2c4.com Wed May 28 18:10:52 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Wed, 28 May 2025 20:10:52 +0200 Subject: R: [PATCH] Enabling Threaded NAPI by Default In-Reply-To: References: Message-ID: On Wed, May 28, 2025 at 05:26:34PM +0000, Mirco Barone wrote: > This happens because NAPI polling is hosted by default in softirq > context, but the WireGuard driver only raises this softirq after the rx > peer queue has been drained, which doesn't happen during high traffic. > In this case, the softirq already active on a core is reused instead of > raising a new one. > > As a result, once two or more tunnel softirqs have been scheduled on > the same core, they remain pinned there until the surge ends. > > In our experiments, this almost always leads to all tunnel NAPIs being > handled on a single core shortly after a surge begins, limiting > scalability to less than 3? the performance of a single tunnel, despite > plenty of unused CPU cores being available. So *that's* what's been going on! Holy Moses, nice discovery. > On our 32-core gateways, enabling threaded NAPI yields a ~4? performance > improvement with 16 tunnels, increasing throughput from ~13 Gbps to > ~48 Gbps. Meanwhile, CPU usage on the receiver (which is the bottleneck) > jumps from 20% to 100%. Shut up and take my money! Patch applied. > --- > drivers/net/wireguard/device.c | 1 + > 1 file changed, 1 insertion(+) Actually, no, wait, sorry, this needs your Signed-off-by line, per the kernel contribution guidelines, for me to be able to push it. Can you just reply to your initial patch email, quote all of the text, and append the string: Signed-off-by: Mirco Barone And then I'll push this up. Sorry for the hassle; kernel development has its particularities. Jason From mirco.barone at polito.it Thu May 29 10:22:31 2025 From: mirco.barone at polito.it (Mirco Barone) Date: Thu, 29 May 2025 12:22:31 +0200 Subject: [PATCH] Enabling Threaded NAPI by Default In-Reply-To: References: Message-ID: <110667b9-ec00-44d4-a46d-2c5fb1892455@polito.it> On 5/28/2025 7:26 PM, Mirco Barone wrote: > This patch enables threaded NAPI by default for WireGuard devices in > response to low performance behavior that we observed when multiple > tunnels (and thus multiple wg devices) are deployed on a single host. > This affects any kind of multi-tunnel deployment, regardless of whether > the tunnels share the same endpoints or not (i.e., a VPN concentrator > type of gateway would also be affected). > > The problem is caused by the fact that, in case of a traffic surge that > involves multiple tunnels at the same time, the polling of the NAPI > instance of all these wg devices tends to converge onto the same core, > causing underutilization of the CPU and bottlenecking performance. > > This happens because NAPI polling is hosted by default in softirq > context, but the WireGuard driver only raises this softirq after the rx > peer queue has been drained, which doesn't happen during high traffic. > In this case, the softirq already active on a core is reused instead of > raising a new one. > > As a result, once two or more tunnel softirqs have been scheduled on > the same core, they remain pinned there until the surge ends. > > In our experiments, this almost always leads to all tunnel NAPIs being > handled on a single core shortly after a surge begins, limiting > scalability to less than 3? the performance of a single tunnel, despite > plenty of unused CPU cores being available. > > The proposed mitigation is to enable threaded NAPI for all WireGuard > devices. This moves the NAPI polling context to a dedicated per-device > kernel thread, allowing the scheduler to balance the load across all > available cores. > > On our 32-core gateways, enabling threaded NAPI yields a ~4? performance > improvement with 16 tunnels, increasing throughput from ~13 Gbps to > ~48 Gbps. Meanwhile, CPU usage on the receiver (which is the bottleneck) > jumps from 20% to 100%. > > We have found no performance regressions in any scenario we tested. > Single-tunnel throughput remains unchanged. > > More details are available in our Netdev paper: > https://netdevconf.info/0x18/docs/netdev-0x18-paper23-talk-paper.pdf > --- > drivers/net/wireguard/device.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/net/wireguard/device.c b/drivers/net/wireguard/device.c > index 45e9b908dbfb..bb77f54d7526 100644 > --- a/drivers/net/wireguard/device.c > +++ b/drivers/net/wireguard/device.c > @@ -363,6 +363,7 @@ static int wg_newlink(struct net *src_net, struct net_device *dev, > ret = wg_ratelimiter_init(); > if (ret < 0) > goto err_free_handshake_queue; > + dev_set_threaded(dev,true); > > ret = register_netdevice(dev); > if (ret < 0) Signed-off-by: Mirco Barone