From syzbot+c4c7bf27f6b0c4bd97fe at syzkaller.appspotmail.com Mon Jun 2 13:21:34 2025 From: syzbot+c4c7bf27f6b0c4bd97fe at syzkaller.appspotmail.com (syzbot) Date: Mon, 02 Jun 2025 06:21:34 -0700 Subject: [syzbot] [net?] general protection fault in veth_xdp_rcv Message-ID: <683da55e.a00a0220.d8eae.0052.GAE@google.com> Hello, syzbot found the following issue on: HEAD commit: 4cb6c8af8591 selftests/filesystems: Fix build of anon_inod.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=11e8300c580000 kernel config: https://syzkaller.appspot.com/x/.config?x=5319177d225a42f1 dashboard link: https://syzkaller.appspot.com/bug?extid=c4c7bf27f6b0c4bd97fe compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 Unfortunately, I don't have any reproducer for this issue yet. Downloadable assets: disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-4cb6c8af.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/bc0e5dfdd686/vmlinux-4cb6c8af.xz kernel image: https://storage.googleapis.com/syzbot-assets/2cdd323de6ca/bzImage-4cb6c8af.xz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+c4c7bf27f6b0c4bd97fe at syzkaller.appspotmail.com Oops: general protection fault, probably for non-canonical address 0xdffffc0000000098: 0000 [#1] SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x00000000000004c0-0x00000000000004c7] CPU: 1 UID: 0 PID: 5975 Comm: kworker/1:4 Not tainted 6.15.0-syzkaller-10402-g4cb6c8af8591 #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 Workqueue: wg-kex-wg0 wg_packet_handshake_receive_worker RIP: 0010:netdev_get_tx_queue include/linux/netdevice.h:2636 [inline] RIP: 0010:veth_xdp_rcv.constprop.0+0x142/0xda0 drivers/net/veth.c:912 Code: 54 d9 31 fb 45 85 e4 0f 85 db 08 00 00 e8 06 de 31 fb 48 8d bd c0 04 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 18 0c 00 00 44 8b a5 c0 04 00 RSP: 0018:ffffc900006a09b8 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff868a1686 RDX: 0000000000000098 RSI: ffffffff868a0d9a RDI: 00000000000004c0 RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000000 R10: 0000000000000001 R11: ffffc900006a0ff8 R12: 0000000000000001 R13: 1ffff920000d4145 R14: ffffc900006a0e58 R15: ffff8880503d0000 FS: 0000000000000000(0000) GS:ffff8880d686e000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fe5e3a6ad58 CR3: 000000000e382000 CR4: 0000000000352ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: veth_poll+0x19c/0x9c0 drivers/net/veth.c:979 __napi_poll.constprop.0+0xba/0x550 net/core/dev.c:7414 napi_poll net/core/dev.c:7478 [inline] net_rx_action+0xa9f/0xfe0 net/core/dev.c:7605 handle_softirqs+0x219/0x8e0 kernel/softirq.c:579 do_softirq kernel/softirq.c:480 [inline] do_softirq+0xb2/0xf0 kernel/softirq.c:467 __local_bh_enable_ip+0x100/0x120 kernel/softirq.c:407 local_bh_enable include/linux/bottom_half.h:33 [inline] fpregs_unlock arch/x86/include/asm/fpu/api.h:77 [inline] kernel_fpu_end+0x5e/0x70 arch/x86/kernel/fpu/core.c:476 blake2s_compress+0x7f/0xe0 arch/x86/lib/crypto/blake2s-glue.c:46 blake2s_final+0xc9/0x150 lib/crypto/blake2s.c:54 hmac.constprop.0+0x335/0x420 drivers/net/wireguard/noise.c:333 kdf.constprop.0+0x122/0x280 drivers/net/wireguard/noise.c:360 mix_dh+0xe8/0x150 drivers/net/wireguard/noise.c:413 wg_noise_handshake_consume_initiation+0x265/0x880 drivers/net/wireguard/noise.c:608 wg_receive_handshake_packet+0x219/0xbf0 drivers/net/wireguard/receive.c:144 wg_packet_handshake_receive_worker+0x17f/0x3a0 drivers/net/wireguard/receive.c:213 process_one_work+0x9cc/0x1b70 kernel/workqueue.c:3238 process_scheduled_works kernel/workqueue.c:3321 [inline] worker_thread+0x6c8/0xf10 kernel/workqueue.c:3402 kthread+0x3c2/0x780 kernel/kthread.c:464 ret_from_fork+0x5d4/0x6f0 arch/x86/kernel/process.c:148 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 Modules linked in: ---[ end trace 0000000000000000 ]--- RIP: 0010:netdev_get_tx_queue include/linux/netdevice.h:2636 [inline] RIP: 0010:veth_xdp_rcv.constprop.0+0x142/0xda0 drivers/net/veth.c:912 Code: 54 d9 31 fb 45 85 e4 0f 85 db 08 00 00 e8 06 de 31 fb 48 8d bd c0 04 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 18 0c 00 00 44 8b a5 c0 04 00 RSP: 0018:ffffc900006a09b8 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff868a1686 RDX: 0000000000000098 RSI: ffffffff868a0d9a RDI: 00000000000004c0 RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000000 R10: 0000000000000001 R11: ffffc900006a0ff8 R12: 0000000000000001 R13: 1ffff920000d4145 R14: ffffc900006a0e58 R15: ffff8880503d0000 FS: 0000000000000000(0000) GS:ffff8880d686e000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fe5e3a6ad58 CR3: 000000000e382000 CR4: 0000000000352ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 ---------------- Code disassembly (best guess): 0: 54 push %rsp 1: d9 31 fnstenv (%rcx) 3: fb sti 4: 45 85 e4 test %r12d,%r12d 7: 0f 85 db 08 00 00 jne 0x8e8 d: e8 06 de 31 fb call 0xfb31de18 12: 48 8d bd c0 04 00 00 lea 0x4c0(%rbp),%rdi 19: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax 20: fc ff df 23: 48 89 fa mov %rdi,%rdx 26: 48 c1 ea 03 shr $0x3,%rdx * 2a: 0f b6 04 02 movzbl (%rdx,%rax,1),%eax <-- trapping instruction 2e: 84 c0 test %al,%al 30: 74 08 je 0x3a 32: 3c 03 cmp $0x3,%al 34: 0f 8e 18 0c 00 00 jle 0xc52 3a: 44 rex.R 3b: 8b .byte 0x8b 3c: a5 movsl %ds:(%rsi),%es:(%rdi) 3d: c0 .byte 0xc0 3e: 04 00 add $0x0,%al --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller at googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. If the report is already addressed, let syzbot know by replying with: #syz fix: exact-commit-title If you want to overwrite report's subsystems, reply with: #syz set subsystems: new-subsystem (See the list of subsystem names on the web dashboard) If the report is a duplicate of another one, reply with: #syz dup: exact-subject-of-another-report If you want to undo deduplication, reply with: #syz undup From yury.norov at gmail.com Wed Jun 4 23:36:55 2025 From: yury.norov at gmail.com (Yury Norov) Date: Wed, 4 Jun 2025 19:36:55 -0400 Subject: [PATCH] wireguard/queueing: simplify wg_cpumask_next_online() Message-ID: <20250604233656.41896-1-yury.norov@gmail.com> wg_cpumask_choose_online() opencodes cpumask_nth(). Use it and make the function significantly simpler. While there, fix opencoded cpu_online() too. Signed-off-by: Yury Norov --- drivers/net/wireguard/queueing.h | 14 ++++---------- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h index 7eb76724b3ed..3bfe16f71af0 100644 --- a/drivers/net/wireguard/queueing.h +++ b/drivers/net/wireguard/queueing.h @@ -104,17 +104,11 @@ static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating) static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id) { - unsigned int cpu = *stored_cpu, cpu_index, i; + if (likely(*stored_cpu < nr_cpu_ids && cpu_online(*stored_cpu))) + return cpu; - if (unlikely(cpu >= nr_cpu_ids || - !cpumask_test_cpu(cpu, cpu_online_mask))) { - cpu_index = id % cpumask_weight(cpu_online_mask); - cpu = cpumask_first(cpu_online_mask); - for (i = 0; i < cpu_index; ++i) - cpu = cpumask_next(cpu, cpu_online_mask); - *stored_cpu = cpu; - } - return cpu; + *stored_cpu = cpumask_nth(id % num_online_cpus(), cpu_online_mask); + return *stored_cpu; } /* This function is racy, in the sense that it's called while last_cpu is -- 2.43.0 From yury.norov at gmail.com Thu Jun 5 04:23:29 2025 From: yury.norov at gmail.com (Yury Norov) Date: Thu, 5 Jun 2025 00:23:29 -0400 Subject: [PATCH] wireguard/queueing: simplify wg_cpumask_next_online() In-Reply-To: <20250604233656.41896-1-yury.norov@gmail.com> References: <20250604233656.41896-1-yury.norov@gmail.com> Message-ID: On Wed, Jun 04, 2025 at 07:36:55PM -0400, Yury Norov wrote: > wg_cpumask_choose_online() opencodes cpumask_nth(). Use it and make the > function significantly simpler. While there, fix opencoded cpu_online() > too. > > Signed-off-by: Yury Norov > --- > drivers/net/wireguard/queueing.h | 14 ++++---------- > 1 file changed, 4 insertions(+), 10 deletions(-) > > diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h > index 7eb76724b3ed..3bfe16f71af0 100644 > --- a/drivers/net/wireguard/queueing.h > +++ b/drivers/net/wireguard/queueing.h > @@ -104,17 +104,11 @@ static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating) > > static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id) > { > - unsigned int cpu = *stored_cpu, cpu_index, i; > + if (likely(*stored_cpu < nr_cpu_ids && cpu_online(*stored_cpu))) > + return cpu; Oops... This should be return *stored_cpu; I'll resend, sorry for noise. > > - if (unlikely(cpu >= nr_cpu_ids || > - !cpumask_test_cpu(cpu, cpu_online_mask))) { > - cpu_index = id % cpumask_weight(cpu_online_mask); > - cpu = cpumask_first(cpu_online_mask); > - for (i = 0; i < cpu_index; ++i) > - cpu = cpumask_next(cpu, cpu_online_mask); > - *stored_cpu = cpu; > - } > - return cpu; > + *stored_cpu = cpumask_nth(id % num_online_cpus(), cpu_online_mask); > + return *stored_cpu; > } > > /* This function is racy, in the sense that it's called while last_cpu is > -- > 2.43.0 From vegeta at tuxpowered.net Thu Jun 5 10:27:22 2025 From: vegeta at tuxpowered.net (Kajetan Staszkiewicz) Date: Thu, 5 Jun 2025 12:27:22 +0200 Subject: are WG clients expected to automatically handle it when the endpoint is within the AllowedIPs In-Reply-To: <1a897464d3fb56184b83cb6ac7b4a2407047b10e.camel@scientia.org> References: <1a897464d3fb56184b83cb6ac7b4a2407047b10e.camel@scientia.org> Message-ID: <8e9f8b10-f438-40d5-a03a-85ef64632b11@tuxpowered.net> On 2025-05-23 00:36, Christoph Anton Mitterer wrote: > (re-posting, now that the list seems to work again) > > > Hey folks. > > In science/education, many organisations (I could find the total list > only in the Android app, but there it seems to be several 1000) use > eduVPN to provide VPN access to their users. > It comes with a client which, AFAIU, either sets up some OpenVPN or WG > VPN. > > I've previously used the OpenVPN profile files successfully with > NetworkManager but now wanted to switch to WG, and again I don't wanna > use the eduVPN client, because I think this should be done with the > native tools that integrate nicely into the system (e.g. NM for desktop > environments, ifupdown/systemd-networkd/etc. for servers). > > ? > > Using that config with NM fails NetworkManager's Wireguard implemmentation already has a way of supporting it by using fwmarks. It's just that the fwmark operation is not automatically turned unless the tunnel is configured with AllowedIPs=::/0 See my comment and a workaround which always forces the fwmark operation on https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1157#note_2426757 -- | pozdrawiam / regards | Powered by Debian and FreeBSD | | Kajetan Staszkiewicz | www: http://tuxpowered.net | | | matrix: @vegeta:tuxpowered.net | `----------------------^--------------------------------' -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature.asc Type: application/pgp-signature Size: 840 bytes Desc: OpenPGP digital signature URL: From atann at alphasrv.net Wed Jun 4 14:53:33 2025 From: atann at alphasrv.net (Andre Tann) Date: Wed, 4 Jun 2025 16:53:33 +0200 Subject: Delay in ipv6 Message-ID: <7e977585-5b93-4591-94f7-bf33c8923543@alphasrv.net> Hi all, I hope this is not offtopic here. If so, pls let me know a better place for this question. I configured wireguard to route both IPv4 and IPv6. Both protocols work fine. But test-ipv6.com only gives 9/10 points because the browser does not use IPV6 even though it is available. Then I investigated a bit and found this: ping -4 dns.google => ping sequence starts immediately ping -6 dns.google => .5 secs delay => ping sequence starts ping -6 dns.google => ping sequence starts immediately i.e.: On the first try, ping6 takes longer, but the second time there is no delay anymore. I suspected DNS trouble, but pinging 2001:4860:4860::8844 shows the exact same behavior: delay the first time, no delay next time. Yet I couldn't determine how long I need to wait until a second try becomes a first one again, i.e. the delay shows up again. Any ideas where to look next? -- Andre Tann From Jason at zx2c4.com Thu Jun 5 11:33:19 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Thu, 5 Jun 2025 13:33:19 +0200 Subject: [PATCH] wireguard/queueing: simplify wg_cpumask_next_online() In-Reply-To: References: <20250604233656.41896-1-yury.norov@gmail.com> Message-ID: On Thu, Jun 05, 2025 at 12:23:29AM -0400, Yury Norov wrote: > On Wed, Jun 04, 2025 at 07:36:55PM -0400, Yury Norov wrote: > > wg_cpumask_choose_online() opencodes cpumask_nth(). Use it and make the > > function significantly simpler. While there, fix opencoded cpu_online() > > too. > > > > Signed-off-by: Yury Norov > > --- > > drivers/net/wireguard/queueing.h | 14 ++++---------- > > 1 file changed, 4 insertions(+), 10 deletions(-) > > > > diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h > > index 7eb76724b3ed..3bfe16f71af0 100644 > > --- a/drivers/net/wireguard/queueing.h > > +++ b/drivers/net/wireguard/queueing.h > > @@ -104,17 +104,11 @@ static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating) > > > > static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id) > > { > > - unsigned int cpu = *stored_cpu, cpu_index, i; > > + if (likely(*stored_cpu < nr_cpu_ids && cpu_online(*stored_cpu))) > > + return cpu; > > Oops... This should be > return *stored_cpu; Maybe it's best to structure the function something like: unsigned int cpu = *stored_cpu; if (unlikely(cpu >= nr_cpu_ids || !cpu_online(cpu))) { cpu = *stored_cpu = cpumask_nth(id % num_online_cpus(), cpu_online_mask); return cpu; From rm at romanrm.net Thu Jun 5 12:04:59 2025 From: rm at romanrm.net (Roman Mamedov) Date: Thu, 5 Jun 2025 17:04:59 +0500 Subject: Delay in ipv6 In-Reply-To: <7e977585-5b93-4591-94f7-bf33c8923543@alphasrv.net> References: <7e977585-5b93-4591-94f7-bf33c8923543@alphasrv.net> Message-ID: <20250605170459.43439532@nvm> On Wed, 4 Jun 2025 16:53:33 +0200 Andre Tann wrote: > Hi all, > > I hope this is not offtopic here. If so, pls let me know a better place > for this question. > > I configured wireguard to route both IPv4 and IPv6. Both protocols work > fine. But test-ipv6.com only gives 9/10 points because the browser does > not use IPV6 even though it is available. > > Then I investigated a bit and found this: > > ping -4 dns.google => ping sequence starts immediately > ping -6 dns.google => .5 secs delay => ping sequence starts > ping -6 dns.google => ping sequence starts immediately > > i.e.: On the first try, ping6 takes longer, but the second time there is > no delay anymore. Hello, Which DNS resolvers do you use? Try 8.8.8.8 or 1.1.1.1 at first, and then their v6 equivalents. > I suspected DNS trouble, but pinging 2001:4860:4860::8844 shows the > exact same behavior: delay the first time, no delay next time. This might be caused by DNS again, trying to resolve PTR record for the IP. Recheck if "ping -n" starts in this case without a delay. -- With respect, Roman From yury.norov at gmail.com Thu Jun 5 14:24:32 2025 From: yury.norov at gmail.com (Yury Norov) Date: Thu, 5 Jun 2025 10:24:32 -0400 Subject: [PATCH] wireguard/queueing: simplify wg_cpumask_next_online() In-Reply-To: References: <20250604233656.41896-1-yury.norov@gmail.com> Message-ID: On Thu, Jun 05, 2025 at 01:33:19PM +0200, Jason A. Donenfeld wrote: > On Thu, Jun 05, 2025 at 12:23:29AM -0400, Yury Norov wrote: > > On Wed, Jun 04, 2025 at 07:36:55PM -0400, Yury Norov wrote: > > > wg_cpumask_choose_online() opencodes cpumask_nth(). Use it and make the > > > function significantly simpler. While there, fix opencoded cpu_online() > > > too. > > > > > > Signed-off-by: Yury Norov > > > --- > > > drivers/net/wireguard/queueing.h | 14 ++++---------- > > > 1 file changed, 4 insertions(+), 10 deletions(-) > > > > > > diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h > > > index 7eb76724b3ed..3bfe16f71af0 100644 > > > --- a/drivers/net/wireguard/queueing.h > > > +++ b/drivers/net/wireguard/queueing.h > > > @@ -104,17 +104,11 @@ static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating) > > > > > > static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id) > > > { > > > - unsigned int cpu = *stored_cpu, cpu_index, i; > > > + if (likely(*stored_cpu < nr_cpu_ids && cpu_online(*stored_cpu))) > > > + return cpu; > > > > Oops... This should be > > return *stored_cpu; > > Maybe it's best to structure the function something like: > > unsigned int cpu = *stored_cpu; > if (unlikely(cpu >= nr_cpu_ids || !cpu_online(cpu))) { > cpu = *stored_cpu = cpumask_nth(id % num_online_cpus(), cpu_online_mask); > return cpu; If you prefer. I'll send v2 shortly From Jason at zx2c4.com Thu Jun 5 15:47:32 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Thu, 5 Jun 2025 17:47:32 +0200 Subject: [PATCH] wireguard/queueing: simplify wg_cpumask_next_online() In-Reply-To: References: <20250604233656.41896-1-yury.norov@gmail.com> Message-ID: On Thu, Jun 05, 2025 at 10:24:32AM -0400, Yury Norov wrote: > On Thu, Jun 05, 2025 at 01:33:19PM +0200, Jason A. Donenfeld wrote: > > On Thu, Jun 05, 2025 at 12:23:29AM -0400, Yury Norov wrote: > > > On Wed, Jun 04, 2025 at 07:36:55PM -0400, Yury Norov wrote: > > > > wg_cpumask_choose_online() opencodes cpumask_nth(). Use it and make the > > > > function significantly simpler. While there, fix opencoded cpu_online() > > > > too. > > > > > > > > Signed-off-by: Yury Norov > > > > --- > > > > drivers/net/wireguard/queueing.h | 14 ++++---------- > > > > 1 file changed, 4 insertions(+), 10 deletions(-) > > > > > > > > diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h > > > > index 7eb76724b3ed..3bfe16f71af0 100644 > > > > --- a/drivers/net/wireguard/queueing.h > > > > +++ b/drivers/net/wireguard/queueing.h > > > > @@ -104,17 +104,11 @@ static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating) > > > > > > > > static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id) > > > > { > > > > - unsigned int cpu = *stored_cpu, cpu_index, i; > > > > + if (likely(*stored_cpu < nr_cpu_ids && cpu_online(*stored_cpu))) > > > > + return cpu; > > > > > > Oops... This should be > > > return *stored_cpu; > > > > Maybe it's best to structure the function something like: > > > > unsigned int cpu = *stored_cpu; > > if (unlikely(cpu >= nr_cpu_ids || !cpu_online(cpu))) { > > cpu = *stored_cpu = cpumask_nth(id % num_online_cpus(), cpu_online_mask); > > return cpu; > > If you prefer. I'll send v2 shortly While you're at it, fix the commit subject to match the format used by every single other wireguard commit. `$ git log --oneline drivers/net/wireguard` to see what I mean. From calestyo at scientia.org Sun Jun 8 21:11:01 2025 From: calestyo at scientia.org (Christoph Anton Mitterer) Date: Sun, 08 Jun 2025 23:11:01 +0200 Subject: are WG clients expected to automatically handle it when the endpoint is within the AllowedIPs In-Reply-To: <8e9f8b10-f438-40d5-a03a-85ef64632b11@tuxpowered.net> References: <1a897464d3fb56184b83cb6ac7b4a2407047b10e.camel@scientia.org> <8e9f8b10-f438-40d5-a03a-85ef64632b11@tuxpowered.net> Message-ID: <27eacd15c147889708652227cedadf5f01d79d8e.camel@scientia.org> Hey. On Thu, 2025-06-05 at 12:27 +0200, Kajetan Staszkiewicz wrote: > NetworkManager's Wireguard implemmentation already has a way of > supporting it by using fwmarks. It's just that the fwmark operation > is > not automatically turned unless the tunnel is configured with > AllowedIPs=::/0 AFAIU, even the AllowedIPs=::/0 case was only fixed[0] (in the sense of: making it work out-of-the-box) recently, right? But nevertheless, my main point was,... is it expected to be handled *automatically* by WG clients? It's clear that one can always make it somehow manually working, like with the way from your comment or like how I did with adding a specific route for the endpoint in [1] (though your approach is probably cleaner). And at least as of now, neither NM nor wg-quick seem to work out-of- the-box with a split profile as described before. > See my comment and a workaround which always forces the fwmark > operation > on > https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1157#note_2426757 I would rather not have that imposed on "end-users"... not ruled out they get it wrong and perhaps even compromise security. Cheers, Chris. [0] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/2158 [1] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1737 From petrm at nvidia.com Mon Jun 9 20:50:17 2025 From: petrm at nvidia.com (Petr Machata) Date: Mon, 9 Jun 2025 22:50:17 +0200 Subject: [PATCH net-next 01/14] net: ipv4: Add a flags argument to iptunnel_xmit(), udp_tunnel_xmit_skb() In-Reply-To: References: Message-ID: iptunnel_xmit() erases the contents of the SKB control block. In order to be able to set particular IPCB flags on the SKB, add a corresponding parameter, and propagate it to udp_tunnel_xmit_skb() as well. In one of the following patches, VXLAN driver will use this facility to mark packets as subject to IP multicast routing. Signed-off-by: Petr Machata Reviewed-by: Ido Schimmel --- Notes: CC: Pablo Neira Ayuso CC: osmocom-net-gprs at lists.osmocom.org CC: Andrew Lunn CC: Taehee Yoo CC: Antonio Quartulli CC: "Jason A. Donenfeld" CC: wireguard at lists.zx2c4.com CC: Marcelo Ricardo Leitner CC: linux-sctp at vger.kernel.org CC: Jon Maloy CC: tipc-discussion at lists.sourceforge.net drivers/net/amt.c | 9 ++++++--- drivers/net/bareudp.c | 4 ++-- drivers/net/geneve.c | 4 ++-- drivers/net/gtp.c | 10 ++++++---- drivers/net/ovpn/udp.c | 2 +- drivers/net/vxlan/vxlan_core.c | 2 +- drivers/net/wireguard/socket.c | 2 +- include/net/ip_tunnels.h | 2 +- include/net/udp_tunnel.h | 2 +- net/ipv4/ip_tunnel.c | 4 ++-- net/ipv4/ip_tunnel_core.c | 4 +++- net/ipv4/udp_tunnel_core.c | 5 +++-- net/ipv6/sit.c | 2 +- net/sctp/protocol.c | 3 ++- net/tipc/udp_media.c | 2 +- 15 files changed, 33 insertions(+), 24 deletions(-) diff --git a/drivers/net/amt.c b/drivers/net/amt.c index 734a0b3242a9..d0f719531499 100644 --- a/drivers/net/amt.c +++ b/drivers/net/amt.c @@ -1046,7 +1046,8 @@ static bool amt_send_membership_update(struct amt_dev *amt, amt->gw_port, amt->relay_port, false, - false); + false, + 0); amt_update_gw_status(amt, AMT_STATUS_SENT_UPDATE, true); return false; } @@ -1103,7 +1104,8 @@ static void amt_send_multicast_data(struct amt_dev *amt, amt->relay_port, tunnel->source_port, false, - false); + false, + 0); } static bool amt_send_membership_query(struct amt_dev *amt, @@ -1161,7 +1163,8 @@ static bool amt_send_membership_query(struct amt_dev *amt, amt->relay_port, tunnel->source_port, false, - false); + false, + 0); amt_update_relay_status(tunnel, AMT_STATUS_SENT_QUERY, true); return false; } diff --git a/drivers/net/bareudp.c b/drivers/net/bareudp.c index a9dffdcac805..5e613080d3f8 100644 --- a/drivers/net/bareudp.c +++ b/drivers/net/bareudp.c @@ -362,8 +362,8 @@ static int bareudp_xmit_skb(struct sk_buff *skb, struct net_device *dev, udp_tunnel_xmit_skb(rt, sock->sk, skb, saddr, info->key.u.ipv4.dst, tos, ttl, df, sport, bareudp->port, !net_eq(bareudp->net, dev_net(bareudp->dev)), - !test_bit(IP_TUNNEL_CSUM_BIT, - info->key.tun_flags)); + !test_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags), + 0); return 0; free_dst: diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c index ffc15a432689..c668e8b00ed2 100644 --- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -921,8 +921,8 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev, udp_tunnel_xmit_skb(rt, gs4->sock->sk, skb, saddr, info->key.u.ipv4.dst, tos, ttl, df, sport, geneve->cfg.info.key.tp_dst, !net_eq(geneve->net, dev_net(geneve->dev)), - !test_bit(IP_TUNNEL_CSUM_BIT, - info->key.tun_flags)); + !test_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags), + 0); return 0; } diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index d4dec741c7f4..14584793fe4e 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -446,7 +446,8 @@ static int gtp0_send_echo_resp_ip(struct gtp_dev *gtp, struct sk_buff *skb) htons(GTP0_PORT), htons(GTP0_PORT), !net_eq(sock_net(gtp->sk1u), dev_net(gtp->dev)), - false); + false, + 0); return 0; } @@ -704,7 +705,8 @@ static int gtp1u_send_echo_resp(struct gtp_dev *gtp, struct sk_buff *skb) htons(GTP1U_PORT), htons(GTP1U_PORT), !net_eq(sock_net(gtp->sk1u), dev_net(gtp->dev)), - false); + false, + 0); return 0; } @@ -1304,7 +1306,7 @@ static netdev_tx_t gtp_dev_xmit(struct sk_buff *skb, struct net_device *dev) pktinfo.gtph_port, pktinfo.gtph_port, !net_eq(sock_net(pktinfo.pctx->sk), dev_net(dev)), - false); + false, 0); break; case AF_INET6: #if IS_ENABLED(CONFIG_IPV6) @@ -2405,7 +2407,7 @@ static int gtp_genl_send_echo_req(struct sk_buff *skb, struct genl_info *info) port, port, !net_eq(sock_net(sk), dev_net(gtp->dev)), - false); + false, 0); return 0; } diff --git a/drivers/net/ovpn/udp.c b/drivers/net/ovpn/udp.c index bff00946eae2..d866e6bfda70 100644 --- a/drivers/net/ovpn/udp.c +++ b/drivers/net/ovpn/udp.c @@ -199,7 +199,7 @@ static int ovpn_udp4_output(struct ovpn_peer *peer, struct ovpn_bind *bind, transmit: udp_tunnel_xmit_skb(rt, sk, skb, fl.saddr, fl.daddr, 0, ip4_dst_hoplimit(&rt->dst), 0, fl.fl4_sport, - fl.fl4_dport, false, sk->sk_no_check_tx); + fl.fl4_dport, false, sk->sk_no_check_tx, 0); ret = 0; err: local_bh_enable(); diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c index a56d7239b127..d7a5d8873a1b 100644 --- a/drivers/net/vxlan/vxlan_core.c +++ b/drivers/net/vxlan/vxlan_core.c @@ -2522,7 +2522,7 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev, udp_tunnel_xmit_skb(rt, sock4->sock->sk, skb, saddr, pkey->u.ipv4.dst, tos, ttl, df, - src_port, dst_port, xnet, !udp_sum); + src_port, dst_port, xnet, !udp_sum, 0); #if IS_ENABLED(CONFIG_IPV6) } else { struct vxlan_sock *sock6 = rcu_dereference(vxlan->vn6_sock); diff --git a/drivers/net/wireguard/socket.c b/drivers/net/wireguard/socket.c index 0414d7a6ce74..88e685667bc0 100644 --- a/drivers/net/wireguard/socket.c +++ b/drivers/net/wireguard/socket.c @@ -84,7 +84,7 @@ static int send4(struct wg_device *wg, struct sk_buff *skb, skb->ignore_df = 1; udp_tunnel_xmit_skb(rt, sock, skb, fl.saddr, fl.daddr, ds, ip4_dst_hoplimit(&rt->dst), 0, fl.fl4_sport, - fl.fl4_dport, false, false); + fl.fl4_dport, false, false, 0); goto out; err: diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h index 0c3d571a04a1..8cf1380f3656 100644 --- a/include/net/ip_tunnels.h +++ b/include/net/ip_tunnels.h @@ -603,7 +603,7 @@ static inline int iptunnel_pull_header(struct sk_buff *skb, int hdr_len, void iptunnel_xmit(struct sock *sk, struct rtable *rt, struct sk_buff *skb, __be32 src, __be32 dst, u8 proto, - u8 tos, u8 ttl, __be16 df, bool xnet); + u8 tos, u8 ttl, __be16 df, bool xnet, u16 ipcb_flags); struct metadata_dst *iptunnel_metadata_reply(struct metadata_dst *md, gfp_t flags); int skb_tunnel_check_pmtu(struct sk_buff *skb, struct dst_entry *encap_dst, diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h index 2df3b8344eb5..28102c8fd8a8 100644 --- a/include/net/udp_tunnel.h +++ b/include/net/udp_tunnel.h @@ -150,7 +150,7 @@ static inline void udp_tunnel_drop_rx_info(struct net_device *dev) void udp_tunnel_xmit_skb(struct rtable *rt, struct sock *sk, struct sk_buff *skb, __be32 src, __be32 dst, __u8 tos, __u8 ttl, __be16 df, __be16 src_port, __be16 dst_port, - bool xnet, bool nocheck); + bool xnet, bool nocheck, u16 ipcb_flags); int udp_tunnel6_xmit_skb(struct dst_entry *dst, struct sock *sk, struct sk_buff *skb, diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index 678b8f96e3e9..aaeb5d16f0c9 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -668,7 +668,7 @@ void ip_md_tunnel_xmit(struct sk_buff *skb, struct net_device *dev, ip_tunnel_adj_headroom(dev, headroom); iptunnel_xmit(NULL, rt, skb, fl4.saddr, fl4.daddr, proto, tos, ttl, - df, !net_eq(tunnel->net, dev_net(dev))); + df, !net_eq(tunnel->net, dev_net(dev)), 0); return; tx_error: DEV_STATS_INC(dev, tx_errors); @@ -857,7 +857,7 @@ void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev, ip_tunnel_adj_headroom(dev, max_headroom); iptunnel_xmit(NULL, rt, skb, fl4.saddr, fl4.daddr, protocol, tos, ttl, - df, !net_eq(tunnel->net, dev_net(dev))); + df, !net_eq(tunnel->net, dev_net(dev)), 0); return; #if IS_ENABLED(CONFIG_IPV6) diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c index f65d2f727381..cc9915543637 100644 --- a/net/ipv4/ip_tunnel_core.c +++ b/net/ipv4/ip_tunnel_core.c @@ -49,7 +49,8 @@ EXPORT_SYMBOL(ip6tun_encaps); void iptunnel_xmit(struct sock *sk, struct rtable *rt, struct sk_buff *skb, __be32 src, __be32 dst, __u8 proto, - __u8 tos, __u8 ttl, __be16 df, bool xnet) + __u8 tos, __u8 ttl, __be16 df, bool xnet, + u16 ipcb_flags) { int pkt_len = skb->len - skb_inner_network_offset(skb); struct net *net = dev_net(rt->dst.dev); @@ -62,6 +63,7 @@ void iptunnel_xmit(struct sock *sk, struct rtable *rt, struct sk_buff *skb, skb_clear_hash_if_not_l4(skb); skb_dst_set(skb, &rt->dst); memset(IPCB(skb), 0, sizeof(*IPCB(skb))); + IPCB(skb)->flags = ipcb_flags; /* Push down and install the IP header. */ skb_push(skb, sizeof(struct iphdr)); diff --git a/net/ipv4/udp_tunnel_core.c b/net/ipv4/udp_tunnel_core.c index 2326548997d3..9efd62505916 100644 --- a/net/ipv4/udp_tunnel_core.c +++ b/net/ipv4/udp_tunnel_core.c @@ -169,7 +169,7 @@ EXPORT_SYMBOL_GPL(udp_tunnel_notify_del_rx_port); void udp_tunnel_xmit_skb(struct rtable *rt, struct sock *sk, struct sk_buff *skb, __be32 src, __be32 dst, __u8 tos, __u8 ttl, __be16 df, __be16 src_port, __be16 dst_port, - bool xnet, bool nocheck) + bool xnet, bool nocheck, u16 ipcb_flags) { struct udphdr *uh; @@ -185,7 +185,8 @@ void udp_tunnel_xmit_skb(struct rtable *rt, struct sock *sk, struct sk_buff *skb udp_set_csum(nocheck, skb, src, dst, skb->len); - iptunnel_xmit(sk, rt, skb, src, dst, IPPROTO_UDP, tos, ttl, df, xnet); + iptunnel_xmit(sk, rt, skb, src, dst, IPPROTO_UDP, tos, ttl, df, xnet, + ipcb_flags); } EXPORT_SYMBOL_GPL(udp_tunnel_xmit_skb); diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c index a72dbca9e8fc..12496ba1b7d4 100644 --- a/net/ipv6/sit.c +++ b/net/ipv6/sit.c @@ -1035,7 +1035,7 @@ static netdev_tx_t ipip6_tunnel_xmit(struct sk_buff *skb, skb_set_inner_ipproto(skb, IPPROTO_IPV6); iptunnel_xmit(NULL, rt, skb, fl4.saddr, fl4.daddr, protocol, tos, ttl, - df, !net_eq(tunnel->net, dev_net(dev))); + df, !net_eq(tunnel->net, dev_net(dev)), 0); return NETDEV_TX_OK; tx_error_icmp: diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c index 8c3b80c4d40b..bfbb73e359f5 100644 --- a/net/sctp/protocol.c +++ b/net/sctp/protocol.c @@ -1103,7 +1103,8 @@ static inline int sctp_v4_xmit(struct sk_buff *skb, struct sctp_transport *t) skb_set_inner_ipproto(skb, IPPROTO_SCTP); udp_tunnel_xmit_skb(dst_rtable(dst), sk, skb, fl4->saddr, fl4->daddr, dscp, ip4_dst_hoplimit(dst), df, - sctp_sk(sk)->udp_port, t->encap_port, false, false); + sctp_sk(sk)->udp_port, t->encap_port, false, false, + 0); return 0; } diff --git a/net/tipc/udp_media.c b/net/tipc/udp_media.c index 108a4cc2e001..87e8c1e6d550 100644 --- a/net/tipc/udp_media.c +++ b/net/tipc/udp_media.c @@ -197,7 +197,7 @@ static int tipc_udp_xmit(struct net *net, struct sk_buff *skb, ttl = ip4_dst_hoplimit(&rt->dst); udp_tunnel_xmit_skb(rt, ub->ubsock->sk, skb, src->ipv4.s_addr, dst->ipv4.s_addr, 0, ttl, 0, src->port, - dst->port, false, true); + dst->port, false, true, 0); #if IS_ENABLED(CONFIG_IPV6) } else { if (!ndst) { -- 2.49.0 From ihor.solodrai at linux.dev Mon Jun 9 20:55:18 2025 From: ihor.solodrai at linux.dev (Ihor Solodrai) Date: Mon, 9 Jun 2025 13:55:18 -0700 Subject: [syzbot] [net?] general protection fault in veth_xdp_rcv In-Reply-To: <683da55e.a00a0220.d8eae.0052.GAE@google.com> References: <683da55e.a00a0220.d8eae.0052.GAE@google.com> Message-ID: <6fd7a5b5-ee26-4cc5-8eb0-449c4e326ccc@linux.dev> On 6/2/25 6:21 AM, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit: 4cb6c8af8591 selftests/filesystems: Fix build of anon_inod.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=11e8300c580000 > kernel config: https://syzkaller.appspot.com/x/.config?x=5319177d225a42f1 > dashboard link: https://syzkaller.appspot.com/bug?extid=c4c7bf27f6b0c4bd97fe > compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 > > Unfortunately, I don't have any reproducer for this issue yet. > > Downloadable assets: > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-4cb6c8af.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/bc0e5dfdd686/vmlinux-4cb6c8af.xz > kernel image: https://storage.googleapis.com/syzbot-assets/2cdd323de6ca/bzImage-4cb6c8af.xz > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+c4c7bf27f6b0c4bd97fe at syzkaller.appspotmail.com > > Oops: general protection fault, probably for non-canonical address 0xdffffc0000000098: 0000 [#1] SMP KASAN NOPTI > KASAN: null-ptr-deref in range [0x00000000000004c0-0x00000000000004c7] > CPU: 1 UID: 0 PID: 5975 Comm: kworker/1:4 Not tainted 6.15.0-syzkaller-10402-g4cb6c8af8591 #0 PREEMPT(full) > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 > Workqueue: wg-kex-wg0 wg_packet_handshake_receive_worker > RIP: 0010:netdev_get_tx_queue include/linux/netdevice.h:2636 [inline] > RIP: 0010:veth_xdp_rcv.constprop.0+0x142/0xda0 drivers/net/veth.c:912 > Code: 54 d9 31 fb 45 85 e4 0f 85 db 08 00 00 e8 06 de 31 fb 48 8d bd c0 04 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 18 0c 00 00 44 8b a5 c0 04 00 > RSP: 0018:ffffc900006a09b8 EFLAGS: 00010202 > RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff868a1686 > RDX: 0000000000000098 RSI: ffffffff868a0d9a RDI: 00000000000004c0 > RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000000 > R10: 0000000000000001 R11: ffffc900006a0ff8 R12: 0000000000000001 > R13: 1ffff920000d4145 R14: ffffc900006a0e58 R15: ffff8880503d0000 > FS: 0000000000000000(0000) GS:ffff8880d686e000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007fe5e3a6ad58 CR3: 000000000e382000 CR4: 0000000000352ef0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > > veth_poll+0x19c/0x9c0 drivers/net/veth.c:979 > __napi_poll.constprop.0+0xba/0x550 net/core/dev.c:7414 > napi_poll net/core/dev.c:7478 [inline] > net_rx_action+0xa9f/0xfe0 net/core/dev.c:7605 > handle_softirqs+0x219/0x8e0 kernel/softirq.c:579 > do_softirq kernel/softirq.c:480 [inline] > do_softirq+0xb2/0xf0 kernel/softirq.c:467 > > > __local_bh_enable_ip+0x100/0x120 kernel/softirq.c:407 > local_bh_enable include/linux/bottom_half.h:33 [inline] > fpregs_unlock arch/x86/include/asm/fpu/api.h:77 [inline] > kernel_fpu_end+0x5e/0x70 arch/x86/kernel/fpu/core.c:476 > blake2s_compress+0x7f/0xe0 arch/x86/lib/crypto/blake2s-glue.c:46 > blake2s_final+0xc9/0x150 lib/crypto/blake2s.c:54 > hmac.constprop.0+0x335/0x420 drivers/net/wireguard/noise.c:333 > kdf.constprop.0+0x122/0x280 drivers/net/wireguard/noise.c:360 > mix_dh+0xe8/0x150 drivers/net/wireguard/noise.c:413 > wg_noise_handshake_consume_initiation+0x265/0x880 drivers/net/wireguard/noise.c:608 > wg_receive_handshake_packet+0x219/0xbf0 drivers/net/wireguard/receive.c:144 > wg_packet_handshake_receive_worker+0x17f/0x3a0 drivers/net/wireguard/receive.c:213 > process_one_work+0x9cc/0x1b70 kernel/workqueue.c:3238 > process_scheduled_works kernel/workqueue.c:3321 [inline] > worker_thread+0x6c8/0xf10 kernel/workqueue.c:3402 > kthread+0x3c2/0x780 kernel/kthread.c:464 > ret_from_fork+0x5d4/0x6f0 arch/x86/kernel/process.c:148 > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 > > Modules linked in: > ---[ end trace 0000000000000000 ]--- > RIP: 0010:netdev_get_tx_queue include/linux/netdevice.h:2636 [inline] > RIP: 0010:veth_xdp_rcv.constprop.0+0x142/0xda0 drivers/net/veth.c:912 > Code: 54 d9 31 fb 45 85 e4 0f 85 db 08 00 00 e8 06 de 31 fb 48 8d bd c0 04 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 18 0c 00 00 44 8b a5 c0 04 00 > RSP: 0018:ffffc900006a09b8 EFLAGS: 00010202 > RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff868a1686 > RDX: 0000000000000098 RSI: ffffffff868a0d9a RDI: 00000000000004c0 > RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000000 > R10: 0000000000000001 R11: ffffc900006a0ff8 R12: 0000000000000001 > R13: 1ffff920000d4145 R14: ffffc900006a0e58 R15: ffff8880503d0000 > FS: 0000000000000000(0000) GS:ffff8880d686e000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007fe5e3a6ad58 CR3: 000000000e382000 CR4: 0000000000352ef0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > ---------------- Got a very similar call trace on current bpf-next (e41079f53e87) [1], see a paste below. It's flaky, couldn't reproduce so far. Any relevant fixes in flight? #629/1 xdp_veth_broadcast_redirect/0/BROADCAST:OK #629/2 xdp_veth_broadcast_redirect/0/(BROADCAST | EXCLUDE_INGRESS):OK #629/3 xdp_veth_broadcast_redirect/DRV_MODE/BROADCAST:OK #629/4 xdp_veth_broadcast_redirect/DRV_MODE/(BROADCAST | EXCLUDE_INGRESS):OK #629/5 xdp_veth_broadcast_redirect/SKB_MODE/BROADCAST:OK #629/6 xdp_veth_broadcast_redirect/SKB_MODE/(BROADCAST | EXCLUDE_INGRESS):OK #629 xdp_veth_broadcast_redirect:OK [ 343.217465] BUG: kernel NULL pointer dereference, address: 0000000000000018 [ 343.218173] #PF: supervisor read access in kernel mode [ 343.218644] #PF: error_code(0x0000) - not-present page [ 343.219128] PGD 0 P4D 0 [ 343.219379] Oops: Oops: 0000 [#1] SMP NOPTI [ 343.219768] CPU: 1 UID: 0 PID: 7635 Comm: kworker/1:11 Tainted: G W OE 6.15.0-g2b36f2252b0a-dirty #7 PREEMPT(full) [ 343.220844] Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE [ 343.221436] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 343.222356] Workqueue: mld mld_dad_work [ 343.222730] RIP: 0010:veth_xdp_rcv.constprop.0+0x6b/0x380 [ 343.223242] Code: 01 48 89 84 24 90 00 00 00 31 c0 48 8b aa 80 0c 00 00 f3 48 ab e8 f5 e3 48 00 85 c0 0f 85 9c 02 00 00 4c 8d 34 5b 49 c1 e6 07 <4c> 03 75 18 45 85 e4 0f 8e ec 02 00 00 31 db 31 ed eb 4c 48 83 e6 [ 343.224977] RSP: 0018:ffff9aaa400e8ca8 EFLAGS: 00010246 [ 343.225475] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000002 [ 343.226139] RDX: 0000000000000001 RSI: ffff8f22912a5000 RDI: ffff9aaa400e8d38 [ 343.226808] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [ 343.227484] R10: 0000000000000001 R11: ffff9aaa400e8ff8 R12: 0000000000000040 [ 343.228143] R13: ffff9aaa400e8d78 R14: 0000000000000000 R15: ffff8f220ad0f000 [ 343.228820] FS: 0000000000000000(0000) GS:ffff8f22912a5000(0000) knlGS:0000000000000000 [ 343.229572] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 343.230118] CR2: 0000000000000018 CR3: 000000010ce45005 CR4: 0000000000770ef0 [ 343.230794] PKRU: 55555554 [ 343.231061] Call Trace: [ 343.231306] [ 343.231522] veth_poll+0x7b/0x3a0 [ 343.231856] __napi_poll.constprop.0+0x28/0x1d0 [ 343.232297] net_rx_action+0x199/0x350 [ 343.232682] handle_softirqs+0xd3/0x400 [ 343.233057] ? __dev_queue_xmit+0x27b/0x1250 [ 343.233473] do_softirq+0x43/0x90 [ 343.233804] [ 343.234016] [ 343.234226] __local_bh_enable_ip+0xb5/0xd0 [ 343.234622] ? __dev_queue_xmit+0x27b/0x1250 [ 343.235035] __dev_queue_xmit+0x290/0x1250 [ 343.235431] ? lock_acquire+0xbe/0x2c0 [ 343.235797] ? ip6_finish_output+0x25e/0x540 [ 343.236210] ? mark_held_locks+0x40/0x70 [ 343.236583] ip6_finish_output2+0x38f/0xb80 [ 343.237002] ? lock_release+0xc6/0x290 [ 343.237364] ip6_finish_output+0x25e/0x540 [ 343.237761] mld_sendpack+0x1c1/0x3a0 [ 343.238123] mld_dad_work+0x3e/0x150 [ 343.238473] process_one_work+0x1f8/0x580 [ 343.238859] worker_thread+0x1ce/0x3c0 [ 343.239224] ? __pfx_worker_thread+0x10/0x10 [ 343.239638] kthread+0x128/0x250 [ 343.239954] ? __pfx_kthread+0x10/0x10 [ 343.240320] ? __pfx_kthread+0x10/0x10 [ 343.240691] ret_from_fork+0x15c/0x1b0 [ 343.241056] ? __pfx_kthread+0x10/0x10 [ 343.241418] ret_from_fork_asm+0x1a/0x30 [ 343.241800] [ 343.242021] Modules linked in: bpf_testmod(OE) [last unloaded: bpf_test_no_cfi(OE)] [ 343.242737] CR2: 0000000000000018 [ 343.243064] ---[ end trace 0000000000000000 ]--- [ 343.243503] RIP: 0010:veth_xdp_rcv.constprop.0+0x6b/0x380 [ 343.244014] Code: 01 48 89 84 24 90 00 00 00 31 c0 48 8b aa 80 0c 00 00 f3 48 ab e8 f5 e3 48 00 85 c0 0f 85 9c 02 00 00 4c 8d 34 5b 49 c1 e6 07 <4c> 03 75 18 45 85 e4 0f 8e ec 02 00 00 31 db 31 ed eb 4c 48 83 e6 [ 343.245743] RSP: 0018:ffff9aaa400e8ca8 EFLAGS: 00010246 [ 343.246236] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000002 [ 343.246897] RDX: 0000000000000001 RSI: ffff8f22912a5000 RDI: ffff9aaa400e8d38 [ 343.247557] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [ 343.248219] R10: 0000000000000001 R11: ffff9aaa400e8ff8 R12: 0000000000000040 [ 343.248868] R13: ffff9aaa400e8d78 R14: 0000000000000000 R15: ffff8f220ad0f000 [ 343.249496] FS: 0000000000000000(0000) GS:ffff8f22912a5000(0000) knlGS:0000000000000000 [ 343.250109] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 343.250651] CR2: 0000000000000018 CR3: 000000010ce45005 CR4: 0000000000770ef0 [ 343.251320] PKRU: 55555554 [ 343.251548] Kernel panic - not syncing: Fatal exception in interrupt [ 343.252317] Kernel Offset: 0x27000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) Failed to run command Caused by: 0: Failed to QGA guest-exec-status 1: error running guest_exec_status 2: Broken pipe (os error 32) 3: Broken pipe (os error 32) ##[error]Process completed with exit code 2. [1] https://github.com/kernel-patches/bpf/actions/runs/15543380196/job/43759847203 > Code disassembly (best guess): > 0: 54 push %rsp > 1: d9 31 fnstenv (%rcx) > 3: fb sti > 4: 45 85 e4 test %r12d,%r12d > 7: 0f 85 db 08 00 00 jne 0x8e8 > d: e8 06 de 31 fb call 0xfb31de18 > 12: 48 8d bd c0 04 00 00 lea 0x4c0(%rbp),%rdi > 19: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax > 20: fc ff df > 23: 48 89 fa mov %rdi,%rdx > 26: 48 c1 ea 03 shr $0x3,%rdx > * 2a: 0f b6 04 02 movzbl (%rdx,%rax,1),%eax <-- trapping instruction > 2e: 84 c0 test %al,%al > 30: 74 08 je 0x3a > 32: 3c 03 cmp $0x3,%al > 34: 0f 8e 18 0c 00 00 jle 0xc52 > 3a: 44 rex.R > 3b: 8b .byte 0x8b > 3c: a5 movsl %ds:(%rsi),%es:(%rdi) > 3d: c0 .byte 0xc0 > 3e: 04 00 add $0x0,%al > > > --- > This report is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syzkaller at googlegroups.com. > > syzbot will keep track of this issue. See: > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > If the report is already addressed, let syzbot know by replying with: > #syz fix: exact-commit-title > > If you want to overwrite report's subsystems, reply with: > #syz set subsystems: new-subsystem > (See the list of subsystem names on the web dashboard) > > If the report is a duplicate of another one, reply with: > #syz dup: exact-subject-of-another-report > > If you want to undo deduplication, reply with: > #syz undup From razor at blackwall.org Thu Jun 12 10:28:19 2025 From: razor at blackwall.org (Nikolay Aleksandrov) Date: Thu, 12 Jun 2025 13:28:19 +0300 Subject: [PATCH net-next 01/14] net: ipv4: Add a flags argument to iptunnel_xmit(), udp_tunnel_xmit_skb() In-Reply-To: References: Message-ID: On 6/9/25 23:50, Petr Machata wrote: > iptunnel_xmit() erases the contents of the SKB control block. In order to > be able to set particular IPCB flags on the SKB, add a corresponding > parameter, and propagate it to udp_tunnel_xmit_skb() as well. > > In one of the following patches, VXLAN driver will use this facility to > mark packets as subject to IP multicast routing. > > Signed-off-by: Petr Machata > Reviewed-by: Ido Schimmel > --- > > Notes: > CC: Pablo Neira Ayuso > CC: osmocom-net-gprs at lists.osmocom.org > CC: Andrew Lunn > CC: Taehee Yoo > CC: Antonio Quartulli > CC: "Jason A. Donenfeld" > CC: wireguard at lists.zx2c4.com > CC: Marcelo Ricardo Leitner > CC: linux-sctp at vger.kernel.org > CC: Jon Maloy > CC: tipc-discussion at lists.sourceforge.net > > drivers/net/amt.c | 9 ++++++--- > drivers/net/bareudp.c | 4 ++-- > drivers/net/geneve.c | 4 ++-- > drivers/net/gtp.c | 10 ++++++---- > drivers/net/ovpn/udp.c | 2 +- > drivers/net/vxlan/vxlan_core.c | 2 +- > drivers/net/wireguard/socket.c | 2 +- > include/net/ip_tunnels.h | 2 +- > include/net/udp_tunnel.h | 2 +- > net/ipv4/ip_tunnel.c | 4 ++-- > net/ipv4/ip_tunnel_core.c | 4 +++- > net/ipv4/udp_tunnel_core.c | 5 +++-- > net/ipv6/sit.c | 2 +- > net/sctp/protocol.c | 3 ++- > net/tipc/udp_media.c | 2 +- > 15 files changed, 33 insertions(+), 24 deletions(-) > Reviewed-by: Nikolay Aleksandrov From petrm at nvidia.com Thu Jun 12 20:10:35 2025 From: petrm at nvidia.com (Petr Machata) Date: Thu, 12 Jun 2025 22:10:35 +0200 Subject: [PATCH net-next v2 01/14] net: ipv4: Add a flags argument to iptunnel_xmit(), udp_tunnel_xmit_skb() In-Reply-To: References: Message-ID: <93258d0156bab6c2d8c7c6e1a43d23e13e9830ec.1749757582.git.petrm@nvidia.com> iptunnel_xmit() erases the contents of the SKB control block. In order to be able to set particular IPCB flags on the SKB, add a corresponding parameter, and propagate it to udp_tunnel_xmit_skb() as well. In one of the following patches, VXLAN driver will use this facility to mark packets as subject to IP multicast routing. Signed-off-by: Petr Machata Reviewed-by: Ido Schimmel Reviewed-by: Nikolay Aleksandrov Acked-by: Antonio Quartulli --- Notes: CC: Pablo Neira Ayuso CC: osmocom-net-gprs at lists.osmocom.org CC: Andrew Lunn CC: Taehee Yoo CC: Antonio Quartulli CC: "Jason A. Donenfeld" CC: wireguard at lists.zx2c4.com CC: Marcelo Ricardo Leitner CC: linux-sctp at vger.kernel.org CC: Jon Maloy CC: tipc-discussion at lists.sourceforge.net drivers/net/amt.c | 9 ++++++--- drivers/net/bareudp.c | 4 ++-- drivers/net/geneve.c | 4 ++-- drivers/net/gtp.c | 10 ++++++---- drivers/net/ovpn/udp.c | 2 +- drivers/net/vxlan/vxlan_core.c | 2 +- drivers/net/wireguard/socket.c | 2 +- include/net/ip_tunnels.h | 2 +- include/net/udp_tunnel.h | 2 +- net/ipv4/ip_tunnel.c | 4 ++-- net/ipv4/ip_tunnel_core.c | 4 +++- net/ipv4/udp_tunnel_core.c | 5 +++-- net/ipv6/sit.c | 2 +- net/sctp/protocol.c | 3 ++- net/tipc/udp_media.c | 2 +- 15 files changed, 33 insertions(+), 24 deletions(-) diff --git a/drivers/net/amt.c b/drivers/net/amt.c index 734a0b3242a9..d0f719531499 100644 --- a/drivers/net/amt.c +++ b/drivers/net/amt.c @@ -1046,7 +1046,8 @@ static bool amt_send_membership_update(struct amt_dev *amt, amt->gw_port, amt->relay_port, false, - false); + false, + 0); amt_update_gw_status(amt, AMT_STATUS_SENT_UPDATE, true); return false; } @@ -1103,7 +1104,8 @@ static void amt_send_multicast_data(struct amt_dev *amt, amt->relay_port, tunnel->source_port, false, - false); + false, + 0); } static bool amt_send_membership_query(struct amt_dev *amt, @@ -1161,7 +1163,8 @@ static bool amt_send_membership_query(struct amt_dev *amt, amt->relay_port, tunnel->source_port, false, - false); + false, + 0); amt_update_relay_status(tunnel, AMT_STATUS_SENT_QUERY, true); return false; } diff --git a/drivers/net/bareudp.c b/drivers/net/bareudp.c index a9dffdcac805..5e613080d3f8 100644 --- a/drivers/net/bareudp.c +++ b/drivers/net/bareudp.c @@ -362,8 +362,8 @@ static int bareudp_xmit_skb(struct sk_buff *skb, struct net_device *dev, udp_tunnel_xmit_skb(rt, sock->sk, skb, saddr, info->key.u.ipv4.dst, tos, ttl, df, sport, bareudp->port, !net_eq(bareudp->net, dev_net(bareudp->dev)), - !test_bit(IP_TUNNEL_CSUM_BIT, - info->key.tun_flags)); + !test_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags), + 0); return 0; free_dst: diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c index ffc15a432689..c668e8b00ed2 100644 --- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -921,8 +921,8 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev, udp_tunnel_xmit_skb(rt, gs4->sock->sk, skb, saddr, info->key.u.ipv4.dst, tos, ttl, df, sport, geneve->cfg.info.key.tp_dst, !net_eq(geneve->net, dev_net(geneve->dev)), - !test_bit(IP_TUNNEL_CSUM_BIT, - info->key.tun_flags)); + !test_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags), + 0); return 0; } diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index d4dec741c7f4..14584793fe4e 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -446,7 +446,8 @@ static int gtp0_send_echo_resp_ip(struct gtp_dev *gtp, struct sk_buff *skb) htons(GTP0_PORT), htons(GTP0_PORT), !net_eq(sock_net(gtp->sk1u), dev_net(gtp->dev)), - false); + false, + 0); return 0; } @@ -704,7 +705,8 @@ static int gtp1u_send_echo_resp(struct gtp_dev *gtp, struct sk_buff *skb) htons(GTP1U_PORT), htons(GTP1U_PORT), !net_eq(sock_net(gtp->sk1u), dev_net(gtp->dev)), - false); + false, + 0); return 0; } @@ -1304,7 +1306,7 @@ static netdev_tx_t gtp_dev_xmit(struct sk_buff *skb, struct net_device *dev) pktinfo.gtph_port, pktinfo.gtph_port, !net_eq(sock_net(pktinfo.pctx->sk), dev_net(dev)), - false); + false, 0); break; case AF_INET6: #if IS_ENABLED(CONFIG_IPV6) @@ -2405,7 +2407,7 @@ static int gtp_genl_send_echo_req(struct sk_buff *skb, struct genl_info *info) port, port, !net_eq(sock_net(sk), dev_net(gtp->dev)), - false); + false, 0); return 0; } diff --git a/drivers/net/ovpn/udp.c b/drivers/net/ovpn/udp.c index bff00946eae2..d866e6bfda70 100644 --- a/drivers/net/ovpn/udp.c +++ b/drivers/net/ovpn/udp.c @@ -199,7 +199,7 @@ static int ovpn_udp4_output(struct ovpn_peer *peer, struct ovpn_bind *bind, transmit: udp_tunnel_xmit_skb(rt, sk, skb, fl.saddr, fl.daddr, 0, ip4_dst_hoplimit(&rt->dst), 0, fl.fl4_sport, - fl.fl4_dport, false, sk->sk_no_check_tx); + fl.fl4_dport, false, sk->sk_no_check_tx, 0); ret = 0; err: local_bh_enable(); diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c index a56d7239b127..d7a5d8873a1b 100644 --- a/drivers/net/vxlan/vxlan_core.c +++ b/drivers/net/vxlan/vxlan_core.c @@ -2522,7 +2522,7 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev, udp_tunnel_xmit_skb(rt, sock4->sock->sk, skb, saddr, pkey->u.ipv4.dst, tos, ttl, df, - src_port, dst_port, xnet, !udp_sum); + src_port, dst_port, xnet, !udp_sum, 0); #if IS_ENABLED(CONFIG_IPV6) } else { struct vxlan_sock *sock6 = rcu_dereference(vxlan->vn6_sock); diff --git a/drivers/net/wireguard/socket.c b/drivers/net/wireguard/socket.c index 0414d7a6ce74..88e685667bc0 100644 --- a/drivers/net/wireguard/socket.c +++ b/drivers/net/wireguard/socket.c @@ -84,7 +84,7 @@ static int send4(struct wg_device *wg, struct sk_buff *skb, skb->ignore_df = 1; udp_tunnel_xmit_skb(rt, sock, skb, fl.saddr, fl.daddr, ds, ip4_dst_hoplimit(&rt->dst), 0, fl.fl4_sport, - fl.fl4_dport, false, false); + fl.fl4_dport, false, false, 0); goto out; err: diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h index 0c3d571a04a1..8cf1380f3656 100644 --- a/include/net/ip_tunnels.h +++ b/include/net/ip_tunnels.h @@ -603,7 +603,7 @@ static inline int iptunnel_pull_header(struct sk_buff *skb, int hdr_len, void iptunnel_xmit(struct sock *sk, struct rtable *rt, struct sk_buff *skb, __be32 src, __be32 dst, u8 proto, - u8 tos, u8 ttl, __be16 df, bool xnet); + u8 tos, u8 ttl, __be16 df, bool xnet, u16 ipcb_flags); struct metadata_dst *iptunnel_metadata_reply(struct metadata_dst *md, gfp_t flags); int skb_tunnel_check_pmtu(struct sk_buff *skb, struct dst_entry *encap_dst, diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h index 2df3b8344eb5..28102c8fd8a8 100644 --- a/include/net/udp_tunnel.h +++ b/include/net/udp_tunnel.h @@ -150,7 +150,7 @@ static inline void udp_tunnel_drop_rx_info(struct net_device *dev) void udp_tunnel_xmit_skb(struct rtable *rt, struct sock *sk, struct sk_buff *skb, __be32 src, __be32 dst, __u8 tos, __u8 ttl, __be16 df, __be16 src_port, __be16 dst_port, - bool xnet, bool nocheck); + bool xnet, bool nocheck, u16 ipcb_flags); int udp_tunnel6_xmit_skb(struct dst_entry *dst, struct sock *sk, struct sk_buff *skb, diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index 678b8f96e3e9..aaeb5d16f0c9 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -668,7 +668,7 @@ void ip_md_tunnel_xmit(struct sk_buff *skb, struct net_device *dev, ip_tunnel_adj_headroom(dev, headroom); iptunnel_xmit(NULL, rt, skb, fl4.saddr, fl4.daddr, proto, tos, ttl, - df, !net_eq(tunnel->net, dev_net(dev))); + df, !net_eq(tunnel->net, dev_net(dev)), 0); return; tx_error: DEV_STATS_INC(dev, tx_errors); @@ -857,7 +857,7 @@ void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev, ip_tunnel_adj_headroom(dev, max_headroom); iptunnel_xmit(NULL, rt, skb, fl4.saddr, fl4.daddr, protocol, tos, ttl, - df, !net_eq(tunnel->net, dev_net(dev))); + df, !net_eq(tunnel->net, dev_net(dev)), 0); return; #if IS_ENABLED(CONFIG_IPV6) diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c index f65d2f727381..cc9915543637 100644 --- a/net/ipv4/ip_tunnel_core.c +++ b/net/ipv4/ip_tunnel_core.c @@ -49,7 +49,8 @@ EXPORT_SYMBOL(ip6tun_encaps); void iptunnel_xmit(struct sock *sk, struct rtable *rt, struct sk_buff *skb, __be32 src, __be32 dst, __u8 proto, - __u8 tos, __u8 ttl, __be16 df, bool xnet) + __u8 tos, __u8 ttl, __be16 df, bool xnet, + u16 ipcb_flags) { int pkt_len = skb->len - skb_inner_network_offset(skb); struct net *net = dev_net(rt->dst.dev); @@ -62,6 +63,7 @@ void iptunnel_xmit(struct sock *sk, struct rtable *rt, struct sk_buff *skb, skb_clear_hash_if_not_l4(skb); skb_dst_set(skb, &rt->dst); memset(IPCB(skb), 0, sizeof(*IPCB(skb))); + IPCB(skb)->flags = ipcb_flags; /* Push down and install the IP header. */ skb_push(skb, sizeof(struct iphdr)); diff --git a/net/ipv4/udp_tunnel_core.c b/net/ipv4/udp_tunnel_core.c index 2326548997d3..9efd62505916 100644 --- a/net/ipv4/udp_tunnel_core.c +++ b/net/ipv4/udp_tunnel_core.c @@ -169,7 +169,7 @@ EXPORT_SYMBOL_GPL(udp_tunnel_notify_del_rx_port); void udp_tunnel_xmit_skb(struct rtable *rt, struct sock *sk, struct sk_buff *skb, __be32 src, __be32 dst, __u8 tos, __u8 ttl, __be16 df, __be16 src_port, __be16 dst_port, - bool xnet, bool nocheck) + bool xnet, bool nocheck, u16 ipcb_flags) { struct udphdr *uh; @@ -185,7 +185,8 @@ void udp_tunnel_xmit_skb(struct rtable *rt, struct sock *sk, struct sk_buff *skb udp_set_csum(nocheck, skb, src, dst, skb->len); - iptunnel_xmit(sk, rt, skb, src, dst, IPPROTO_UDP, tos, ttl, df, xnet); + iptunnel_xmit(sk, rt, skb, src, dst, IPPROTO_UDP, tos, ttl, df, xnet, + ipcb_flags); } EXPORT_SYMBOL_GPL(udp_tunnel_xmit_skb); diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c index a72dbca9e8fc..12496ba1b7d4 100644 --- a/net/ipv6/sit.c +++ b/net/ipv6/sit.c @@ -1035,7 +1035,7 @@ static netdev_tx_t ipip6_tunnel_xmit(struct sk_buff *skb, skb_set_inner_ipproto(skb, IPPROTO_IPV6); iptunnel_xmit(NULL, rt, skb, fl4.saddr, fl4.daddr, protocol, tos, ttl, - df, !net_eq(tunnel->net, dev_net(dev))); + df, !net_eq(tunnel->net, dev_net(dev)), 0); return NETDEV_TX_OK; tx_error_icmp: diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c index 8c3b80c4d40b..bfbb73e359f5 100644 --- a/net/sctp/protocol.c +++ b/net/sctp/protocol.c @@ -1103,7 +1103,8 @@ static inline int sctp_v4_xmit(struct sk_buff *skb, struct sctp_transport *t) skb_set_inner_ipproto(skb, IPPROTO_SCTP); udp_tunnel_xmit_skb(dst_rtable(dst), sk, skb, fl4->saddr, fl4->daddr, dscp, ip4_dst_hoplimit(dst), df, - sctp_sk(sk)->udp_port, t->encap_port, false, false); + sctp_sk(sk)->udp_port, t->encap_port, false, false, + 0); return 0; } diff --git a/net/tipc/udp_media.c b/net/tipc/udp_media.c index 108a4cc2e001..87e8c1e6d550 100644 --- a/net/tipc/udp_media.c +++ b/net/tipc/udp_media.c @@ -197,7 +197,7 @@ static int tipc_udp_xmit(struct net *net, struct sk_buff *skb, ttl = ip4_dst_hoplimit(&rt->dst); udp_tunnel_xmit_skb(rt, ub->ubsock->sk, skb, src->ipv4.s_addr, dst->ipv4.s_addr, 0, ttl, 0, src->port, - dst->port, false, true); + dst->port, false, true, 0); #if IS_ENABLED(CONFIG_IPV6) } else { if (!ndst) { -- 2.49.0 From kuba at kernel.org Fri Jun 13 16:48:58 2025 From: kuba at kernel.org (Jakub Kicinski) Date: Fri, 13 Jun 2025 09:48:58 -0700 Subject: [PATCH net-next v2 01/14] net: ipv4: Add a flags argument to iptunnel_xmit(), udp_tunnel_xmit_skb() In-Reply-To: <93258d0156bab6c2d8c7c6e1a43d23e13e9830ec.1749757582.git.petrm@nvidia.com> References: <93258d0156bab6c2d8c7c6e1a43d23e13e9830ec.1749757582.git.petrm@nvidia.com> Message-ID: <20250613094858.5dfa435e@kernel.org> On Thu, 12 Jun 2025 22:10:35 +0200 Petr Machata wrote: > void udp_tunnel_xmit_skb(struct rtable *rt, struct sock *sk, struct sk_buff *skb, > __be32 src, __be32 dst, __u8 tos, __u8 ttl, > __be16 df, __be16 src_port, __be16 dst_port, > - bool xnet, bool nocheck) > + bool xnet, bool nocheck, u16 ipcb_flags) This is a lot of arguments for a function. I don't have a great suggestion off the top of my head, but maybe think more about it? From petrm at nvidia.com Fri Jun 13 19:23:20 2025 From: petrm at nvidia.com (Petr Machata) Date: Fri, 13 Jun 2025 21:23:20 +0200 Subject: [PATCH net-next v2 01/14] net: ipv4: Add a flags argument to iptunnel_xmit(), udp_tunnel_xmit_skb() In-Reply-To: <20250613094858.5dfa435e@kernel.org> References: <93258d0156bab6c2d8c7c6e1a43d23e13e9830ec.1749757582.git.petrm@nvidia.com> <20250613094858.5dfa435e@kernel.org> Message-ID: <87wm9f2zwd.fsf@nvidia.com> Jakub Kicinski writes: > On Thu, 12 Jun 2025 22:10:35 +0200 Petr Machata wrote: >> void udp_tunnel_xmit_skb(struct rtable *rt, struct sock *sk, struct sk_buff *skb, >> __be32 src, __be32 dst, __u8 tos, __u8 ttl, >> __be16 df, __be16 src_port, __be16 dst_port, >> - bool xnet, bool nocheck) >> + bool xnet, bool nocheck, u16 ipcb_flags) > > This is a lot of arguments for a function. > I don't have a great suggestion off the top of my head, but maybe > think more about it? It wraps functions that take many arguments ^o^ We could exchange src_port, dst_port by passing in the UDP header directly, but I don't think that's a good idea. I guess I don't have great ideas either. From petrm at nvidia.com Mon Jun 16 22:44:09 2025 From: petrm at nvidia.com (Petr Machata) Date: Tue, 17 Jun 2025 00:44:09 +0200 Subject: [PATCH net-next v3 01/15] net: ipv4: Add a flags argument to iptunnel_xmit(), udp_tunnel_xmit_skb() In-Reply-To: References: Message-ID: <89c9daf9f2dc088b6b92ccebcc929f51742de91f.1750113335.git.petrm@nvidia.com> iptunnel_xmit() erases the contents of the SKB control block. In order to be able to set particular IPCB flags on the SKB, add a corresponding parameter, and propagate it to udp_tunnel_xmit_skb() as well. In one of the following patches, VXLAN driver will use this facility to mark packets as subject to IP multicast routing. Signed-off-by: Petr Machata Reviewed-by: Ido Schimmel Reviewed-by: Nikolay Aleksandrov Acked-by: Antonio Quartulli --- Notes: CC: Pablo Neira Ayuso CC: osmocom-net-gprs at lists.osmocom.org CC: Andrew Lunn CC: Taehee Yoo CC: Antonio Quartulli CC: "Jason A. Donenfeld" CC: wireguard at lists.zx2c4.com CC: Marcelo Ricardo Leitner CC: linux-sctp at vger.kernel.org CC: Jon Maloy CC: tipc-discussion at lists.sourceforge.net drivers/net/amt.c | 9 ++++++--- drivers/net/bareudp.c | 4 ++-- drivers/net/geneve.c | 4 ++-- drivers/net/gtp.c | 10 ++++++---- drivers/net/ovpn/udp.c | 2 +- drivers/net/vxlan/vxlan_core.c | 2 +- drivers/net/wireguard/socket.c | 2 +- include/net/ip_tunnels.h | 2 +- include/net/udp_tunnel.h | 2 +- net/ipv4/ip_tunnel.c | 4 ++-- net/ipv4/ip_tunnel_core.c | 4 +++- net/ipv4/udp_tunnel_core.c | 5 +++-- net/ipv6/sit.c | 2 +- net/sctp/protocol.c | 3 ++- net/tipc/udp_media.c | 2 +- 15 files changed, 33 insertions(+), 24 deletions(-) diff --git a/drivers/net/amt.c b/drivers/net/amt.c index fb130fde68c0..ed86537b2f61 100644 --- a/drivers/net/amt.c +++ b/drivers/net/amt.c @@ -1046,7 +1046,8 @@ static bool amt_send_membership_update(struct amt_dev *amt, amt->gw_port, amt->relay_port, false, - false); + false, + 0); amt_update_gw_status(amt, AMT_STATUS_SENT_UPDATE, true); return false; } @@ -1103,7 +1104,8 @@ static void amt_send_multicast_data(struct amt_dev *amt, amt->relay_port, tunnel->source_port, false, - false); + false, + 0); } static bool amt_send_membership_query(struct amt_dev *amt, @@ -1161,7 +1163,8 @@ static bool amt_send_membership_query(struct amt_dev *amt, amt->relay_port, tunnel->source_port, false, - false); + false, + 0); amt_update_relay_status(tunnel, AMT_STATUS_SENT_QUERY, true); return false; } diff --git a/drivers/net/bareudp.c b/drivers/net/bareudp.c index a9dffdcac805..5e613080d3f8 100644 --- a/drivers/net/bareudp.c +++ b/drivers/net/bareudp.c @@ -362,8 +362,8 @@ static int bareudp_xmit_skb(struct sk_buff *skb, struct net_device *dev, udp_tunnel_xmit_skb(rt, sock->sk, skb, saddr, info->key.u.ipv4.dst, tos, ttl, df, sport, bareudp->port, !net_eq(bareudp->net, dev_net(bareudp->dev)), - !test_bit(IP_TUNNEL_CSUM_BIT, - info->key.tun_flags)); + !test_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags), + 0); return 0; free_dst: diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c index ffc15a432689..c668e8b00ed2 100644 --- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -921,8 +921,8 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev, udp_tunnel_xmit_skb(rt, gs4->sock->sk, skb, saddr, info->key.u.ipv4.dst, tos, ttl, df, sport, geneve->cfg.info.key.tp_dst, !net_eq(geneve->net, dev_net(geneve->dev)), - !test_bit(IP_TUNNEL_CSUM_BIT, - info->key.tun_flags)); + !test_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags), + 0); return 0; } diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index d4dec741c7f4..14584793fe4e 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -446,7 +446,8 @@ static int gtp0_send_echo_resp_ip(struct gtp_dev *gtp, struct sk_buff *skb) htons(GTP0_PORT), htons(GTP0_PORT), !net_eq(sock_net(gtp->sk1u), dev_net(gtp->dev)), - false); + false, + 0); return 0; } @@ -704,7 +705,8 @@ static int gtp1u_send_echo_resp(struct gtp_dev *gtp, struct sk_buff *skb) htons(GTP1U_PORT), htons(GTP1U_PORT), !net_eq(sock_net(gtp->sk1u), dev_net(gtp->dev)), - false); + false, + 0); return 0; } @@ -1304,7 +1306,7 @@ static netdev_tx_t gtp_dev_xmit(struct sk_buff *skb, struct net_device *dev) pktinfo.gtph_port, pktinfo.gtph_port, !net_eq(sock_net(pktinfo.pctx->sk), dev_net(dev)), - false); + false, 0); break; case AF_INET6: #if IS_ENABLED(CONFIG_IPV6) @@ -2405,7 +2407,7 @@ static int gtp_genl_send_echo_req(struct sk_buff *skb, struct genl_info *info) port, port, !net_eq(sock_net(sk), dev_net(gtp->dev)), - false); + false, 0); return 0; } diff --git a/drivers/net/ovpn/udp.c b/drivers/net/ovpn/udp.c index bff00946eae2..d866e6bfda70 100644 --- a/drivers/net/ovpn/udp.c +++ b/drivers/net/ovpn/udp.c @@ -199,7 +199,7 @@ static int ovpn_udp4_output(struct ovpn_peer *peer, struct ovpn_bind *bind, transmit: udp_tunnel_xmit_skb(rt, sk, skb, fl.saddr, fl.daddr, 0, ip4_dst_hoplimit(&rt->dst), 0, fl.fl4_sport, - fl.fl4_dport, false, sk->sk_no_check_tx); + fl.fl4_dport, false, sk->sk_no_check_tx, 0); ret = 0; err: local_bh_enable(); diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c index 97792de896b7..1cc18acd242d 100644 --- a/drivers/net/vxlan/vxlan_core.c +++ b/drivers/net/vxlan/vxlan_core.c @@ -2522,7 +2522,7 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev, udp_tunnel_xmit_skb(rt, sock4->sock->sk, skb, saddr, pkey->u.ipv4.dst, tos, ttl, df, - src_port, dst_port, xnet, !udp_sum); + src_port, dst_port, xnet, !udp_sum, 0); #if IS_ENABLED(CONFIG_IPV6) } else { struct vxlan_sock *sock6 = rcu_dereference(vxlan->vn6_sock); diff --git a/drivers/net/wireguard/socket.c b/drivers/net/wireguard/socket.c index 0414d7a6ce74..88e685667bc0 100644 --- a/drivers/net/wireguard/socket.c +++ b/drivers/net/wireguard/socket.c @@ -84,7 +84,7 @@ static int send4(struct wg_device *wg, struct sk_buff *skb, skb->ignore_df = 1; udp_tunnel_xmit_skb(rt, sock, skb, fl.saddr, fl.daddr, ds, ip4_dst_hoplimit(&rt->dst), 0, fl.fl4_sport, - fl.fl4_dport, false, false); + fl.fl4_dport, false, false, 0); goto out; err: diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h index 0c3d571a04a1..8cf1380f3656 100644 --- a/include/net/ip_tunnels.h +++ b/include/net/ip_tunnels.h @@ -603,7 +603,7 @@ static inline int iptunnel_pull_header(struct sk_buff *skb, int hdr_len, void iptunnel_xmit(struct sock *sk, struct rtable *rt, struct sk_buff *skb, __be32 src, __be32 dst, u8 proto, - u8 tos, u8 ttl, __be16 df, bool xnet); + u8 tos, u8 ttl, __be16 df, bool xnet, u16 ipcb_flags); struct metadata_dst *iptunnel_metadata_reply(struct metadata_dst *md, gfp_t flags); int skb_tunnel_check_pmtu(struct sk_buff *skb, struct dst_entry *encap_dst, diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h index 2df3b8344eb5..28102c8fd8a8 100644 --- a/include/net/udp_tunnel.h +++ b/include/net/udp_tunnel.h @@ -150,7 +150,7 @@ static inline void udp_tunnel_drop_rx_info(struct net_device *dev) void udp_tunnel_xmit_skb(struct rtable *rt, struct sock *sk, struct sk_buff *skb, __be32 src, __be32 dst, __u8 tos, __u8 ttl, __be16 df, __be16 src_port, __be16 dst_port, - bool xnet, bool nocheck); + bool xnet, bool nocheck, u16 ipcb_flags); int udp_tunnel6_xmit_skb(struct dst_entry *dst, struct sock *sk, struct sk_buff *skb, diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index 678b8f96e3e9..aaeb5d16f0c9 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -668,7 +668,7 @@ void ip_md_tunnel_xmit(struct sk_buff *skb, struct net_device *dev, ip_tunnel_adj_headroom(dev, headroom); iptunnel_xmit(NULL, rt, skb, fl4.saddr, fl4.daddr, proto, tos, ttl, - df, !net_eq(tunnel->net, dev_net(dev))); + df, !net_eq(tunnel->net, dev_net(dev)), 0); return; tx_error: DEV_STATS_INC(dev, tx_errors); @@ -857,7 +857,7 @@ void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev, ip_tunnel_adj_headroom(dev, max_headroom); iptunnel_xmit(NULL, rt, skb, fl4.saddr, fl4.daddr, protocol, tos, ttl, - df, !net_eq(tunnel->net, dev_net(dev))); + df, !net_eq(tunnel->net, dev_net(dev)), 0); return; #if IS_ENABLED(CONFIG_IPV6) diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c index f65d2f727381..cc9915543637 100644 --- a/net/ipv4/ip_tunnel_core.c +++ b/net/ipv4/ip_tunnel_core.c @@ -49,7 +49,8 @@ EXPORT_SYMBOL(ip6tun_encaps); void iptunnel_xmit(struct sock *sk, struct rtable *rt, struct sk_buff *skb, __be32 src, __be32 dst, __u8 proto, - __u8 tos, __u8 ttl, __be16 df, bool xnet) + __u8 tos, __u8 ttl, __be16 df, bool xnet, + u16 ipcb_flags) { int pkt_len = skb->len - skb_inner_network_offset(skb); struct net *net = dev_net(rt->dst.dev); @@ -62,6 +63,7 @@ void iptunnel_xmit(struct sock *sk, struct rtable *rt, struct sk_buff *skb, skb_clear_hash_if_not_l4(skb); skb_dst_set(skb, &rt->dst); memset(IPCB(skb), 0, sizeof(*IPCB(skb))); + IPCB(skb)->flags = ipcb_flags; /* Push down and install the IP header. */ skb_push(skb, sizeof(struct iphdr)); diff --git a/net/ipv4/udp_tunnel_core.c b/net/ipv4/udp_tunnel_core.c index 2326548997d3..9efd62505916 100644 --- a/net/ipv4/udp_tunnel_core.c +++ b/net/ipv4/udp_tunnel_core.c @@ -169,7 +169,7 @@ EXPORT_SYMBOL_GPL(udp_tunnel_notify_del_rx_port); void udp_tunnel_xmit_skb(struct rtable *rt, struct sock *sk, struct sk_buff *skb, __be32 src, __be32 dst, __u8 tos, __u8 ttl, __be16 df, __be16 src_port, __be16 dst_port, - bool xnet, bool nocheck) + bool xnet, bool nocheck, u16 ipcb_flags) { struct udphdr *uh; @@ -185,7 +185,8 @@ void udp_tunnel_xmit_skb(struct rtable *rt, struct sock *sk, struct sk_buff *skb udp_set_csum(nocheck, skb, src, dst, skb->len); - iptunnel_xmit(sk, rt, skb, src, dst, IPPROTO_UDP, tos, ttl, df, xnet); + iptunnel_xmit(sk, rt, skb, src, dst, IPPROTO_UDP, tos, ttl, df, xnet, + ipcb_flags); } EXPORT_SYMBOL_GPL(udp_tunnel_xmit_skb); diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c index a72dbca9e8fc..12496ba1b7d4 100644 --- a/net/ipv6/sit.c +++ b/net/ipv6/sit.c @@ -1035,7 +1035,7 @@ static netdev_tx_t ipip6_tunnel_xmit(struct sk_buff *skb, skb_set_inner_ipproto(skb, IPPROTO_IPV6); iptunnel_xmit(NULL, rt, skb, fl4.saddr, fl4.daddr, protocol, tos, ttl, - df, !net_eq(tunnel->net, dev_net(dev))); + df, !net_eq(tunnel->net, dev_net(dev)), 0); return NETDEV_TX_OK; tx_error_icmp: diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c index f402f90eb6b6..a5ccada55f2b 100644 --- a/net/sctp/protocol.c +++ b/net/sctp/protocol.c @@ -1103,7 +1103,8 @@ static inline int sctp_v4_xmit(struct sk_buff *skb, struct sctp_transport *t) skb_set_inner_ipproto(skb, IPPROTO_SCTP); udp_tunnel_xmit_skb(dst_rtable(dst), sk, skb, fl4->saddr, fl4->daddr, dscp, ip4_dst_hoplimit(dst), df, - sctp_sk(sk)->udp_port, t->encap_port, false, false); + sctp_sk(sk)->udp_port, t->encap_port, false, false, + 0); return 0; } diff --git a/net/tipc/udp_media.c b/net/tipc/udp_media.c index 108a4cc2e001..87e8c1e6d550 100644 --- a/net/tipc/udp_media.c +++ b/net/tipc/udp_media.c @@ -197,7 +197,7 @@ static int tipc_udp_xmit(struct net *net, struct sk_buff *skb, ttl = ip4_dst_hoplimit(&rt->dst); udp_tunnel_xmit_skb(rt, ub->ubsock->sk, skb, src->ipv4.s_addr, dst->ipv4.s_addr, 0, ttl, 0, src->port, - dst->port, false, true); + dst->port, false, true, 0); #if IS_ENABLED(CONFIG_IPV6) } else { if (!ndst) { -- 2.49.0 From petrm at nvidia.com Mon Jun 16 22:44:14 2025 From: petrm at nvidia.com (Petr Machata) Date: Tue, 17 Jun 2025 00:44:14 +0200 Subject: [PATCH net-next v3 06/15] net: ipv6: Add a flags argument to ip6tunnel_xmit(), udp_tunnel6_xmit_skb() In-Reply-To: References: Message-ID: ip6tunnel_xmit() erases the contents of the SKB control block. In order to be able to set particular IP6CB flags on the SKB, add a corresponding parameter, and propagate it to udp_tunnel6_xmit_skb() as well. In one of the following patches, VXLAN driver will use this facility to mark packets as subject to IPv6 multicast routing. Signed-off-by: Petr Machata Reviewed-by: Ido Schimmel Reviewed-by: Nikolay Aleksandrov --- Notes: CC: Pablo Neira Ayuso CC: osmocom-net-gprs at lists.osmocom.org CC: Andrew Lunn CC: Antonio Quartulli CC: "Jason A. Donenfeld" CC: wireguard at lists.zx2c4.com CC: Marcelo Ricardo Leitner CC: linux-sctp at vger.kernel.org CC: Jon Maloy CC: tipc-discussion at lists.sourceforge.net drivers/net/bareudp.c | 3 ++- drivers/net/geneve.c | 3 ++- drivers/net/gtp.c | 2 +- drivers/net/ovpn/udp.c | 2 +- drivers/net/vxlan/vxlan_core.c | 3 ++- drivers/net/wireguard/socket.c | 2 +- include/net/ip6_tunnel.h | 3 ++- include/net/udp_tunnel.h | 3 ++- net/ipv6/ip6_tunnel.c | 2 +- net/ipv6/ip6_udp_tunnel.c | 5 +++-- net/sctp/ipv6.c | 2 +- net/tipc/udp_media.c | 2 +- 12 files changed, 19 insertions(+), 13 deletions(-) diff --git a/drivers/net/bareudp.c b/drivers/net/bareudp.c index 5e613080d3f8..0df3208783ad 100644 --- a/drivers/net/bareudp.c +++ b/drivers/net/bareudp.c @@ -431,7 +431,8 @@ static int bareudp6_xmit_skb(struct sk_buff *skb, struct net_device *dev, &saddr, &daddr, prio, ttl, info->key.label, sport, bareudp->port, !test_bit(IP_TUNNEL_CSUM_BIT, - info->key.tun_flags)); + info->key.tun_flags), + 0); return 0; free_dst: diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c index c668e8b00ed2..f6bd155aae7f 100644 --- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -1014,7 +1014,8 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev, &saddr, &key->u.ipv6.dst, prio, ttl, info->key.label, sport, geneve->cfg.info.key.tp_dst, !test_bit(IP_TUNNEL_CSUM_BIT, - info->key.tun_flags)); + info->key.tun_flags), + 0); return 0; } #endif diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index 14584793fe4e..4b668ebaa0f7 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -1316,7 +1316,7 @@ static netdev_tx_t gtp_dev_xmit(struct sk_buff *skb, struct net_device *dev) ip6_dst_hoplimit(&pktinfo.rt->dst), 0, pktinfo.gtph_port, pktinfo.gtph_port, - false); + false, 0); #else goto tx_err; #endif diff --git a/drivers/net/ovpn/udp.c b/drivers/net/ovpn/udp.c index d866e6bfda70..254cc94c4617 100644 --- a/drivers/net/ovpn/udp.c +++ b/drivers/net/ovpn/udp.c @@ -274,7 +274,7 @@ static int ovpn_udp6_output(struct ovpn_peer *peer, struct ovpn_bind *bind, skb->ignore_df = 1; udp_tunnel6_xmit_skb(dst, sk, skb, skb->dev, &fl.saddr, &fl.daddr, 0, ip6_dst_hoplimit(dst), 0, fl.fl6_sport, - fl.fl6_dport, udp_get_no_check6_tx(sk)); + fl.fl6_dport, udp_get_no_check6_tx(sk), 0); ret = 0; err: local_bh_enable(); diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c index 1cc18acd242d..b22f9866be8e 100644 --- a/drivers/net/vxlan/vxlan_core.c +++ b/drivers/net/vxlan/vxlan_core.c @@ -2586,7 +2586,8 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev, udp_tunnel6_xmit_skb(ndst, sock6->sock->sk, skb, dev, &saddr, &pkey->u.ipv6.dst, tos, ttl, - pkey->label, src_port, dst_port, !udp_sum); + pkey->label, src_port, dst_port, !udp_sum, + 0); #endif } vxlan_vnifilter_count(vxlan, vni, NULL, VXLAN_VNI_STATS_TX, pkt_len); diff --git a/drivers/net/wireguard/socket.c b/drivers/net/wireguard/socket.c index 88e685667bc0..253488f8c00f 100644 --- a/drivers/net/wireguard/socket.c +++ b/drivers/net/wireguard/socket.c @@ -151,7 +151,7 @@ static int send6(struct wg_device *wg, struct sk_buff *skb, skb->ignore_df = 1; udp_tunnel6_xmit_skb(dst, sock, skb, skb->dev, &fl.saddr, &fl.daddr, ds, ip6_dst_hoplimit(dst), 0, fl.fl6_sport, - fl.fl6_dport, false); + fl.fl6_dport, false, 0); goto out; err: diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h index 399592405c72..dd163495f353 100644 --- a/include/net/ip6_tunnel.h +++ b/include/net/ip6_tunnel.h @@ -152,11 +152,12 @@ int ip6_tnl_get_iflink(const struct net_device *dev); int ip6_tnl_change_mtu(struct net_device *dev, int new_mtu); static inline void ip6tunnel_xmit(struct sock *sk, struct sk_buff *skb, - struct net_device *dev) + struct net_device *dev, u16 ip6cb_flags) { int pkt_len, err; memset(skb->cb, 0, sizeof(struct inet6_skb_parm)); + IP6CB(skb)->flags = ip6cb_flags; pkt_len = skb->len - skb_inner_network_offset(skb); err = ip6_local_out(dev_net(skb_dst(skb)->dev), sk, skb); diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h index 0b01f6ade20d..e3c70b579095 100644 --- a/include/net/udp_tunnel.h +++ b/include/net/udp_tunnel.h @@ -158,7 +158,8 @@ void udp_tunnel6_xmit_skb(struct dst_entry *dst, struct sock *sk, const struct in6_addr *saddr, const struct in6_addr *daddr, __u8 prio, __u8 ttl, __be32 label, - __be16 src_port, __be16 dst_port, bool nocheck); + __be16 src_port, __be16 dst_port, bool nocheck, + u16 ip6cb_flags); void udp_tunnel_sock_release(struct socket *sock); diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 894d3158a6f0..a885bb5c98ea 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -1278,7 +1278,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield, ipv6h->nexthdr = proto; ipv6h->saddr = fl6->saddr; ipv6h->daddr = fl6->daddr; - ip6tunnel_xmit(NULL, skb, dev); + ip6tunnel_xmit(NULL, skb, dev, 0); return 0; tx_err_link_failure: DEV_STATS_INC(dev, tx_carrier_errors); diff --git a/net/ipv6/ip6_udp_tunnel.c b/net/ipv6/ip6_udp_tunnel.c index 21681718b7bb..8ebe17a6058a 100644 --- a/net/ipv6/ip6_udp_tunnel.c +++ b/net/ipv6/ip6_udp_tunnel.c @@ -80,7 +80,8 @@ void udp_tunnel6_xmit_skb(struct dst_entry *dst, struct sock *sk, const struct in6_addr *saddr, const struct in6_addr *daddr, __u8 prio, __u8 ttl, __be32 label, - __be16 src_port, __be16 dst_port, bool nocheck) + __be16 src_port, __be16 dst_port, bool nocheck, + u16 ip6cb_flags) { struct udphdr *uh; struct ipv6hdr *ip6h; @@ -108,7 +109,7 @@ void udp_tunnel6_xmit_skb(struct dst_entry *dst, struct sock *sk, ip6h->daddr = *daddr; ip6h->saddr = *saddr; - ip6tunnel_xmit(sk, skb, dev); + ip6tunnel_xmit(sk, skb, dev, ip6cb_flags); } EXPORT_SYMBOL_GPL(udp_tunnel6_xmit_skb); diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c index d1ecf7454827..3336dcfb4515 100644 --- a/net/sctp/ipv6.c +++ b/net/sctp/ipv6.c @@ -263,7 +263,7 @@ static int sctp_v6_xmit(struct sk_buff *skb, struct sctp_transport *t) udp_tunnel6_xmit_skb(dst, sk, skb, NULL, &fl6->saddr, &fl6->daddr, tclass, ip6_dst_hoplimit(dst), label, - sctp_sk(sk)->udp_port, t->encap_port, false); + sctp_sk(sk)->udp_port, t->encap_port, false, 0); return 0; } diff --git a/net/tipc/udp_media.c b/net/tipc/udp_media.c index 414713fcd8c5..a024fcc8c0cb 100644 --- a/net/tipc/udp_media.c +++ b/net/tipc/udp_media.c @@ -219,7 +219,7 @@ static int tipc_udp_xmit(struct net *net, struct sk_buff *skb, ttl = ip6_dst_hoplimit(ndst); udp_tunnel6_xmit_skb(ndst, ub->ubsock->sk, skb, NULL, &src->ipv6, &dst->ipv6, 0, ttl, 0, - src->port, dst->port, false); + src->port, dst->port, false, 0); #endif } local_bh_enable(); -- 2.49.0 From kuba at kernel.org Wed Jun 18 18:47:35 2025 From: kuba at kernel.org (Jakub Kicinski) Date: Wed, 18 Jun 2025 11:47:35 -0700 Subject: Issue with rtnl_lock in wg_pm_notification In-Reply-To: <2c5257a7-e330-4983-8447-3e217b616b2e@quicinc.com> References: <2c5257a7-e330-4983-8447-3e217b616b2e@quicinc.com> Message-ID: <20250618114735.5472f2cd@kernel.org> On Tue, 17 Jun 2025 01:44:56 +0530 Sharath Chandra Vurukala wrote: > I do not understand fully what wireguard functionality is, but > considering that rtnl_lock is a global one, it does not seem to be a > good design to have notification callback acquire this lock. I'm not very familiar with the PM locks, but isn't the PM notification lock also a global one? Again, not an expert but having PM lock outside the rtnl_lock would be more intuitive to me. From vegeta at tuxpowered.net Wed Jun 18 20:38:43 2025 From: vegeta at tuxpowered.net (Kajetan Staszkiewicz) Date: Wed, 18 Jun 2025 22:38:43 +0200 Subject: [PATCH] Wireguard-Apple: Restore iOS-like NWPath handling on MacOS app Message-ID: <01a00e74-0eca-4914-9507-f2b96118b0e7@tuxpowered.net> I've sent this already when the mailing list was down, maybe it went unnoticed: Sometimes after a network path change, especially when only "unsatisfied" network path is available, for example when a laptop loses all LAN and WiFi networks, further network path changes are ignored. When "satisfied" networks disappear the cloned route for the bound socket is removed by the system and WireGuard packets are routed through the tunnel. This will result in an non-operational tunnel. The iOS code does not manifest this behaviour, as it properly disables the tunnel when no "satisfied" networks are available. Remove the special MacOS case, use the iOS code on MacOS app. -- | pozdrawiam / regards | Powered by Debian and FreeBSD | | Kajetan Staszkiewicz | www: http://tuxpowered.net | | | matrix: @vegeta:tuxpowered.net | `----------------------^--------------------------------' -------------- next part -------------- From 0dc1630f54201dac005125b065265b7b3394bc29 Mon Sep 17 00:00:00 2001 From: Kajetan Staszkiewicz Date: Mon, 27 Jan 2025 12:48:36 +0100 Subject: [PATCH] Restore iOS-like NWPath handling on MacOS app Sometimes after a network path change, especially when only "unsatisfied" network path is available, for example when a laptop loses all LAN and WiFi networks, further network path changes are ignored. When "satisfied" networks disappear the cloned route for the bound socket is removed by the system and WireGuard packets are routed through the tunnel. This will result in an non-operational tunnel. The iOS code does not manifest this behaviour, as it properly disables the tunnel when no "satisfied" networks are available. Remove the special MacOS case, use the iOS code on MacOS app. --- Sources/WireGuardKit/WireGuardAdapter.swift | 8 -------- 1 file changed, 8 deletions(-) diff --git a/Sources/WireGuardKit/WireGuardAdapter.swift b/Sources/WireGuardKit/WireGuardAdapter.swift index f7be19b..f5bf115 100644 --- a/Sources/WireGuardKit/WireGuardAdapter.swift +++ b/Sources/WireGuardKit/WireGuardAdapter.swift @@ -409,25 +409,20 @@ public class WireGuardAdapter { self.logHandler(.error, "Failed to resolve endpoint \(resolutionError.address): \(resolutionError.errorDescription ?? "(nil)")") } } } /// Helper method used by network path monitor. /// - Parameter path: new network path private func didReceivePathUpdate(path: Network.NWPath) { self.logHandler(.verbose, "Network change detected with \(path.status) route and interface order \(path.availableInterfaces)") - #if os(macOS) - if case .started(let handle, _) = self.state { - wgBumpSockets(handle) - } - #elseif os(iOS) switch self.state { case .started(let handle, let settingsGenerator): if path.status.isSatisfiable { let (wgConfig, resolutionResults) = settingsGenerator.endpointUapiConfiguration() self.logEndpointResolutionResults(resolutionResults) wgSetConfig(handle, wgConfig) wgDisableSomeRoamingForBrokenMobileSemantics(handle) wgBumpSockets(handle) } else { @@ -453,23 +448,20 @@ public class WireGuardAdapter { settingsGenerator ) } catch { self.logHandler(.error, "Failed to restart backend: \(error.localizedDescription)") } case .stopped: // no-op break } - #else - #error("Unsupported") - #endif } } /// A enum describing WireGuard log levels defined in `api-apple.go`. public enum WireGuardLogLevel: Int32 { case verbose = 0 case error = 1 } private extension Network.NWPath.Status { -- 2.47.0 -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature.asc Type: application/pgp-signature Size: 840 bytes Desc: OpenPGP digital signature URL: From yury.norov at gmail.com Thu Jun 19 14:54:59 2025 From: yury.norov at gmail.com (Yury Norov) Date: Thu, 19 Jun 2025 10:54:59 -0400 Subject: [PATCH v2] wireguard: queueing: simplify wg_cpumask_next_online() Message-ID: <20250619145501.351951-1-yury.norov@gmail.com> From: Yury Norov [NVIDIA] wg_cpumask_choose_online() opencodes cpumask_nth(). Use it and make the function significantly simpler. While there, fix opencoded cpu_online() too. Signed-off-by: Yury Norov [NVIDIA] --- v1: https://lore.kernel.org/all/20250604233656.41896-1-yury.norov at gmail.com/ v2: - fix 'cpu' undeclared; - change subject (Jason); - keep the original function structure (Jason); drivers/net/wireguard/queueing.h | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h index 7eb76724b3ed..56314f98b6ba 100644 --- a/drivers/net/wireguard/queueing.h +++ b/drivers/net/wireguard/queueing.h @@ -104,16 +104,11 @@ static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating) static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id) { - unsigned int cpu = *stored_cpu, cpu_index, i; + unsigned int cpu = *stored_cpu; + + if (unlikely(cpu >= nr_cpu_ids || !cpu_online(cpu))) + cpu = *stored_cpu = cpumask_nth(id % num_online_cpus(), cpu_online_mask); - if (unlikely(cpu >= nr_cpu_ids || - !cpumask_test_cpu(cpu, cpu_online_mask))) { - cpu_index = id % cpumask_weight(cpu_online_mask); - cpu = cpumask_first(cpu_online_mask); - for (i = 0; i < cpu_index; ++i) - cpu = cpumask_next(cpu, cpu_online_mask); - *stored_cpu = cpu; - } return cpu; } -- 2.43.0 From kevans at FreeBSD.org Thu Jun 26 03:37:55 2025 From: kevans at FreeBSD.org (Kyle Evans) Date: Wed, 25 Jun 2025 22:37:55 -0500 Subject: [RESEND PATCH v1 wireguard-tools] ipc: linux: Support incremental allowed ips updates In-Reply-To: References: <20250517192955.594735-1-jordan@jrife.io> Message-ID: On 5/21/25 18:51, Jason A. Donenfeld wrote: > On Thu, May 22, 2025 at 1:02?AM Jordan Rife wrote: >>>> Merged here: >>>> https://git.zx2c4.com/wireguard-tools/commit/?id=0788f90810efde88cfa07ed96e7eca77c7f2eedd >>>> >>>> With a followup here: >>>> https://git.zx2c4.com/wireguard-tools/commit/?id=dce8ac6e2fa30f8b07e84859f244f81b3c6b2353 >>> >>> Also, >>> https://git.zx2c4.com/wireguard-go/commit/?id=256bcbd70d5b4eaae2a9f21a9889498c0f89041c >> >> Nice, cool to see this extended to wireguard-go as well. As a follow up, >> I was planning to also create a patch for golang.zx2c4.com/wireguard/wgctrl >> so the feature can be used from there too. > > Wonderful, please do! Looking forward to merging that. > > There's already an open PR in FreeBSD too. FreeBSD support landed as of: https://cgit.freebsd.org/src/commit/?id=f6d9e22982a It will be available in FreeBSD 15.0 and probably 14.4 (to be released next year) as well. I have pushed a branch, ke/fbsd_aip, to the wireguard-tools repository for your consideration. Aside: this is a really neat feature. Thanks! Kyle Evans From jordan at jrife.io Sat Jun 28 16:05:24 2025 From: jordan at jrife.io (Jordan Rife) Date: Sat, 28 Jun 2025 09:05:24 -0700 Subject: [RESEND PATCH v1 wireguard-tools] ipc: linux: Support incremental allowed ips updates In-Reply-To: References: <20250517192955.594735-1-jordan@jrife.io> Message-ID: On Wed, Jun 25, 2025 at 10:37:55PM -0500, Kyle Evans wrote: > On 5/21/25 18:51, Jason A. Donenfeld wrote: > > On Thu, May 22, 2025 at 1:02?AM Jordan Rife wrote: > > > > > Merged here: > > > > > https://git.zx2c4.com/wireguard-tools/commit/?id=0788f90810efde88cfa07ed96e7eca77c7f2eedd > > > > > > > > > > With a followup here: > > > > > https://git.zx2c4.com/wireguard-tools/commit/?id=dce8ac6e2fa30f8b07e84859f244f81b3c6b2353 > > > > > > > > Also, > > > > https://git.zx2c4.com/wireguard-go/commit/?id=256bcbd70d5b4eaae2a9f21a9889498c0f89041c > > > > > > Nice, cool to see this extended to wireguard-go as well. As a follow up, > > > I was planning to also create a patch for golang.zx2c4.com/wireguard/wgctrl > > > so the feature can be used from there too. > > > > Wonderful, please do! Looking forward to merging that. > > > > There's already an open PR in FreeBSD too. > > FreeBSD support landed as of: > > https://cgit.freebsd.org/src/commit/?id=f6d9e22982a > > It will be available in FreeBSD 15.0 and probably 14.4 (to be released next > year) as well. I have pushed a branch, ke/fbsd_aip, to the wireguard-tools > repository for your consideration. > > Aside: this is a really neat feature. > > Thanks! > > Kyle Evans That's great news. It's nice to see this feature percolating through the WireGuard ecosystem. I was working on adding support for direct IP removal to wgctrl-go too, a Go library for controlling WireGuard devices: https://github.com/WireGuard/wgctrl-go/pull/156 While I'm at it, I'll try to add native support for IP removal on FreeBSD if I can get a dev build working with the latest and greatest ( I am a FreeBSD noob :) ). Jordan From kevans at FreeBSD.org Mon Jun 30 01:44:33 2025 From: kevans at FreeBSD.org (Kyle Evans) Date: Sun, 29 Jun 2025 20:44:33 -0500 Subject: [RESEND PATCH v1 wireguard-tools] ipc: linux: Support incremental allowed ips updates In-Reply-To: References: <20250517192955.594735-1-jordan@jrife.io> Message-ID: <2d9b26c6-2512-4031-bce5-afacfdb780c2@FreeBSD.org> On 6/28/25 11:05, Jordan Rife wrote: > On Wed, Jun 25, 2025 at 10:37:55PM -0500, Kyle Evans wrote: >> On 5/21/25 18:51, Jason A. Donenfeld wrote: >>> On Thu, May 22, 2025 at 1:02?AM Jordan Rife wrote: >>>>>> Merged here: >>>>>> https://git.zx2c4.com/wireguard-tools/commit/?id=0788f90810efde88cfa07ed96e7eca77c7f2eedd >>>>>> >>>>>> With a followup here: >>>>>> https://git.zx2c4.com/wireguard-tools/commit/?id=dce8ac6e2fa30f8b07e84859f244f81b3c6b2353 >>>>> >>>>> Also, >>>>> https://git.zx2c4.com/wireguard-go/commit/?id=256bcbd70d5b4eaae2a9f21a9889498c0f89041c >>>> >>>> Nice, cool to see this extended to wireguard-go as well. As a follow up, >>>> I was planning to also create a patch for golang.zx2c4.com/wireguard/wgctrl >>>> so the feature can be used from there too. >>> >>> Wonderful, please do! Looking forward to merging that. >>> >>> There's already an open PR in FreeBSD too. >> >> FreeBSD support landed as of: >> >> https://cgit.freebsd.org/src/commit/?id=f6d9e22982a >> >> It will be available in FreeBSD 15.0 and probably 14.4 (to be released next >> year) as well. I have pushed a branch, ke/fbsd_aip, to the wireguard-tools >> repository for your consideration. >> >> Aside: this is a really neat feature. >> >> Thanks! >> >> Kyle Evans > > That's great news. It's nice to see this feature percolating through > the WireGuard ecosystem. > > I was working on adding support for direct IP removal to wgctrl-go too, > a Go library for controlling WireGuard devices: > > https://github.com/WireGuard/wgctrl-go/pull/156 > Ah, neat! > While I'm at it, I'll try to add native support for IP removal on > FreeBSD if I can get a dev build working with the latest and greatest > ( I am a FreeBSD noob :) ). > Feel free to shoot me an e-mail off-list if you need any assistance there, more than happy to lend a hand for the cause. Thanks, Kyle Evans From horms at kernel.org Mon Jun 30 16:52:44 2025 From: horms at kernel.org (Simon Horman) Date: Mon, 30 Jun 2025 17:52:44 +0100 Subject: [PATCH v2] wireguard: queueing: simplify wg_cpumask_next_online() In-Reply-To: <20250619145501.351951-1-yury.norov@gmail.com> References: <20250619145501.351951-1-yury.norov@gmail.com> Message-ID: <20250630165244.GL41770@horms.kernel.org> On Thu, Jun 19, 2025 at 10:54:59AM -0400, Yury Norov wrote: > From: Yury Norov [NVIDIA] > > wg_cpumask_choose_online() opencodes cpumask_nth(). Use it and make the > function significantly simpler. While there, fix opencoded cpu_online() > too. > > Signed-off-by: Yury Norov [NVIDIA] > --- > v1: https://lore.kernel.org/all/20250604233656.41896-1-yury.norov at gmail.com/ > v2: > - fix 'cpu' undeclared; > - change subject (Jason); > - keep the original function structure (Jason); Reviewed-by: Simon Horman From Jason at zx2c4.com Mon Jun 30 17:24:33 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Mon, 30 Jun 2025 19:24:33 +0200 Subject: [PATCH v2] wireguard: queueing: simplify wg_cpumask_next_online() In-Reply-To: <20250619145501.351951-1-yury.norov@gmail.com> References: <20250619145501.351951-1-yury.norov@gmail.com> Message-ID: On Thu, Jun 19, 2025 at 10:54:59AM -0400, Yury Norov wrote: > From: Yury Norov [NVIDIA] > > wg_cpumask_choose_online() opencodes cpumask_nth(). Use it and make the > function significantly simpler. While there, fix opencoded cpu_online() > too. > > Signed-off-by: Yury Norov [NVIDIA] > --- > v1: https://lore.kernel.org/all/20250604233656.41896-1-yury.norov at gmail.com/ > v2: > - fix 'cpu' undeclared; > - change subject (Jason); > - keep the original function structure (Jason); > > drivers/net/wireguard/queueing.h | 13 ++++--------- > 1 file changed, 4 insertions(+), 9 deletions(-) > > diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h > index 7eb76724b3ed..56314f98b6ba 100644 > --- a/drivers/net/wireguard/queueing.h > +++ b/drivers/net/wireguard/queueing.h > @@ -104,16 +104,11 @@ static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating) > > static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id) > { > - unsigned int cpu = *stored_cpu, cpu_index, i; > + unsigned int cpu = *stored_cpu; > + > + if (unlikely(cpu >= nr_cpu_ids || !cpu_online(cpu))) > + cpu = *stored_cpu = cpumask_nth(id % num_online_cpus(), cpu_online_mask); I was about to apply this but then it occurred to me: what happens if cpu_online_mask changes (shrinks) after num_online_cpus() is evaluated? cpumask_nth() will then return nr_cpu_ids? Jason From yury.norov at gmail.com Mon Jun 30 17:33:37 2025 From: yury.norov at gmail.com (Yury Norov) Date: Mon, 30 Jun 2025 13:33:37 -0400 Subject: [PATCH v2] wireguard: queueing: simplify wg_cpumask_next_online() In-Reply-To: References: <20250619145501.351951-1-yury.norov@gmail.com> Message-ID: On Mon, Jun 30, 2025 at 07:24:33PM +0200, Jason A. Donenfeld wrote: > On Thu, Jun 19, 2025 at 10:54:59AM -0400, Yury Norov wrote: > > From: Yury Norov [NVIDIA] > > > > wg_cpumask_choose_online() opencodes cpumask_nth(). Use it and make the > > function significantly simpler. While there, fix opencoded cpu_online() > > too. > > > > Signed-off-by: Yury Norov [NVIDIA] > > --- > > v1: https://lore.kernel.org/all/20250604233656.41896-1-yury.norov at gmail.com/ > > v2: > > - fix 'cpu' undeclared; > > - change subject (Jason); > > - keep the original function structure (Jason); > > > > drivers/net/wireguard/queueing.h | 13 ++++--------- > > 1 file changed, 4 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h > > index 7eb76724b3ed..56314f98b6ba 100644 > > --- a/drivers/net/wireguard/queueing.h > > +++ b/drivers/net/wireguard/queueing.h > > @@ -104,16 +104,11 @@ static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating) > > > > static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id) > > { > > - unsigned int cpu = *stored_cpu, cpu_index, i; > > + unsigned int cpu = *stored_cpu; > > + > > + if (unlikely(cpu >= nr_cpu_ids || !cpu_online(cpu))) > > + cpu = *stored_cpu = cpumask_nth(id % num_online_cpus(), cpu_online_mask); > > I was about to apply this but then it occurred to me: what happens if > cpu_online_mask changes (shrinks) after num_online_cpus() is evaluated? > cpumask_nth() will then return nr_cpu_ids? It will return >= nd_cpu_ids. The original version based a for-loop does the same, so I decided that the caller is safe against it. If not, I can send a v3. But, what should we do - retry, or return a local cpu? Or something else? Thanks, Yury From Jason at zx2c4.com Mon Jun 30 17:38:02 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Mon, 30 Jun 2025 19:38:02 +0200 Subject: [PATCH v2] wireguard: queueing: simplify wg_cpumask_next_online() In-Reply-To: References: <20250619145501.351951-1-yury.norov@gmail.com> Message-ID: On Mon, Jun 30, 2025 at 01:33:37PM -0400, Yury Norov wrote: > On Mon, Jun 30, 2025 at 07:24:33PM +0200, Jason A. Donenfeld wrote: > > On Thu, Jun 19, 2025 at 10:54:59AM -0400, Yury Norov wrote: > > > From: Yury Norov [NVIDIA] > > > > > > wg_cpumask_choose_online() opencodes cpumask_nth(). Use it and make the > > > function significantly simpler. While there, fix opencoded cpu_online() > > > too. > > > > > > Signed-off-by: Yury Norov [NVIDIA] > > > --- > > > v1: https://lore.kernel.org/all/20250604233656.41896-1-yury.norov at gmail.com/ > > > v2: > > > - fix 'cpu' undeclared; > > > - change subject (Jason); > > > - keep the original function structure (Jason); > > > > > > drivers/net/wireguard/queueing.h | 13 ++++--------- > > > 1 file changed, 4 insertions(+), 9 deletions(-) > > > > > > diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h > > > index 7eb76724b3ed..56314f98b6ba 100644 > > > --- a/drivers/net/wireguard/queueing.h > > > +++ b/drivers/net/wireguard/queueing.h > > > @@ -104,16 +104,11 @@ static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating) > > > > > > static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id) > > > { > > > - unsigned int cpu = *stored_cpu, cpu_index, i; > > > + unsigned int cpu = *stored_cpu; > > > + > > > + if (unlikely(cpu >= nr_cpu_ids || !cpu_online(cpu))) > > > + cpu = *stored_cpu = cpumask_nth(id % num_online_cpus(), cpu_online_mask); > > > > I was about to apply this but then it occurred to me: what happens if > > cpu_online_mask changes (shrinks) after num_online_cpus() is evaluated? > > cpumask_nth() will then return nr_cpu_ids? > > It will return >= nd_cpu_ids. The original version based a for-loop > does the same, so I decided that the caller is safe against it. Good point. I just checked... This goes into queue_work_on() which eventually hits: /* pwq which will be used unless @work is executing elsewhere */ if (req_cpu == WORK_CPU_UNBOUND) { And it turns out WORK_CPU_UNBOUND is the same as nr_cpu_ids. So I guess that's a fine failure mode. I'll queue this patch up. Jason From yury.norov at gmail.com Mon Jun 30 17:54:01 2025 From: yury.norov at gmail.com (Yury Norov) Date: Mon, 30 Jun 2025 13:54:01 -0400 Subject: [PATCH v2] wireguard: queueing: simplify wg_cpumask_next_online() In-Reply-To: References: <20250619145501.351951-1-yury.norov@gmail.com> Message-ID: On Mon, Jun 30, 2025 at 07:38:02PM +0200, Jason A. Donenfeld wrote: > On Mon, Jun 30, 2025 at 01:33:37PM -0400, Yury Norov wrote: > > On Mon, Jun 30, 2025 at 07:24:33PM +0200, Jason A. Donenfeld wrote: > > > On Thu, Jun 19, 2025 at 10:54:59AM -0400, Yury Norov wrote: > > > > From: Yury Norov [NVIDIA] > > > > > > > > wg_cpumask_choose_online() opencodes cpumask_nth(). Use it and make the > > > > function significantly simpler. While there, fix opencoded cpu_online() > > > > too. > > > > > > > > Signed-off-by: Yury Norov [NVIDIA] > > > > --- > > > > v1: https://lore.kernel.org/all/20250604233656.41896-1-yury.norov at gmail.com/ > > > > v2: > > > > - fix 'cpu' undeclared; > > > > - change subject (Jason); > > > > - keep the original function structure (Jason); > > > > > > > > drivers/net/wireguard/queueing.h | 13 ++++--------- > > > > 1 file changed, 4 insertions(+), 9 deletions(-) > > > > > > > > diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h > > > > index 7eb76724b3ed..56314f98b6ba 100644 > > > > --- a/drivers/net/wireguard/queueing.h > > > > +++ b/drivers/net/wireguard/queueing.h > > > > @@ -104,16 +104,11 @@ static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating) > > > > > > > > static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id) > > > > { > > > > - unsigned int cpu = *stored_cpu, cpu_index, i; > > > > + unsigned int cpu = *stored_cpu; > > > > + > > > > + if (unlikely(cpu >= nr_cpu_ids || !cpu_online(cpu))) > > > > + cpu = *stored_cpu = cpumask_nth(id % num_online_cpus(), cpu_online_mask); > > > > > > I was about to apply this but then it occurred to me: what happens if > > > cpu_online_mask changes (shrinks) after num_online_cpus() is evaluated? > > > cpumask_nth() will then return nr_cpu_ids? > > > > It will return >= nd_cpu_ids. The original version based a for-loop > > does the same, so I decided that the caller is safe against it. > > Good point. I just checked... This goes into queue_work_on() which > eventually hits: > > /* pwq which will be used unless @work is executing elsewhere */ > if (req_cpu == WORK_CPU_UNBOUND) { > > And it turns out WORK_CPU_UNBOUND is the same as nr_cpu_ids. So I guess > that's a fine failure mode. Actually, cpumask_nth_cpu may return >= nr_cpu_ids because of small_cpumask_nbits optimization. So it's safer to relax the condition. Can you consider applying the following patch for that? Thanks, Yury >From fbdce972342437fb12703cae0c3a4f8f9e218a1b Mon Sep 17 00:00:00 2001 From: Yury Norov (NVIDIA) Date: Mon, 30 Jun 2025 13:47:49 -0400 Subject: [PATCH] workqueue: relax condition in __queue_work() Some cpumask search functions may return a number greater than nr_cpu_ids when nothing is found. Adjust __queue_work() to it. Signed-off-by: Yury Norov (NVIDIA) --- kernel/workqueue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 9f9148075828..abacfe157fe6 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -2261,7 +2261,7 @@ static void __queue_work(int cpu, struct workqueue_struct *wq, rcu_read_lock(); retry: /* pwq which will be used unless @work is executing elsewhere */ - if (req_cpu == WORK_CPU_UNBOUND) { + if (req_cpu >= WORK_CPU_UNBOUND) { if (wq->flags & WQ_UNBOUND) cpu = wq_select_unbound_cpu(raw_smp_processor_id()); else -- 2.43.0 From Jason at zx2c4.com Mon Jun 30 17:55:49 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Mon, 30 Jun 2025 19:55:49 +0200 Subject: [PATCH v2] wireguard: queueing: simplify wg_cpumask_next_online() In-Reply-To: References: <20250619145501.351951-1-yury.norov@gmail.com> Message-ID: Hi Yury, > > > > > diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h > > > > > index 7eb76724b3ed..56314f98b6ba 100644 > > > > > --- a/drivers/net/wireguard/queueing.h > > > > > +++ b/drivers/net/wireguard/queueing.h > > > > > @@ -104,16 +104,11 @@ static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating) > > > > > > > > > > static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id) > > > > > { > > > > > - unsigned int cpu = *stored_cpu, cpu_index, i; > > > > > + unsigned int cpu = *stored_cpu; > > > > > + > > > > > + if (unlikely(cpu >= nr_cpu_ids || !cpu_online(cpu))) > > > > > + cpu = *stored_cpu = cpumask_nth(id % num_online_cpus(), cpu_online_mask); > > > > > > > > I was about to apply this but then it occurred to me: what happens if > > > > cpu_online_mask changes (shrinks) after num_online_cpus() is evaluated? > > > > cpumask_nth() will then return nr_cpu_ids? > > > > > > It will return >= nd_cpu_ids. The original version based a for-loop > > > does the same, so I decided that the caller is safe against it. > > > > Good point. I just checked... This goes into queue_work_on() which > > eventually hits: > > > > /* pwq which will be used unless @work is executing elsewhere */ > > if (req_cpu == WORK_CPU_UNBOUND) { > > > > And it turns out WORK_CPU_UNBOUND is the same as nr_cpu_ids. So I guess > > that's a fine failure mode. > > Actually, cpumask_nth_cpu may return >= nr_cpu_ids because of > small_cpumask_nbits optimization. So it's safer to relax the > condition. > > Can you consider applying the following patch for that? > > Thanks, > Yury > > > From fbdce972342437fb12703cae0c3a4f8f9e218a1b Mon Sep 17 00:00:00 2001 > From: Yury Norov (NVIDIA) > Date: Mon, 30 Jun 2025 13:47:49 -0400 > Subject: [PATCH] workqueue: relax condition in __queue_work() > > Some cpumask search functions may return a number greater than > nr_cpu_ids when nothing is found. Adjust __queue_work() to it. > > Signed-off-by: Yury Norov (NVIDIA) > --- > kernel/workqueue.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/workqueue.c b/kernel/workqueue.c > index 9f9148075828..abacfe157fe6 100644 > --- a/kernel/workqueue.c > +++ b/kernel/workqueue.c > @@ -2261,7 +2261,7 @@ static void __queue_work(int cpu, struct workqueue_struct *wq, > rcu_read_lock(); > retry: > /* pwq which will be used unless @work is executing elsewhere */ > - if (req_cpu == WORK_CPU_UNBOUND) { > + if (req_cpu >= WORK_CPU_UNBOUND) { > if (wq->flags & WQ_UNBOUND) > cpu = wq_select_unbound_cpu(raw_smp_processor_id()); > else > Seems reasonable to me... Maybe submit this to Tejun and CC me? Jason From yury.norov at gmail.com Mon Jun 30 17:59:15 2025 From: yury.norov at gmail.com (Yury Norov) Date: Mon, 30 Jun 2025 13:59:15 -0400 Subject: [PATCH v2] wireguard: queueing: simplify wg_cpumask_next_online() In-Reply-To: References: <20250619145501.351951-1-yury.norov@gmail.com> Message-ID: On Mon, Jun 30, 2025 at 07:55:49PM +0200, Jason A. Donenfeld wrote: > Hi Yury, > > > > > > > diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h > > > > > > index 7eb76724b3ed..56314f98b6ba 100644 > > > > > > --- a/drivers/net/wireguard/queueing.h > > > > > > +++ b/drivers/net/wireguard/queueing.h > > > > > > @@ -104,16 +104,11 @@ static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating) > > > > > > > > > > > > static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id) > > > > > > { > > > > > > - unsigned int cpu = *stored_cpu, cpu_index, i; > > > > > > + unsigned int cpu = *stored_cpu; > > > > > > + > > > > > > + if (unlikely(cpu >= nr_cpu_ids || !cpu_online(cpu))) > > > > > > + cpu = *stored_cpu = cpumask_nth(id % num_online_cpus(), cpu_online_mask); > > > > > > > > > > I was about to apply this but then it occurred to me: what happens if > > > > > cpu_online_mask changes (shrinks) after num_online_cpus() is evaluated? > > > > > cpumask_nth() will then return nr_cpu_ids? > > > > > > > > It will return >= nd_cpu_ids. The original version based a for-loop > > > > does the same, so I decided that the caller is safe against it. > > > > > > Good point. I just checked... This goes into queue_work_on() which > > > eventually hits: > > > > > > /* pwq which will be used unless @work is executing elsewhere */ > > > if (req_cpu == WORK_CPU_UNBOUND) { > > > > > > And it turns out WORK_CPU_UNBOUND is the same as nr_cpu_ids. So I guess > > > that's a fine failure mode. > > > > Actually, cpumask_nth_cpu may return >= nr_cpu_ids because of > > small_cpumask_nbits optimization. So it's safer to relax the > > condition. > > > > Can you consider applying the following patch for that? > > > > Thanks, > > Yury > > > > > > From fbdce972342437fb12703cae0c3a4f8f9e218a1b Mon Sep 17 00:00:00 2001 > > From: Yury Norov (NVIDIA) > > Date: Mon, 30 Jun 2025 13:47:49 -0400 > > Subject: [PATCH] workqueue: relax condition in __queue_work() > > > > Some cpumask search functions may return a number greater than > > nr_cpu_ids when nothing is found. Adjust __queue_work() to it. > > > > Signed-off-by: Yury Norov (NVIDIA) > > --- > > kernel/workqueue.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/kernel/workqueue.c b/kernel/workqueue.c > > index 9f9148075828..abacfe157fe6 100644 > > --- a/kernel/workqueue.c > > +++ b/kernel/workqueue.c > > @@ -2261,7 +2261,7 @@ static void __queue_work(int cpu, struct workqueue_struct *wq, > > rcu_read_lock(); > > retry: > > /* pwq which will be used unless @work is executing elsewhere */ > > - if (req_cpu == WORK_CPU_UNBOUND) { > > + if (req_cpu >= WORK_CPU_UNBOUND) { > > if (wq->flags & WQ_UNBOUND) > > cpu = wq_select_unbound_cpu(raw_smp_processor_id()); > > else > > > > Seems reasonable to me... Maybe submit this to Tejun and CC me? Sure, no problem. From yury.norov at gmail.com Mon Jun 30 18:15:06 2025 From: yury.norov at gmail.com (Yury Norov) Date: Mon, 30 Jun 2025 14:15:06 -0400 Subject: [PATCH v2] wireguard: queueing: simplify wg_cpumask_next_online() In-Reply-To: References: <20250619145501.351951-1-yury.norov@gmail.com> Message-ID: > > > From fbdce972342437fb12703cae0c3a4f8f9e218a1b Mon Sep 17 00:00:00 2001 > > > From: Yury Norov (NVIDIA) > > > Date: Mon, 30 Jun 2025 13:47:49 -0400 > > > Subject: [PATCH] workqueue: relax condition in __queue_work() > > > > > > Some cpumask search functions may return a number greater than > > > nr_cpu_ids when nothing is found. Adjust __queue_work() to it. > > > > > > Signed-off-by: Yury Norov (NVIDIA) > > > --- > > > kernel/workqueue.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/kernel/workqueue.c b/kernel/workqueue.c > > > index 9f9148075828..abacfe157fe6 100644 > > > --- a/kernel/workqueue.c > > > +++ b/kernel/workqueue.c > > > @@ -2261,7 +2261,7 @@ static void __queue_work(int cpu, struct workqueue_struct *wq, > > > rcu_read_lock(); > > > retry: > > > /* pwq which will be used unless @work is executing elsewhere */ > > > - if (req_cpu == WORK_CPU_UNBOUND) { > > > + if (req_cpu >= WORK_CPU_UNBOUND) { > > > if (wq->flags & WQ_UNBOUND) > > > cpu = wq_select_unbound_cpu(raw_smp_processor_id()); > > > else > > > > > > > Seems reasonable to me... Maybe submit this to Tejun and CC me? > > Sure, no problem. Hmm... So, actually WORK_CPU_UNBOUND is NR_CPUS, which is not the same as nr_cpu_ids. For example, on my Ubuntu machine, the CONFIG_NR_CPUS is 8192, and nr_cpu_ids is 8. So, for the wg_cpumask_next_online() to work properly, we need to return the WORK_CPU_UNBOUND in case of nothing is found. I think I need to send a v3... Thanks, Yury From Jason at zx2c4.com Mon Jun 30 18:20:45 2025 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Mon, 30 Jun 2025 20:20:45 +0200 Subject: [PATCH v2] wireguard: queueing: simplify wg_cpumask_next_online() In-Reply-To: References: <20250619145501.351951-1-yury.norov@gmail.com> Message-ID: On Mon, Jun 30, 2025 at 8:15?PM Yury Norov wrote: > > > > > From fbdce972342437fb12703cae0c3a4f8f9e218a1b Mon Sep 17 00:00:00 2001 > > > > From: Yury Norov (NVIDIA) > > > > Date: Mon, 30 Jun 2025 13:47:49 -0400 > > > > Subject: [PATCH] workqueue: relax condition in __queue_work() > > > > > > > > Some cpumask search functions may return a number greater than > > > > nr_cpu_ids when nothing is found. Adjust __queue_work() to it. > > > > > > > > Signed-off-by: Yury Norov (NVIDIA) > > > > --- > > > > kernel/workqueue.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/kernel/workqueue.c b/kernel/workqueue.c > > > > index 9f9148075828..abacfe157fe6 100644 > > > > --- a/kernel/workqueue.c > > > > +++ b/kernel/workqueue.c > > > > @@ -2261,7 +2261,7 @@ static void __queue_work(int cpu, struct workqueue_struct *wq, > > > > rcu_read_lock(); > > > > retry: > > > > /* pwq which will be used unless @work is executing elsewhere */ > > > > - if (req_cpu == WORK_CPU_UNBOUND) { > > > > + if (req_cpu >= WORK_CPU_UNBOUND) { > > > > if (wq->flags & WQ_UNBOUND) > > > > cpu = wq_select_unbound_cpu(raw_smp_processor_id()); > > > > else > > > > > > > > > > Seems reasonable to me... Maybe submit this to Tejun and CC me? > > > > Sure, no problem. > > Hmm... So, actually WORK_CPU_UNBOUND is NR_CPUS, which is not the same > as nr_cpu_ids. For example, on my Ubuntu machine, the CONFIG_NR_CPUS > is 8192, and nr_cpu_ids is 8. > > So, for the wg_cpumask_next_online() to work properly, we need to > return the WORK_CPU_UNBOUND in case of nothing is found. Or just try again? Could just make your if into a while. From antonio at openvpn.net Thu Jun 12 11:21:10 2025 From: antonio at openvpn.net (Antonio Quartulli) Date: Thu, 12 Jun 2025 11:21:10 -0000 Subject: [PATCH net-next 01/14] net: ipv4: Add a flags argument to iptunnel_xmit(), udp_tunnel_xmit_skb() In-Reply-To: References: Message-ID: On 09/06/2025 22:50, Petr Machata wrote: > iptunnel_xmit() erases the contents of the SKB control block. In order to > be able to set particular IPCB flags on the SKB, add a corresponding > parameter, and propagate it to udp_tunnel_xmit_skb() as well. > > In one of the following patches, VXLAN driver will use this facility to > mark packets as subject to IP multicast routing. > > Signed-off-by: Petr Machata > Reviewed-by: Ido Schimmel > --- > > Notes: > CC: Pablo Neira Ayuso > CC: osmocom-net-gprs at lists.osmocom.org > CC: Andrew Lunn > CC: Taehee Yoo > CC: Antonio Quartulli > CC: "Jason A. Donenfeld" > CC: wireguard at lists.zx2c4.com > CC: Marcelo Ricardo Leitner > CC: linux-sctp at vger.kernel.org > CC: Jon Maloy > CC: tipc-discussion at lists.sourceforge.net Acked-by: Antonio Quartulli -- Antonio Quartulli OpenVPN Inc.