From syzbot+listded2f47f5f1d416c4059 at syzkaller.appspotmail.com Mon May 1 09:03:54 2023 From: syzbot+listded2f47f5f1d416c4059 at syzkaller.appspotmail.com (syzbot) Date: Mon, 01 May 2023 02:03:54 -0700 Subject: [syzbot] Monthly wireguard report (Apr 2023) Message-ID: <0000000000002e8a7c05fa9e1a7a@google.com> Hello wireguard maintainers/developers, This is a 31-day syzbot report for the wireguard subsystem. All related reports/information can be found at: https://syzkaller.appspot.com/upstream/s/wireguard During the period, 1 new issues were detected and 0 were fixed. In total, 4 issues are still open and 13 have been fixed so far. Some of the still happening issues: Ref Crashes Repro Title <1> 620 No KCSAN: data-race in wg_packet_send_staged_packets / wg_packet_send_staged_packets (3) https://syzkaller.appspot.com/bug?extid=6ba34f16b98fe40daef1 <2> 440 No KCSAN: data-race in wg_packet_decrypt_worker / wg_packet_rx_poll (2) https://syzkaller.appspot.com/bug?extid=d1de830e4ecdaac83d89 <3> 6 No KASAN: slab-use-after-free Write in enqueue_timer https://syzkaller.appspot.com/bug?extid=c2775460db0e1c70018e --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller at googlegroups.com. To disable reminders for individual bugs, reply with the following command: #syz set no-reminders To change bug's subsystems, reply with: #syz set subsystems: new-subsystem You may send multiple commands in a single email message. From nogikh at google.com Tue May 2 09:03:29 2023 From: nogikh at google.com (Aleksandr Nogikh) Date: Tue, 2 May 2023 11:03:29 +0200 Subject: [syzbot] Monthly wireguard report (Apr 2023) In-Reply-To: References: <0000000000002e8a7c05fa9e1a7a@google.com> Message-ID: Hello John, Do you mean only these monthly reports or all messages from the mailing lists? You received this specific email because you're subscribed to one of the following lists: linux-kernel at vger.kernel.org, netdev at vger.kernel.org, wireguard at lists.zx2c4.com (the email was also sent to syzkaller-bugs at googlegroups.com, but you're not a member of it -- I've just checked). You could determine the exact one by looking at the "Mailing-list" header in the raw message. -- Aleksandr On Mon, May 1, 2023 at 5:56?PM J.F. Samuels - K2CIB wrote: > > I don't know how I subscribed to this - wish I knew enough to be of help! > > Please unsubscribe me from all related lists. > > Thanks, > > John > > > > On 5/1/2023 5:03 AM, syzbot wrote: > > Hello wireguard maintainers/developers, > > This is a 31-day syzbot report for the wireguard subsystem. > All related reports/information can be found at: > https://syzkaller.appspot.com/upstream/s/wireguard > > During the period, 1 new issues were detected and 0 were fixed. > In total, 4 issues are still open and 13 have been fixed so far. > > Some of the still happening issues: > > Ref Crashes Repro Title > <1> 620 No KCSAN: data-race in wg_packet_send_staged_packets / wg_packet_send_staged_packets (3) > https://syzkaller.appspot.com/bug?extid=6ba34f16b98fe40daef1 > <2> 440 No KCSAN: data-race in wg_packet_decrypt_worker / wg_packet_rx_poll (2) > https://syzkaller.appspot.com/bug?extid=d1de830e4ecdaac83d89 > <3> 6 No KASAN: slab-use-after-free Write in enqueue_timer > https://syzkaller.appspot.com/bug?extid=c2775460db0e1c70018e > > --- > This report is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syzkaller at googlegroups.com. > > To disable reminders for individual bugs, reply with the following command: > #syz set no-reminders > > To change bug's subsystems, reply with: > #syz set subsystems: new-subsystem > > You may send multiple commands in a single email message. > > -- > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe at googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/bdb29f17-dac3-20a3-c726-963259b95208%40gmail.com. From Jason at zx2c4.com Mon May 15 12:40:41 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Mon, 15 May 2023 14:40:41 +0200 Subject: Direct APKs for WireGuard Android are now available Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hi folks, By popular demand, on , there's now a (somewhat small) link in the Android section called "Download APK File", for those who want to sideload the package or bundle it in an OS image or whatever else. The recommended method of installation is still of course the Play Store, because its updater is known to work very well across devices. But now there's a decent alternative method. The new direct APK download and the Play Store are the *only* two supported installation sources. Alternative builds and alternative app stores aren't supported (unless they're shipping the direct APK file that the WireGuard project provides). You can verify those direct APK files using OpenBSD's signify(1): $ cat wireguard-android-release.pub untrusted comment: wireguard android release key public key RWTAzwGRYr3EC9px0Ia3fbttz8WcVN6wrOwWp2delz4el6SI8XmkKSMp $ curl -O https://download.wireguard.com/android-client/latest.sig $ signify -V -p wireguard-android-release.pub -e -x latest.sig -m latest Signature Verified $ read _ file < <(sort -k2 -Vr latest) $ curl -O https://download.wireguard.com/android-client/"$file" $ sha256sum -c latest --ignore-missing com.wireguard.android-1.0.20230512.apk: OK This is the same Ed25519-based signature mechanism that is used by the WireGuard Windows client. The private key lives in an HSM [1]. Jason [1] https://marc.info/?l=openbsd-misc&m=155723329924761&w=2 -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmRiJ/QACgkQSfxwEqXe A669fg/9GnopO43uXGIB6T1IZbY1WkIDZt7pMdq52rypqMq9PwG4HK+kQbGJWJYC IBW9v3ae3uhVhX84Qnke7RJ3aYVMltfyp0BoTzsIsyk4v4U8KguGchdI5Mn59sj/ 2HSUVMQ9+5n7SCQqsJp9CW0GSBoME2AU1zzjEyzwr1SM7zq/5CCLEBvMsImhP0rw n1Vzb0o24CUNyiNbNy4op4eEAuLs8lpfj95qs0kpaLM2vH13LBeO0sKHdKUQe9dd iOJRXBrx8FAy/kwweycFww6KhGtO1fKzWwLyAwEhKvvcBC+kBhFfEU/mO6iIuao+ YQ8VDw4uSaHrP3RFBFxVUlcMhI/ytShwnW2CIuKd1/tpCk9Pdq5tg+QQB5FqVv0A evAhjuI0ggzmsEpnh9ldYDWCDViKBz7TdBYgsQ+lW4lwQLNIAn3jzqHTSLNtJPY2 Obw9E5PvZK/kw+cHbZJP4mRXpSl2sLL6HocDPUwRWNwEFAVawHlPNSkaNhiiWpmg HO0m7FMh7NP7R/IVA+7ULaUFL3X+R9d66znn2uoGwU783FQFlfKb4X5CCsP8h3+A YoJJ5v7328LHc6tajprvPSEH5Lt0ok+4cKxq/wAQb4AI2SUIFW77MRmM44q2TYrd mr0v2FmZxOlTdENn5lMyj2580k9E41zuH85/Pz0VxmgdbUuGqKM= =ACE8 -----END PGP SIGNATURE----- From Jason at zx2c4.com Mon May 15 23:43:58 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Tue, 16 May 2023 01:43:58 +0200 Subject: APK outside of Play Store? In-Reply-To: <517b5e80-d3ac-4829-9096-1415ecad5aa4@www.fastmail.com> References: <517b5e80-d3ac-4829-9096-1415ecad5aa4@www.fastmail.com> Message-ID: On Wed, Sep 21, 2022 at 04:09:56AM +0000, wireguard at bulletin.elitemail.org wrote: > For users who prefer to avoid Play Store as a delivery channel, is > there an official pre-built APK available? Such users are typically > steered towards APKPure/APKMirror/F-Droid with questionabl > authenticity and (in the case of F-Droid) the prospect of old build > dependencies built on an EOL OS (Strech). Aurora is an option, though > a dev provided build (with accompanying checksum/signing cert > fingerprint) would be preferable. Done: https://lists.zx2c4.com/pipermail/wireguard/2023-May/008057.html From maxim.cournoyer at gmail.com Mon May 15 18:04:36 2023 From: maxim.cournoyer at gmail.com (Maxim Cournoyer) Date: Mon, 15 May 2023 14:04:36 -0400 Subject: [bug] No keep-alives sent when private is set via PostUp Message-ID: <87fs7xtqrv.fsf@gmail.com> Hello, I've encountered an edge case where no keep alives would be sent following recreating a connection with --8<---------------cut here---------------start------------->8--- wg-quick down my-config-file wg-quick up my-config-file --8<---------------cut here---------------end--------------->8--- Where my-config-file contains something like: --8<---------------cut here---------------start------------->8--- cat /gnu/store/zilv4f0jqa8nz8apqv8y3a6g0ifymxhc-wireguard-config/wg0.conf [Interface] Address = 10.0.0.7/32 Table = auto PostUp = /gnu/store/4cnl0h79zc599xryr5jh66d7yq643zk4-wireguard-tools-1.0.20210914/bin/wg set %i private-key /etc/wireguard/private.key ListenPort = 51820 [Peer] #apteryx PublicKey = JPWIbC9qMlnTkWfqGp0plOxWJ/ewOO/C9BuxIJles28= AllowedIPs = 10.0.1.1/32 Endpoint = apteryx.duckdns.org:51820 PersistentKeepalive = 25 --8<---------------cut here---------------end--------------->8--- The following command on that machine: --8<---------------cut here---------------start------------->8--- tcpdump -n -i any port 51820 --8<---------------cut here---------------end--------------->8--- wouldn't show any traffic. Discussing this on #wireguard (libera.chat IRC), the another| user thinks the problem could be triggered because of setting the private key in using a PostUp directive; more specifically it is believed the problem would happen when "no private key is defined when the interface comes up". -- Thanks, Maxim From Jason at zx2c4.com Thu May 18 01:17:42 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Thu, 18 May 2023 03:17:42 +0200 Subject: [bug] No keep-alives sent when private is set via PostUp In-Reply-To: <87fs7xtqrv.fsf@gmail.com> References: <87fs7xtqrv.fsf@gmail.com> Message-ID: Hi Maxim, Thanks for the bug report! I think indeed you're right about this. Can you test if this commit fixes the issue for you? https://git.zx2c4.com/wireguard-linux/commit/?id=3ac1bf099766f1e9735883d5127148054cd5b30a It at least satisfies the test case I added. Until this patch hits stable kernels, you can probably work around this by changing your PostUp into a PreUp. I adjusted the man page here: https://git.zx2c4.com/wireguard-tools/commit/?id=9d42bd1ab9d707f7a72162d36c9b37cc9bdf480e Jason From Jason at zx2c4.com Thu May 18 01:22:36 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Thu, 18 May 2023 03:22:36 +0200 Subject: [bug] No keep-alives sent when private is set via PostUp In-Reply-To: References: <87fs7xtqrv.fsf@gmail.com> Message-ID: On Thu, May 18, 2023 at 3:17?AM Jason A. Donenfeld wrote: > Until this patch hits stable kernels, you can probably work around > this by changing your PostUp into a PreUp. I adjusted the man page > here: > > https://git.zx2c4.com/wireguard-tools/commit/?id=9d42bd1ab9d707f7a72162d36c9b37cc9bdf480e Er, nevermind about this part. PreUp executes before the interface is added. From maxim.cournoyer at gmail.com Thu May 18 02:04:34 2023 From: maxim.cournoyer at gmail.com (Maxim Cournoyer) Date: Wed, 17 May 2023 22:04:34 -0400 Subject: [bug] No keep-alives sent when private is set via PostUp In-Reply-To: (Jason A. Donenfeld's message of "Thu, 18 May 2023 03:22:36 +0200") References: <87fs7xtqrv.fsf@gmail.com> Message-ID: <87ttwamm31.fsf@gmail.com> Hi, "Jason A. Donenfeld" writes: > On Thu, May 18, 2023 at 3:17?AM Jason A. Donenfeld wrote: >> Until this patch hits stable kernels, you can probably work around >> this by changing your PostUp into a PreUp. I adjusted the man page >> here: >> >> https://git.zx2c4.com/wireguard-tools/commit/?id=9d42bd1ab9d707f7a72162d36c9b37cc9bdf480e > > Er, nevermind about this part. PreUp executes before the interface is added. Does that mean that the example bit changed in the man page needs to be reverted? -- Thanks, Maxim From Jason at zx2c4.com Thu May 18 14:39:38 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Thu, 18 May 2023 16:39:38 +0200 Subject: [PATCH] wg-quick: Allow setting iface VRF in PreUp hook In-Reply-To: <20221207180031.301766-1-dxld@darkboxed.org> References: <20221207180031.301766-1-dxld@darkboxed.org> Message-ID: Applied, thanks. From nwfilardo at gmail.com Mon May 22 06:48:04 2023 From: nwfilardo at gmail.com (Nathaniel Filardo) Date: Mon, 22 May 2023 07:48:04 +0100 Subject: IPv6-only flag set on v6 sockets prevents the use of v4-mapped addresses Message-ID: Hello wireguard@, I recently found out that in-Linux wireguard has, since its inception, set its v6 sockets to v6-only (https://github.com/torvalds/linux/blob/e7096c131e5161fa3b8e52a650d7719d2857adfd/drivers/net/wireguard/socket.c#L381) and it keys only off the address family to decide which socket to use (https://github.com/torvalds/linux/blob/e7096c131e5161fa3b8e52a650d7719d2857adfd/drivers/net/wireguard/socket.c#L188). This means that v4-mapped v6 addresses (::ffff:a.b.c.d) can be registered as peer endpoints, but the kernel very silently won't try to reach out. Is that deliberate for some reason that eludes me? If it is, could the userspace tooling be educated about v4-mapped addresses and translate them accordingly before handing them up to the kernel; if it isn't, could we drop the v6-only flag on the kernel socket? Thanks for any input, --nwf; From Jason at zx2c4.com Tue May 23 15:46:20 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Tue, 23 May 2023 17:46:20 +0200 Subject: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer In-Reply-To: <000000000000c0b11d05fa917fe3@google.com> References: <000000000000c0b11d05fa917fe3@google.com> Message-ID: Hey Syzkaller & Netdev folks, I've been looking at this a bit and am slightly puzzled. At first I saw this: > enqueue_timer+0xad/0x560 kernel/time/timer.c:605 > internal_add_timer kernel/time/timer.c:634 [inline] > __mod_timer+0xa76/0xf40 kernel/time/timer.c:1131 > mod_peer_timer+0x158/0x220 drivers/net/wireguard/timers.c:37 > wg_packet_consume_data_done drivers/net/wireguard/receive.c:354 [inline] > wg_packet_rx_poll+0xd9e/0x2250 drivers/net/wireguard/receive.c:474 And I thought - darn, it's a bug where a struct wg_peer's timer is modified -- in this case, timer_persistent_keepalive by way of wg_timers_any_authenticated_packet_traversal() -- after the peer object has been freed. This fits most clearly the designated line receive.c:354, and the subsequent 8 byte write when enqueuing the timer. So I traced through the peer shutdown code in peer.c -- the peer_make_dead() + peer_remove_after_dead() combo -- and made sure the peer->is_dead RCU logic was correct. And I couldn't find a bug. But then I looked further down at the syzbot report: > Allocated by task 16792: > kvzalloc include/linux/slab.h:705 [inline] > alloc_netdev_mqs+0x89/0xf30 net/core/dev.c:10626 > rtnl_create_link+0x2f7/0xc00 net/core/rtnetlink.c:3315 and > Freed by task 41: > __kmem_cache_free+0x264/0x3c0 mm/slub.c:3799 > device_release+0x95/0x1c0 > kobject_cleanup lib/kobject.c:683 [inline] > kobject_release lib/kobject.c:714 [inline] > kref_put include/linux/kref.h:65 [inline] > kobject_put+0x228/0x470 lib/kobject.c:731 > netdev_run_todo+0xe5a/0xf50 net/core/dev.c:10400 So that means the memory in question is actually the one that's allocated and freed by the networking stack. Specifically, dev.c:10626 is allocating a struct net_device with a trailing struct wg_device (its priv_data). However, wg_device does not have any struct timer_lists in it, and I don't see how net_device's watchdog_timer would be related to the stacktrace which is clearly operating over a wg_peer timer. So what on earth is going on here? Jason PS - Jakub, I have some WG fixes queued up for you, but I wanted to have some resolution with this first before sending a tranche. From tmittermair at cvl.tuwien.ac.at Tue May 23 13:00:09 2023 From: tmittermair at cvl.tuwien.ac.at (Theodor Mittermair) Date: Tue, 23 May 2023 15:00:09 +0200 Subject: Odd behaviour with wireguard on windows when using subnets in AllowedIPs Message-ID: Hi! I already asked about this on https://web.libera.chat/#wireguard and was told to post to the mailing list. tl;dr: under windows when using multiple subnets different from /0 or /32, they are treated as networks with broadcast adresses, makeing certain addresses not be tunneled as expected. long story: Preface (i changed details to avoid privacy conflicts, but to the best of my knowledge, that should not change the results): Under my administration is a public /24 network, lets call it 1.2.3.0/24. There is a wireguard server, assume it's address is 1.2.3.66. In the network are some ssh servers which are not publicly reachable from outside the network, as well as a http(s) server on 1.2.3.79 which is generally reachable from everywhere. It is the goal to tunnel only necessary addresses to the wireguard server, such that a client can access the internal ssh servers, but it is expected that the generally available http(s) server continues to be reachable. To archieve this, I have client configurations that generally look like this: ==== ==== ==== ==== [Interface] PrivateKey=clientprivkey Address=10.20.0.4/32 [Peer] PublicKey=serverpubkey AllowedIPs=1.2.3.0/26 AllowedIPs=1.2.3.64/31 AllowedIPs=1.2.3.67/32 AllowedIPs=1.2.3.72/29 AllowedIPs=1.2.3.80/28 AllowedIPs=1.2.3.96/27 AllowedIPs=1.2.3.128/25 Endpoint=1.2.3.66:51820 PersistentKeepalive=15 ==== ==== ==== ==== Reason for the ugly list of AllowedIPs of course is, if i just wrote "AllowedIPs=1.2.3.0/24" that would attempt to tunnel all traffic including the tunnel traffic that should go to the wireguard server through the tunnel. While i hoped that this might work by the same magic that makes "AllowedIPs=0.0.0.0/0" work, experiments showed that a single "AllowedIPs=1.2.3.0/24" does not work on linux. These configurations successfully connect to the wireguard server on both windows and linux and provide atleast _some_ functionality. On linux, this config seems to work as i desire. On windows, i can reach atleast one of the ssh internal ssh servers that would not be accessible without the tunneling, but, the http(s) server on .79 is not reachable anymore. If i attempt to enter the webservers url or it's ip address in chrome on windows, i get an error "ERR_ADDRESS_INVALID" as long as the wireguard tunnel is active. For debugging purposes i replaced "AllowedIPs=1.2.3.72/29" with "AllowedIPs=1.2.3.79/32" (ignoring the lost hosts for a moment), which makes it work (read as: i can access the http service on .79 as expected). Out of curiosity, i tried to replace all AllowedIPs with a single "AllowedIPs=1.2.3.0/24" on windows (which was not functional under linux) and suprisingly that worked (means: ssh and http works as expected). Further online reading brought up the "Table=off" option in the "[Interface]" section. When used with the sample configuration above, it makes it work on windows as well (though i have no reasonable explanation). The same config with "Table=off" does not work at all on linux (understandable, since the routes to the subnets just aren't added anymore at all). What at one point dawned on me, is that .79 is the last address of the .72/29 block, i.e. it's broadcast address. That would also kind of explain the error my browser showed me in a way (since it's not really a thing to connect to a broadcast address over http i think). The theory is somewhat verified by obtaining the same result when attempting to connect to other "broadcast" addresses (e.g.: 1.2.3.127) while the wireguard tunnel is active. Conclusion: I am now somewhat stuck with two variations (AllowedIPs netlist vs net/24 & Table=auto vs Table=off), which both make the config work on windows OR linux, but never fully functionally on both at the same time. Because i would like to distribute a number of client configurations to unknown-OS-users, it would be strongly preferred to have one config that works regardless of client OS. I _could_ just list each addresses in the /24 as a /32 and be done, but thats really awfull (and makes the config parser that does the pretty colors on windows hang for quite some seconds). Am i missing something or did i find somewhat of a bug? best regards Theodor Mittermair From kuba at kernel.org Tue May 23 16:05:12 2023 From: kuba at kernel.org (Jakub Kicinski) Date: Tue, 23 May 2023 09:05:12 -0700 Subject: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer In-Reply-To: References: <000000000000c0b11d05fa917fe3@google.com> Message-ID: <20230523090512.19ca60b6@kernel.org> On Tue, 23 May 2023 17:46:20 +0200 Jason A. Donenfeld wrote: > > Freed by task 41: > > __kmem_cache_free+0x264/0x3c0 mm/slub.c:3799 > > device_release+0x95/0x1c0 > > kobject_cleanup lib/kobject.c:683 [inline] > > kobject_release lib/kobject.c:714 [inline] > > kref_put include/linux/kref.h:65 [inline] > > kobject_put+0x228/0x470 lib/kobject.c:731 > > netdev_run_todo+0xe5a/0xf50 net/core/dev.c:10400 > > So that means the memory in question is actually the one that's > allocated and freed by the networking stack. Specifically, dev.c:10626 > is allocating a struct net_device with a trailing struct wg_device (its > priv_data). However, wg_device does not have any struct timer_lists in > it, and I don't see how net_device's watchdog_timer would be related to > the stacktrace which is clearly operating over a wg_peer timer. > > So what on earth is going on here? Your timer had the pleasure of getting queued _after_ a dead watchdog timer, no? IOW it tries to update the ->next pointer of a queued watchdog timer. We should probably do: diff --git a/net/core/dev.c b/net/core/dev.c index 374d38fb8b9d..f3ed20ebcf5a 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -10389,6 +10389,8 @@ void netdev_run_todo(void) WARN_ON(rcu_access_pointer(dev->ip_ptr)); WARN_ON(rcu_access_pointer(dev->ip6_ptr)); + WARN_ON(timer_shutdown_sync(&dev->watchdog_timer)); + if (dev->priv_destructor) dev->priv_destructor(dev); if (dev->needs_free_netdev) to catch how that watchdog_timer is getting queued. Would that make sense, Eric? From edumazet at google.com Tue May 23 16:12:32 2023 From: edumazet at google.com (Eric Dumazet) Date: Tue, 23 May 2023 18:12:32 +0200 Subject: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer In-Reply-To: <20230523090512.19ca60b6@kernel.org> References: <000000000000c0b11d05fa917fe3@google.com> <20230523090512.19ca60b6@kernel.org> Message-ID: On Tue, May 23, 2023 at 6:05?PM Jakub Kicinski wrote: > > On Tue, 23 May 2023 17:46:20 +0200 Jason A. Donenfeld wrote: > > > Freed by task 41: > > > __kmem_cache_free+0x264/0x3c0 mm/slub.c:3799 > > > device_release+0x95/0x1c0 > > > kobject_cleanup lib/kobject.c:683 [inline] > > > kobject_release lib/kobject.c:714 [inline] > > > kref_put include/linux/kref.h:65 [inline] > > > kobject_put+0x228/0x470 lib/kobject.c:731 > > > netdev_run_todo+0xe5a/0xf50 net/core/dev.c:10400 > > > > So that means the memory in question is actually the one that's > > allocated and freed by the networking stack. Specifically, dev.c:10626 > > is allocating a struct net_device with a trailing struct wg_device (its > > priv_data). However, wg_device does not have any struct timer_lists in > > it, and I don't see how net_device's watchdog_timer would be related to > > the stacktrace which is clearly operating over a wg_peer timer. > > > > So what on earth is going on here? > > Your timer had the pleasure of getting queued _after_ a dead watchdog > timer, no? IOW it tries to update the ->next pointer of a queued > watchdog timer. We should probably do: > > diff --git a/net/core/dev.c b/net/core/dev.c > index 374d38fb8b9d..f3ed20ebcf5a 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -10389,6 +10389,8 @@ void netdev_run_todo(void) > WARN_ON(rcu_access_pointer(dev->ip_ptr)); > WARN_ON(rcu_access_pointer(dev->ip6_ptr)); > > + WARN_ON(timer_shutdown_sync(&dev->watchdog_timer)); > + > if (dev->priv_destructor) > dev->priv_destructor(dev); > if (dev->needs_free_netdev) > > to catch how that watchdog_timer is getting queued. Would that make > sense, Eric? Would this case be catched at the time the device is freed ? (CONFIG_DEBUG_OBJECTS_FREE=y or something) From Jason at zx2c4.com Tue May 23 16:14:18 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Tue, 23 May 2023 18:14:18 +0200 Subject: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer In-Reply-To: <20230523090512.19ca60b6@kernel.org> References: <000000000000c0b11d05fa917fe3@google.com> <20230523090512.19ca60b6@kernel.org> Message-ID: On Tue, May 23, 2023 at 09:05:12AM -0700, Jakub Kicinski wrote: > On Tue, 23 May 2023 17:46:20 +0200 Jason A. Donenfeld wrote: > > > Freed by task 41: > > > __kmem_cache_free+0x264/0x3c0 mm/slub.c:3799 > > > device_release+0x95/0x1c0 > > > kobject_cleanup lib/kobject.c:683 [inline] > > > kobject_release lib/kobject.c:714 [inline] > > > kref_put include/linux/kref.h:65 [inline] > > > kobject_put+0x228/0x470 lib/kobject.c:731 > > > netdev_run_todo+0xe5a/0xf50 net/core/dev.c:10400 > > > > So that means the memory in question is actually the one that's > > allocated and freed by the networking stack. Specifically, dev.c:10626 > > is allocating a struct net_device with a trailing struct wg_device (its > > priv_data). However, wg_device does not have any struct timer_lists in > > it, and I don't see how net_device's watchdog_timer would be related to > > the stacktrace which is clearly operating over a wg_peer timer. > > > > So what on earth is going on here? > > Your timer had the pleasure of getting queued _after_ a dead watchdog > timer, no? IOW it tries to update the ->next pointer of a queued > watchdog timer. Ahh, you're right! Specifically, > hlist_add_head include/linux/list.h:945 [inline] > enqueue_timer+0xad/0x560 kernel/time/timer.c:605 The write on line 945 refers to the side of the timer base, not the peer's timer_list being queued. So indeed, the wireguard netdev is still alive at this point, but it's being queued to a timer in a different netdev that's already been freed (whether watchdog or otherwise in some privdata). So, IOW, not a wireguard bug, right? Jason From kuba at kernel.org Tue May 23 16:41:08 2023 From: kuba at kernel.org (Jakub Kicinski) Date: Tue, 23 May 2023 09:41:08 -0700 Subject: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer In-Reply-To: References: <000000000000c0b11d05fa917fe3@google.com> <20230523090512.19ca60b6@kernel.org> Message-ID: <20230523094108.0c624d47@kernel.org> On Tue, 23 May 2023 18:12:32 +0200 Eric Dumazet wrote: > > Your timer had the pleasure of getting queued _after_ a dead watchdog > > timer, no? IOW it tries to update the ->next pointer of a queued > > watchdog timer. We should probably do: > > > > diff --git a/net/core/dev.c b/net/core/dev.c > > index 374d38fb8b9d..f3ed20ebcf5a 100644 > > --- a/net/core/dev.c > > +++ b/net/core/dev.c > > @@ -10389,6 +10389,8 @@ void netdev_run_todo(void) > > WARN_ON(rcu_access_pointer(dev->ip_ptr)); > > WARN_ON(rcu_access_pointer(dev->ip6_ptr)); > > > > + WARN_ON(timer_shutdown_sync(&dev->watchdog_timer)); > > + > > if (dev->priv_destructor) > > dev->priv_destructor(dev); > > if (dev->needs_free_netdev) > > > > to catch how that watchdog_timer is getting queued. Would that make > > sense, Eric? > > Would this case be catched at the time the device is freed ? > > (CONFIG_DEBUG_OBJECTS_FREE=y or something) It should, no idea why it isn't. Looking thru the code now I don't see any obvious gaps where timer object is on a list but not active :S There's no way to get a vmcore from syzbot, right? :) Also I thought the shutdown leads to a warning when someone tries to schedule the dead timer but in fact add_timer() just exits cleanly. So the shutdown won't help us find the culprit :( From Jason at zx2c4.com Tue May 23 16:42:53 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Tue, 23 May 2023 18:42:53 +0200 Subject: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer In-Reply-To: <20230523094108.0c624d47@kernel.org> References: <000000000000c0b11d05fa917fe3@google.com> <20230523090512.19ca60b6@kernel.org> <20230523094108.0c624d47@kernel.org> Message-ID: On Tue, May 23, 2023 at 6:41?PM Jakub Kicinski wrote: > > On Tue, 23 May 2023 18:12:32 +0200 Eric Dumazet wrote: > > > Your timer had the pleasure of getting queued _after_ a dead watchdog > > > timer, no? IOW it tries to update the ->next pointer of a queued > > > watchdog timer. We should probably do: > > > > > > diff --git a/net/core/dev.c b/net/core/dev.c > > > index 374d38fb8b9d..f3ed20ebcf5a 100644 > > > --- a/net/core/dev.c > > > +++ b/net/core/dev.c > > > @@ -10389,6 +10389,8 @@ void netdev_run_todo(void) > > > WARN_ON(rcu_access_pointer(dev->ip_ptr)); > > > WARN_ON(rcu_access_pointer(dev->ip6_ptr)); > > > > > > + WARN_ON(timer_shutdown_sync(&dev->watchdog_timer)); > > > + > > > if (dev->priv_destructor) > > > dev->priv_destructor(dev); > > > if (dev->needs_free_netdev) > > > > > > to catch how that watchdog_timer is getting queued. Would that make > > > sense, Eric? > > > > Would this case be catched at the time the device is freed ? > > > > (CONFIG_DEBUG_OBJECTS_FREE=y or something) > > It should, no idea why it isn't. Looking thru the code now I don't see > any obvious gaps where timer object is on a list but not active :S > There's no way to get a vmcore from syzbot, right? :) > > Also I thought the shutdown leads to a warning when someone tries to > schedule the dead timer but in fact add_timer() just exits cleanly. > So the shutdown won't help us find the culprit :( Worth noting that it could also be caused by adding to a dead timer anywhere in priv_data of another netdev, not just the sole timer_list in net_device. From kuba at kernel.org Tue May 23 16:46:06 2023 From: kuba at kernel.org (Jakub Kicinski) Date: Tue, 23 May 2023 09:46:06 -0700 Subject: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer In-Reply-To: References: <000000000000c0b11d05fa917fe3@google.com> <20230523090512.19ca60b6@kernel.org> Message-ID: <20230523094606.6f4f8f4f@kernel.org> On Tue, 23 May 2023 18:14:18 +0200 Jason A. Donenfeld wrote: > So, IOW, not a wireguard bug, right? What's slightly concerning is that there aren't any other timers leading to KASAN: slab-use-after-free Write in enqueue_timer :( If WG was just an innocent bystander there should be, right? From kuba at kernel.org Tue May 23 16:47:36 2023 From: kuba at kernel.org (Jakub Kicinski) Date: Tue, 23 May 2023 09:47:36 -0700 Subject: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer In-Reply-To: References: <000000000000c0b11d05fa917fe3@google.com> <20230523090512.19ca60b6@kernel.org> <20230523094108.0c624d47@kernel.org> Message-ID: <20230523094736.3a9f6f8c@kernel.org> On Tue, 23 May 2023 18:42:53 +0200 Jason A. Donenfeld wrote: > > It should, no idea why it isn't. Looking thru the code now I don't see > > any obvious gaps where timer object is on a list but not active :S > > There's no way to get a vmcore from syzbot, right? :) > > > > Also I thought the shutdown leads to a warning when someone tries to > > schedule the dead timer but in fact add_timer() just exits cleanly. > > So the shutdown won't help us find the culprit :( > > Worth noting that it could also be caused by adding to a dead timer > anywhere in priv_data of another netdev, not just the sole timer_list > in net_device. Oh, I thought you zero'ed in on the watchdog based on offsets. Still, object debug should track all timers in the slab and complain on the free path. From Jason at zx2c4.com Tue May 23 16:47:41 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Tue, 23 May 2023 18:47:41 +0200 Subject: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer In-Reply-To: <20230523094606.6f4f8f4f@kernel.org> References: <000000000000c0b11d05fa917fe3@google.com> <20230523090512.19ca60b6@kernel.org> <20230523094606.6f4f8f4f@kernel.org> Message-ID: On Tue, May 23, 2023 at 6:46?PM Jakub Kicinski wrote: > > On Tue, 23 May 2023 18:14:18 +0200 Jason A. Donenfeld wrote: > > So, IOW, not a wireguard bug, right? > > What's slightly concerning is that there aren't any other timers > leading to > > KASAN: slab-use-after-free Write in enqueue_timer > > :( If WG was just an innocent bystander there should be, right? Well, WG does mod this timer for every single packet in its RX path. So that's bound to turn things up I suppose. From Jason at zx2c4.com Tue May 23 17:01:31 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Tue, 23 May 2023 19:01:31 +0200 Subject: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer In-Reply-To: <20230523094736.3a9f6f8c@kernel.org> References: <000000000000c0b11d05fa917fe3@google.com> <20230523090512.19ca60b6@kernel.org> <20230523094108.0c624d47@kernel.org> <20230523094736.3a9f6f8c@kernel.org> Message-ID: On Tue, May 23, 2023 at 09:47:36AM -0700, Jakub Kicinski wrote: > On Tue, 23 May 2023 18:42:53 +0200 Jason A. Donenfeld wrote: > > > It should, no idea why it isn't. Looking thru the code now I don't see > > > any obvious gaps where timer object is on a list but not active :S > > > There's no way to get a vmcore from syzbot, right? :) > > > > > > Also I thought the shutdown leads to a warning when someone tries to > > > schedule the dead timer but in fact add_timer() just exits cleanly. > > > So the shutdown won't help us find the culprit :( > > > > Worth noting that it could also be caused by adding to a dead timer > > anywhere in priv_data of another netdev, not just the sole timer_list > > in net_device. > > Oh, I thought you zero'ed in on the watchdog based on offsets. > Still, object debug should track all timers in the slab and complain > on the free path. No, I mentioned watchdog because it's the only timer_list in struct net_device. Offset analysis is an interesting idea though. Look at this: > The buggy address belongs to the object at ffff88801ecc0000 > which belongs to the cache kmalloc-cg-8k of size 8192 > The buggy address is located 5376 bytes inside of > freed 8192-byte region [ffff88801ecc0000, ffff88801ecc2000) IDA says that for syzkaller's vmlinux, net_device has a size of 0xc80 and wg_device has a size of 0x880. 0xc80+0x880=5376. Coincidence that the address offset is just after what wg uses? Hm. Jason From edumazet at google.com Tue May 23 17:05:27 2023 From: edumazet at google.com (Eric Dumazet) Date: Tue, 23 May 2023 19:05:27 +0200 Subject: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer In-Reply-To: References: <000000000000c0b11d05fa917fe3@google.com> <20230523090512.19ca60b6@kernel.org> <20230523094108.0c624d47@kernel.org> <20230523094736.3a9f6f8c@kernel.org> Message-ID: On Tue, May 23, 2023 at 7:01?PM Jason A. Donenfeld wrote: > > On Tue, May 23, 2023 at 09:47:36AM -0700, Jakub Kicinski wrote: > > On Tue, 23 May 2023 18:42:53 +0200 Jason A. Donenfeld wrote: > > > > It should, no idea why it isn't. Looking thru the code now I don't see > > > > any obvious gaps where timer object is on a list but not active :S > > > > There's no way to get a vmcore from syzbot, right? :) > > > > > > > > Also I thought the shutdown leads to a warning when someone tries to > > > > schedule the dead timer but in fact add_timer() just exits cleanly. > > > > So the shutdown won't help us find the culprit :( > > > > > > Worth noting that it could also be caused by adding to a dead timer > > > anywhere in priv_data of another netdev, not just the sole timer_list > > > in net_device. > > > > Oh, I thought you zero'ed in on the watchdog based on offsets. > > Still, object debug should track all timers in the slab and complain > > on the free path. > > No, I mentioned watchdog because it's the only timer_list in struct > net_device. > > Offset analysis is an interesting idea though. Look at this: > > > The buggy address belongs to the object at ffff88801ecc0000 > > which belongs to the cache kmalloc-cg-8k of size 8192 > > The buggy address is located 5376 bytes inside of > > freed 8192-byte region [ffff88801ecc0000, ffff88801ecc2000) > > IDA says that for syzkaller's vmlinux, net_device has a size of 0xc80 > and wg_device has a size of 0x880. 0xc80+0x880=5376. Coincidence that > the address offset is just after what wg uses? Note that the syzkaller report mentioned: alloc_netdev_mqs+0x89/0xf30 net/core/dev.c:10626 usbnet_probe+0x196/0x2770 drivers/net/usb/usbnet.c:1698 usb_probe_interface+0x5c4/0xb00 drivers/usb/core/driver.c:396 really_probe+0x294/0xc30 drivers/base/dd.c:658 __driver_probe_device+0x1a2/0x3d0 drivers/base/dd.c:800 driver_probe_device+0x50/0x420 drivers/base/dd.c:830 __device_attach_driver+0x2d3/0x520 drivers/base/dd.c:958 So maybe an usbnet driver has a timer_list in its priv_data. From edumazet at google.com Tue May 23 17:07:05 2023 From: edumazet at google.com (Eric Dumazet) Date: Tue, 23 May 2023 19:07:05 +0200 Subject: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer In-Reply-To: References: <000000000000c0b11d05fa917fe3@google.com> <20230523090512.19ca60b6@kernel.org> <20230523094108.0c624d47@kernel.org> <20230523094736.3a9f6f8c@kernel.org> Message-ID: On Tue, May 23, 2023 at 7:05?PM Eric Dumazet wrote: > > On Tue, May 23, 2023 at 7:01?PM Jason A. Donenfeld wrote: > > > > On Tue, May 23, 2023 at 09:47:36AM -0700, Jakub Kicinski wrote: > > > On Tue, 23 May 2023 18:42:53 +0200 Jason A. Donenfeld wrote: > > > > > It should, no idea why it isn't. Looking thru the code now I don't see > > > > > any obvious gaps where timer object is on a list but not active :S > > > > > There's no way to get a vmcore from syzbot, right? :) > > > > > > > > > > Also I thought the shutdown leads to a warning when someone tries to > > > > > schedule the dead timer but in fact add_timer() just exits cleanly. > > > > > So the shutdown won't help us find the culprit :( > > > > > > > > Worth noting that it could also be caused by adding to a dead timer > > > > anywhere in priv_data of another netdev, not just the sole timer_list > > > > in net_device. > > > > > > Oh, I thought you zero'ed in on the watchdog based on offsets. > > > Still, object debug should track all timers in the slab and complain > > > on the free path. > > > > No, I mentioned watchdog because it's the only timer_list in struct > > net_device. > > > > Offset analysis is an interesting idea though. Look at this: > > > > > The buggy address belongs to the object at ffff88801ecc0000 > > > which belongs to the cache kmalloc-cg-8k of size 8192 > > > The buggy address is located 5376 bytes inside of > > > freed 8192-byte region [ffff88801ecc0000, ffff88801ecc2000) > > > > IDA says that for syzkaller's vmlinux, net_device has a size of 0xc80 > > and wg_device has a size of 0x880. 0xc80+0x880=5376. Coincidence that > > the address offset is just after what wg uses? > > > Note that the syzkaller report mentioned: > > alloc_netdev_mqs+0x89/0xf30 net/core/dev.c:10626 > usbnet_probe+0x196/0x2770 drivers/net/usb/usbnet.c:1698 > usb_probe_interface+0x5c4/0xb00 drivers/usb/core/driver.c:396 > really_probe+0x294/0xc30 drivers/base/dd.c:658 > __driver_probe_device+0x1a2/0x3d0 drivers/base/dd.c:800 > driver_probe_device+0x50/0x420 drivers/base/dd.c:830 > __device_attach_driver+0x2d3/0x520 drivers/base/dd.c:958 > > So maybe an usbnet driver has a timer_list in its priv_data. struct usbnet { ... struct timer_list delay; From Jason at zx2c4.com Tue May 23 17:16:20 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Tue, 23 May 2023 19:16:20 +0200 Subject: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer In-Reply-To: References: <000000000000c0b11d05fa917fe3@google.com> <20230523090512.19ca60b6@kernel.org> <20230523094606.6f4f8f4f@kernel.org> Message-ID: On Tue, May 23, 2023 at 06:47:41PM +0200, Jason A. Donenfeld wrote: > On Tue, May 23, 2023 at 6:46?PM Jakub Kicinski wrote: > > > > On Tue, 23 May 2023 18:14:18 +0200 Jason A. Donenfeld wrote: > > > So, IOW, not a wireguard bug, right? > > > > What's slightly concerning is that there aren't any other timers > > leading to > > > > KASAN: slab-use-after-free Write in enqueue_timer > > > > :( If WG was just an innocent bystander there should be, right? > > Well, WG does mod this timer for every single packet in its RX path. > So that's bound to turn things up I suppose. Here's one that is seemingly the same -- enqueuing a timer to a freed base -- with the allocation and free being the same netdev core function, but the UaF trigger for it is a JBD2 transaction thing: https://syzkaller.appspot.com/text?tag=CrashReport&x=17dd2446280000 No WG at all in it, but there's still the mysterious 5376 value... Jason From Jason at zx2c4.com Tue May 23 17:28:54 2023 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Tue, 23 May 2023 19:28:54 +0200 Subject: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer In-Reply-To: References: <000000000000c0b11d05fa917fe3@google.com> <20230523090512.19ca60b6@kernel.org> <20230523094606.6f4f8f4f@kernel.org> Message-ID: On Tue, May 23, 2023 at 07:16:20PM +0200, Jason A. Donenfeld wrote: > On Tue, May 23, 2023 at 06:47:41PM +0200, Jason A. Donenfeld wrote: > > On Tue, May 23, 2023 at 6:46?PM Jakub Kicinski wrote: > > > > > > On Tue, 23 May 2023 18:14:18 +0200 Jason A. Donenfeld wrote: > > > > So, IOW, not a wireguard bug, right? > > > > > > What's slightly concerning is that there aren't any other timers > > > leading to > > > > > > KASAN: slab-use-after-free Write in enqueue_timer > > > > > > :( If WG was just an innocent bystander there should be, right? > > > > Well, WG does mod this timer for every single packet in its RX path. > > So that's bound to turn things up I suppose. > > Here's one that is seemingly the same -- enqueuing a timer to a freed > base -- with the allocation and free being the same netdev core > function, but the UaF trigger for it is a JBD2 transaction thing: > https://syzkaller.appspot.com/text?tag=CrashReport&x=17dd2446280000 > No WG at all in it, but there's still the mysterious 5376 value... In this one, you see the free happens in some infiniband code. Looking at ipoib_dev_priv, and going to the member at net_device+ipoib_dev_priv, we get this at 5320: struct delayed_work neigh_reap_task; 5376-5320=56, which doesn't quite put us at the timer_list. Close but no cigar? From dvyukov at google.com Wed May 24 08:24:31 2023 From: dvyukov at google.com (Dmitry Vyukov) Date: Wed, 24 May 2023 10:24:31 +0200 Subject: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer In-Reply-To: References: <000000000000c0b11d05fa917fe3@google.com> <20230523090512.19ca60b6@kernel.org> <20230523094108.0c624d47@kernel.org> <20230523094736.3a9f6f8c@kernel.org> Message-ID: On Tue, 23 May 2023 at 19:07, 'Eric Dumazet' via syzkaller-bugs wrote: > > On Tue, May 23, 2023 at 7:05?PM Eric Dumazet wrote: > > > > On Tue, May 23, 2023 at 7:01?PM Jason A. Donenfeld wrote: > > > > > > On Tue, May 23, 2023 at 09:47:36AM -0700, Jakub Kicinski wrote: > > > > On Tue, 23 May 2023 18:42:53 +0200 Jason A. Donenfeld wrote: > > > > > > It should, no idea why it isn't. Looking thru the code now I don't see > > > > > > any obvious gaps where timer object is on a list but not active :S > > > > > > There's no way to get a vmcore from syzbot, right? :) > > > > > > > > > > > > Also I thought the shutdown leads to a warning when someone tries to > > > > > > schedule the dead timer but in fact add_timer() just exits cleanly. > > > > > > So the shutdown won't help us find the culprit :( > > > > > > > > > > Worth noting that it could also be caused by adding to a dead timer > > > > > anywhere in priv_data of another netdev, not just the sole timer_list > > > > > in net_device. > > > > > > > > Oh, I thought you zero'ed in on the watchdog based on offsets. > > > > Still, object debug should track all timers in the slab and complain > > > > on the free path. > > > > > > No, I mentioned watchdog because it's the only timer_list in struct > > > net_device. > > > > > > Offset analysis is an interesting idea though. Look at this: > > > > > > > The buggy address belongs to the object at ffff88801ecc0000 > > > > which belongs to the cache kmalloc-cg-8k of size 8192 > > > > The buggy address is located 5376 bytes inside of > > > > freed 8192-byte region [ffff88801ecc0000, ffff88801ecc2000) > > > > > > IDA says that for syzkaller's vmlinux, net_device has a size of 0xc80 > > > and wg_device has a size of 0x880. 0xc80+0x880=5376. Coincidence that > > > the address offset is just after what wg uses? > > > > > > Note that the syzkaller report mentioned: > > > > alloc_netdev_mqs+0x89/0xf30 net/core/dev.c:10626 > > usbnet_probe+0x196/0x2770 drivers/net/usb/usbnet.c:1698 > > usb_probe_interface+0x5c4/0xb00 drivers/usb/core/driver.c:396 > > really_probe+0x294/0xc30 drivers/base/dd.c:658 > > __driver_probe_device+0x1a2/0x3d0 drivers/base/dd.c:800 > > driver_probe_device+0x50/0x420 drivers/base/dd.c:830 > > __device_attach_driver+0x2d3/0x520 drivers/base/dd.c:958 > > > > So maybe an usbnet driver has a timer_list in its priv_data. > > struct usbnet { > ... > struct timer_list delay; FWIW There are more report examples on the dashboard. There are some that don't mention wireguard nor usbnet, e.g.: https://syzkaller.appspot.com/text?tag=CrashReport&x=17dd2446280000 So that's probably red herring. But they all seem to mention alloc_netdev_mqs. Let's do for now: #syz set subsystems: net From kuba at kernel.org Wed May 24 15:33:41 2023 From: kuba at kernel.org (Jakub Kicinski) Date: Wed, 24 May 2023 08:33:41 -0700 Subject: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer In-Reply-To: References: <000000000000c0b11d05fa917fe3@google.com> <20230523090512.19ca60b6@kernel.org> <20230523094108.0c624d47@kernel.org> <20230523094736.3a9f6f8c@kernel.org> Message-ID: <20230524083341.0cd435f7@kernel.org> On Wed, 24 May 2023 10:24:31 +0200 Dmitry Vyukov wrote: > FWIW There are more report examples on the dashboard. > There are some that don't mention wireguard nor usbnet, e.g.: > https://syzkaller.appspot.com/text?tag=CrashReport&x=17dd2446280000 > So that's probably red herring. But they all seem to mention alloc_netdev_mqs. While we have you, let me ask about the possibility of having vmcore access - I think it'd be very useful to solve this mystery. With a bit of luck the timer still has the function set. From kuba at kernel.org Wed May 24 15:39:35 2023 From: kuba at kernel.org (Jakub Kicinski) Date: Wed, 24 May 2023 08:39:35 -0700 Subject: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer In-Reply-To: <20230524083341.0cd435f7@kernel.org> References: <000000000000c0b11d05fa917fe3@google.com> <20230523090512.19ca60b6@kernel.org> <20230523094108.0c624d47@kernel.org> <20230523094736.3a9f6f8c@kernel.org> <20230524083341.0cd435f7@kernel.org> Message-ID: <20230524083935.7108f17f@kernel.org> On Wed, 24 May 2023 08:33:41 -0700 Jakub Kicinski wrote: > On Wed, 24 May 2023 10:24:31 +0200 Dmitry Vyukov wrote: > > FWIW There are more report examples on the dashboard. > > There are some that don't mention wireguard nor usbnet, e.g.: > > https://syzkaller.appspot.com/text?tag=CrashReport&x=17dd2446280000 > > So that's probably red herring. But they all seem to mention alloc_netdev_mqs. > > While we have you, let me ask about the possibility of having vmcore > access - I think it'd be very useful to solve this mystery. > With a bit of luck the timer still has the function set. I take that back. Memory state around the buggy address: ffff88801ecc1400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88801ecc1480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff88801ecc1500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88801ecc1580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88801ecc1600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb From rumen.telbizov at menlosecurity.com Tue May 9 22:17:21 2023 From: rumen.telbizov at menlosecurity.com (Rumen Telbizov) Date: Tue, 09 May 2023 22:17:21 -0000 Subject: WireGuard IRQ distribution Message-ID: Hello WireGuard, New subscriber to the list here. I've been running performance tests between two bare-metal machines, trying to gauge what performance at what CPU utilization I can expect out of WireGuard. While doing so I noticed that the immediate bottleneck becomes an IRQ which lands on a single CPU core. I strongly suspect that this is because the underlying packet flow between the two machines is exactly the same 5-tuple: UDP, src IP:51280, dst IP:51280. Since WireGuard doesn't vary the source UDP port, all packets land on the same IRQ and thus the same CPU. No huge surprises so far, if my understanding is correct. The interesting part comes when I try to introduce UDP source-port variability artificially through nftables - see below for details. Even though I am able to distribute the IRQ load pretty well across all cores, the overall performance actually drops by about 50%. I was hoping to get some ideas as to what might be going on and if this is an expected behaviour. Any further pointers as to how I can fully utilize all my CPU capacity and get as close to wire-speed would be appreciated. Setup -- 2 x of the following: * Xeon(R) E-2378G CPU @ 2.80GHz, 64GB RAM * MT27800 Family [ConnectX-5] - 2 x 25Gbit/s in LACP bond = 50Gbit/s * Debian 11, kernel: 5.10.178-3 * modinfo wireguard: version: 1.0.0 * Server running: iperf3 -s * Client running: iperf3 -c XXX -Z -t 30 Baseline iperf3 performance over plain VLAN: * Stable 24Gbit/s and 2Mpps bmon: Gb (RX Bits/second) 24.54 .........|.||..|.||.||.||||||..||.||....................... 20.45 .........|||||||||||||||||||||||||||||..................... 16.36 ........||||||||||||||||||||||||||||||..................... 12.27 ........||||||||||||||||||||||||||||||..................... 8.18 ........|||||||||||||||||||||||||||||||..................... 4.09 ::::::::|||||||||||||||||||||||||||||||::::::::::::::::::::: 1 5 10 15 20 25 30 35 40 45 50 55 60 M (RX Packets/second) 2.03 .........|.||..|.||.||.||||||..||.||........................ 1.69 .........|||||||||||||||||||||||||||||...................... 1.35 ........||||||||||||||||||||||||||||||...................... 1.01 ........||||||||||||||||||||||||||||||...................... 0.68 ........|||||||||||||||||||||||||||||||..................... 0.34 ::::::::|||||||||||||||||||||||||||||||::::::::::::::::::::: 1 5 10 15 20 25 30 35 40 45 50 55 60 top: %Cpu0 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu1 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu2 : 1.0 us, 1.0 sy, 0.0 ni, 98.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu3 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu4 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu5 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu6 : 1.0 us, 0.0 sy, 0.0 ni, 99.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu7 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu8 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu9 : 1.0 us, 1.0 sy, 0.0 ni, 98.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu10 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu11 : 0.0 us, 0.9 sy, 0.0 ni, 16.8 id, 0.0 wa, 0.0 hi, 82.2 si, 0.0 st %Cpu12 : 0.0 us, 32.3 sy, 0.0 ni, 65.6 id, 0.0 wa, 0.0 hi, 2.1 si, 0.0 st %Cpu13 : 1.0 us, 36.3 sy, 0.0 ni, 59.8 id, 0.0 wa, 0.0 hi, 2.9 si, 0.0 st %Cpu14 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu15 : 0.0 us, 1.0 sy, 0.0 ni, 99.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st The IRQs do pile up behind CPU 11 because iperf3 is single-threaded. Still, I can reach the full bandwidth of a single NIC (25Gbit/s) which is also an artefact of the LACP hashing of a single packet flow. Scenario 1: No port randomization (stock wireguard setup) * all IRQs land on a single CPU core * 8Gbit/s and 660Kpps bmon: Gb (RX Bits/second) 8.01 ...........|||||||||||||||.|||||||||||||.................... 6.68 ...........|||||||||||||||||||||||||||||.................... 5.34 ...........||||||||||||||||||||||||||||||................... 4.01 ...........||||||||||||||||||||||||||||||................... 2.67 ...........||||||||||||||||||||||||||||||................... 1.34 ::::::::::|||||||||||||||||||||||||||||||::::::::::::::::::: 1 5 10 15 20 25 30 35 40 45 50 55 60 K (RX Packets/second) 661.71 ...........|||||||||||||||.|||||||||||||.................... 551.42 ...........|||||||||||||||||||||||||||||.................... 441.14 ...........||||||||||||||||||||||||||||||................... 330.85 ...........||||||||||||||||||||||||||||||................... 220.57 ...........||||||||||||||||||||||||||||||................... 110.28 ::::::::::|||||||||||||||||||||||||||||||::::::::::::::::::: 1 5 10 15 20 25 30 35 40 45 50 55 60 top: %Cpu0 : 0.0 us, 28.0 sy, 0.0 ni, 69.0 id, 0.0 wa, 0.0 hi, 3.0 si, 0.0 st %Cpu1 : 0.0 us, 18.1 sy, 0.0 ni, 79.8 id, 0.0 wa, 0.0 hi, 2.1 si, 0.0 st %Cpu2 : 0.0 us, 20.2 sy, 0.0 ni, 77.9 id, 0.0 wa, 0.0 hi, 1.9 si, 0.0 st %Cpu3 : 0.0 us, 22.8 sy, 0.0 ni, 74.3 id, 0.0 wa, 0.0 hi, 3.0 si, 0.0 st %Cpu4 : 0.0 us, 14.6 sy, 0.0 ni, 85.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu5 : 0.0 us, 12.6 sy, 0.0 ni, 87.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu6 : 0.0 us, 21.3 sy, 0.0 ni, 75.5 id, 0.0 wa, 0.0 hi, 3.2 si, 0.0 st %Cpu7 : 0.0 us, 17.6 sy, 0.0 ni, 76.9 id, 0.0 wa, 0.0 hi, 5.5 si, 0.0 st %Cpu8 : 1.1 us, 24.2 sy, 0.0 ni, 70.5 id, 0.0 wa, 0.0 hi, 4.2 si, 0.0 st %Cpu9 : 0.0 us, 20.2 sy, 0.0 ni, 74.5 id, 0.0 wa, 0.0 hi, 5.3 si, 0.0 st %Cpu10 : 0.0 us, 30.3 sy, 0.0 ni, 62.6 id, 0.0 wa, 0.0 hi, 7.1 si, 0.0 st %Cpu11 : 0.0 us, 22.3 sy, 0.0 ni, 71.3 id, 0.0 wa, 0.0 hi, 6.4 si, 0.0 st %Cpu12 : 1.1 us, 15.8 sy, 0.0 ni, 76.8 id, 0.0 wa, 0.0 hi, 6.3 si, 0.0 st %Cpu13 : 0.0 us, 0.0 sy, 0.0 ni, 5.0 id, 0.0 wa, 0.0 hi, 95.0 si, 0.0 st %Cpu14 : 1.0 us, 23.7 sy, 0.0 ni, 71.1 id, 0.0 wa, 0.0 hi, 4.1 si, 0.0 st %Cpu15 : 0.0 us, 23.2 sy, 0.0 ni, 73.7 id, 0.0 wa, 0.0 hi, 3.2 si, 0.0 st As mentioned above I suspect this is an effect of the single 5-tuple UDP, src 169.254.100.2:51280, dst 169.254.100.1:51280 that WireGuard uses under the hood. Parallelizing iperf3 has no effect since it all comes down to the same flow on the wire after encapsulation. This is the point where I decided to try to diversify / randomize the source UDP port to try to distribute the CPU load over the remaining cores. Scenario 2: UDP source port randomization via nftables * 4Gbit/s and 337Kpps * I applied the following nftables to transparently change the source UDP port at transmit time and then to bring it back to what WireGuard expects. table inet raw { chain POSTROUTING { type filter hook postrouting priority raw; policy accept; oif bond0.2000 udp dport 51280 notrack udp sport set ip id } chain PREROUTING { type filter hook prerouting priority raw; policy accept; iif bond0.2000 udp dport 51280 notrack udp sport set 51280 } } In essence I set the source UDP port to the IP ID field which gives me a pretty good distribution of source UDP ports. I tried using the random and inc modules of nftables but with no luck, port was always 0. This trick seems to work though. bmon: Gb (RX Bits/second) 4.08 ........|..|||.||||||||..||||||||........................... 3.40 ........||.||||||||||||||||||||||||||....................... 2.72 .......||||||||||||||||||||||||||||||....................... 2.04 .......||||||||||||||||||||||||||||||....................... 1.36 .......||||||||||||||||||||||||||||||....................... 0.68 :::::::|||||||||||||||||||||||||||||||:::::::::::::::::::::: 1 5 10 15 20 25 30 35 40 45 50 55 60 K (RX Packets/second) 337.23 ........|..|||.||||||||..||||||||........................... 281.02 ........||.||||||||||||||||||||||||||....................... 224.82 .......||||||||||||||||||||||||||||||....................... 168.61 .......||||||||||||||||||||||||||||||....................... 112.41 .......||||||||||||||||||||||||||||||....................... 56.20 :::::::|||||||||||||||||||||||||||||||:::::::::::::::::::::: 1 5 10 15 20 25 30 35 40 45 50 55 60 top: %Cpu0 : 0.0 us, 16.5 sy, 0.0 ni, 62.9 id, 0.0 wa, 0.0 hi, 20.6 si, 0.0 st %Cpu1 : 0.0 us, 50.5 sy, 0.0 ni, 31.1 id, 0.0 wa, 0.0 hi, 18.4 si, 0.0 st %Cpu2 : 0.0 us, 16.8 sy, 0.0 ni, 68.4 id, 0.0 wa, 0.0 hi, 14.7 si, 0.0 st %Cpu3 : 0.0 us, 20.6 sy, 0.0 ni, 61.8 id, 0.0 wa, 0.0 hi, 17.6 si, 0.0 st %Cpu4 : 0.0 us, 13.1 sy, 0.0 ni, 68.7 id, 0.0 wa, 0.0 hi, 18.2 si, 0.0 st %Cpu5 : 0.0 us, 19.2 sy, 0.0 ni, 61.6 id, 0.0 wa, 0.0 hi, 19.2 si, 0.0 st %Cpu6 : 0.0 us, 15.5 sy, 0.0 ni, 62.1 id, 0.0 wa, 0.0 hi, 22.3 si, 0.0 st %Cpu7 : 0.0 us, 29.3 sy, 0.0 ni, 53.5 id, 0.0 wa, 0.0 hi, 17.2 si, 0.0 st %Cpu8 : 1.0 us, 18.0 sy, 0.0 ni, 59.0 id, 0.0 wa, 0.0 hi, 22.0 si, 0.0 st %Cpu9 : 0.0 us, 20.8 sy, 0.0 ni, 68.9 id, 0.0 wa, 0.0 hi, 10.4 si, 0.0 st %Cpu10 : 1.0 us, 16.8 sy, 0.0 ni, 66.3 id, 0.0 wa, 0.0 hi, 15.8 si, 0.0 st %Cpu11 : 0.0 us, 13.4 sy, 0.0 ni, 66.0 id, 0.0 wa, 0.0 hi, 20.6 si, 0.0 st %Cpu12 : 0.0 us, 21.9 sy, 0.0 ni, 64.6 id, 0.0 wa, 0.0 hi, 13.5 si, 0.0 st %Cpu13 : 0.0 us, 22.4 sy, 0.0 ni, 60.2 id, 0.0 wa, 0.0 hi, 17.3 si, 0.0 st %Cpu14 : 0.0 us, 23.0 sy, 0.0 ni, 61.0 id, 0.0 wa, 0.0 hi, 16.0 si, 0.0 st %Cpu15 : 0.0 us, 16.8 sy, 0.0 ni, 67.4 id, 0.0 wa, 0.0 hi, 15.8 si, 0.0 st As you can see the IRQs are pretty well balanced and I have tons of idle on all cores, yet I get half the performance numbers. I'll continue with my tests and try a newer kernel, but wanted to share this with this community to try and get your feedback. Thank you, Rumen Telbizov From jnashicq at googlemail.com Tue May 16 13:07:09 2023 From: jnashicq at googlemail.com (Anton) Date: Tue, 16 May 2023 13:07:09 -0000 Subject: Possible race condition in Wireguard-go Message-ID: Hello all I've found a possible race condition resulting in a panic in wireguard-go. It happens when a client session disconnects, not often - once in a few days with a few (5-10) sessions running. The app I'm working on is based on wireguard-go/tun/netstack/tun.go code. The problem reveals itself as a panic (see below). It happens when peer.RoutineSequentialReceiver() go-routine does a (*tun.Device).Write(), which calls gvisor (*Endpoint).InjectInbound(), but endpoint could have been made nil to this point of time, b/c tun.stack.RemoveNIC(1) called from tunDev.Close() assigns nil to endpoint. A possible solution: https://github.com/mysteriumnetwork/wireguard-go/pull/6/files If I move > device.tun.device.Close() below the > device.RemoveAllPeers() thus making peer-related operations to finish before the device.tun.device.Close(), then crash doesn't happen. By now the code has been running for a week. I'll test it for another week or two. Trace: > 2023-05-04T00:34:10.000 INF services\wireguard\service\service.go:162 > Cleaning up session 7f100e49-6517-4141-be66-1ac7c47ed5e8 > DEBUG: (myst) 2023/05/04 00:34:10 Device closing > 2023-05-04T00:34:10.000 INF services\wireguard\service\stats_publisher.go:65 > Stopped publishing statistics for session 7f100e49-6517-4141-be66-1ac7c47ed5e8 > DEBUG: (myst) 2023/05/04 00:34:10 peer(/Zbg?wTzA) - Routine: sequential receiver - stopped > panic: runtime error: invalid memory address or nil pointer dereference > [signal 0xc0000005 code=0x0 addr=0x20 pc=0x7ff62082c781] > goroutine 485845 [running]: > gvisor.dev/gvisor/pkg/tcpip/link/channel.(*Endpoint).InjectInbound(...) > C:/Users/user/go/pkg/mod/gvisor.dev/gvisor at v0.0.0-20221203005347-703fd9b7fbc0/pkg/tcpip/link/channel/channel.go:194 > github.com/mysteriumnetwork/node/services/wireguard/endpoint/netstack-provider.(*netTun).Write(0xc002211600, {0xc0020348a0?, 0x1, 0xc0015ac810?}, 0x10) > C:/Users/user/src/node/services/wireguard/endpoint/netstack-provider/netstack.go:164 +0x141 > golang.zx2c4.com/wireguard/device.(*Peer).RoutineSequentialReceiver(0xc001229c00, 0x1) > C:/Users/user/go/pkg/mod/golang.zx2c4.com/wireguard at v0.0.0-20230325221338-052af4a8072b/device/receive.go:513 +0x23a > created by golang.zx2c4.com/wireguard/device.(*Peer).Start > C:/Users/user/go/pkg/mod/golang.zx2c4.com/wireguard at v0.0.0-20230325221338-052af4a8072b/device/peer.go:199 +0x2e5 A link to related code: https://github.com/mysteriumnetwork/node/blob/5c109f64858da7c0c0add4e2dd7ce9e4e46c99e1/services/wireguard/endpoint/netstack-provider/netstack.go#L164 -- regards, Anton From hgcoin at gmail.com Wed May 17 23:13:30 2023 From: hgcoin at gmail.com (Harry G Coin) Date: Wed, 17 May 2023 23:13:30 -0000 Subject: ip netns del zaps wg link Message-ID: <4fd6c9cb-c2cf-7a16-ee62-d958790652ea@gmail.com> First, Hi and thanks for all the effort! At least on Ubuntu latest LTS:? As advertised, if a wireguard link gets created by systemd/networkd, then set into a different net namespace, all works well. However, if that namespace is deleted, the link appears to be 'gone forever'.? Other link types reappear in the primary namespace when the namespace they are in gets deleted.?? I'm not sure whether the link retains its 'up' or 'down' state when the namespace it's in gets deleted and reset to primary.? Not a big deal, doesn't happen often. This is 100% repeatable.?? Some other answer than 'inaccessible until the next reboot' would be nice. From spatel at cloudflare.com Fri May 5 18:14:55 2023 From: spatel at cloudflare.com (Shiv Patel) Date: Fri, 05 May 2023 18:14:55 -0000 Subject: WinTun - Blue Screen on Windows (DPC_WATCHDOG_VIOLATION) Message-ID: I am a Product Manager at Cloudflare working on our WARP application. As of Cloudflare WARP Version 2023.3.381.0, we are using WinTun to setup our wireguard tunnel and it had been working great. In the last few days we?ve received several reports regarding a Blue Screen of Death on Windows 10 and Windows 11 devices when the product is turned on. The error code is DPC_WATCHDOG_VIOLATION(133). Args aren usually something like Arg1: 0000000000000001, The system cumulatively spent an extended period of time at DISPATCH_LEVEL or above. Arg2: 0000000000001e00, The watchdog period (in ticks). Arg3: fffff8017a4fb320, cast to nt!DPC_WATCHDOG_GLOBAL_TRIAGE_BLOCK, which contains additional information regarding the cumulative timeout Arg4: 0000000000000000 The function is usually different each time, but all in tpcip.sys. Here are a few examples SYMBOL_NAME: tcpip!RtlAcquireScalableWriteLock+15 MODULE_NAME: tcpip IMAGE_NAME: tcpip.sys IMAGE_VERSION: 10.0.19041.1221 STACK_COMMAND: .cxr; .ecxr ; kb BUCKET_ID_FUNC_OFFSET: 15 FAILURE_BUCKET_ID: 0x133_ISR_tcpip!RtlAcquireScalableWriteLock OSPLATFORM_TYPE: x64 OSNAME: Windows 10 FAILURE_ID_HASH: {5c8b9f1f-7ae1-233f-1a70-950d5491c936} OR SYMBOL_NAME: tcpip!IppFindOrCreatePath+f47 MODULE_NAME: tcpipIMAGE_NAME: tcpip.sys IMAGE_VERSION: 10.0.19041.1221 STACK_COMMAND: .cxr; .ecxr ; kb BUCKET_ID_FUNC_OFFSET: f47 FAILURE_BUCKET_ID: 0x133_ISR_tcpip!IppFindOrCreatePath OS_VERSION: 10.0.19041.1 BUILDLAB_STR: vb_release OSPLATFORM_TYPE: x64 OSNAME: Windows 10 FAILURE_ID_HASH: {bf177f95-f651-3035-3f86-b46e476b90e2} Followup: MachineOwner OR SYMBOL_NAME: tcpip!IppFindBestSourceAddressOnInterfaceUnderLock+107 MODULE_NAME: tcpip IMAGE_NAME: tcpip.sys STACK_COMMAND: .cxr; .ecxr ; kb BUCKET_ID_FUNC_OFFSET: 107 FAILURE_BUCKET_ID: 0x133_ISR_tcpip!IppFindBestSourceAddressOnInterfaceUnderLock OS_VERSION: 10.0.19041.1 BUILDLAB_STR: vb_release OSPLATFORM_TYPE: x64 OSNAME: Windows 10 FAILURE_ID_HASH: {2e016b48-a7c2-0910-9f41-71eded0c9f01} Has anyone seen this on WinTun? We can?t find any commonality between windows versions, hardware, software on the devices. Hoping this is a known issue.