Kernel module sends infinite netlink messages on v0.0.20180802
Matt Layher
mdlayher at gmail.com
Wed Aug 8 21:36:57 CEST 2018
Hi all,
While working on wireguardctrl, I found what I believe to be a bug with
the kernel module today. I'm using v0.0.20180802. At first I assumed
that my code was doing something wrong, but I'm able to make "wg show"
hang forever as well, so I believe this to be a problem with the kernel
module itself.
System information:
matt at nerr-2:~$ dmesg | grep wireguard
[ 1075.085912] wireguard: module verification failed: signature and/or
required key missing - tainting kernel
[ 1075.086235] wireguard: WireGuard 0.0.20180802 loaded. See
www.wireguard.com for information.
[ 1075.086235] wireguard: Copyright (C) 2015-2018 Jason A. Donenfeld
<Jason at zx2c4.com>. All Rights Reserved.
matt at nerr-2:~$ uname -a
Linux nerr-2 4.15.0-30-generic #32-Ubuntu SMP Thu Jul 26 17:42:43 UTC
2018 x86_64 x86_64 x86_64 GNU/Linux
Here are the steps to reproduce the issue:
Grab my "wgnlbug" Go source program and build it:
https://github.com/mdlayher/wireguardctrl/blob/master/cmd/wgnlbug/main.go
$ go install github.com/mdlayher/wireguardctrl/cmd/wgnlbug
Reset wg0 to a clean state:
$ sudo ip link del dev wg0 && sudo ip link add dev wg0 type wireguard
Attempt to add multiple peers with 511 addresses each (the actual CIDR
is hard-coded for both and doesn't seem to matter). Note that you have
to Ctrl+C the program or it'll hang forever.
$ sudo time ./bin/wgnlbug -n 2
before: wg0
^CCommand terminated by signal 2
1.29user 2.62system 0:02.74elapsed 142%CPU (0avgtext+0avgdata
385236maxresident)k
0inputs+0outputs (0major+98292minor)pagefaults 0swaps
At this point, "wg show" appears to hang forever until something sends
it a KILL (kernel maybe?) as well:
$ sudo time wg show
Command terminated by signal 9
20.88user 40.39system 1:03.31elapsed 96%CPU (0avgtext+0avgdata
12233204maxresident)k
16128inputs+0outputs (92major+3058349minor)pagefaults 0swaps
A look at strace reveals what appears to be an infinite stream of
multi-part netlink messages with identical sequence numbers:
$ sudo strace wg show
...
recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0,
nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base={{len=4072,
type=wireguard, flags=NLM_F_MULTI, seq=1533756618, pid=946},
"\x00\x01\x00\x00\x06\x00\x06\x00\x00\x00\x00\x00\x08\x00\x07\x00\x00\x00\x00\x00\x08\x00\x01\x00\x81\x00\x00\x00\x08\x00\x02\x00"...},
iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 4072
recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0,
nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base={{len=4068,
type=wireguard, flags=NLM_F_MULTI, seq=1533756618, pid=946},
"\x00\x01\x00\x00\xd0\x0f\x08\x00\xcc\x0f\x00\x00\x24\x00\x01\x00\xc6\x24\x8a\x34\xcc\x3c\x4a\x23\x00\xd4\x94\x8d\xec\x58\xc6\x7c"...},
iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 4068
recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0,
nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base={{len=4072,
type=wireguard, flags=NLM_F_MULTI, seq=1533756618, pid=946},
"\x00\x01\x00\x00\x06\x00\x06\x00\x00\x00\x00\x00\x08\x00\x07\x00\x00\x00\x00\x00\x08\x00\x01\x00\x81\x00\x00\x00\x08\x00\x02\x00"...},
iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 4072
recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0,
nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base={{len=4068,
type=wireguard, flags=NLM_F_MULTI, seq=1533756618, pid=946},
"\x00\x01\x00\x00\xd0\x0f\x08\x00\xcc\x0f\x00\x00\x24\x00\x01\x00\xc6\x24\x8a\x34\xcc\x3c\x4a\x23\x00\xd4\x94\x8d\xec\x58\xc6\x7c"...},
iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 4068
recvmsg(3, ^C{msg_name={sa_family=AF_NETLINK, nl_pid=0,
nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base={{len=4072,
type=wireguard, flags=NLM_F_MULTI, seq=1533756618, pid=946},
"\x00\x01\x00\x00\x06\x00\x06\x00\x00\x00\x00\x00\x08\x00\x07\x00\x00\x00\x00\x00\x08\x00\x01\x00\x81\x00\x00\x00\x08\x00\x02\x00"...},
iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 4072
--- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} ---
strace: Process 946 detached
Hope this is helpful. If it isn't a kernel module problem, I'd be
curious to see what both my code and "wg" are doing that causes this.
It seems to be reproducible 100% of the time on my system.
- Matt Layher
More information about the WireGuard
mailing list