Wireguard connection lost between peers

Raoul Bhatia raoul.bhatia at radarcs.com
Wed May 12 05:19:48 UTC 2021


Hi Jason

Apologies for taking some time to get back to you.
We tried to verify a few things and to see if we spot anything unusual,
and waited for a few mor instances to happen to get sufficient right data.

> That's surprising behavior. Thanks for debugging it. Can you see if
> you can reproduce with dynamic logging enabled? That'll give some
> useful information in dmesg:
>
>            # modprobe wireguard && echo module wireguard +p >
> /sys/kernel/debug/dynamic_debug/control

I did enable the debug control and also set
  sysctl -w net.core.message_cost=0
and have extracted a sample of the issue.
Please find it here https://nem3d.net/wireguard_20210512a.txt

From my observation, it is always the following symptoms:
1. Everything is WORKING:
LXC container d1-h sends handshake initiation.
Host wg0 receives, re-creates keypair, answers
d1-h receives, re-creates keypair, sends keepalive
wg0 receives keepalive
etc.


2. Somewhen it BREAKS
d1-h stopps hearing back after 15 seconds.
Initialization loop like mentioned above
d1-h stopps hearing back after 15 seconds.
etc.

As mentioned, the resolution is to dump the config, 
remove the peer, and syncconf to restore.
This time,  I used "nsenter -n" to apply this procedure to the
unprivileged container interface d1-h.

Lastly, we also saw similar behavior even between 2 physical hosts.
I will try to gather similar debug information.

Please let me know if further information is needed to
better understand the problem.

Raoul

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6069 bytes
Desc: not available
URL: <http://lists.zx2c4.com/pipermail/wireguard/attachments/20210512/ee03ee9f/attachment.p7s>


More information about the WireGuard mailing list