wireguard-freebsd handshaking issue upon underlying WAN

Ryan Roosa ryanroosa at gmail.com
Mon Oct 25 17:17:09 UTC 2021

First off, I want to say thank you for the FreeBSD kernel module work
as it is greatly appreciated by myself and many others running *sense
firewalls :)

Generally wireguard-freebsd (wireguard-kmod 0.0.20210606_1) is running
quite well in my experience however, there is one issue which I have
been able to reproduce consistently: when the underlying WAN
connection that a tunnel is using is disrupted for the span of time
amounting to two missed handshake attempts (~4-5 minutes giving the ~2
minute average of handshake attempts), the tunnel will never handshake
again upon subsequent WAN restoration. This is the case even if one
resets the tunnel with 'wg-quick down ; wg-quick up' or restarts the
underlying OS (tried with both the latest stable versions of pfSense
and OPNSense community). For reference I am using a keep alive value
of 30 seconds in this scenario.

The only thing I've been able to do to get an existing tunnel
configuration handshaking with a peer endpoint again after its
Internet connection has been disrupted (outside of a complete removal
and rebuild) is to arbitrarily change the configured tunnel's
listening port (ex. 51820 to 51821 etc.). Upon saving and application
of the port change, the tunnel then handshakes with the peer endpoint
again immediately.

Given the symptom, it seems there may be some issue surrounding tunnel
handshaking resiliency when the underlying WAN drops out unexpectedly
for an extended period. If there is any way to look into this to
improve upon it so that after a 5+ minute internet outage a tunnel
could resume handshaking on its own without manual intervention, this
would be greatly appreciated.

I've got a 'bandaid' script running every 5 minutes currently which
checks the peer's handshake age and then changes the tunnel listen
port arbitrarily to restore connectivity then changes it back after 5
minutes of successful handshaking but obviously this is less than
ideal. As an additional data point I found if I switched the port and
tried to switch it back before another 5 minutes had passed, it would
stop handshaking again so there seems to be something special around
the 5 minute number regarding handshakes. Not sure if this is helpful
or not but thought I would include it.

Thank you in advance for looking into this and if there is any
additional information I can provide which may be of assistance I
would be happy to provide it.

Ryan Roosa

More information about the WireGuard mailing list