Fast failover and handshake renegotiation for multihomed WireGuard servers

Justin Kilpatrick justin at althea.net
Mon Jul 8 17:10:29 CEST 2019


I'm running a small fleet of WireGuard servers and clients, the clients use the Babel routing protocol to detect the latency and packet loss to any of the servers and select the best one accordingly. 

The WireGuard servers are multihomed, they share a user list, keys, and an ip address. Babel will insert a route to the same destination ip but a different actual server whenever that server becomes the better option. 

Sadly I've had to keep this feature out of production because switching between two servers involves around a minute of zero connectivity and that's simply too disruptive to expose to customers. The client continues to send packets using the handshake data from the previous server, the new server dutifully discards them as incorrect packets and everyone involved waits around for the old handshake to time out and a new one to be renegotiated. 
 
Is there any way to trigger a handshake renegotiation quickly that is also secure? Ideally I would like users to be able to roam between servers without any detectable change, much as they can roam between routes inside of a babel network. 

-- 
  Justin Kilpatrick
  justin at althea.net


More information about the WireGuard mailing list