Built-in Roaming is limited due to a design fault adding STUN and TURN support would be good and make wire-guard connections more durable.

Sun Jan 15 09:39:09 CET 2017

Hi Peter,

I followed this thread and like to express some concerns. Although I see the problem and ran into it myself, I would like to see a solution outside the wireguard code. Like the one Jason proposed or even a new approach. I am afraid that network layers problems (legacy IP and especially NAT) are about to uglify yet another beautiful protocol. None of the problems stated in this thread have I ever observed in an dual stack or in a IP (read IPv6) environment. This is all inconvenience that legacy IP (read IPv4) brings and that is caused by ten+ years of overprovisioning and not taking care of the network layer of the infrastructure. It is 2017, run IPv6. There should not be a single line of code in wireguard that deals with broken infrastructure. There is plenty of room for all kinds of workarounds in the userspace. Like the scripts in the Wireguard repository. I see the problem, agree on it, but It is out of the scope for wireguard to solve. The infrastructure must be able to somehow connect to peers via UDP. That is I think the least one can expect from a network layer. Whatever _outside_ magic it may need due to historical protocol usage.

My concerns expressed and all that said, I would love to see some code or PoC. Code and pcaps are king :)

I solved the problem using ipv6 only when I ran into it. May require to finally invest in state of the art layer 3 protocol usage in some cases. However, it's overdue anyway. Wireguards roaming feature tool care of the sites where even the ipv6 prefix changes from time to time. HTH.

Cheers,

Dan

> On 9 Jan 2017, at 14:43, Peter Dolding <oiaohm at gmail.com> wrote:
> 
>> On Fri, Jan 6, 2017 at 6:33 AM, Jason A. Donenfeld <Jason at zx2c4.com> wrote:
>> Hi Peter,
>> 
>> On Thu, Jan 5, 2017 at 12:08 PM, Peter Dolding <oiaohm at gmail.com> wrote:
>>>> 1. Dynamic IPs.
>>>> 2. Both peers behind NAT.
>>> That misses one completely
>>> 3. Server and Peers both behind NAT.   Yes there is a usage case for this one.
>> 
>> WireGuard has no concept of client/server. There are only peers. So,
>> when I wrote "both peers behind NAT", I most certainly had in mind
>> what you refer to as "server and peer". Please reread my answer in
>> light of this new understanding.
>> 
>>> Dynamic DNS has it weaknesses.   Transparent DNS caching and DNS
>>> access restrictions in some networks mess with the solution you
>>> describe.
>>> 
>>> https://tools.ietf.org/html/rfc3489
>>> 
>>> For Voip STUN was developed for many reasons three key reasons.
>>> 1. you can username and password protect a STUN server so restricting
>>> the users who can find out about the service.
>>> 2 . It does support TLS so encrypted.
>>> 3. Information on a STUN server is not replaced to other servers like
>>> lots of dynamic DNS are so in case of attack there are limited sources
>>> of information.
>>> 
>>> Dynamic DNS option really using a hack that was not suitable for voip
>>> yet for some reason people think it suitable to use for VPN.   Dynamic
>>> DNS is not designed for punching though NAT solutions for the server
>>> address like STUN is or designed to limit access to the address
>>> information like STUN is.
>> 
>> So, as I already wrote in my previous email, implement a STUN tool for
>> setting up initial WireGuard communications. The example code I linked
>> to already provides a framework on how this might be done. Just
>> replace my homebaked hooks with whatever STUN library tickles your
>> fancy.
>> 
>>> This example the WireGuard server has a public IP address.   The case
>>> I am mentioning Wireguard server may not have a public IP address.
>> 
>> Um, no. Did you even read the example? Both WireGuard peers have
>> private IP addresses. Only the NAT-helper server has a public IP
>> address. This is the same model as STUN. Spend some time actually
>> reading and studying the work already done on this before wasting more
>> time with long emails.
>> 
>>> Now STUN will attempt hole punching in the case you don't have a
>>> public IP address for the WireGuard server if the NAT in use are
>>> cooperative.   Of course if you read STUN rfc they state the case
>>> where STUN cannot be used to create a link between server and client
>>> both behind NATs as long as the STUN server is in the open.
>> 
>> The example code I linked to presents the same model.
>> 
> It is not the same model some critical is missing.
> 
> You example gets you a connection.   You example does not cope with IP
> change as that happens in NAT environments.
> 
> 
> You need a particular pattern of operations.
> 
> NAT Hole punch/dynamic DNS resolve.
> Start VPN.
> VPN detects connection lost triggers resolve again to check if the IP
> address/port it is using is still current and correct.
> 
> This way VPN does not fall over and die.   Implementing STUN
> completely I need to way of connecting an application to Wireguard
> that Wireguard will resort to when connection fails keep alive message
> and before informing applications or users VPN is lost.
> 
> So I need how to connect resolve to Wireguard so it can be done as
> part of maintaining VPN connection.   This connect resolve to be
> triggered when keep alive fails need to happen when using a dynamic
> DNS.
> 
> The problem you have is when you find out your IP address on the
> Wireguard server as changed then attempt to inform clients of
> Wireguard server by wireguard that the IP address has changed this is
> going to fail in NAT guarded clients.   Why NAT rejected IP addresses
> that clients behind nat have not attempted to contact.   This is why
> you example only really works with Wireguard clients behind nat with
> dyanmic IP but if you put a Wireguard Server behind a nat with a
> dynamic IP your example code completely fails.
> 
> Wireguard idea that Wireguard server can update clients when server IP
> address changes only works when you have two public IP addresses old
> and new so you can send change messages from old IP address and have
> new IP address receive clients.   Problem is reality in dynamic IP
> address you don't know what your new IP address will be until after
> you have change IP address to the new IP address.   So wireguard
> design of how to make server handle changing IP address fails in real
> world.  Server changing IP address need clients to drop back to a
> resolve solution and when server is behind nat and needing hole
> punching rerunning hole punching.
> 
> The reality is you cannot run resolve just once you are operating in
> dynamic IP address with NAT.
> 
> 
> 
>>> Jason the issue here to be able to use STUN/TURN combination in all
>>> cases I need Wireguard to be able to be directed to use TURN.
>> 
>> Great. I already outlined how this could be done, and I provided
>> example code. Plug your STUN or TURN library into that, and you'll be
>> all set.
>> 
>> No, I'm not going to write it for you. You got a NAT-punching example
>> from me. You can get a STUN/TURN library from somebody else. All
>> that's left for you is putting them together.
>>> Current model is
>>> [Client]-[Server]
>>> What is needed is
>>> [Client]-[Relay]-[Server] with [Client]-[Server] to cover all usage
>>> cases.   Of course the relay being something standard and common
>>> reduces the number of servers that have to be publicly deployed and
>>> maintained.
>> 
>> Yes, I'm aware. And this is exactly what the example code demonstrates
>> is possible.
>> 
> No relay is something different to what you demoed.   TURN is relay.
> When using TURN Wireguard clients/server would not be able to connect
> to each other so would be sending all packets by the TURN server.
> 
> Relay is not exactly possible to performing with existing code.   From
> what I have seen of wireguard I cannot tell it to use a standard
> socket file.  Like /var/lib/mysql/mysql.sock with mysql.   Because
> when you are going to be relaying all packets you don't need a IP
> addressed port.   Ok now we have extra performance overhead and over
> expanded packet attempting to get into TURN effecting every packet
> sent by current Wireguard design.
> 
> Current from client packets roughly look like(yes this is rough I have
> skipped over lots.
> cleint send [destination IP address of wireguard server][source IP
> address of wireguard client][wirguard data]
> server send [destination IP address of wireguard client ][source IP
> address of wireguard server ][wirguard data]
> In Turn it need to do the following after connection stuff on the turn
> server is setup.
> wireguard cleint send [destination IP address turn server][source IP
> address of wireguard client][Turn channel ID for server][wirguard
> data]
> wireguard server send [destination IP address turn server][source IP
> address of wireguard server][Turn channel ID for client][wirguard
> data]
> 
> Of course all packets the wireguard client or server would be
> receiving would have address of TURN server because it a relay.
> 
> So current design would be rewriting all the packet headers to send
> wireguard to TURN.   This is why with TURN it would be way better
> built into the base design.
> 
> If connection fails due to any number of causes a connection using
> TURN need to some how trigger running resolve again.
> 
> Using TURN you basically are not using IP addresses any more but an
> abstraction system.   Using TURN you can technically send packets
> between two~16 thousand clients with identical IP address.
> 
> Relay solution is basically for when you cannot punch a hole through
> the nat or have direct connection between client/server.
> 
> What you code demonstrates is a resolve server with client code to
> interface with it.    This is what STUN is.   STUN is just a rfc
> standard defining how to go about implementing a resolve server and
> clients with all the nice stuff.
> 
> STUN rfc goes and documents all the ways a resolve sever solution with
> NAT will hit brick walls.   So you want to design something for
> punching holes in NAT read the STUN rfc completely.
> https://tools.ietf.org/html/rfc5389
> 
> Basically you have been doing NIH syndrome.   The punching though NAT
> has been done very completely with STUN to make viop and other
> services like it work.   In fact STUN has done it about as far as can
> be done this is why TURN support for relaying was added as it own rfc
> to cover the cases where there is no way to punch though the NAT.
> 
> My step one is somehow have wireguard support calling/registering with
> a userspace resolve program for when connection fails.   The one thing
> about implementing going though nat with dyanmic IP address is that
> connection failure is a given.  The key trick is handling connection
> failure not making the connection.   Basically punching a NAT hole to
> make a connection in most cases is either easy or not possible
> maintaining the hole is the hard bit because nats can clear the
> information ip address can change.....  Wireguard servers and clients
> have a keep alive message these messages not turning up can be a clear
> sign that resolve program need to rerun.
> 
> I am not going to write something that I know end users are going to
> have trouble with.   Both STUN/TURN mandates that resolve interface to
> be workable.
> 
> Basically you did the easy bit and punched the hole and have missed
> all the steps required to maintain that hole as the network
> environment changes around you.      Resolver should not be running on
> the clients all the time and should not need to be perfectly fast just
> run when ever the connection looks dead to make sure settings are
> right before calling the connection absolutely dead if the settings
> are wrong correct settings and attempt to rejoin up.  This is a
> interface for auto healing of failures..
> 
> Peter Dolding
> _______________________________________________
> WireGuard mailing list
> WireGuard at lists.zx2c4.com
> https://lists.zx2c4.com/mailman/listinfo/wireguard