WireGuard protocol blocking in China, swgp-go (userspace obfuscation proxy)

Sat Jul 2 23:21:07 UTC 2022

On Tue, Jun 14, 2022 at 03:13:11PM +0200, Nico Schottelius wrote:
> > I am forwarding some information about WireGuard blocking and
> > anti-blocking that was posted to a censorship circumvention forum.
> 
> In regards to this topic I was wondering if it makes sense to have a
> more generic obfuscation proxy that can carry tcp/udp payload?
> 
> Maybe this already exists, but I would think that something that hops
> protocols (IPv6, IPv4 endpoints, tcp/udp encapsolution), changes ports
> and uses envelope based tunneling (http, https, smtp, imap - worst case
> DNS) would make it easier to sustain communication even in more serious
> filtering scenarios.
> 
> Given such a "generic obfuscator", it could be combined with "protocol"
> modes, i.e. enhancing protocols such as wireguard with the presented
> algorithm, making it even harder to predict the content.
> 
> I'd assume some performance regressions using such an obfuscator, but
> maybe it could even "learn" the proper obfuscation by detecting blocks
> on easier to detect obfuscation and then switching to a stronger, but
> less efficient obfuscation.

There are many designs for anti-censorship protocol obfuscation proxies,
including multi-modal ones like you suggest, more or less fanciful, more
or less realized. You can get a list of some of the more popular and
practical ones under the "censorship-circumvention" topic on GitHub:
	https://github.com/topics/censorship-circumvention
There's plenty of research on the topic as well, sometimes leading to
practical systems. CensorBib has the most important papers:
	https://censorbib.nymity.ch/
For an introduction and survey of ideas that have been proposed, I
recommend reading "Making Sense of Censorship Resistance Systems"
( https://censorbib.nymity.ch/#Khattak2016a ) and "Towards Grounding
Censorship Circumvention in Empiricism" ( https://censorbib.nymity.ch/#Tschantz2016a ).
There are summaries of recent research papers on censorship at
https://github.com/net4people/bbs/issues?q=label%3A%22reading+group%22 .
The idea of pluggable obfuscation modules is fairly ubiquitous, and has
been systematized in various ways:
	https://www.pluggabletransports.info/
	https://shadowsocks.org/guide/sip003.html
	https://guide.v2fly.org/en_US/advanced/advanced.html

Let me suggest, however, that it is a mistake to focus too narrowly on
protocol obfuscation. It is a necessary element, but not the most
important one nor the one that's hardest to get right. There's no
shortage of obfuscators, and even naive protocol obfuscation usually
works even against well-resourced censors like the GFW. The weak link of
circumvention systems tends not to be what cover protocols they use, but
the particulars of their connection establishment. (This point is made
in the two SoK papers I linked above.) Empirically, censors prefer to
disrupt communications during the early stages, when the connection is
being first set up. It is faster, cheaper, and overall more effective
than long-term steady-state protocol processing.

Take HTTP for example. It's not hard to obfsucate a network tunnel so
that it resembles HTTP at some degree of fidelity. One might think that
to in order to block such a tunnel, a censor needs to somehow
distinguish tunnel HTTP from "real" HTTP. But most real-world censors
will never go that far: instead they will look at the destination IP
address or Host header of HTTP requests, or send their own probes to
suspected servers to see how they respond; but in any case they'll add
the server's IP address to a firewall block list, and call it a day. The
hard part of obfuscating a tunnel as HTTP is not faithfully implementing
HTTP; it's protecting the server's address from being blocked for
reasons unrelated to HTTP. Protocol obfuscation cannot help when it is
not the protocol that is being attacked.

Having the tunnel hop across different endpoints is not a bad idea, but
consider: how does a legitimate user learn what endpoints to use,
without a censor learning them also? There are ways to protect
circumvention endpoints, from single-user servers (which I believe is
the model swgp-go intends), to strategic distribution of endpoint
addresses, to colocating proxy servers with other network services, but
these go beyond the realm of protocol obfuscation.

What I like about swgp-go is that it has a narrowly targeted goal (to
hide the most salient distinguishers of WireGuard, with low overhead)
and it is designed for a realistic and informed threat model. The
protocol obfuscation is re-encryption and padding, but it doesn't have
to be more than that.