passing-through TOS/DSCP marking

Thu Jun 17 12:24:29 UTC 2021

Daniel Golle <daniel at makrotopia.org> writes:

> Hi Florent,
>
> On Thu, Jun 17, 2021 at 07:55:09AM +0000, Florent Daigniere wrote:
>> On Thu, 2021-06-17 at 01:33 +0200, Toke Høiland-Jørgensen wrote:
>> > Daniel Golle <daniel at makrotopia.org> writes:
>> > 
>> > > Hi Jason,
>> > > 
>> > > On Wed, Jun 16, 2021 at 06:28:12PM +0200, Jason A. Donenfeld wrote:
>> > > > WireGuard does not copy the inner DSCP mark to the outside, aside
>> > > > from
>> > > > the ECN bits, in order to avoid a data leak.
>> > > 
>> > > That's a very valid argument.
>> > > 
>> > > However, from my experience now, Wireguard is not suitable for
>> > > VoIP/RTP
>> > > data (minimize-delay) being sent through the same tunnel as TCP bulk
>> > > (maximize-throughput) traffic in bandwidth constraint and/or high-
>> > > latency
>> > > environments, as that ruins the VoIP calls to the degree of not
>> > > being
>> > > understandable. ECN helps quite a bit when it comes to avoid packet
>> > > drops
>> > > for TCP traffic, but that's not enough to avoid high jitter and
>> > > drops for
>> > > RTP/UDP traffic at the same time.
>> > > 
>> > > I thought about ways to improve that and wonder what you would
>> > > suggest.
>> > > My ideas are:
>> > >  * have different tunnels depending on inner DSCP bits and mark them
>> > >    accordingly on the outside.
>> > >    => we already got multiple tunnels and that would double the
>> > > number.
>> > > 
>> > >  * mark outer packets with DSCP bits based on their size.
>> > >    VoIP RTP/UDP packets are typically "medium sized" while TCP
>> > > packets
>> > >    typically max out the MTU.
>> > >    => we would not leak information, but that assumption may not
>> > > always
>> > >       be true
>> > > 
>> > >  * patch wireguard kernel code to allow preserving inner DSCP bits.
>> > >    => even only having 2 differentl classes of traffic (critical vs.
>> > >       bulk) would already help a lot...
>> > > 
>> > > 
>> > > What do you think? Any other ideas?
>> > 
>> > Can you share a few more details about the network setup? I.e., where
>> > is
>> > the bottleneck link that requires this special treatment?
>> 
>> I can tell you about mine. WiFi in a congested environment: "voip on
>> mobile phones". WMM/802.11e uses the diffserv markings; most commercial
>> APs will do the right thing provided packets are marked appropriately.
>> 
>> At the time I have sent patches (back in 2019) for both the golang and
>> linux implementation that turned it on by default. I believe that
>> Russell Strong further improved upon them by adding a knob (20190318 on
>> this mailing list).
>
> Thank you very much for the hint!
> This patch is exactly what I was looking for:
> https://lists.zx2c4.com/pipermail/wireguard/2019-March/004026.html
>
> Unfortunately it has not received a great amount of feedback back then.
> I'll try forward-porting and deploying it now, because to me it looks
> like the best solution money can buy :)

I think you can achieve something similar using BPF filters, by relying
on wireguard passing through the skb->hash value when encrypting.

Simply attach a TC-BPF filter to the wireguard netdev, pull out the DSCP
value and store it in a map keyed on skb->hash. Then, run a second BPF
filter on the physical interface that shares that same map, lookup the
DSCP value based on the skb->hash value, and rewrite the outer IP
header.

The read-side filter will need to use bpf_get_hash_recalc() to make sure
the hash is calculated before the packet gets handed to wireguard, and
it'll be subject to hash collisions, but I think it should generally
work fairly well (for anything that's flow-based of course). And it can
be done without patching wireguard itself :)

-Toke