[PATCH RFC v1] wireguard: queueing: get rid of per-peer ring buffers

Wed Feb 17 22:28:43 UTC 2021

On Wed, Feb 17, 2021 at 7:36 PM Toke Høiland-Jørgensen <toke at toke.dk> wrote:
> Are these performance measurements are based on micro-benchmarks of the
> queueing structure, or overall wireguard performance? Do you see any
> measurable difference in the overall performance (i.e., throughput
> drop)?

These are from counting cycles per instruction using perf and seeing
which instructions are hotspots that take a greater or smaller
percentage of the overall time.

> And what about relative to using one of the existing skb queueing
> primitives in the kernel? Including some actual numbers would be nice to
> justify adding yet-another skb queueing scheme to the kernel :)

If you're referring to skb_queue_* and friends, those very much will
not work in any way, shape, or form here. Aside from the fact that the
MPSC nature of it is problematic for performance, those functions use
a doubly linked list. In wireguard's case, there is only one pointer
available (skb->prev), as skb->next is used to create the singly
linked skb_list (see skb_list_walk_safe) of gso frags. And in fact, by
having these two pointers next to each other for the separate lists,
it doesn't need to pull in another cache line. This isn't "yet-another
queueing scheme" in the kernel. This is just a singly linked list
queue.

> I say this also because the actual queueing of the packets has never
> really shown up on any performance radar in the qdisc and mac80211
> layers, which both use traditional spinlock-protected queueing
> structures.

Those are single threaded and the locks aren't really contended much.

> that would be good; also for figuring out if this algorithm might be
> useful in other areas as well (and don't get me wrong, I'm fascinated by
> it!).

If I find the motivation -- and if the mailing list conversations
don't become overly miserable -- I might try to fashion the queueing
mechanism into a general header-only data structure in include/linux/.
But that'd take a bit of work to see if there are actually places
where it matters and where it's useful. WireGuard can get away with it
because of its workqueue design, but other things probably aren't as
lucky like that. So I'm on the fence about generality.

> > -     if (wg_packet_queue_init(&peer->tx_queue, wg_packet_tx_worker, false,
> > -                              MAX_QUEUED_PACKETS))
> > -             goto err_2;
> > +     INIT_WORK(&peer->transmit_packet_work, wg_packet_tx_worker);
>
> It's not quite clear to me why changing the queue primitives requires
> adding another work queue?

It doesn't require a new workqueue. It's just that a workqueue was
init'd earlier in the call to "wg_packet_queue_init", which allocated
a ring buffer at the same time. We're not going through that
infrastructure anymore, but I still want the workqueue it used, so I
init it there instead. I truncated the diff in my quoted reply -- take
a look at that quote above and you'll see more clearly what I mean.

> > +#define NEXT(skb) ((skb)->prev)
>
> In particular, please explain this oxymoronic define :)

I can write more about that, sure. But it's what I wrote earlier in
this email -- the next pointer is taken; the prev one is free. So,
this uses the prev one.

> While this is nice and compact it's also really hard to read.

Actually I've already reworked that a bit in master to get the memory
barrier better.

> I don't see anywhere that you're clearing the next pointer (or prev, as
> it were). Which means you'll likely end up passing packets up or down
> the stack with that pointer still set, right? See this commit for a
> previous instance where something like this has lead to issues:
>
> 22f6bbb7bcfc ("net: use skb_list_del_init() to remove from RX sublists")

The prev pointer is never used for anything or initialized to NULL
anywhere. skb_mark_not_on_list concerns skb->next.

Thanks for the review.

Jason