[WireGuard] [PATCH] poly1305: generic C can be faster on chips with slow unaligned access
Jason A. Donenfeld
Jason at zx2c4.com
Mon Nov 7 19:23:05 CET 2016
On Mon, Nov 7, 2016 at 7:08 PM, Jason A. Donenfeld <Jason at zx2c4.com> wrote:
> Hmm... The general data flow that strikes me as most pertinent is
> something like:
>
> struct sk_buff *skb = get_it_from_somewhere();
> skb = skb_share_check(skb, GFP_ATOMIC);
> num_frags = skb_cow_data(skb, ..., ...);
> struct scatterlist sg[num_frags];
> sg_init_table(sg, num_frags);
> skb_to_sgvec(skb, sg, ..., ...);
> blkcipher_walk_init(&walk, sg, sg, len);
> blkcipher_walk_virt_block(&desc, &walk, BLOCK_SIZE);
> while (walk.nbytes >= BLOCK_SIZE) {
> size_t chunk_len = rounddown(walk.nbytes, BLOCK_SIZE);
> poly1305_update(&poly1305_state, walk.src.virt.addr, chunk_len);
> blkcipher_walk_done(&desc, &walk, walk.nbytes % BLOCK_SIZE);
> }
> if (walk.nbytes) {
> poly1305_update(&poly1305_state, walk.src.virt.addr, walk.nbytes);
> blkcipher_walk_done(&desc, &walk, 0);
> }
>
> Is your suggestion that that in the final if block, walk.src.virt.addr
> might be unaligned? Like in the case of the last fragment being 67
> bytes long?
In fact, I'm not so sure this happens here. In the while loop, each
new walk.src.virt.addr will be aligned to BLOCK_SIZE or be aligned by
virtue of being at the start of a new page. In the subsequent if
block, walk.src.virt.addr will either be
some_aligned_address+BLOCK_SIZE, which will be aligned, or it will be
a start of a new page, which will be aligned.
So what did you have in mind exactly?
I don't think anybody is running code like:
for (size_t i = 0; i < len; i += 17)
poly1305_update(&poly, &buffer[i], 17);
(And if so, those consumers should be fixed.)
More information about the WireGuard
mailing list