From horms at kernel.org Tue Dec 3 09:10:32 2024 From: horms at kernel.org (Simon Horman) Date: Tue, 03 Dec 2024 09:10:32 -0000 Subject: [PATCH] net: wireguard: Allow binding to specific ifindex In-Reply-To: <20241125212111.1533982-1-greearb@candelatech.com> References: <20241125212111.1533982-1-greearb@candelatech.com> Message-ID: <20241203090927.GA9361@kernel.org> On Mon, Nov 25, 2024 at 01:21:11PM -0800, greearb at candelatech.com wrote: > From: Ben Greear > > Which allows us to bind to VRF. > > Signed-off-by: Ben Greear > --- > > NOTE: Modified user-space to utilize this may be found here: > https://github.com/greearb/wireguard-tools-ct > Only the 'wg' part has been tested with this new feature as of today. ... > diff --git a/drivers/net/wireguard/socket.c b/drivers/net/wireguard/socket.c > index 0414d7a6ce74..a7cb1c7c3112 100644 > --- a/drivers/net/wireguard/socket.c > +++ b/drivers/net/wireguard/socket.c > @@ -25,7 +25,8 @@ static int send4(struct wg_device *wg, struct sk_buff *skb, > .daddr = endpoint->addr4.sin_addr.s_addr, > .fl4_dport = endpoint->addr4.sin_port, > .flowi4_mark = wg->fwmark, > - .flowi4_proto = IPPROTO_UDP > + .flowi4_proto = IPPROTO_UDP, > + .flowi4_oif = wg->lowerdev, > }; > struct rtable *rt = NULL; > struct sock *sock; > @@ -111,6 +112,9 @@ static int send6(struct wg_device *wg, struct sk_buff *skb, > struct sock *sock; > int ret = 0; > > + if (wg->lowerdev) > + fl.flowi6_oif = wg->lowerdev, Hi Ben, I think that the trailing ',' on the line above should be a ';'. As written, with a ',', the call to skb_mark_not_on_list() below will be included in the conditional block above. And this doesn't seem to be the intention of the code based on indentation. Flagged by clang-19 with -Wcomma > + > skb_mark_not_on_list(skb); > skb->dev = wg->dev; > skb->mark = wg->fwmark; ... From dsahern at kernel.org Tue Dec 3 20:35:13 2024 From: dsahern at kernel.org (David Ahern) Date: Tue, 03 Dec 2024 20:35:13 -0000 Subject: [PATCH v2] net: wireguard: Allow binding to specific ifindex In-Reply-To: <20241203193939.1953303-1-greearb@candelatech.com> References: <20241203193939.1953303-1-greearb@candelatech.com> Message-ID: On 12/3/24 12:39 PM, greearb at candelatech.com wrote: > From: Ben Greear > > Which allows us to bind to VRF. > > Signed-off-by: Ben Greear > --- > > v2: Fix bad use of comma, semicolon now used instead. > > drivers/net/wireguard/device.h | 1 + > drivers/net/wireguard/netlink.c | 12 +++++++++++- > drivers/net/wireguard/socket.c | 8 +++++++- > include/uapi/linux/wireguard.h | 3 +++ > 4 files changed, 22 insertions(+), 2 deletions(-) > LGTM Reviewed-by: David Ahern be good to throw a test case under selftests From jrife at google.com Wed Dec 4 01:23:34 2024 From: jrife at google.com (Jordan Rife) Date: Wed, 04 Dec 2024 01:23:34 -0000 Subject: [PATCH v2 net-next] wireguard: allowedips: Add WGALLOWEDIP_F_REMOVE_ME flag In-Reply-To: <20241130095624.1c34a12c@kernel.org> References: <20241127232133.3928793-1-jrife@google.com> <20241130095624.1c34a12c@kernel.org> Message-ID: > Better still use NLA_POLICY_MASK() so that nla_parse_nested() can > perform the validation and attach a machine readable info about > the failure. This is definitely cleaner for the new WGALLOWEDIP_A_FLAGS parameter. Thanks for the suggestion. Applying this to WGPEER_A_FLAGS would simplify the existing validation logic as well, although I think it changes the error code returned if a user provides an invalid flag from EOPNOTSUPP to EINVAL. I'm not sure if there's anything relying on this behavior. I'll let Jason make the final call there. -Jordan From jrife at google.com Wed Dec 4 01:50:56 2024 From: jrife at google.com (Jordan Rife) Date: Wed, 04 Dec 2024 01:50:56 -0000 Subject: [PATCH v3 net-next] wireguard: allowedips: Add WGALLOWEDIP_F_REMOVE_ME flag Message-ID: <20241204014909.2760696-1-jrife@google.com> The current netlink API for WireGuard does not directly support removal of allowed ips from a peer. A user can remove an allowed ip from a peer in one of two ways: 1. By using the WGPEER_F_REPLACE_ALLOWEDIPS flag and providing a new list of allowed ips which omits the allowed ip that is to be removed. 2. By reassigning an allowed ip to a "dummy" peer then removing that peer with WGPEER_F_REMOVE_ME. With the first approach, the driver completely rebuilds the allowed ip list for a peer. If my current configuration is such that a peer has allowed ips 192.168.0.2 and 192.168.0.3 and I want to remove 192.168.0.2 the actual transition looks like this. [192.168.0.2, 192.168.0.3] <-- Initial state [] <-- Step 1: Allowed ips removed for peer [192.168.0.3] <-- Step 2: Allowed ips added back for peer This is true even if the allowed ip list is small and the update does not need to be batched into multiple WG_CMD_SET_DEVICE requests, as the removal and subsequent addition of ips is non-atomic within a single request. Consequently, wg_allowedips_lookup_dst and wg_allowedips_lookup_src may return NULL while reconfiguring a peer even for packets bound for ips a user did not intend to remove leading to unintended interruptions in connectivity. This presents in userspace as failed calls to sendto and sendmsg for UDP sockets. In my case, I ran netperf while repeatedly reconfiguring the allowed ips for a peer with wg. /usr/local/bin/netperf -H 10.102.73.72 -l 10m -t UDP_STREAM -- -R 1 -m 1024 send_data: data send error: No route to host (errno 113) netperf: send_omni: send_data failed: No route to host While this may not be of particular concern for environments where peers and allowed ips are mostly static, systems like Cilium manage peers and allowed ips in a dynamic environment where peers (i.e. Kubernetes nodes) and allowed ips (i.e. pods running on those nodes) can frequently change making WGPEER_F_REPLACE_ALLOWEDIPS problematic. The second approach avoids any possible connectivity interruptions but is hacky and less direct, requiring the creation of a temporary peer just to dispose of an allowed ip. Introduce a new flag called WGALLOWEDIP_F_REMOVE_ME which in the same way that WGPEER_F_REMOVE_ME allows a user to remove a single peer from a WireGuard device's configuration allows a user to remove an ip from a peer's set of allowed ips. This enables incremental updates to a device's configuration without any connectivity blips or messy workarounds. A corresponding patch for wg extends the existing `wg set` interface to leverage this feature. $ wg set wg0 peer allowed-ips +192.168.88.0/24,-192.168.0.1/32 When '+' or '-' is prepended to any ip in the list, wg clears WGPEER_F_REPLACE_ALLOWEDIPS and sets the WGALLOWEDIP_F_REMOVE_ME flag on any ip prefixed with '-'. v2->v3 ------ * Revert WG_GENL_VERSION back to 1. * Rename _remove() to remove_node(). * Remove unnecessary !peer guard from remove(). * Adjust line length for calls to wg_allowedips_(remove|insert)_v(4|6). * Fix punctuation inside uapi docs for WGALLOWEDIP_A_FLAGS. * Get rid of remove-ip program and use wg instead in selftests. * Use NLA_POLICY_MASK for WGALLOWEDIP_A_FLAGS validation. v1->v2 ------ * Fixed some Sparse warnings. Link: https://lore.kernel.org/netdev/20240905200551.4099064-1-jrife at google.com/ Signed-off-by: Jordan Rife --- drivers/net/wireguard/allowedips.c | 106 ++++++++++++++------ drivers/net/wireguard/allowedips.h | 4 + drivers/net/wireguard/netlink.c | 37 ++++--- drivers/net/wireguard/selftest/allowedips.c | 48 +++++++++ include/uapi/linux/wireguard.h | 9 ++ tools/testing/selftests/wireguard/netns.sh | 32 ++++++ 6 files changed, 193 insertions(+), 43 deletions(-) diff --git a/drivers/net/wireguard/allowedips.c b/drivers/net/wireguard/allowedips.c index 4b8528206cc8..dcf068ba2881 100644 --- a/drivers/net/wireguard/allowedips.c +++ b/drivers/net/wireguard/allowedips.c @@ -249,6 +249,56 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, return 0; } +static void remove_node(struct allowedips_node *node, struct mutex *lock) +{ + struct allowedips_node *child, **parent_bit, *parent; + bool free_parent; + + list_del_init(&node->peer_list); + RCU_INIT_POINTER(node->peer, NULL); + if (node->bit[0] && node->bit[1]) + return; + child = rcu_dereference_protected(node->bit[!rcu_access_pointer(node->bit[0])], + lockdep_is_held(lock)); + if (child) + child->parent_bit_packed = node->parent_bit_packed; + parent_bit = (struct allowedips_node **)(node->parent_bit_packed & ~3UL); + *parent_bit = child; + parent = (void *)parent_bit - + offsetof(struct allowedips_node, bit[node->parent_bit_packed & 1]); + free_parent = !rcu_access_pointer(node->bit[0]) && + !rcu_access_pointer(node->bit[1]) && + (node->parent_bit_packed & 3) <= 1 && + !rcu_access_pointer(parent->peer); + if (free_parent) + child = rcu_dereference_protected(parent->bit[!(node->parent_bit_packed & 1)], + lockdep_is_held(lock)); + call_rcu(&node->rcu, node_free_rcu); + if (!free_parent) + return; + if (child) + child->parent_bit_packed = parent->parent_bit_packed; + *(struct allowedips_node **)(parent->parent_bit_packed & ~3UL) = child; + call_rcu(&parent->rcu, node_free_rcu); +} + +static int remove(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, + u8 cidr, struct wg_peer *peer, struct mutex *lock) +{ + struct allowedips_node *node; + + if (unlikely(cidr > bits)) + return -EINVAL; + if (!rcu_access_pointer(*trie) || + !node_placement(*trie, key, cidr, bits, &node, lock) || + peer != rcu_access_pointer(node->peer)) + return 0; + + remove_node(node, lock); + + return 0; +} + void wg_allowedips_init(struct allowedips *table) { table->root4 = table->root6 = NULL; @@ -300,44 +350,38 @@ int wg_allowedips_insert_v6(struct allowedips *table, const struct in6_addr *ip, return add(&table->root6, 128, key, cidr, peer, lock); } +int wg_allowedips_remove_v4(struct allowedips *table, const struct in_addr *ip, + u8 cidr, struct wg_peer *peer, struct mutex *lock) +{ + /* Aligned so it can be passed to fls */ + u8 key[4] __aligned(__alignof(u32)); + + ++table->seq; + swap_endian(key, (const u8 *)ip, 32); + return remove(&table->root4, 32, key, cidr, peer, lock); +} + +int wg_allowedips_remove_v6(struct allowedips *table, const struct in6_addr *ip, + u8 cidr, struct wg_peer *peer, struct mutex *lock) +{ + /* Aligned so it can be passed to fls64 */ + u8 key[16] __aligned(__alignof(u64)); + + ++table->seq; + swap_endian(key, (const u8 *)ip, 128); + return remove(&table->root6, 128, key, cidr, peer, lock); +} + void wg_allowedips_remove_by_peer(struct allowedips *table, struct wg_peer *peer, struct mutex *lock) { - struct allowedips_node *node, *child, **parent_bit, *parent, *tmp; - bool free_parent; + struct allowedips_node *node, *tmp; if (list_empty(&peer->allowedips_list)) return; ++table->seq; - list_for_each_entry_safe(node, tmp, &peer->allowedips_list, peer_list) { - list_del_init(&node->peer_list); - RCU_INIT_POINTER(node->peer, NULL); - if (node->bit[0] && node->bit[1]) - continue; - child = rcu_dereference_protected(node->bit[!rcu_access_pointer(node->bit[0])], - lockdep_is_held(lock)); - if (child) - child->parent_bit_packed = node->parent_bit_packed; - parent_bit = (struct allowedips_node **)(node->parent_bit_packed & ~3UL); - *parent_bit = child; - parent = (void *)parent_bit - - offsetof(struct allowedips_node, bit[node->parent_bit_packed & 1]); - free_parent = !rcu_access_pointer(node->bit[0]) && - !rcu_access_pointer(node->bit[1]) && - (node->parent_bit_packed & 3) <= 1 && - !rcu_access_pointer(parent->peer); - if (free_parent) - child = rcu_dereference_protected( - parent->bit[!(node->parent_bit_packed & 1)], - lockdep_is_held(lock)); - call_rcu(&node->rcu, node_free_rcu); - if (!free_parent) - continue; - if (child) - child->parent_bit_packed = parent->parent_bit_packed; - *(struct allowedips_node **)(parent->parent_bit_packed & ~3UL) = child; - call_rcu(&parent->rcu, node_free_rcu); - } + list_for_each_entry_safe(node, tmp, &peer->allowedips_list, peer_list) + remove_node(node, lock); } int wg_allowedips_read_node(struct allowedips_node *node, u8 ip[16], u8 *cidr) diff --git a/drivers/net/wireguard/allowedips.h b/drivers/net/wireguard/allowedips.h index 2346c797eb4d..931958cb6e10 100644 --- a/drivers/net/wireguard/allowedips.h +++ b/drivers/net/wireguard/allowedips.h @@ -38,6 +38,10 @@ int wg_allowedips_insert_v4(struct allowedips *table, const struct in_addr *ip, u8 cidr, struct wg_peer *peer, struct mutex *lock); int wg_allowedips_insert_v6(struct allowedips *table, const struct in6_addr *ip, u8 cidr, struct wg_peer *peer, struct mutex *lock); +int wg_allowedips_remove_v4(struct allowedips *table, const struct in_addr *ip, + u8 cidr, struct wg_peer *peer, struct mutex *lock); +int wg_allowedips_remove_v6(struct allowedips *table, const struct in6_addr *ip, + u8 cidr, struct wg_peer *peer, struct mutex *lock); void wg_allowedips_remove_by_peer(struct allowedips *table, struct wg_peer *peer, struct mutex *lock); /* The ip input pointer should be __aligned(__alignof(u64))) */ diff --git a/drivers/net/wireguard/netlink.c b/drivers/net/wireguard/netlink.c index f7055180ba4a..386f65042072 100644 --- a/drivers/net/wireguard/netlink.c +++ b/drivers/net/wireguard/netlink.c @@ -46,7 +46,8 @@ static const struct nla_policy peer_policy[WGPEER_A_MAX + 1] = { static const struct nla_policy allowedip_policy[WGALLOWEDIP_A_MAX + 1] = { [WGALLOWEDIP_A_FAMILY] = { .type = NLA_U16 }, [WGALLOWEDIP_A_IPADDR] = NLA_POLICY_MIN_LEN(sizeof(struct in_addr)), - [WGALLOWEDIP_A_CIDR_MASK] = { .type = NLA_U8 } + [WGALLOWEDIP_A_CIDR_MASK] = { .type = NLA_U8 }, + [WGALLOWEDIP_A_FLAGS] = NLA_POLICY_MASK(NLA_U32, __WGALLOWEDIP_F_ALL), }; static struct wg_device *lookup_interface(struct nlattr **attrs, @@ -329,6 +330,7 @@ static int set_port(struct wg_device *wg, u16 port) static int set_allowedip(struct wg_peer *peer, struct nlattr **attrs) { int ret = -EINVAL; + u32 flags = 0; u16 family; u8 cidr; @@ -337,19 +339,30 @@ static int set_allowedip(struct wg_peer *peer, struct nlattr **attrs) return ret; family = nla_get_u16(attrs[WGALLOWEDIP_A_FAMILY]); cidr = nla_get_u8(attrs[WGALLOWEDIP_A_CIDR_MASK]); + if (attrs[WGALLOWEDIP_A_FLAGS]) + flags = nla_get_u32(attrs[WGALLOWEDIP_A_FLAGS]); if (family == AF_INET && cidr <= 32 && - nla_len(attrs[WGALLOWEDIP_A_IPADDR]) == sizeof(struct in_addr)) - ret = wg_allowedips_insert_v4( - &peer->device->peer_allowedips, - nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr, peer, - &peer->device->device_update_lock); - else if (family == AF_INET6 && cidr <= 128 && - nla_len(attrs[WGALLOWEDIP_A_IPADDR]) == sizeof(struct in6_addr)) - ret = wg_allowedips_insert_v6( - &peer->device->peer_allowedips, - nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr, peer, - &peer->device->device_update_lock); + nla_len(attrs[WGALLOWEDIP_A_IPADDR]) == sizeof(struct in_addr)) { + if (flags & WGALLOWEDIP_F_REMOVE_ME) + ret = wg_allowedips_remove_v4(&peer->device->peer_allowedips, + nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr, + peer, &peer->device->device_update_lock); + else + ret = wg_allowedips_insert_v4(&peer->device->peer_allowedips, + nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr, + peer, &peer->device->device_update_lock); + } else if (family == AF_INET6 && cidr <= 128 && + nla_len(attrs[WGALLOWEDIP_A_IPADDR]) == sizeof(struct in6_addr)) { + if (flags & WGALLOWEDIP_F_REMOVE_ME) + ret = wg_allowedips_remove_v6(&peer->device->peer_allowedips, + nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr, + peer, &peer->device->device_update_lock); + else + ret = wg_allowedips_insert_v6(&peer->device->peer_allowedips, + nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr, + peer, &peer->device->device_update_lock); + } return ret; } diff --git a/drivers/net/wireguard/selftest/allowedips.c b/drivers/net/wireguard/selftest/allowedips.c index 3d1f64ff2e12..dc51223a1d3a 100644 --- a/drivers/net/wireguard/selftest/allowedips.c +++ b/drivers/net/wireguard/selftest/allowedips.c @@ -461,6 +461,10 @@ static __init struct wg_peer *init_peer(void) wg_allowedips_insert_v##version(&t, ip##version(ipa, ipb, ipc, ipd), \ cidr, mem, &mutex) +#define remove(version, mem, ipa, ipb, ipc, ipd, cidr) \ + wg_allowedips_remove_v##version(&t, ip##version(ipa, ipb, ipc, ipd), \ + cidr, mem, &mutex) + #define maybe_fail() do { \ ++i; \ if (!_s) { \ @@ -586,6 +590,50 @@ bool __init wg_allowedips_selftest(void) test_negative(4, a, 192, 0, 0, 0); test_negative(4, a, 255, 0, 0, 0); + insert(4, a, 1, 0, 0, 0, 32); + insert(4, a, 192, 0, 0, 0, 24); + insert(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef, 128); + insert(6, a, 0x24446800, 0xf0e40800, 0xeeaebeef, 0, 98); + test(4, a, 1, 0, 0, 0); + test(4, a, 192, 0, 0, 1); + test(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef); + test(6, a, 0x24446800, 0xf0e40800, 0xeeaebeef, 0x10101010); + /* Must be an exact match to remove */ + remove(4, a, 192, 0, 0, 0, 32); + test(4, a, 192, 0, 0, 1); + /* NULL peer should have no effect and return 0 */ + test_boolean(!remove(4, NULL, 192, 0, 0, 0, 24)); + test(4, a, 192, 0, 0, 1); + /* different peer should have no effect and return 0 */ + test_boolean(!remove(4, b, 192, 0, 0, 0, 24)); + test(4, a, 192, 0, 0, 1); + /* invalid CIDR should have no effect and return -EINVAL */ + test_boolean(remove(4, b, 192, 0, 0, 0, 33) == -EINVAL); + test(4, a, 192, 0, 0, 1); + remove(4, a, 192, 0, 0, 0, 24); + test_negative(4, a, 192, 0, 0, 1); + remove(4, a, 1, 0, 0, 0, 32); + test_negative(4, a, 1, 0, 0, 0); + /* Must be an exact match to remove */ + remove(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef, 96); + test(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef); + /* NULL peer should have no effect and return 0 */ + test_boolean(!remove(6, NULL, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef, 128)); + test(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef); + /* different peer should have no effect and return 0 */ + test_boolean(!remove(6, b, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef, 128)); + test(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef); + /* invalid CIDR should have no effect and return -EINVAL */ + test_boolean(remove(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef, 129) == -EINVAL); + test(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef); + remove(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef, 128); + test_negative(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef); + /* Must match the peer to remove */ + remove(6, b, 0x24446800, 0xf0e40800, 0xeeaebeef, 0, 98); + test(6, a, 0x24446800, 0xf0e40800, 0xeeaebeef, 0x10101010); + remove(6, a, 0x24446800, 0xf0e40800, 0xeeaebeef, 0, 98); + test_negative(6, a, 0x24446800, 0xf0e40800, 0xeeaebeef, 0x10101010); + wg_allowedips_free(&t, &mutex); wg_allowedips_init(&t); insert(4, a, 192, 168, 0, 0, 16); diff --git a/include/uapi/linux/wireguard.h b/include/uapi/linux/wireguard.h index ae88be14c947..8c26391196d5 100644 --- a/include/uapi/linux/wireguard.h +++ b/include/uapi/linux/wireguard.h @@ -101,6 +101,10 @@ * WGALLOWEDIP_A_FAMILY: NLA_U16 * WGALLOWEDIP_A_IPADDR: struct in_addr or struct in6_addr * WGALLOWEDIP_A_CIDR_MASK: NLA_U8 + * WGALLOWEDIP_A_FLAGS: NLA_U32, WGALLOWEDIP_F_REMOVE_ME if + * the specified IP should be removed; + * otherwise, this IP will be added if + * it is not already present. * 0: NLA_NESTED * ... * 0: NLA_NESTED @@ -184,11 +188,16 @@ enum wgpeer_attribute { }; #define WGPEER_A_MAX (__WGPEER_A_LAST - 1) +enum wgallowedip_flag { + WGALLOWEDIP_F_REMOVE_ME = 1U << 0, + __WGALLOWEDIP_F_ALL = WGALLOWEDIP_F_REMOVE_ME +}; enum wgallowedip_attribute { WGALLOWEDIP_A_UNSPEC, WGALLOWEDIP_A_FAMILY, WGALLOWEDIP_A_IPADDR, WGALLOWEDIP_A_CIDR_MASK, + WGALLOWEDIP_A_FLAGS, __WGALLOWEDIP_A_LAST }; #define WGALLOWEDIP_A_MAX (__WGALLOWEDIP_A_LAST - 1) diff --git a/tools/testing/selftests/wireguard/netns.sh b/tools/testing/selftests/wireguard/netns.sh index 405ff262ca93..853922c895cd 100755 --- a/tools/testing/selftests/wireguard/netns.sh +++ b/tools/testing/selftests/wireguard/netns.sh @@ -610,6 +610,38 @@ n0 wg set wg0 peer "$pub2" allowed-ips "$allowedips" } < <(n0 wg show wg0 allowed-ips) ip0 link del wg0 +# Test IP removal +allowedips=( ) +for i in {1..197}; do + allowedips+=( 192.168.0.$i ) + allowedips+=( abcd::$i ) +done +saved_ifs="$IFS" +IFS=, +allowedips="${allowedips[*]}" +IFS="$saved_ifs" +ip0 link add wg0 type wireguard +n0 wg set wg0 peer "$pub1" allowed-ips "$allowedips" +pub1_hex=$(echo "$pub1" | base64 -d | xxd -p -c 50) +n0 wg set wg0 peer "$pub1" allowed-ips -192.168.0.1/32,-192.168.0.20/32,-192.168.0.100/32,-abcd::1/128,-abcd::20/128,-abcd::100/128 +n0 wg show wg0 allowed-ips +{ + read -r pub allowedips + [[ $pub == "$pub1" ]] + i=0 + for ip in $allowedips; do + [[ "$ip" != "192.168.0.1" ]] + [[ "$ip" != "192.168.0.20" ]] + [[ "$ip" != "192.168.0.100" ]] + [[ "$ip" != "abcd::1" ]] + [[ "$ip" != "abcd::20" ]] + [[ "$ip" != "abcd::100" ]] + ((++i)) + done + ((i == 388)) +} < <(n0 wg show wg0 allowed-ips) +ip0 link del wg0 + ! n0 wg show doesnotexist || false ip0 link add wg0 type wireguard -- 2.47.0.338.g60cca15819-goog From jrife at google.com Wed Dec 4 01:53:41 2024 From: jrife at google.com (Jordan Rife) Date: Wed, 04 Dec 2024 01:53:41 -0000 Subject: [PATCH v1 wireguard-tools] ipc: linux: Support incremental allowed ips updates Message-ID: <20241204015331.2762169-1-jrife@google.com> [1] adds support for the WGALLOWEDIP_F_REMOVE_ME flag to WireGuard's Linux driver which is a direct way of removing a single allowed ip from a peer. Extend the interface of `wg set` to leverage this feature, allowing for incremental updates to a peer's configuration. By default, allowed-ips fully replaces a peer's allowed ips using WGPEER_REPLACE_ALLOWEDIPS under the hood. When '+' or '-' is prepended to any ip in the list, wg clears WGPEER_F_REPLACE_ALLOWEDIPS and sets the WGALLOWEDIP_F_REMOVE_ME flag on any ip prefixed with '-'. $ wg set wg0 peer allowed-ips +192.168.88.0/24,-192.168.0.1/32 This command means "add 192.168.88.0/24 to this peer's allowed ips if not present, and remove 192.168.0.1/32 if present". Currently, this feature is only enabled for Linux builds. [1]: https://lore.kernel.org/netdev/20241204014909.2760696-1-jrife at google.com/T/#u Signed-off-by: Jordan Rife --- src/config.c | 27 +++++++++++++++++++++++++++ src/containers.h | 5 +++++ src/ipc-linux.h | 2 ++ src/man/wg.8 | 8 ++++++-- src/set.c | 2 +- src/uapi/linux/linux/wireguard.h | 9 +++++++++ 6 files changed, 50 insertions(+), 3 deletions(-) diff --git a/src/config.c b/src/config.c index 81ccb47..b740f73 100644 --- a/src/config.c +++ b/src/config.c @@ -337,6 +337,29 @@ static bool validate_netmask(struct wgallowedip *allowedip) return true; } +#if defined(__linux__) +static inline void parse_ip_prefix(struct wgpeer *peer, uint32_t *flags, char **mask) +{ + /* If the IP is prefixed with either '+' or '-' consider + * this an incremental change. Disable WGPEER_REPLACE_ALLOWEDIPS. + */ + switch ((*mask)[0]) { + case '-': + *flags |= WGALLOWEDIP_REMOVE_ME; + /* fall through */ + case '+': + peer->flags &= ~WGPEER_REPLACE_ALLOWEDIPS; + (*mask)++; + } +} +#else +static inline void parse_ip_prefix(struct wgpeer *peer __attribute__ ((unused)), + uint32_t *flags __attribute__ ((unused)), + char **mask __attribute__ ((unused))) +{ +} +#endif + static inline bool parse_allowedips(struct wgpeer *peer, struct wgallowedip **last_allowedip, const char *value) { struct wgallowedip *allowedip = *last_allowedip, *new_allowedip; @@ -353,9 +376,12 @@ static inline bool parse_allowedips(struct wgpeer *peer, struct wgallowedip **la } sep = mutable; while ((mask = strsep(&sep, ","))) { + uint32_t flags = 0; unsigned long cidr; char *end, *ip; + parse_ip_prefix(peer, &flags, &mask); + saved_entry = strdup(mask); ip = strsep(&mask, "/"); @@ -387,6 +413,7 @@ static inline bool parse_allowedips(struct wgpeer *peer, struct wgallowedip **la else goto err; new_allowedip->cidr = cidr; + new_allowedip->flags = flags; if (!validate_netmask(new_allowedip)) fprintf(stderr, "Warning: AllowedIP has nonzero host part: %s/%s\n", ip, mask); diff --git a/src/containers.h b/src/containers.h index a82e8dd..8fd813a 100644 --- a/src/containers.h +++ b/src/containers.h @@ -28,6 +28,10 @@ struct timespec64 { int64_t tv_nsec; }; +enum { + WGALLOWEDIP_REMOVE_ME = 1U << 0, +}; + struct wgallowedip { uint16_t family; union { @@ -35,6 +39,7 @@ struct wgallowedip { struct in6_addr ip6; }; uint8_t cidr; + uint32_t flags; struct wgallowedip *next_allowedip; }; diff --git a/src/ipc-linux.h b/src/ipc-linux.h index d29c0c5..01247f1 100644 --- a/src/ipc-linux.h +++ b/src/ipc-linux.h @@ -228,6 +228,8 @@ again: } if (!mnl_attr_put_u8_check(nlh, SOCKET_BUFFER_SIZE, WGALLOWEDIP_A_CIDR_MASK, allowedip->cidr)) goto toobig_allowedips; + if (allowedip->flags && !mnl_attr_put_u32_check(nlh, SOCKET_BUFFER_SIZE, WGALLOWEDIP_A_FLAGS, allowedip->flags)) + goto toobig_allowedips; mnl_attr_nest_end(nlh, allowedip_nest); allowedip_nest = NULL; } diff --git a/src/man/wg.8 b/src/man/wg.8 index 7984539..1ec68df 100644 --- a/src/man/wg.8 +++ b/src/man/wg.8 @@ -55,7 +55,7 @@ transfer-rx, transfer-tx, persistent-keepalive. Shows the current configuration of \fI\fP in the format described by \fICONFIGURATION FILE FORMAT\fP below. .TP -\fBset\fP \fI\fP [\fIlisten-port\fP \fI\fP] [\fIfwmark\fP \fI\fP] [\fIprivate-key\fP \fI\fP] [\fIpeer\fP \fI\fP [\fIremove\fP] [\fIpreshared-key\fP \fI\fP] [\fIendpoint\fP \fI:\fP] [\fIpersistent-keepalive\fP \fI\fP] [\fIallowed-ips\fP \fI/\fP[,\fI/\fP]...] ]... +\fBset\fP \fI\fP [\fIlisten-port\fP \fI\fP] [\fIfwmark\fP \fI\fP] [\fIprivate-key\fP \fI\fP] [\fIpeer\fP \fI\fP [\fIremove\fP] [\fIpreshared-key\fP \fI\fP] [\fIendpoint\fP \fI:\fP] [\fIpersistent-keepalive\fP \fI\fP] [\fIallowed-ips\fP \fI[+|-]/\fP[,\fI[+|-]/\fP]...] ]... Sets configuration values for the specified \fI\fP. Multiple \fIpeer\fPs may be specified, and if the \fIremove\fP argument is given for a peer, that peer is removed, not configured. If \fIlisten-port\fP @@ -72,7 +72,11 @@ the device. The use of \fIpreshared-key\fP is optional, and may be omitted; it adds an additional layer of symmetric-key cryptography to be mixed into the already existing public-key cryptography, for post-quantum resistance. If \fIallowed-ips\fP is specified, but the value is the empty string, all -allowed ips are removed from the peer. The use of \fIpersistent-keepalive\fP +allowed ips are removed from the peer. By default, \fIallowed-ips\fP replaces +a peer's allowed ips. (Linux only) If + or - is prepended to any of the ips then +the update is incremental; ips prefixed with '+' or '' are added to the peer's +allowed ips if not present while ips prefixed with '-' are removed if present. +The use of \fIpersistent-keepalive\fP is optional and is by default off; setting it to 0 or "off" disables it. Otherwise it represents, in seconds, between 1 and 65535 inclusive, how often to send an authenticated empty packet to the peer, for the purpose of keeping diff --git a/src/set.c b/src/set.c index 75560fd..992ffa2 100644 --- a/src/set.c +++ b/src/set.c @@ -18,7 +18,7 @@ int set_main(int argc, const char *argv[]) int ret = 1; if (argc < 3) { - fprintf(stderr, "Usage: %s %s [listen-port ] [fwmark ] [private-key ] [peer [remove] [preshared-key ] [endpoint :] [persistent-keepalive ] [allowed-ips /[,/]...] ]...\n", PROG_NAME, argv[0]); + fprintf(stderr, "Usage: %s %s [listen-port ] [fwmark ] [private-key ] [peer [remove] [preshared-key ] [endpoint :] [persistent-keepalive ] [allowed-ips [+|-]/[,[+|-]/]...] ]...\n", PROG_NAME, argv[0]); return 1; } diff --git a/src/uapi/linux/linux/wireguard.h b/src/uapi/linux/linux/wireguard.h index 0efd52c..6ca266a 100644 --- a/src/uapi/linux/linux/wireguard.h +++ b/src/uapi/linux/linux/wireguard.h @@ -101,6 +101,10 @@ * WGALLOWEDIP_A_FAMILY: NLA_U16 * WGALLOWEDIP_A_IPADDR: struct in_addr or struct in6_addr * WGALLOWEDIP_A_CIDR_MASK: NLA_U8 + * WGALLOWEDIP_A_FLAGS: NLA_U32, WGALLOWEDIP_F_REMOVE_ME if + * the specified IP should be removed; + * otherwise, this IP will be added if + * it is not already present. * 0: NLA_NESTED * ... * 0: NLA_NESTED @@ -184,11 +188,16 @@ enum wgpeer_attribute { }; #define WGPEER_A_MAX (__WGPEER_A_LAST - 1) +enum wgallowedip_flag { + WGALLOWEDIP_F_REMOVE_ME = 1U << 0, + __WGALLOWEDIP_F_ALL = WGALLOWEDIP_F_REMOVE_ME +}; enum wgallowedip_attribute { WGALLOWEDIP_A_UNSPEC, WGALLOWEDIP_A_FAMILY, WGALLOWEDIP_A_IPADDR, WGALLOWEDIP_A_CIDR_MASK, + WGALLOWEDIP_A_FLAGS, __WGALLOWEDIP_A_LAST }; #define WGALLOWEDIP_A_MAX (__WGALLOWEDIP_A_LAST - 1) -- 2.47.0.338.g60cca15819-goog From shaw.leon at gmail.com Mon Dec 9 14:02:30 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Mon, 09 Dec 2024 14:02:30 -0000 Subject: [PATCH net-next v5 0/5] net: Improve netns handling in RTNL and ip_tunnel Message-ID: <20241209140151.231257-1-shaw.leon@gmail.com> This patch series includes some netns-related improvements and fixes for RTNL and ip_tunnel, to make link creation more intuitive: - Creating link in another net namespace doesn't conflict with link names in current one. - Refector rtnetlink link creation. Create link in target namespace directly. Pass both source and link netns to drivers via newlink() callback. So that # ip link add netns ns1 link-netns ns2 tun0 type gre ... will create tun0 in ns1, rather than create it in ns2 and move to ns1. And don't conflict with another interface named "tun0" in current netns. --- v5: - Fix function doc in batman-adv. - Include peer_net in rtnl newlink parameters. v4: link: https://lore.kernel.org/all/20241118143244.1773-1-shaw.leon at gmail.com/ - Pack newlink() parameters to a single struct. - Use ynl async_msg_queue.empty() in selftest. v3: link: https://lore.kernel.org/all/20241113125715.150201-1-shaw.leon at gmail.com/ - Drop "netns_atomic" flag and module parameter. Add netns parameter to newlink() instead, and convert drivers accordingly. - Move python NetNSEnter helper to net selftest lib. v2: link: https://lore.kernel.org/all/20241107133004.7469-1-shaw.leon at gmail.com/ - Check NLM_F_EXCL to ensure only link creation is affected. - Add self tests for link name/ifindex conflict and notifications in different netns. - Changes in dummy driver and ynl in order to add the test case. v1: link: https://lore.kernel.org/all/20241023023146.372653-1-shaw.leon at gmail.com/ Xiao Liang (5): net: ip_tunnel: Build flow in underlay net namespace rtnetlink: Lookup device in target netns when creating link rtnetlink: Decouple net namespaces in rtnl_newlink_create() selftests: net: Add python context manager for netns entering selftests: net: Add two test cases for link netns drivers/infiniband/ulp/ipoib/ipoib_netlink.c | 11 +++-- drivers/net/amt.c | 13 +++--- drivers/net/bareudp.c | 11 +++-- drivers/net/bonding/bond_netlink.c | 8 ++-- drivers/net/can/dev/netlink.c | 4 +- drivers/net/can/vxcan.c | 9 ++-- .../ethernet/qualcomm/rmnet/rmnet_config.c | 11 +++-- drivers/net/geneve.c | 11 +++-- drivers/net/gtp.c | 9 ++-- drivers/net/ipvlan/ipvlan.h | 4 +- drivers/net/ipvlan/ipvlan_main.c | 11 +++-- drivers/net/ipvlan/ipvtap.c | 7 ++- drivers/net/macsec.c | 11 +++-- drivers/net/macvlan.c | 8 ++-- drivers/net/macvtap.c | 8 ++-- drivers/net/netkit.c | 9 ++-- drivers/net/pfcp.c | 8 ++-- drivers/net/ppp/ppp_generic.c | 10 +++-- drivers/net/team/team_core.c | 7 +-- drivers/net/veth.c | 9 ++-- drivers/net/vrf.c | 7 +-- drivers/net/vxlan/vxlan_core.c | 11 +++-- drivers/net/wireguard/device.c | 8 ++-- drivers/net/wireless/virtual/virt_wifi.c | 10 +++-- drivers/net/wwan/wwan_core.c | 15 +++++-- include/net/ip_tunnels.h | 5 ++- include/net/rtnetlink.h | 44 ++++++++++++++++--- net/8021q/vlan_netlink.c | 11 +++-- net/batman-adv/soft-interface.c | 12 ++--- net/bridge/br_netlink.c | 8 ++-- net/caif/chnl_net.c | 6 +-- net/core/rtnetlink.c | 35 ++++++++------- net/hsr/hsr_netlink.c | 14 +++--- net/ieee802154/6lowpan/core.c | 9 ++-- net/ipv4/ip_gre.c | 27 ++++++++---- net/ipv4/ip_tunnel.c | 16 ++++--- net/ipv4/ip_vti.c | 10 +++-- net/ipv4/ipip.c | 10 +++-- net/ipv6/ip6_gre.c | 28 +++++++----- net/ipv6/ip6_tunnel.c | 16 +++---- net/ipv6/ip6_vti.c | 15 +++---- net/ipv6/sit.c | 16 +++---- net/xfrm/xfrm_interface_core.c | 14 +++--- tools/testing/selftests/net/Makefile | 1 + .../testing/selftests/net/lib/py/__init__.py | 2 +- tools/testing/selftests/net/lib/py/netns.py | 18 ++++++++ tools/testing/selftests/net/netns-name.sh | 10 +++++ tools/testing/selftests/net/netns_atomic.py | 39 ++++++++++++++++ 48 files changed, 385 insertions(+), 211 deletions(-) create mode 100755 tools/testing/selftests/net/netns_atomic.py -- 2.47.1 From shaw.leon at gmail.com Mon Dec 9 14:02:39 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Mon, 09 Dec 2024 14:02:39 -0000 Subject: [PATCH net-next v5 1/5] net: ip_tunnel: Build flow in underlay net namespace In-Reply-To: <20241209140151.231257-1-shaw.leon@gmail.com> References: <20241209140151.231257-1-shaw.leon@gmail.com> Message-ID: <20241209140151.231257-2-shaw.leon@gmail.com> Build IPv4 flow in underlay net namespace, where encapsulated packets are routed. Signed-off-by: Xiao Liang --- net/ipv4/ip_tunnel.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index 25505f9b724c..09b73acf037a 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -294,7 +294,7 @@ static int ip_tunnel_bind_dev(struct net_device *dev) ip_tunnel_init_flow(&fl4, iph->protocol, iph->daddr, iph->saddr, tunnel->parms.o_key, - iph->tos & INET_DSCP_MASK, dev_net(dev), + iph->tos & INET_DSCP_MASK, tunnel->net, tunnel->parms.link, tunnel->fwmark, 0, 0); rt = ip_route_output_key(tunnel->net, &fl4); @@ -611,7 +611,7 @@ void ip_md_tunnel_xmit(struct sk_buff *skb, struct net_device *dev, } ip_tunnel_init_flow(&fl4, proto, key->u.ipv4.dst, key->u.ipv4.src, tunnel_id_to_key32(key->tun_id), - tos & INET_DSCP_MASK, dev_net(dev), 0, skb->mark, + tos & INET_DSCP_MASK, tunnel->net, 0, skb->mark, skb_get_hash(skb), key->flow_flags); if (!tunnel_hlen) @@ -774,7 +774,7 @@ void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev, ip_tunnel_init_flow(&fl4, protocol, dst, tnl_params->saddr, tunnel->parms.o_key, tos & INET_DSCP_MASK, - dev_net(dev), READ_ONCE(tunnel->parms.link), + tunnel->net, READ_ONCE(tunnel->parms.link), tunnel->fwmark, skb_get_hash(skb), 0); if (ip_tunnel_encap(skb, &tunnel->encap, &protocol, &fl4) < 0) -- 2.47.1 From shaw.leon at gmail.com Mon Dec 9 14:03:48 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Mon, 09 Dec 2024 14:03:48 -0000 Subject: [PATCH net-next v5 2/5] rtnetlink: Lookup device in target netns when creating link In-Reply-To: <20241209140151.231257-1-shaw.leon@gmail.com> References: <20241209140151.231257-1-shaw.leon@gmail.com> Message-ID: <20241209140151.231257-3-shaw.leon@gmail.com> When creating link, lookup for existing device in target net namespace instead of current one. For example, two links created by: # ip link add dummy1 type dummy # ip link add netns ns1 dummy1 type dummy should have no conflict since they are in different namespaces. Signed-off-by: Xiao Liang --- net/core/rtnetlink.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index ab5f201bf0ab..7855f81c917b 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -3851,20 +3851,26 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh, { struct nlattr ** const tb = tbs->tb; struct net *net = sock_net(skb->sk); + struct net *device_net; struct net_device *dev; struct ifinfomsg *ifm; bool link_specified; + /* When creating, lookup for existing device in target net namespace */ + device_net = (nlh->nlmsg_flags & NLM_F_CREATE) && + (nlh->nlmsg_flags & NLM_F_EXCL) ? + tgt_net : net; + ifm = nlmsg_data(nlh); if (ifm->ifi_index > 0) { link_specified = true; - dev = __dev_get_by_index(net, ifm->ifi_index); + dev = __dev_get_by_index(device_net, ifm->ifi_index); } else if (ifm->ifi_index < 0) { NL_SET_ERR_MSG(extack, "ifindex can't be negative"); return -EINVAL; } else if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME]) { link_specified = true; - dev = rtnl_dev_get(net, tb); + dev = rtnl_dev_get(device_net, tb); } else { link_specified = false; dev = NULL; -- 2.47.1 From shaw.leon at gmail.com Mon Dec 9 14:06:00 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Mon, 09 Dec 2024 14:06:00 -0000 Subject: [PATCH net-next v5 3/5] rtnetlink: Decouple net namespaces in rtnl_newlink_create() In-Reply-To: <20241209140151.231257-1-shaw.leon@gmail.com> References: <20241209140151.231257-1-shaw.leon@gmail.com> Message-ID: <20241209140151.231257-4-shaw.leon@gmail.com> There are 4 net namespaces involved when creating links: - source netns - where the netlink socket resides, - target netns - where to put the device being created, - link netns - netns associated with the device (backend), - peer netns - netns of peer device. Currently, two nets are passed to newlink() callback - "src_net" parameter and "dev_net" (implicitly in net_device). They are set as follows, depending on netlink attributes. +------------+-------------------+---------+---------+ | peer netns | IFLA_LINK_NETNSID | src_net | dev_net | +------------+-------------------+---------+---------+ | | absent | source | target | | absent +-------------------+---------+---------+ | | present | link | link | +------------+-------------------+---------+---------+ | | absent | peer | target | | present +-------------------+---------+---------+ | | present | peer | link | +------------+-------------------+---------+---------+ When IFLA_LINK_NETNSID is present, the device is created in link netns first. This has some side effects, including extra ifindex allocation, ifname validation and link notifications. There's also an extra step to move the device to target netns. These could be avoided if we create it in target netns at the beginning. On the other hand, the meaning of src_net is ambiguous. It varies depending on how parameters are passed. It is the effective link or peer netns by design, but some drivers ignore it and use dev_net instead. This patch refactors netns handling by packing newlink() parameters into a struct, and passing source, link and peer netns as is through this struct. Fallback logic is implemented in helper functions - rtnl_newlink_link_net() and rtnl_newlink_peer_net(). If is not set, peer netns falls back to link netns, and link netns falls back to source netns. rtnl_newlink_create() now creates devices in target netns directly, so dev_net is always target netns. For drivers that use dev_net as fallback of link_netns, current behavior is kept for compatibility. Signed-off-by: Xiao Liang --- There're some issues found when coverting drivers. Please check if they work as intended: - In amt_newlink() drivers/net/amt.c: amt->net = net; ... amt->stream_dev = dev_get_by_index(net, ... Uses net (src_net actually), but amt_lookup_upper_dev() only searches in dev_net. - In gtp_newlink() in drivers/net/gtp.c: gtp->net = src_net; ... gn = net_generic(dev_net(dev), gtp_net_id); list_add_rcu(>p->list, &gn->gtp_dev_list); Uses src_net, but is linked to list in dev_net. - In pfcp_newlink() in drivers/net/pfcp.c: pfcp->net = net; ... pn = net_generic(dev_net(dev), pfcp_net_id); list_add_rcu(&pfcp->list, &pn->pfcp_dev_list); Same. - In lowpan_newlink() in net/ieee802154/6lowpan/core.c: wdev = dev_get_by_index(dev_net(ldev), nla_get_u32(tb[IFLA_LINK])); Looks for IFLA_LINK in dev_net, but in theory the ifindex is defined in link netns. --- drivers/infiniband/ulp/ipoib/ipoib_netlink.c | 11 +++-- drivers/net/amt.c | 13 +++--- drivers/net/bareudp.c | 11 +++-- drivers/net/bonding/bond_netlink.c | 8 ++-- drivers/net/can/dev/netlink.c | 4 +- drivers/net/can/vxcan.c | 9 ++-- .../ethernet/qualcomm/rmnet/rmnet_config.c | 11 +++-- drivers/net/geneve.c | 11 +++-- drivers/net/gtp.c | 9 ++-- drivers/net/ipvlan/ipvlan.h | 4 +- drivers/net/ipvlan/ipvlan_main.c | 11 +++-- drivers/net/ipvlan/ipvtap.c | 7 ++- drivers/net/macsec.c | 11 +++-- drivers/net/macvlan.c | 8 ++-- drivers/net/macvtap.c | 8 ++-- drivers/net/netkit.c | 9 ++-- drivers/net/pfcp.c | 8 ++-- drivers/net/ppp/ppp_generic.c | 10 +++-- drivers/net/team/team_core.c | 7 +-- drivers/net/veth.c | 9 ++-- drivers/net/vrf.c | 7 +-- drivers/net/vxlan/vxlan_core.c | 11 +++-- drivers/net/wireguard/device.c | 8 ++-- drivers/net/wireless/virtual/virt_wifi.c | 10 +++-- drivers/net/wwan/wwan_core.c | 15 +++++-- include/net/ip_tunnels.h | 5 ++- include/net/rtnetlink.h | 44 ++++++++++++++++--- net/8021q/vlan_netlink.c | 11 +++-- net/batman-adv/soft-interface.c | 12 ++--- net/bridge/br_netlink.c | 8 ++-- net/caif/chnl_net.c | 6 +-- net/core/rtnetlink.c | 25 +++++------ net/hsr/hsr_netlink.c | 14 +++--- net/ieee802154/6lowpan/core.c | 9 ++-- net/ipv4/ip_gre.c | 27 ++++++++---- net/ipv4/ip_tunnel.c | 10 +++-- net/ipv4/ip_vti.c | 10 +++-- net/ipv4/ipip.c | 10 +++-- net/ipv6/ip6_gre.c | 28 +++++++----- net/ipv6/ip6_tunnel.c | 16 +++---- net/ipv6/ip6_vti.c | 15 +++---- net/ipv6/sit.c | 16 +++---- net/xfrm/xfrm_interface_core.c | 14 +++--- 43 files changed, 305 insertions(+), 205 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c index 9ad8d9856275..da587af85d4f 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c @@ -97,10 +97,13 @@ static int ipoib_changelink(struct net_device *dev, struct nlattr *tb[], return ret; } -static int ipoib_new_child_link(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ipoib_new_child_link(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct netlink_ext_ack *extack = params->extack; + struct net *link_net = rtnl_newlink_link_net(params); struct net_device *pdev; struct ipoib_dev_priv *ppriv; u16 child_pkey; @@ -109,7 +112,7 @@ static int ipoib_new_child_link(struct net *src_net, struct net_device *dev, if (!tb[IFLA_LINK]) return -EINVAL; - pdev = __dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK])); + pdev = __dev_get_by_index(link_net, nla_get_u32(tb[IFLA_LINK])); if (!pdev || pdev->type != ARPHRD_INFINIBAND) return -ENODEV; diff --git a/drivers/net/amt.c b/drivers/net/amt.c index 98c6205ed19f..2f7bf50e05d2 100644 --- a/drivers/net/amt.c +++ b/drivers/net/amt.c @@ -3161,14 +3161,17 @@ static int amt_validate(struct nlattr *tb[], struct nlattr *data[], return 0; } -static int amt_newlink(struct net *net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int amt_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct netlink_ext_ack *extack = params->extack; + struct net *link_net = rtnl_newlink_link_net(params); struct amt_dev *amt = netdev_priv(dev); int err = -EINVAL; - amt->net = net; + amt->net = link_net; amt->mode = nla_get_u32(data[IFLA_AMT_MODE]); if (data[IFLA_AMT_MAX_TUNNELS] && @@ -3183,7 +3186,7 @@ static int amt_newlink(struct net *net, struct net_device *dev, amt->hash_buckets = AMT_HSIZE; amt->nr_tunnels = 0; get_random_bytes(&amt->hash_seed, sizeof(amt->hash_seed)); - amt->stream_dev = dev_get_by_index(net, + amt->stream_dev = dev_get_by_index(link_net, nla_get_u32(data[IFLA_AMT_LINK])); if (!amt->stream_dev) { NL_SET_ERR_MSG_ATTR(extack, tb[IFLA_AMT_LINK], diff --git a/drivers/net/bareudp.c b/drivers/net/bareudp.c index 70814303aab8..91e1c02ada72 100644 --- a/drivers/net/bareudp.c +++ b/drivers/net/bareudp.c @@ -698,10 +698,13 @@ static void bareudp_dellink(struct net_device *dev, struct list_head *head) unregister_netdevice_queue(dev, head); } -static int bareudp_newlink(struct net *net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int bareudp_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct netlink_ext_ack *extack = params->extack; + struct net *link_net = rtnl_newlink_link_net(params); struct bareudp_conf conf; int err; @@ -709,7 +712,7 @@ static int bareudp_newlink(struct net *net, struct net_device *dev, if (err) return err; - err = bareudp_configure(net, dev, &conf, extack); + err = bareudp_configure(link_net, dev, &conf, extack); if (err) return err; diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c index 2a6a424806aa..db3062c6dbe0 100644 --- a/drivers/net/bonding/bond_netlink.c +++ b/drivers/net/bonding/bond_netlink.c @@ -564,10 +564,12 @@ static int bond_changelink(struct net_device *bond_dev, struct nlattr *tb[], return 0; } -static int bond_newlink(struct net *src_net, struct net_device *bond_dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int bond_newlink(struct rtnl_newlink_params *params) { + struct net_device *bond_dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct netlink_ext_ack *extack = params->extack; int err; err = bond_changelink(bond_dev, tb, data, extack); diff --git a/drivers/net/can/dev/netlink.c b/drivers/net/can/dev/netlink.c index 01aacdcda260..52dae0e94858 100644 --- a/drivers/net/can/dev/netlink.c +++ b/drivers/net/can/dev/netlink.c @@ -624,9 +624,7 @@ static int can_fill_xstats(struct sk_buff *skb, const struct net_device *dev) return -EMSGSIZE; } -static int can_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int can_newlink(struct rtnl_newlink_params *params) { return -EOPNOTSUPP; } diff --git a/drivers/net/can/vxcan.c b/drivers/net/can/vxcan.c index ca8811941085..65ae07116c91 100644 --- a/drivers/net/can/vxcan.c +++ b/drivers/net/can/vxcan.c @@ -172,10 +172,13 @@ static void vxcan_setup(struct net_device *dev) /* forward declaration for rtnl_create_link() */ static struct rtnl_link_ops vxcan_link_ops; -static int vxcan_newlink(struct net *peer_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int vxcan_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct netlink_ext_ack *extack = params->extack; + struct net *peer_net = rtnl_newlink_peer_net(params); struct vxcan_priv *priv; struct net_device *peer; diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c index f3bea196a8f9..d45555d784e6 100644 --- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c @@ -117,10 +117,13 @@ static void rmnet_unregister_bridge(struct rmnet_port *port) rmnet_unregister_real_device(bridge_dev); } -static int rmnet_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int rmnet_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct netlink_ext_ack *extack = params->extack; + struct net *link_net = rtnl_newlink_link_net(params); u32 data_format = RMNET_FLAGS_INGRESS_DEAGGREGATION; struct net_device *real_dev; int mode = RMNET_EPMODE_VND; @@ -134,7 +137,7 @@ static int rmnet_newlink(struct net *src_net, struct net_device *dev, return -EINVAL; } - real_dev = __dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK])); + real_dev = __dev_get_by_index(link_net, nla_get_u32(tb[IFLA_LINK])); if (!real_dev) { NL_SET_ERR_MSG_MOD(extack, "link does not exist"); return -ENODEV; diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c index 642155cb8315..77978617f509 100644 --- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -1614,10 +1614,13 @@ static void geneve_link_config(struct net_device *dev, geneve_change_mtu(dev, ldev_mtu - info->options_len); } -static int geneve_newlink(struct net *net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int geneve_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct netlink_ext_ack *extack = params->extack; + struct net *link_net = rtnl_newlink_link_net(params); struct geneve_config cfg = { .df = GENEVE_DF_UNSET, .use_udp6_rx_checksums = false, @@ -1631,7 +1634,7 @@ static int geneve_newlink(struct net *net, struct net_device *dev, if (err) return err; - err = geneve_configure(net, dev, extack, &cfg); + err = geneve_configure(link_net, dev, extack, &cfg); if (err) return err; diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index 89a996ad8cd0..3eb1bc3ac124 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -1460,10 +1460,11 @@ static int gtp_create_sockets(struct gtp_dev *gtp, const struct nlattr *nla, #define GTP_TH_MAXLEN (sizeof(struct udphdr) + sizeof(struct gtp0_header)) #define GTP_IPV6_MAXLEN (sizeof(struct ipv6hdr) + GTP_TH_MAXLEN) -static int gtp_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int gtp_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *link_net = rtnl_newlink_link_net(params); unsigned int role = GTP_ROLE_GGSN; struct gtp_dev *gtp; struct gtp_net *gn; @@ -1494,7 +1495,7 @@ static int gtp_newlink(struct net *src_net, struct net_device *dev, gtp->restart_count = nla_get_u8_default(data[IFLA_GTP_RESTART_COUNT], 0); - gtp->net = src_net; + gtp->net = link_net; err = gtp_hashtable_new(gtp, hashsize); if (err < 0) diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h index 025e0c19ec25..beff25a1d6f0 100644 --- a/drivers/net/ipvlan/ipvlan.h +++ b/drivers/net/ipvlan/ipvlan.h @@ -166,9 +166,7 @@ struct ipvl_addr *ipvlan_addr_lookup(struct ipvl_port *port, void *lyr3h, void *ipvlan_get_L3_hdr(struct ipvl_port *port, struct sk_buff *skb, int *type); void ipvlan_count_rx(const struct ipvl_dev *ipvlan, unsigned int len, bool success, bool mcast); -int ipvlan_link_new(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack); +int ipvlan_link_new(struct rtnl_newlink_params *params); void ipvlan_link_delete(struct net_device *dev, struct list_head *head); void ipvlan_link_setup(struct net_device *dev); int ipvlan_link_register(struct rtnl_link_ops *ops); diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c index ee2c3cf4df36..53860e9d08b1 100644 --- a/drivers/net/ipvlan/ipvlan_main.c +++ b/drivers/net/ipvlan/ipvlan_main.c @@ -532,10 +532,13 @@ static int ipvlan_nl_fillinfo(struct sk_buff *skb, return ret; } -int ipvlan_link_new(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +int ipvlan_link_new(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct netlink_ext_ack *extack = params->extack; + struct net *link_net = rtnl_newlink_link_net(params); struct ipvl_dev *ipvlan = netdev_priv(dev); struct ipvl_port *port; struct net_device *phy_dev; @@ -545,7 +548,7 @@ int ipvlan_link_new(struct net *src_net, struct net_device *dev, if (!tb[IFLA_LINK]) return -EINVAL; - phy_dev = __dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK])); + phy_dev = __dev_get_by_index(link_net, nla_get_u32(tb[IFLA_LINK])); if (!phy_dev) return -ENODEV; diff --git a/drivers/net/ipvlan/ipvtap.c b/drivers/net/ipvlan/ipvtap.c index 1afc4c47be73..69e7456a48ca 100644 --- a/drivers/net/ipvlan/ipvtap.c +++ b/drivers/net/ipvlan/ipvtap.c @@ -73,10 +73,9 @@ static void ipvtap_update_features(struct tap_dev *tap, netdev_update_features(vlan->dev); } -static int ipvtap_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ipvtap_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; struct ipvtap_dev *vlantap = netdev_priv(dev); int err; @@ -97,7 +96,7 @@ static int ipvtap_newlink(struct net *src_net, struct net_device *dev, /* Don't put anything that may fail after macvlan_common_newlink * because we can't undo what it does. */ - err = ipvlan_link_new(src_net, dev, tb, data, extack); + err = ipvlan_link_new(params); if (err) { netdev_rx_handler_unregister(dev); return err; diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c index 1bc1e5993f56..e8b147fe4fce 100644 --- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -4141,10 +4141,13 @@ static int macsec_add_dev(struct net_device *dev, sci_t sci, u8 icv_len) static struct lock_class_key macsec_netdev_addr_lock_key; -static int macsec_newlink(struct net *net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int macsec_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct netlink_ext_ack *extack = params->extack; + struct net *link_net = rtnl_newlink_link_net(params); struct macsec_dev *macsec = macsec_priv(dev); rx_handler_func_t *rx_handler; u8 icv_len = MACSEC_DEFAULT_ICV_LEN; @@ -4154,7 +4157,7 @@ static int macsec_newlink(struct net *net, struct net_device *dev, if (!tb[IFLA_LINK]) return -EINVAL; - real_dev = __dev_get_by_index(net, nla_get_u32(tb[IFLA_LINK])); + real_dev = __dev_get_by_index(link_net, nla_get_u32(tb[IFLA_LINK])); if (!real_dev) return -ENODEV; if (real_dev->type != ARPHRD_ETHER) diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index fed4fe2a4748..7050a061b2b9 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -1565,11 +1565,11 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev, } EXPORT_SYMBOL_GPL(macvlan_common_newlink); -static int macvlan_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int macvlan_newlink(struct rtnl_newlink_params *params) { - return macvlan_common_newlink(src_net, dev, tb, data, extack); + return macvlan_common_newlink(rtnl_newlink_link_net(params), + params->dev, params->tb, params->data, + params->extack); } void macvlan_dellink(struct net_device *dev, struct list_head *head) diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c index 29a5929d48e5..213a16719c5a 100644 --- a/drivers/net/macvtap.c +++ b/drivers/net/macvtap.c @@ -77,10 +77,9 @@ static void macvtap_update_features(struct tap_dev *tap, netdev_update_features(vlan->dev); } -static int macvtap_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int macvtap_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; struct macvtap_dev *vlantap = netdev_priv(dev); int err; @@ -105,7 +104,8 @@ static int macvtap_newlink(struct net *src_net, struct net_device *dev, /* Don't put anything that may fail after macvlan_common_newlink * because we can't undo what it does. */ - err = macvlan_common_newlink(src_net, dev, tb, data, extack); + err = macvlan_common_newlink(rtnl_newlink_link_net(params), dev, + params->tb, params->data, params->extack); if (err) { netdev_rx_handler_unregister(dev); return err; diff --git a/drivers/net/netkit.c b/drivers/net/netkit.c index c1d881dc6409..607d3b141f8c 100644 --- a/drivers/net/netkit.c +++ b/drivers/net/netkit.c @@ -327,10 +327,13 @@ static int netkit_validate(struct nlattr *tb[], struct nlattr *data[], static struct rtnl_link_ops netkit_link_ops; -static int netkit_new_link(struct net *peer_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int netkit_new_link(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct netlink_ext_ack *extack = params->extack; + struct net *peer_net = rtnl_newlink_peer_net(params); struct nlattr *peer_tb[IFLA_MAX + 1], **tbp = tb, *attr; enum netkit_action policy_prim = NETKIT_PASS; enum netkit_action policy_peer = NETKIT_PASS; diff --git a/drivers/net/pfcp.c b/drivers/net/pfcp.c index 69434fd13f96..8576d5117233 100644 --- a/drivers/net/pfcp.c +++ b/drivers/net/pfcp.c @@ -184,15 +184,15 @@ static int pfcp_add_sock(struct pfcp_dev *pfcp) return PTR_ERR_OR_ZERO(pfcp->sock); } -static int pfcp_newlink(struct net *net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int pfcp_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct net *link_net = rtnl_newlink_link_net(params); struct pfcp_dev *pfcp = netdev_priv(dev); struct pfcp_net *pn; int err; - pfcp->net = net; + pfcp->net = link_net; err = pfcp_add_sock(pfcp); if (err) { diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c index 4583e15ad03a..a0ace8aa5b5d 100644 --- a/drivers/net/ppp/ppp_generic.c +++ b/drivers/net/ppp/ppp_generic.c @@ -1303,10 +1303,12 @@ static int ppp_nl_validate(struct nlattr *tb[], struct nlattr *data[], return 0; } -static int ppp_nl_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ppp_nl_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct net *link_net = rtnl_newlink_link_net(params); struct ppp_config conf = { .unit = -1, .ifname_is_set = true, @@ -1343,7 +1345,7 @@ static int ppp_nl_newlink(struct net *src_net, struct net_device *dev, if (!tb[IFLA_IFNAME] || !nla_len(tb[IFLA_IFNAME]) || !*(char *)nla_data(tb[IFLA_IFNAME])) conf.ifname_is_set = false; - err = ppp_dev_configure(src_net, dev, &conf); + err = ppp_dev_configure(link_net, dev, &conf); out_unlock: mutex_unlock(&ppp_mutex); diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c index a1b27b69f010..c9ee70030517 100644 --- a/drivers/net/team/team_core.c +++ b/drivers/net/team/team_core.c @@ -2206,10 +2206,11 @@ static void team_setup(struct net_device *dev) dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX; } -static int team_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int team_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + if (tb[IFLA_ADDRESS] == NULL) eth_hw_addr_random(dev); diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 07ebb800edf1..e9818f6b666b 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -1765,10 +1765,13 @@ static int veth_init_queues(struct net_device *dev, struct nlattr *tb[]) return 0; } -static int veth_newlink(struct net *peer_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int veth_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct netlink_ext_ack *extack = params->extack; + struct net *peer_net = rtnl_newlink_peer_net(params); int err; struct net_device *peer; struct veth_priv *priv; diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c index ca81b212a246..ed1d47a473e2 100644 --- a/drivers/net/vrf.c +++ b/drivers/net/vrf.c @@ -1677,10 +1677,11 @@ static void vrf_dellink(struct net_device *dev, struct list_head *head) unregister_netdevice_queue(dev, head); } -static int vrf_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int vrf_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct netlink_ext_ack *extack = params->extack; struct net_vrf *vrf = netdev_priv(dev); struct netns_vrf *nn_vrf; bool *add_fib_rules; diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c index b46a799bd390..2e27b19a557e 100644 --- a/drivers/net/vxlan/vxlan_core.c +++ b/drivers/net/vxlan/vxlan_core.c @@ -4351,10 +4351,13 @@ static int vxlan_nl2conf(struct nlattr *tb[], struct nlattr *data[], return 0; } -static int vxlan_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int vxlan_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct netlink_ext_ack *extack = params->extack; + struct net *link_net = rtnl_newlink_link_net(params); struct vxlan_config conf; int err; @@ -4362,7 +4365,7 @@ static int vxlan_newlink(struct net *src_net, struct net_device *dev, if (err) return err; - return __vxlan_dev_create(src_net, dev, &conf, extack); + return __vxlan_dev_create(link_net, dev, &conf, extack); } static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[], diff --git a/drivers/net/wireguard/device.c b/drivers/net/wireguard/device.c index 6cf173a008e7..a27f844cfb7f 100644 --- a/drivers/net/wireguard/device.c +++ b/drivers/net/wireguard/device.c @@ -307,14 +307,14 @@ static void wg_setup(struct net_device *dev) wg->dev = dev; } -static int wg_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int wg_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct net *link_net = rtnl_newlink_link_net(params); struct wg_device *wg = netdev_priv(dev); int ret = -ENOMEM; - rcu_assign_pointer(wg->creating_net, src_net); + rcu_assign_pointer(wg->creating_net, link_net); init_rwsem(&wg->static_identity.lock); mutex_init(&wg->socket_update_lock); mutex_init(&wg->device_update_lock); diff --git a/drivers/net/wireless/virtual/virt_wifi.c b/drivers/net/wireless/virtual/virt_wifi.c index 4ee374080466..107dc503b4f2 100644 --- a/drivers/net/wireless/virtual/virt_wifi.c +++ b/drivers/net/wireless/virtual/virt_wifi.c @@ -519,10 +519,12 @@ static rx_handler_result_t virt_wifi_rx_handler(struct sk_buff **pskb) } /* Called with rtnl lock held. */ -static int virt_wifi_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int virt_wifi_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct netlink_ext_ack *extack = params->extack; + struct net *link_net = rtnl_newlink_link_net(params); struct virt_wifi_netdev_priv *priv = netdev_priv(dev); int err; @@ -532,7 +534,7 @@ static int virt_wifi_newlink(struct net *src_net, struct net_device *dev, netif_carrier_off(dev); priv->upperdev = dev; - priv->lowerdev = __dev_get_by_index(src_net, + priv->lowerdev = __dev_get_by_index(link_net, nla_get_u32(tb[IFLA_LINK])); if (!priv->lowerdev) diff --git a/drivers/net/wwan/wwan_core.c b/drivers/net/wwan/wwan_core.c index a51e2755991a..450cf2e253e4 100644 --- a/drivers/net/wwan/wwan_core.c +++ b/drivers/net/wwan/wwan_core.c @@ -967,10 +967,11 @@ static struct net_device *wwan_rtnl_alloc(struct nlattr *tb[], return dev; } -static int wwan_rtnl_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int wwan_rtnl_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct netlink_ext_ack *extack = params->extack; struct wwan_device *wwandev = wwan_dev_get_by_parent(dev->dev.parent); u32 link_id = nla_get_u32(data[IFLA_WWAN_LINK_ID]); struct wwan_netdev_priv *priv = netdev_priv(dev); @@ -1064,6 +1065,11 @@ static void wwan_create_default_link(struct wwan_device *wwandev, struct net_device *dev; struct nlmsghdr *nlh; struct sk_buff *msg; + struct rtnl_newlink_params params = { + .src_net = &init_net, + .tb = tb, + .data = data, + }; /* Forge attributes required to create a WWAN netdev. We first * build a netlink message and then parse it. This looks @@ -1105,7 +1111,8 @@ static void wwan_create_default_link(struct wwan_device *wwandev, if (WARN_ON(IS_ERR(dev))) goto unlock; - if (WARN_ON(wwan_rtnl_newlink(&init_net, dev, tb, data, NULL))) { + params.dev = dev; + if (WARN_ON(wwan_rtnl_newlink(¶ms))) { free_netdev(dev); goto unlock; } diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h index 1aa31bdb2b31..ae1f2dda4533 100644 --- a/include/net/ip_tunnels.h +++ b/include/net/ip_tunnels.h @@ -406,8 +406,9 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct sk_buff *skb, bool log_ecn_error); int ip_tunnel_changelink(struct net_device *dev, struct nlattr *tb[], struct ip_tunnel_parm_kern *p, __u32 fwmark); -int ip_tunnel_newlink(struct net_device *dev, struct nlattr *tb[], - struct ip_tunnel_parm_kern *p, __u32 fwmark); +int ip_tunnel_newlink(struct net *net, struct net_device *dev, + struct nlattr *tb[], struct ip_tunnel_parm_kern *p, + __u32 fwmark); void ip_tunnel_setup(struct net_device *dev, unsigned int net_id); bool ip_tunnel_netlink_encap_parms(struct nlattr *data[], diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h index bc0069a8b6ea..04fc0e91af42 100644 --- a/include/net/rtnetlink.h +++ b/include/net/rtnetlink.h @@ -69,6 +69,44 @@ static inline int rtnl_msg_family(const struct nlmsghdr *nlh) return AF_UNSPEC; } +/** + * struct rtnl_newlink_params - parameters of rtnl_link_ops::newlink() + * + * @src_net: Source netns of rtnetlink socket + * @link_net: Link netns by IFLA_LINK_NETNSID, NULL if not specified + * @peer_net: Peer netns + * @dev: The net_device being created + * @tb: IFLA_* attributes + * @data: IFLA_INFO_DATA attributes + * @extack: Netlink extended ACK + */ +struct rtnl_newlink_params { + struct net *src_net; + struct net *link_net; + struct net *peer_net; + struct net_device *dev; + struct nlattr **tb; + struct nlattr **data; + struct netlink_ext_ack *extack; +}; + +/* Get effective link netns from newlink params. Generally, this is link_net + * and falls back to src_net. But for compatibility, a driver may * choose to + * use dev_net(dev) instead. + */ +static inline struct net *rtnl_newlink_link_net(struct rtnl_newlink_params *p) +{ + return p->link_net ? : p->src_net; +} + +/* Get peer netns from newlink params. Fallback to link netns if peer netns is + * not specified explicitly. + */ +static inline struct net *rtnl_newlink_peer_net(struct rtnl_newlink_params *p) +{ + return p->peer_net ? : rtnl_newlink_link_net(p); +} + /** * struct rtnl_link_ops - rtnetlink link operations * @@ -125,11 +163,7 @@ struct rtnl_link_ops { struct nlattr *data[], struct netlink_ext_ack *extack); - int (*newlink)(struct net *src_net, - struct net_device *dev, - struct nlattr *tb[], - struct nlattr *data[], - struct netlink_ext_ack *extack); + int (*newlink)(struct rtnl_newlink_params *params); int (*changelink)(struct net_device *dev, struct nlattr *tb[], struct nlattr *data[], diff --git a/net/8021q/vlan_netlink.c b/net/8021q/vlan_netlink.c index 134419667d59..603b2ea1e2b4 100644 --- a/net/8021q/vlan_netlink.c +++ b/net/8021q/vlan_netlink.c @@ -135,10 +135,13 @@ static int vlan_changelink(struct net_device *dev, struct nlattr *tb[], return 0; } -static int vlan_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int vlan_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct netlink_ext_ack *extack = params->extack; + struct net *link_net = rtnl_newlink_link_net(params); struct vlan_dev_priv *vlan = vlan_dev_priv(dev); struct net_device *real_dev; unsigned int max_mtu; @@ -155,7 +158,7 @@ static int vlan_newlink(struct net *src_net, struct net_device *dev, return -EINVAL; } - real_dev = __dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK])); + real_dev = __dev_get_by_index(link_net, nla_get_u32(tb[IFLA_LINK])); if (!real_dev) { NL_SET_ERR_MSG_MOD(extack, "link does not exist"); return -ENODEV; diff --git a/net/batman-adv/soft-interface.c b/net/batman-adv/soft-interface.c index 2758aba47a2f..c411b8857095 100644 --- a/net/batman-adv/soft-interface.c +++ b/net/batman-adv/soft-interface.c @@ -1063,18 +1063,14 @@ static int batadv_softif_validate(struct nlattr *tb[], struct nlattr *data[], /** * batadv_softif_newlink() - pre-initialize and register new batadv link - * @src_net: the applicable net namespace - * @dev: network device to register - * @tb: IFLA_INFO_DATA netlink attributes - * @data: enum batadv_ifla_attrs attributes - * @extack: extended ACK report struct + * @params: rtnl newlink parameters * * Return: 0 if successful or error otherwise. */ -static int batadv_softif_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int batadv_softif_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; struct batadv_priv *bat_priv = netdev_priv(dev); const char *algo_name; int err; diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 3e0f47203f2a..ccce5119b28d 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -1553,10 +1553,12 @@ static int br_changelink(struct net_device *brdev, struct nlattr *tb[], return 0; } -static int br_dev_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int br_dev_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct netlink_ext_ack *extack = params->extack; struct net_bridge *br = netdev_priv(dev); int err; diff --git a/net/caif/chnl_net.c b/net/caif/chnl_net.c index 94ad09e36df2..748e38908709 100644 --- a/net/caif/chnl_net.c +++ b/net/caif/chnl_net.c @@ -438,10 +438,10 @@ static void caif_netlink_parms(struct nlattr *data[], } } -static int ipcaif_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ipcaif_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; int ret; struct chnl_net *caifdev; ASSERT_RTNL(); diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 7855f81c917b..3ea63722d0fd 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -3757,6 +3757,14 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm, struct net_device *dev; char ifname[IFNAMSIZ]; int err; + struct rtnl_newlink_params params = { + .src_net = net, + .link_net = link_net, + .peer_net = peer_net, + .tb = tb, + .data = data, + .extack = extack, + }; if (!ops->alloc && !ops->setup) return -EOPNOTSUPP; @@ -3768,22 +3776,18 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm, name_assign_type = NET_NAME_ENUM; } - dev = rtnl_create_link(link_net ? : tgt_net, ifname, - name_assign_type, ops, tb, extack); + dev = rtnl_create_link(tgt_net, ifname, name_assign_type, ops, tb, + extack); if (IS_ERR(dev)) { err = PTR_ERR(dev); goto out; } dev->ifindex = ifm->ifi_index; - - if (link_net) - net = link_net; - if (peer_net) - net = peer_net; + params.dev = dev; if (ops->newlink) - err = ops->newlink(net, dev, tb, data, extack); + err = ops->newlink(¶ms); else err = register_netdevice(dev); if (err < 0) { @@ -3794,11 +3798,6 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm, err = rtnl_configure_link(dev, ifm, portid, nlh); if (err < 0) goto out_unregister; - if (link_net) { - err = dev_change_net_namespace(dev, tgt_net, ifname); - if (err < 0) - goto out_unregister; - } if (tb[IFLA_MASTER]) { err = do_set_master(dev, nla_get_u32(tb[IFLA_MASTER]), extack); if (err) diff --git a/net/hsr/hsr_netlink.c b/net/hsr/hsr_netlink.c index b68f2f71d0e1..694392222637 100644 --- a/net/hsr/hsr_netlink.c +++ b/net/hsr/hsr_netlink.c @@ -29,10 +29,12 @@ static const struct nla_policy hsr_policy[IFLA_HSR_MAX + 1] = { /* Here, it seems a netdevice has already been allocated for us, and the * hsr_dev_setup routine has been executed. Nice! */ -static int hsr_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int hsr_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct netlink_ext_ack *extack = params->extack; + struct net *link_net = rtnl_newlink_link_net(params); enum hsr_version proto_version; unsigned char multicast_spec; u8 proto = HSR_PROTOCOL_HSR; @@ -46,7 +48,7 @@ static int hsr_newlink(struct net *src_net, struct net_device *dev, NL_SET_ERR_MSG_MOD(extack, "Slave1 device not specified"); return -EINVAL; } - link[0] = __dev_get_by_index(src_net, + link[0] = __dev_get_by_index(link_net, nla_get_u32(data[IFLA_HSR_SLAVE1])); if (!link[0]) { NL_SET_ERR_MSG_MOD(extack, "Slave1 does not exist"); @@ -56,7 +58,7 @@ static int hsr_newlink(struct net *src_net, struct net_device *dev, NL_SET_ERR_MSG_MOD(extack, "Slave2 device not specified"); return -EINVAL; } - link[1] = __dev_get_by_index(src_net, + link[1] = __dev_get_by_index(link_net, nla_get_u32(data[IFLA_HSR_SLAVE2])); if (!link[1]) { NL_SET_ERR_MSG_MOD(extack, "Slave2 does not exist"); @@ -69,7 +71,7 @@ static int hsr_newlink(struct net *src_net, struct net_device *dev, } if (data[IFLA_HSR_INTERLINK]) - interlink = __dev_get_by_index(src_net, + interlink = __dev_get_by_index(link_net, nla_get_u32(data[IFLA_HSR_INTERLINK])); if (interlink && interlink == link[0]) { diff --git a/net/ieee802154/6lowpan/core.c b/net/ieee802154/6lowpan/core.c index 175efd860f7b..65a5c61cf38c 100644 --- a/net/ieee802154/6lowpan/core.c +++ b/net/ieee802154/6lowpan/core.c @@ -129,10 +129,10 @@ static int lowpan_validate(struct nlattr *tb[], struct nlattr *data[], return 0; } -static int lowpan_newlink(struct net *src_net, struct net_device *ldev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int lowpan_newlink(struct rtnl_newlink_params *params) { + struct net_device *ldev = params->dev; + struct nlattr **tb = params->tb; struct net_device *wdev; int ret; @@ -143,7 +143,8 @@ static int lowpan_newlink(struct net *src_net, struct net_device *ldev, if (!tb[IFLA_LINK]) return -EINVAL; /* find and hold wpan device */ - wdev = dev_get_by_index(dev_net(ldev), nla_get_u32(tb[IFLA_LINK])); + wdev = dev_get_by_index(params->link_net ? : dev_net(ldev), + nla_get_u32(tb[IFLA_LINK])); if (!wdev) return -ENODEV; if (wdev->type != ARPHRD_IEEE802154) { diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c index f1f31ebfc793..4a3f8e450ef5 100644 --- a/net/ipv4/ip_gre.c +++ b/net/ipv4/ip_gre.c @@ -1389,10 +1389,12 @@ ipgre_newlink_encap_setup(struct net_device *dev, struct nlattr *data[]) return 0; } -static int ipgre_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ipgre_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct net *net = params->link_net ? : dev_net(dev); struct ip_tunnel_parm_kern p; __u32 fwmark = 0; int err; @@ -1404,13 +1406,15 @@ static int ipgre_newlink(struct net *src_net, struct net_device *dev, err = ipgre_netlink_parms(dev, data, tb, &p, &fwmark); if (err < 0) return err; - return ip_tunnel_newlink(dev, tb, &p, fwmark); + return ip_tunnel_newlink(net, dev, tb, &p, fwmark); } -static int erspan_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int erspan_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct net *net = params->link_net ? : dev_net(dev); struct ip_tunnel_parm_kern p; __u32 fwmark = 0; int err; @@ -1422,7 +1426,7 @@ static int erspan_newlink(struct net *src_net, struct net_device *dev, err = erspan_netlink_parms(dev, data, tb, &p, &fwmark); if (err) return err; - return ip_tunnel_newlink(dev, tb, &p, fwmark); + return ip_tunnel_newlink(net, dev, tb, &p, fwmark); } static int ipgre_changelink(struct net_device *dev, struct nlattr *tb[], @@ -1695,6 +1699,10 @@ struct net_device *gretap_fb_dev_create(struct net *net, const char *name, LIST_HEAD(list_kill); struct ip_tunnel *t; int err; + struct rtnl_newlink_params params = { + .src_net = net, + .tb = tb, + }; memset(&tb, 0, sizeof(tb)); @@ -1707,7 +1715,8 @@ struct net_device *gretap_fb_dev_create(struct net *net, const char *name, t = netdev_priv(dev); t->collect_md = true; - err = ipgre_newlink(net, dev, tb, NULL, NULL); + params.dev = dev; + err = ipgre_newlink(¶ms); if (err < 0) { free_netdev(dev); return ERR_PTR(err); diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index 09b73acf037a..618a50d5c0c2 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -1213,11 +1213,11 @@ void ip_tunnel_delete_nets(struct list_head *net_list, unsigned int id, } EXPORT_SYMBOL_GPL(ip_tunnel_delete_nets); -int ip_tunnel_newlink(struct net_device *dev, struct nlattr *tb[], - struct ip_tunnel_parm_kern *p, __u32 fwmark) +int ip_tunnel_newlink(struct net *net, struct net_device *dev, + struct nlattr *tb[], struct ip_tunnel_parm_kern *p, + __u32 fwmark) { struct ip_tunnel *nt; - struct net *net = dev_net(dev); struct ip_tunnel_net *itn; int mtu; int err; @@ -1326,7 +1326,9 @@ int ip_tunnel_init(struct net_device *dev) } tunnel->dev = dev; - tunnel->net = dev_net(dev); + if (!tunnel->net) + tunnel->net = dev_net(dev); + strscpy(tunnel->parms.name, dev->name); iph->version = 4; iph->ihl = 5; diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c index f0b4419cef34..b567e2375302 100644 --- a/net/ipv4/ip_vti.c +++ b/net/ipv4/ip_vti.c @@ -575,15 +575,17 @@ static void vti_netlink_parms(struct nlattr *data[], *fwmark = nla_get_u32(data[IFLA_VTI_FWMARK]); } -static int vti_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int vti_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; struct ip_tunnel_parm_kern parms; __u32 fwmark = 0; vti_netlink_parms(data, &parms, &fwmark); - return ip_tunnel_newlink(dev, tb, &parms, fwmark); + return ip_tunnel_newlink(params->link_net ? : dev_net(dev), dev, tb, + &parms, fwmark); } static int vti_changelink(struct net_device *dev, struct nlattr *tb[], diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c index dc0db5895e0e..9dccaa0d6ba7 100644 --- a/net/ipv4/ipip.c +++ b/net/ipv4/ipip.c @@ -436,10 +436,11 @@ static void ipip_netlink_parms(struct nlattr *data[], *fwmark = nla_get_u32(data[IFLA_IPTUN_FWMARK]); } -static int ipip_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ipip_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; struct ip_tunnel *t = netdev_priv(dev); struct ip_tunnel_encap ipencap; struct ip_tunnel_parm_kern p; @@ -453,7 +454,8 @@ static int ipip_newlink(struct net *src_net, struct net_device *dev, } ipip_netlink_parms(data, &p, &t->collect_md, &fwmark); - return ip_tunnel_newlink(dev, tb, &p, fwmark); + return ip_tunnel_newlink(params->link_net ? : dev_net(dev), dev, tb, &p, + fwmark); } static int ipip_changelink(struct net_device *dev, struct nlattr *tb[], diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c index 235808cfec70..7d6d3db200a1 100644 --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -1971,7 +1971,7 @@ static bool ip6gre_netlink_encap_parms(struct nlattr *data[], return ret; } -static int ip6gre_newlink_common(struct net *src_net, struct net_device *dev, +static int ip6gre_newlink_common(struct net *link_net, struct net_device *dev, struct nlattr *tb[], struct nlattr *data[], struct netlink_ext_ack *extack) { @@ -1992,7 +1992,7 @@ static int ip6gre_newlink_common(struct net *src_net, struct net_device *dev, eth_hw_addr_random(dev); nt->dev = dev; - nt->net = dev_net(dev); + nt->net = link_net; err = register_netdevice(dev); if (err) @@ -2005,12 +2005,14 @@ static int ip6gre_newlink_common(struct net *src_net, struct net_device *dev, return err; } -static int ip6gre_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ip6gre_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct netlink_ext_ack *extack = params->extack; struct ip6_tnl *nt = netdev_priv(dev); - struct net *net = dev_net(dev); + struct net *net = params->link_net ? : dev_net(dev); struct ip6gre_net *ign; int err; @@ -2025,7 +2027,7 @@ static int ip6gre_newlink(struct net *src_net, struct net_device *dev, return -EEXIST; } - err = ip6gre_newlink_common(src_net, dev, tb, data, extack); + err = ip6gre_newlink_common(net, dev, tb, data, extack); if (!err) { ip6gre_tnl_link_config(nt, !tb[IFLA_MTU]); ip6gre_tunnel_link_md(ign, nt); @@ -2241,12 +2243,14 @@ static void ip6erspan_tap_setup(struct net_device *dev) netif_keep_dst(dev); } -static int ip6erspan_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ip6erspan_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct netlink_ext_ack *extack = params->extack; struct ip6_tnl *nt = netdev_priv(dev); - struct net *net = dev_net(dev); + struct net *net = params->link_net ? : dev_net(dev); struct ip6gre_net *ign; int err; @@ -2262,7 +2266,7 @@ static int ip6erspan_newlink(struct net *src_net, struct net_device *dev, return -EEXIST; } - err = ip6gre_newlink_common(src_net, dev, tb, data, extack); + err = ip6gre_newlink_common(net, dev, tb, data, extack); if (!err) { ip6erspan_tnl_link_config(nt, !tb[IFLA_MTU]); ip6erspan_tunnel_link_md(ign, nt); diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 48fd53b98972..33a58c3c9ebe 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -250,10 +250,9 @@ static void ip6_dev_free(struct net_device *dev) dst_cache_destroy(&t->dst_cache); } -static int ip6_tnl_create2(struct net_device *dev) +static int ip6_tnl_create2(struct net *net, struct net_device *dev) { struct ip6_tnl *t = netdev_priv(dev); - struct net *net = dev_net(dev); struct ip6_tnl_net *ip6n = net_generic(net, ip6_tnl_net_id); int err; @@ -308,7 +307,7 @@ static struct ip6_tnl *ip6_tnl_create(struct net *net, struct __ip6_tnl_parm *p) t = netdev_priv(dev); t->parms = *p; t->net = dev_net(dev); - err = ip6_tnl_create2(dev); + err = ip6_tnl_create2(net, dev); if (err < 0) goto failed_free; @@ -2002,11 +2001,12 @@ static void ip6_tnl_netlink_parms(struct nlattr *data[], parms->fwmark = nla_get_u32(data[IFLA_IPTUN_FWMARK]); } -static int ip6_tnl_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ip6_tnl_newlink(struct rtnl_newlink_params *params) { - struct net *net = dev_net(dev); + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct net *net = params->link_net ? : dev_net(dev); struct ip6_tnl_net *ip6n = net_generic(net, ip6_tnl_net_id); struct ip_tunnel_encap ipencap; struct ip6_tnl *nt, *t; @@ -2031,7 +2031,7 @@ static int ip6_tnl_newlink(struct net *src_net, struct net_device *dev, return -EEXIST; } - err = ip6_tnl_create2(dev); + err = ip6_tnl_create2(net, dev); if (!err && tb[IFLA_MTU]) ip6_tnl_change_mtu(dev, nla_get_u32(tb[IFLA_MTU])); diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c index 590737c27537..ff9dc74819c5 100644 --- a/net/ipv6/ip6_vti.c +++ b/net/ipv6/ip6_vti.c @@ -174,10 +174,9 @@ vti6_tnl_unlink(struct vti6_net *ip6n, struct ip6_tnl *t) } } -static int vti6_tnl_create2(struct net_device *dev) +static int vti6_tnl_create2(struct net *net, struct net_device *dev) { struct ip6_tnl *t = netdev_priv(dev); - struct net *net = dev_net(dev); struct vti6_net *ip6n = net_generic(net, vti6_net_id); int err; @@ -221,7 +220,7 @@ static struct ip6_tnl *vti6_tnl_create(struct net *net, struct __ip6_tnl_parm *p t->parms = *p; t->net = dev_net(dev); - err = vti6_tnl_create2(dev); + err = vti6_tnl_create2(net, dev); if (err < 0) goto failed_free; @@ -997,11 +996,11 @@ static void vti6_netlink_parms(struct nlattr *data[], parms->fwmark = nla_get_u32(data[IFLA_VTI_FWMARK]); } -static int vti6_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int vti6_newlink(struct rtnl_newlink_params *params) { - struct net *net = dev_net(dev); + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *net = params->link_net ? : dev_net(dev); struct ip6_tnl *nt; nt = netdev_priv(dev); @@ -1012,7 +1011,7 @@ static int vti6_newlink(struct net *src_net, struct net_device *dev, if (vti6_locate(net, &nt->parms, 0)) return -EEXIST; - return vti6_tnl_create2(dev); + return vti6_tnl_create2(net, dev); } static void vti6_dellink(struct net_device *dev, struct list_head *head) diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c index 39bd8951bfca..cbcaccbfc3c9 100644 --- a/net/ipv6/sit.c +++ b/net/ipv6/sit.c @@ -198,10 +198,9 @@ static void ipip6_tunnel_clone_6rd(struct net_device *dev, struct sit_net *sitn) #endif } -static int ipip6_tunnel_create(struct net_device *dev) +static int ipip6_tunnel_create(struct net *net, struct net_device *dev) { struct ip_tunnel *t = netdev_priv(dev); - struct net *net = dev_net(dev); struct sit_net *sitn = net_generic(net, sit_net_id); int err; @@ -270,7 +269,7 @@ static struct ip_tunnel *ipip6_tunnel_locate(struct net *net, nt = netdev_priv(dev); nt->parms = *parms; - if (ipip6_tunnel_create(dev) < 0) + if (ipip6_tunnel_create(net, dev) < 0) goto failed_free; if (!parms->name[0]) @@ -1550,11 +1549,12 @@ static bool ipip6_netlink_6rd_parms(struct nlattr *data[], } #endif -static int ipip6_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ipip6_newlink(struct rtnl_newlink_params *params) { - struct net *net = dev_net(dev); + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + struct nlattr **data = params->data; + struct net *net = params->link_net ? : dev_net(dev); struct ip_tunnel *nt; struct ip_tunnel_encap ipencap; #ifdef CONFIG_IPV6_SIT_6RD @@ -1575,7 +1575,7 @@ static int ipip6_newlink(struct net *src_net, struct net_device *dev, if (ipip6_tunnel_locate(net, &nt->parms, 0)) return -EEXIST; - err = ipip6_tunnel_create(dev); + err = ipip6_tunnel_create(net, dev); if (err < 0) return err; diff --git a/net/xfrm/xfrm_interface_core.c b/net/xfrm/xfrm_interface_core.c index 98f1e2b67c76..d1f2674a98c8 100644 --- a/net/xfrm/xfrm_interface_core.c +++ b/net/xfrm/xfrm_interface_core.c @@ -242,10 +242,9 @@ static void xfrmi_dev_free(struct net_device *dev) gro_cells_destroy(&xi->gro_cells); } -static int xfrmi_create(struct net_device *dev) +static int xfrmi_create(struct net *net, struct net_device *dev) { struct xfrm_if *xi = netdev_priv(dev); - struct net *net = dev_net(dev); struct xfrmi_net *xfrmn = net_generic(net, xfrmi_net_id); int err; @@ -814,11 +813,12 @@ static void xfrmi_netlink_parms(struct nlattr *data[], parms->collect_md = true; } -static int xfrmi_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int xfrmi_newlink(struct rtnl_newlink_params *params) { - struct net *net = dev_net(dev); + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct netlink_ext_ack *extack = params->extack; + struct net *net = params->link_net ? : dev_net(dev); struct xfrm_if_parms p = {}; struct xfrm_if *xi; int err; @@ -851,7 +851,7 @@ static int xfrmi_newlink(struct net *src_net, struct net_device *dev, xi->net = net; xi->dev = dev; - err = xfrmi_create(dev); + err = xfrmi_create(net, dev); return err; } -- 2.47.1 From shaw.leon at gmail.com Mon Dec 9 14:06:08 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Mon, 09 Dec 2024 14:06:08 -0000 Subject: [PATCH net-next v5 4/5] selftests: net: Add python context manager for netns entering In-Reply-To: <20241209140151.231257-1-shaw.leon@gmail.com> References: <20241209140151.231257-1-shaw.leon@gmail.com> Message-ID: <20241209140151.231257-5-shaw.leon@gmail.com> Change netns of current thread and switch back on context exit. For example: with NetNSEnter("ns1"): ip("link add dummy0 type dummy") The command be executed in netns "ns1". Signed-off-by: Xiao Liang --- tools/testing/selftests/net/lib/py/__init__.py | 2 +- tools/testing/selftests/net/lib/py/netns.py | 18 ++++++++++++++++++ 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/net/lib/py/__init__.py b/tools/testing/selftests/net/lib/py/__init__.py index 54d8f5eba810..e2d6c7b63019 100644 --- a/tools/testing/selftests/net/lib/py/__init__.py +++ b/tools/testing/selftests/net/lib/py/__init__.py @@ -2,7 +2,7 @@ from .consts import KSRC from .ksft import * -from .netns import NetNS +from .netns import NetNS, NetNSEnter from .nsim import * from .utils import * from .ynl import NlError, YnlFamily, EthtoolFamily, NetdevFamily, RtnlFamily diff --git a/tools/testing/selftests/net/lib/py/netns.py b/tools/testing/selftests/net/lib/py/netns.py index ecff85f9074f..8e9317044eef 100644 --- a/tools/testing/selftests/net/lib/py/netns.py +++ b/tools/testing/selftests/net/lib/py/netns.py @@ -1,9 +1,12 @@ # SPDX-License-Identifier: GPL-2.0 from .utils import ip +import ctypes import random import string +libc = ctypes.cdll.LoadLibrary('libc.so.6') + class NetNS: def __init__(self, name=None): @@ -29,3 +32,18 @@ class NetNS: def __repr__(self): return f"NetNS({self.name})" + + +class NetNSEnter: + def __init__(self, ns_name): + self.ns_path = f"/run/netns/{ns_name}" + + def __enter__(self): + self.saved = open("/proc/thread-self/ns/net") + with open(self.ns_path) as ns_file: + libc.setns(ns_file.fileno(), 0) + return self + + def __exit__(self, exc_type, exc_value, traceback): + libc.setns(self.saved.fileno(), 0) + self.saved.close() -- 2.47.1 From shaw.leon at gmail.com Mon Dec 9 14:06:16 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Mon, 09 Dec 2024 14:06:16 -0000 Subject: [PATCH net-next v5 5/5] selftests: net: Add two test cases for link netns In-Reply-To: <20241209140151.231257-1-shaw.leon@gmail.com> References: <20241209140151.231257-1-shaw.leon@gmail.com> Message-ID: <20241209140151.231257-6-shaw.leon@gmail.com> - Add test for creating link in another netns when a link of the same name and ifindex exists in current netns. - Add test for link netns atomicity - create link directly in target netns, and no notifications should be generated in current netns. Signed-off-by: Xiao Liang --- tools/testing/selftests/net/Makefile | 1 + tools/testing/selftests/net/netns-name.sh | 10 ++++++ tools/testing/selftests/net/netns_atomic.py | 39 +++++++++++++++++++++ 3 files changed, 50 insertions(+) create mode 100755 tools/testing/selftests/net/netns_atomic.py diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile index cb2fc601de66..f9f7a765d645 100644 --- a/tools/testing/selftests/net/Makefile +++ b/tools/testing/selftests/net/Makefile @@ -34,6 +34,7 @@ TEST_PROGS += gre_gso.sh TEST_PROGS += cmsg_so_mark.sh TEST_PROGS += cmsg_time.sh cmsg_ipv6.sh TEST_PROGS += netns-name.sh +TEST_PROGS += netns_atomic.py TEST_PROGS += nl_netdev.py TEST_PROGS += srv6_end_dt46_l3vpn_test.sh TEST_PROGS += srv6_end_dt4_l3vpn_test.sh diff --git a/tools/testing/selftests/net/netns-name.sh b/tools/testing/selftests/net/netns-name.sh index 6974474c26f3..0be1905d1f2f 100755 --- a/tools/testing/selftests/net/netns-name.sh +++ b/tools/testing/selftests/net/netns-name.sh @@ -78,6 +78,16 @@ ip -netns $NS link show dev $ALT_NAME 2> /dev/null && fail "Can still find alt-name after move" ip -netns $test_ns link del $DEV || fail +# +# Test no conflict of the same name/ifindex in different netns +# +ip -netns $NS link add name $DEV index 100 type dummy || fail +ip -netns $NS link add netns $test_ns name $DEV index 100 type dummy || + fail "Can create in netns without moving" +ip -netns $test_ns link show dev $DEV >> /dev/null || fail "Device not found" +ip -netns $NS link del $DEV || fail +ip -netns $test_ns link del $DEV || fail + echo -ne "$(basename $0) \t\t\t\t" if [ $RET_CODE -eq 0 ]; then echo "[ OK ]" diff --git a/tools/testing/selftests/net/netns_atomic.py b/tools/testing/selftests/net/netns_atomic.py new file mode 100755 index 000000000000..d350a3fc0a91 --- /dev/null +++ b/tools/testing/selftests/net/netns_atomic.py @@ -0,0 +1,39 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +import time + +from lib.py import ksft_run, ksft_exit, ksft_true +from lib.py import ip +from lib.py import NetNS, NetNSEnter +from lib.py import RtnlFamily + + +def test_event(ns1, ns2) -> None: + with NetNSEnter(str(ns1)): + rtnl = RtnlFamily() + + rtnl.ntf_subscribe("rtnlgrp-link") + + ip(f"netns set {ns1} 0", ns=str(ns2)) + + ip(f"link add netns {ns2} link-netnsid 0 dummy1 type dummy") + ip(f"link add netns {ns2} dummy2 type dummy", ns=str(ns1)) + + ip("link del dummy1", ns=str(ns2)) + ip("link del dummy2", ns=str(ns2)) + + time.sleep(1) + rtnl.check_ntf() + ksft_true(rtnl.async_msg_queue.empty(), + "Received unexpected link notification") + + +def main() -> None: + with NetNS() as ns1, NetNS() as ns2: + ksft_run([test_event], args=(ns1, ns2)) + ksft_exit() + + +if __name__ == "__main__": + main() -- 2.47.1 From pabeni at redhat.com Thu Dec 12 09:27:55 2024 From: pabeni at redhat.com (Paolo Abeni) Date: Thu, 12 Dec 2024 09:27:55 -0000 Subject: [PATCH net-next v5 3/5] rtnetlink: Decouple net namespaces in rtnl_newlink_create() In-Reply-To: <20241209140151.231257-4-shaw.leon@gmail.com> References: <20241209140151.231257-1-shaw.leon@gmail.com> <20241209140151.231257-4-shaw.leon@gmail.com> Message-ID: <2b89667d-ccd6-40b7-b355-1c71e159d14f@redhat.com> On 12/9/24 15:01, Xiao Liang wrote: > There are 4 net namespaces involved when creating links: > > - source netns - where the netlink socket resides, > - target netns - where to put the device being created, > - link netns - netns associated with the device (backend), > - peer netns - netns of peer device. > > Currently, two nets are passed to newlink() callback - "src_net" > parameter and "dev_net" (implicitly in net_device). They are set as > follows, depending on netlink attributes. > > +------------+-------------------+---------+---------+ > | peer netns | IFLA_LINK_NETNSID | src_net | dev_net | > +------------+-------------------+---------+---------+ > | | absent | source | target | > | absent +-------------------+---------+---------+ > | | present | link | link | > +------------+-------------------+---------+---------+ > | | absent | peer | target | > | present +-------------------+---------+---------+ > | | present | peer | link | > +------------+-------------------+---------+---------+ > > When IFLA_LINK_NETNSID is present, the device is created in link netns > first. This has some side effects, including extra ifindex allocation, > ifname validation and link notifications. There's also an extra step to > move the device to target netns. These could be avoided if we create it > in target netns at the beginning. > > On the other hand, the meaning of src_net is ambiguous. It varies > depending on how parameters are passed. It is the effective link or peer > netns by design, but some drivers ignore it and use dev_net instead. > > This patch refactors netns handling by packing newlink() parameters into > a struct, and passing source, link and peer netns as is through this > struct. Fallback logic is implemented in helper functions - > rtnl_newlink_link_net() and rtnl_newlink_peer_net(). If is not set, peer > netns falls back to link netns, and link netns falls back to source netns. > rtnl_newlink_create() now creates devices in target netns directly, > so dev_net is always target netns. > > For drivers that use dev_net as fallback of link_netns, current behavior > is kept for compatibility. > > Signed-off-by: Xiao Liang I must admit this patch is way too huge for me to allow any reasonable review except that this has the potential of breaking a lot of things. I think you should be splitted to make it more palatable; i.e. - a patch just add the params struct with no semantic changes. - a patch making the dev_change_net_namespace() conditional on net != tge_net[1] - many per-device patches creating directly the device in the target namespace. - a patch reverting [1] Other may have different opinions, I'd love to hear them. > diff --git a/drivers/net/amt.c b/drivers/net/amt.c > index 98c6205ed19f..2f7bf50e05d2 100644 > --- a/drivers/net/amt.c > +++ b/drivers/net/amt.c > @@ -3161,14 +3161,17 @@ static int amt_validate(struct nlattr *tb[], struct nlattr *data[], > return 0; > } > > -static int amt_newlink(struct net *net, struct net_device *dev, > - struct nlattr *tb[], struct nlattr *data[], > - struct netlink_ext_ack *extack) > +static int amt_newlink(struct rtnl_newlink_params *params) > { > + struct net_device *dev = params->dev; > + struct nlattr **tb = params->tb; > + struct nlattr **data = params->data; > + struct netlink_ext_ack *extack = params->extack; > + struct net *link_net = rtnl_newlink_link_net(params); > struct amt_dev *amt = netdev_priv(dev); > int err = -EINVAL; Minor nit: here and and many other places, please respect the reverse xmas tree order. Thanks, Paolo From pabeni at redhat.com Thu Dec 12 09:41:00 2024 From: pabeni at redhat.com (Paolo Abeni) Date: Thu, 12 Dec 2024 09:41:00 -0000 Subject: [PATCH net-next v5 5/5] selftests: net: Add two test cases for link netns In-Reply-To: <20241209140151.231257-6-shaw.leon@gmail.com> References: <20241209140151.231257-1-shaw.leon@gmail.com> <20241209140151.231257-6-shaw.leon@gmail.com> Message-ID: <4a2fe99a-772d-4df1-a8ef-14338682b69e@redhat.com> On 12/9/24 15:01, Xiao Liang wrote: > - Add test for creating link in another netns when a link of the same > name and ifindex exists in current netns. > - Add test for link netns atomicity - create link directly in target > netns, and no notifications should be generated in current netns. > > Signed-off-by: Xiao Liang > --- > tools/testing/selftests/net/Makefile | 1 + > tools/testing/selftests/net/netns-name.sh | 10 ++++++ > tools/testing/selftests/net/netns_atomic.py | 39 +++++++++++++++++++++ > 3 files changed, 50 insertions(+) > create mode 100755 tools/testing/selftests/net/netns_atomic.py > > diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile > index cb2fc601de66..f9f7a765d645 100644 > --- a/tools/testing/selftests/net/Makefile > +++ b/tools/testing/selftests/net/Makefile > @@ -34,6 +34,7 @@ TEST_PROGS += gre_gso.sh > TEST_PROGS += cmsg_so_mark.sh > TEST_PROGS += cmsg_time.sh cmsg_ipv6.sh > TEST_PROGS += netns-name.sh > +TEST_PROGS += netns_atomic.py > TEST_PROGS += nl_netdev.py > TEST_PROGS += srv6_end_dt46_l3vpn_test.sh > TEST_PROGS += srv6_end_dt4_l3vpn_test.sh > diff --git a/tools/testing/selftests/net/netns-name.sh b/tools/testing/selftests/net/netns-name.sh > index 6974474c26f3..0be1905d1f2f 100755 > --- a/tools/testing/selftests/net/netns-name.sh > +++ b/tools/testing/selftests/net/netns-name.sh > @@ -78,6 +78,16 @@ ip -netns $NS link show dev $ALT_NAME 2> /dev/null && > fail "Can still find alt-name after move" > ip -netns $test_ns link del $DEV || fail > > +# > +# Test no conflict of the same name/ifindex in different netns > +# > +ip -netns $NS link add name $DEV index 100 type dummy || fail > +ip -netns $NS link add netns $test_ns name $DEV index 100 type dummy || > + fail "Can create in netns without moving" > +ip -netns $test_ns link show dev $DEV >> /dev/null || fail "Device not found" > +ip -netns $NS link del $DEV || fail > +ip -netns $test_ns link del $DEV || fail > + > echo -ne "$(basename $0) \t\t\t\t" > if [ $RET_CODE -eq 0 ]; then > echo "[ OK ]" > diff --git a/tools/testing/selftests/net/netns_atomic.py b/tools/testing/selftests/net/netns_atomic.py > new file mode 100755 > index 000000000000..d350a3fc0a91 > --- /dev/null > +++ b/tools/testing/selftests/net/netns_atomic.py > @@ -0,0 +1,39 @@ > +#!/usr/bin/env python3 > +# SPDX-License-Identifier: GPL-2.0 > + > +import time > + > +from lib.py import ksft_run, ksft_exit, ksft_true > +from lib.py import ip > +from lib.py import NetNS, NetNSEnter > +from lib.py import RtnlFamily > + > + > +def test_event(ns1, ns2) -> None: > + with NetNSEnter(str(ns1)): > + rtnl = RtnlFamily() > + > + rtnl.ntf_subscribe("rtnlgrp-link") > + > + ip(f"netns set {ns1} 0", ns=str(ns2)) > + > + ip(f"link add netns {ns2} link-netnsid 0 dummy1 type dummy") > + ip(f"link add netns {ns2} dummy2 type dummy", ns=str(ns1)) > + > + ip("link del dummy1", ns=str(ns2)) > + ip("link del dummy2", ns=str(ns2)) > + > + time.sleep(1) > + rtnl.check_ntf() > + ksft_true(rtnl.async_msg_queue.empty(), > + "Received unexpected link notification") I think we need a much larger coverage here, possibly testing all the update drivers and more 'netns', 'link-netnsid', 'peer netns' permutations for the devices that allow them. Thanks, Paolo From shaw.leon at gmail.com Thu Dec 12 12:41:44 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Thu, 12 Dec 2024 12:41:44 -0000 Subject: [PATCH net-next v5 3/5] rtnetlink: Decouple net namespaces in rtnl_newlink_create() In-Reply-To: <2b89667d-ccd6-40b7-b355-1c71e159d14f@redhat.com> References: <20241209140151.231257-1-shaw.leon@gmail.com> <20241209140151.231257-4-shaw.leon@gmail.com> <2b89667d-ccd6-40b7-b355-1c71e159d14f@redhat.com> Message-ID: On Thu, Dec 12, 2024 at 5:27?PM Paolo Abeni wrote: > > On 12/9/24 15:01, Xiao Liang wrote: > > There are 4 net namespaces involved when creating links: > > > > - source netns - where the netlink socket resides, > > - target netns - where to put the device being created, > > - link netns - netns associated with the device (backend), > > - peer netns - netns of peer device. > > > > Currently, two nets are passed to newlink() callback - "src_net" > > parameter and "dev_net" (implicitly in net_device). They are set as > > follows, depending on netlink attributes. > > > > +------------+-------------------+---------+---------+ > > | peer netns | IFLA_LINK_NETNSID | src_net | dev_net | > > +------------+-------------------+---------+---------+ > > | | absent | source | target | > > | absent +-------------------+---------+---------+ > > | | present | link | link | > > +------------+-------------------+---------+---------+ > > | | absent | peer | target | > > | present +-------------------+---------+---------+ > > | | present | peer | link | > > +------------+-------------------+---------+---------+ > > > > When IFLA_LINK_NETNSID is present, the device is created in link netns > > first. This has some side effects, including extra ifindex allocation, > > ifname validation and link notifications. There's also an extra step to > > move the device to target netns. These could be avoided if we create it > > in target netns at the beginning. > > > > On the other hand, the meaning of src_net is ambiguous. It varies > > depending on how parameters are passed. It is the effective link or peer > > netns by design, but some drivers ignore it and use dev_net instead. > > > > This patch refactors netns handling by packing newlink() parameters into > > a struct, and passing source, link and peer netns as is through this > > struct. Fallback logic is implemented in helper functions - > > rtnl_newlink_link_net() and rtnl_newlink_peer_net(). If is not set, peer > > netns falls back to link netns, and link netns falls back to source netns. > > rtnl_newlink_create() now creates devices in target netns directly, > > so dev_net is always target netns. > > > > For drivers that use dev_net as fallback of link_netns, current behavior > > is kept for compatibility. > > > > Signed-off-by: Xiao Liang > > I must admit this patch is way too huge for me to allow any reasonable > review except that this has the potential of breaking a lot of things. > > I think you should be splitted to make it more palatable; i.e. > - a patch just add the params struct with no semantic changes. > - a patch making the dev_change_net_namespace() conditional on net != > tge_net[1] > - many per-device patches creating directly the device in the target > namespace. > - a patch reverting [1] > > Other may have different opinions, I'd love to hear them. Thanks. I understand your concern. Since the device is created in common code, how about splitting the patch this way: 1) make the params struct contain both current src_net and other netns: struct rtnl_newlink_params { struct net *net; // renamed from current src_net struct net *src_net; // real src_net struct net *link_net; ... }; 2) convert each driver to use the accurate netns, 3) remove "net", which is not used now, from params struct, 4) change rtnl_newlink_create() to create device in target netns directly. So 1) will be a big one but has no semantic changes. And I will send Patch 1 in this series to the net tree instead. > > > diff --git a/drivers/net/amt.c b/drivers/net/amt.c > > index 98c6205ed19f..2f7bf50e05d2 100644 > > --- a/drivers/net/amt.c > > +++ b/drivers/net/amt.c > > @@ -3161,14 +3161,17 @@ static int amt_validate(struct nlattr *tb[], struct nlattr *data[], > > return 0; > > } > > > > -static int amt_newlink(struct net *net, struct net_device *dev, > > - struct nlattr *tb[], struct nlattr *data[], > > - struct netlink_ext_ack *extack) > > +static int amt_newlink(struct rtnl_newlink_params *params) > > { > > + struct net_device *dev = params->dev; > > + struct nlattr **tb = params->tb; > > + struct nlattr **data = params->data; > > + struct netlink_ext_ack *extack = params->extack; > > + struct net *link_net = rtnl_newlink_link_net(params); > > struct amt_dev *amt = netdev_priv(dev); > > int err = -EINVAL; > > Minor nit: here and and many other places, please respect the reverse > xmas tree order. Will fix this. From shaw.leon at gmail.com Thu Dec 12 13:06:40 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Thu, 12 Dec 2024 13:06:40 -0000 Subject: [PATCH net-next v5 5/5] selftests: net: Add two test cases for link netns In-Reply-To: <4a2fe99a-772d-4df1-a8ef-14338682b69e@redhat.com> References: <20241209140151.231257-1-shaw.leon@gmail.com> <20241209140151.231257-6-shaw.leon@gmail.com> <4a2fe99a-772d-4df1-a8ef-14338682b69e@redhat.com> Message-ID: On Thu, Dec 12, 2024 at 5:40?PM Paolo Abeni wrote: > > On 12/9/24 15:01, Xiao Liang wrote: > > - Add test for creating link in another netns when a link of the same > > name and ifindex exists in current netns. > > - Add test for link netns atomicity - create link directly in target > > netns, and no notifications should be generated in current netns. > > > > Signed-off-by: Xiao Liang > > --- > > tools/testing/selftests/net/Makefile | 1 + > > tools/testing/selftests/net/netns-name.sh | 10 ++++++ > > tools/testing/selftests/net/netns_atomic.py | 39 +++++++++++++++++++++ > > 3 files changed, 50 insertions(+) > > create mode 100755 tools/testing/selftests/net/netns_atomic.py > > > > diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile > > index cb2fc601de66..f9f7a765d645 100644 > > --- a/tools/testing/selftests/net/Makefile > > +++ b/tools/testing/selftests/net/Makefile > > @@ -34,6 +34,7 @@ TEST_PROGS += gre_gso.sh > > TEST_PROGS += cmsg_so_mark.sh > > TEST_PROGS += cmsg_time.sh cmsg_ipv6.sh > > TEST_PROGS += netns-name.sh > > +TEST_PROGS += netns_atomic.py > > TEST_PROGS += nl_netdev.py > > TEST_PROGS += srv6_end_dt46_l3vpn_test.sh > > TEST_PROGS += srv6_end_dt4_l3vpn_test.sh > > diff --git a/tools/testing/selftests/net/netns-name.sh b/tools/testing/selftests/net/netns-name.sh > > index 6974474c26f3..0be1905d1f2f 100755 > > --- a/tools/testing/selftests/net/netns-name.sh > > +++ b/tools/testing/selftests/net/netns-name.sh > > @@ -78,6 +78,16 @@ ip -netns $NS link show dev $ALT_NAME 2> /dev/null && > > fail "Can still find alt-name after move" > > ip -netns $test_ns link del $DEV || fail > > > > +# > > +# Test no conflict of the same name/ifindex in different netns > > +# > > +ip -netns $NS link add name $DEV index 100 type dummy || fail > > +ip -netns $NS link add netns $test_ns name $DEV index 100 type dummy || > > + fail "Can create in netns without moving" > > +ip -netns $test_ns link show dev $DEV >> /dev/null || fail "Device not found" > > +ip -netns $NS link del $DEV || fail > > +ip -netns $test_ns link del $DEV || fail > > + > > echo -ne "$(basename $0) \t\t\t\t" > > if [ $RET_CODE -eq 0 ]; then > > echo "[ OK ]" > > diff --git a/tools/testing/selftests/net/netns_atomic.py b/tools/testing/selftests/net/netns_atomic.py > > new file mode 100755 > > index 000000000000..d350a3fc0a91 > > --- /dev/null > > +++ b/tools/testing/selftests/net/netns_atomic.py > > @@ -0,0 +1,39 @@ > > +#!/usr/bin/env python3 > > +# SPDX-License-Identifier: GPL-2.0 > > + > > +import time > > + > > +from lib.py import ksft_run, ksft_exit, ksft_true > > +from lib.py import ip > > +from lib.py import NetNS, NetNSEnter > > +from lib.py import RtnlFamily > > + > > + > > +def test_event(ns1, ns2) -> None: > > + with NetNSEnter(str(ns1)): > > + rtnl = RtnlFamily() > > + > > + rtnl.ntf_subscribe("rtnlgrp-link") > > + > > + ip(f"netns set {ns1} 0", ns=str(ns2)) > > + > > + ip(f"link add netns {ns2} link-netnsid 0 dummy1 type dummy") > > + ip(f"link add netns {ns2} dummy2 type dummy", ns=str(ns1)) > > + > > + ip("link del dummy1", ns=str(ns2)) > > + ip("link del dummy2", ns=str(ns2)) > > + > > + time.sleep(1) > > + rtnl.check_ntf() > > + ksft_true(rtnl.async_msg_queue.empty(), > > + "Received unexpected link notification") > > I think we need a much larger coverage here, possibly testing all the > update drivers and more 'netns', 'link-netnsid', 'peer netns' > permutations for the devices that allow them. OK, I will add more cases. But I'm afraid I don't know how to build valid parameters for all of them, and some seem to require hardware. > > Thanks, > > Paolo > From liuhangbin at gmail.com Fri Dec 13 03:08:37 2024 From: liuhangbin at gmail.com (Hangbin Liu) Date: Fri, 13 Dec 2024 03:08:37 -0000 Subject: [PATCHv3 net-next 0/2] selftests: wireguards: use nftables for testing Message-ID: <20241213030819.49987-1-liuhangbin@gmail.com> This patch set convert iptables to nftables for wireguard testing, as iptables is deparated and nftables is the default framework of most releases. v3: drop iptables directly (Jason A. Donenfeld) Also convert to using nft for qemu testing (Jason A. Donenfeld) v2: use one nft table for testing (Phil Sutter) Hangbin Liu (2): selftests: wireguards: convert iptables to nft selftests: wireguard: update to using nft for qemu test tools/testing/selftests/wireguard/netns.sh | 29 +++++++++----- .../testing/selftests/wireguard/qemu/Makefile | 40 ++++++++++++++----- .../selftests/wireguard/qemu/kernel.config | 7 ++-- 3 files changed, 53 insertions(+), 23 deletions(-) -- 2.39.5 (Apple Git-154) From liuhangbin at gmail.com Fri Dec 13 03:08:43 2024 From: liuhangbin at gmail.com (Hangbin Liu) Date: Fri, 13 Dec 2024 03:08:43 -0000 Subject: [PATCHv3 net-next 1/2] selftests: wireguards: convert iptables to nft In-Reply-To: <20241213030819.49987-1-liuhangbin@gmail.com> References: <20241213030819.49987-1-liuhangbin@gmail.com> Message-ID: <20241213030819.49987-2-liuhangbin@gmail.com> Convert iptabels to nft as it is the replacement for iptables, which is used by default in most releases. Signed-off-by: Hangbin Liu --- tools/testing/selftests/wireguard/netns.sh | 29 ++++++++++++++-------- 1 file changed, 19 insertions(+), 10 deletions(-) diff --git a/tools/testing/selftests/wireguard/netns.sh b/tools/testing/selftests/wireguard/netns.sh index 55500f901fbc..4032384e6747 100755 --- a/tools/testing/selftests/wireguard/netns.sh +++ b/tools/testing/selftests/wireguard/netns.sh @@ -75,6 +75,11 @@ pp ip netns add $netns1 pp ip netns add $netns2 ip0 link set up dev lo +# init nft tables +n0 nft add table ip wgtest +n1 nft add table ip wgtest +n2 nft add table ip wgtest + ip0 link add dev wg0 type wireguard ip0 link set wg0 netns $netns1 ip0 link add dev wg0 type wireguard @@ -196,13 +201,14 @@ ip1 link set wg0 mtu 1300 ip2 link set wg0 mtu 1300 n1 wg set wg0 peer "$pub2" endpoint 127.0.0.1:2 n2 wg set wg0 peer "$pub1" endpoint 127.0.0.1:1 -n0 iptables -A INPUT -m length --length 1360 -j DROP +n0 nft add chain ip wgtest INPUT { type filter hook input priority filter \; policy accept \; } +n0 nft add rule ip wgtest INPUT meta length 1360 counter drop n1 ip route add 192.168.241.2/32 dev wg0 mtu 1299 n2 ip route add 192.168.241.1/32 dev wg0 mtu 1299 n2 ping -c 1 -W 1 -s 1269 192.168.241.1 n2 ip route delete 192.168.241.1/32 dev wg0 mtu 1299 n1 ip route delete 192.168.241.2/32 dev wg0 mtu 1299 -n0 iptables -F INPUT +n0 nft flush table ip wgtest ip1 link set wg0 mtu $orig_mtu ip2 link set wg0 mtu $orig_mtu @@ -335,7 +341,8 @@ n0 bash -c 'printf 1 > /proc/sys/net/ipv4/ip_forward' [[ -e /proc/sys/net/netfilter/nf_conntrack_udp_timeout ]] || modprobe nf_conntrack n0 bash -c 'printf 2 > /proc/sys/net/netfilter/nf_conntrack_udp_timeout' n0 bash -c 'printf 2 > /proc/sys/net/netfilter/nf_conntrack_udp_timeout_stream' -n0 iptables -t nat -A POSTROUTING -s 192.168.1.0/24 -d 10.0.0.0/24 -j SNAT --to 10.0.0.1 +n0 nft add chain ip wgtest POSTROUTING { type nat hook postrouting priority srcnat\; policy accept \; } +n0 nft add rule ip wgtest POSTROUTING ip saddr 192.168.1.0/24 ip daddr 10.0.0.0/24 counter snat to 10.0.0.1 n1 wg set wg0 peer "$pub2" endpoint 10.0.0.100:2 persistent-keepalive 1 n1 ping -W 1 -c 1 192.168.241.2 @@ -349,10 +356,11 @@ n1 wg set wg0 peer "$pub2" persistent-keepalive 0 # Test that sk_bound_dev_if works n1 ping -I wg0 -c 1 -W 1 192.168.241.2 # What about when the mark changes and the packet must be rerouted? -n1 iptables -t mangle -I OUTPUT -j MARK --set-xmark 1 +n1 nft add chain ip wgtest OUTPUT { type route hook output priority mangle\; policy accept \; } +n1 nft add rule ip wgtest OUTPUT counter meta mark set 0x1 n1 ping -c 1 -W 1 192.168.241.2 # First the boring case n1 ping -I wg0 -c 1 -W 1 192.168.241.2 # Then the sk_bound_dev_if case -n1 iptables -t mangle -D OUTPUT -j MARK --set-xmark 1 +n1 nft flush table ip wgtest # Test that onion routing works, even when it loops n1 wg set wg0 peer "$pub3" allowed-ips 192.168.242.2/32 endpoint 192.168.241.2:5 @@ -386,16 +394,17 @@ n1 ping -W 1 -c 100 -f 192.168.99.7 n1 ping -W 1 -c 100 -f abab::1111 # Have ns2 NAT into wg0 packets from ns0, but return an icmp error along the right route. -n2 iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -d 192.168.241.0/24 -j SNAT --to 192.168.241.2 -n0 iptables -t filter -A INPUT \! -s 10.0.0.0/24 -i vethrs -j DROP # Manual rpfilter just to be explicit. +n2 nft add chain ip wgtest POSTROUTING { type nat hook postrouting priority srcnat\; policy accept \; } +n2 nft add rule ip wgtest POSTROUTING ip saddr 10.0.0.0/24 ip daddr 192.168.241.0/24 counter snat to 192.168.241.2 +n0 nft add chain ip wgtest INPUT { type filter hook input priority filter \; policy accept \; } +n0 nft add rule ip wgtest INPUT iifname "vethrs" ip saddr != 10.0.0.0/24 counter drop n2 bash -c 'printf 1 > /proc/sys/net/ipv4/ip_forward' ip0 -4 route add 192.168.241.1 via 10.0.0.100 n2 wg set wg0 peer "$pub1" remove [[ $(! n0 ping -W 1 -c 1 192.168.241.1 || false) == *"From 10.0.0.100 icmp_seq=1 Destination Host Unreachable"* ]] -n0 iptables -t nat -F -n0 iptables -t filter -F -n2 iptables -t nat -F +n0 nft flush table ip wgtest +n2 nft flush table ip wgtest ip0 link del vethrc ip0 link del vethrs ip1 link del wg0 -- 2.39.5 (Apple Git-154) From liuhangbin at gmail.com Fri Dec 13 03:08:49 2024 From: liuhangbin at gmail.com (Hangbin Liu) Date: Fri, 13 Dec 2024 03:08:49 -0000 Subject: [PATCHv3 net-next 2/2] selftests: wireguard: update to using nft for qemu test In-Reply-To: <20241213030819.49987-1-liuhangbin@gmail.com> References: <20241213030819.49987-1-liuhangbin@gmail.com> Message-ID: <20241213030819.49987-3-liuhangbin@gmail.com> Since we will replace iptables with nft for wireguard netns testing, let's also convert the qemu test to use nft at the same time. Co-developed-by: Phil Sutter Signed-off-by: Phil Sutter Signed-off-by: Hangbin Liu --- .../testing/selftests/wireguard/qemu/Makefile | 40 ++++++++++++++----- .../selftests/wireguard/qemu/kernel.config | 7 ++-- 2 files changed, 34 insertions(+), 13 deletions(-) diff --git a/tools/testing/selftests/wireguard/qemu/Makefile b/tools/testing/selftests/wireguard/qemu/Makefile index 35856b11c143..10e79449fefa 100644 --- a/tools/testing/selftests/wireguard/qemu/Makefile +++ b/tools/testing/selftests/wireguard/qemu/Makefile @@ -40,7 +40,9 @@ endef $(eval $(call tar_download,IPERF,iperf,3.11,.tar.gz,https://downloads.es.net/pub/iperf/,de8cb409fad61a0574f4cb07eb19ce1159707403ac2dc01b5d175e91240b7e5f)) $(eval $(call tar_download,BASH,bash,5.1.16,.tar.gz,https://ftp.gnu.org/gnu/bash/,5bac17218d3911834520dad13cd1f85ab944e1c09ae1aba55906be1f8192f558)) $(eval $(call tar_download,IPROUTE2,iproute2,5.17.0,.tar.gz,https://www.kernel.org/pub/linux/utils/net/iproute2/,bda331d5c4606138892f23a565d78fca18919b4d508a0b7ca8391c2da2db68b9)) -$(eval $(call tar_download,IPTABLES,iptables,1.8.7,.tar.bz2,https://www.netfilter.org/projects/iptables/files/,c109c96bb04998cd44156622d36f8e04b140701ec60531a10668cfdff5e8d8f0)) +$(eval $(call tar_download,LIBMNL,libmnl,1.0.5,.tar.bz2,https://www.netfilter.org/projects/libmnl/files/,274b9b919ef3152bfb3da3a13c950dd60d6e2bcd54230ffeca298d03b40d0525)) +$(eval $(call tar_download,LIBNFTNL,libnftnl,1.2.8,.tar.xz,https://www.netfilter.org/projects/libnftnl/files/,37fea5d6b5c9b08de7920d298de3cdc942e7ae64b1a3e8b880b2d390ae67ad95)) +$(eval $(call tar_download,NFTABLES,nftables,1.1.1,.tar.xz,https://www.netfilter.org/projects/nftables/files/,6358830f3a64f31e39b0ad421d7dadcd240b72343ded48d8ef13b8faf204865a)) $(eval $(call tar_download,NMAP,nmap,7.92,.tgz,https://nmap.org/dist/,064183ea642dc4c12b1ab3b5358ce1cef7d2e7e11ffa2849f16d339f5b717117)) $(eval $(call tar_download,IPUTILS,iputils,s20190709,.tar.gz,https://github.com/iputils/iputils/archive/s20190709.tar.gz/#,a15720dd741d7538dd2645f9f516d193636ae4300ff7dbc8bfca757bf166490a)) $(eval $(call tar_download,WIREGUARD_TOOLS,wireguard-tools,1.0.20210914,.tar.xz,https://git.zx2c4.com/wireguard-tools/snapshot/,97ff31489217bb265b7ae850d3d0f335ab07d2652ba1feec88b734bc96bd05ac)) @@ -322,11 +324,12 @@ $(BUILD_PATH)/init-cpio-spec.txt: $(TOOLCHAIN_PATH)/.installed $(BUILD_PATH)/ini echo "file /bin/ss $(IPROUTE2_PATH)/misc/ss 755 0 0" >> $@ echo "file /bin/ping $(IPUTILS_PATH)/ping 755 0 0" >> $@ echo "file /bin/ncat $(NMAP_PATH)/ncat/ncat 755 0 0" >> $@ - echo "file /bin/xtables-legacy-multi $(IPTABLES_PATH)/iptables/xtables-legacy-multi 755 0 0" >> $@ - echo "slink /bin/iptables xtables-legacy-multi 777 0 0" >> $@ + echo "file /bin/nft $(NFTABLES_PATH)/src/nft 755 0 0" >> $@ echo "slink /bin/ping6 ping 777 0 0" >> $@ echo "dir /lib 755 0 0" >> $@ echo "file /lib/libc.so $(TOOLCHAIN_PATH)/$(CHOST)/lib/libc.so 755 0 0" >> $@ + echo "file /lib/libmnl.so.0 $(TOOLCHAIN_PATH)/lib/libmnl.so.0 755 0 0" >> $@ + echo "file /lib/libnftnl.so.11 $(TOOLCHAIN_PATH)/lib/libnftnl.so.11 755 0 0" >> $@ echo "slink $$($(CHOST)-readelf -p .interp '$(BUILD_PATH)/init'| grep -o '/lib/.*') libc.so 777 0 0" >> $@ $(KERNEL_BUILD_PATH)/.config: $(TOOLCHAIN_PATH)/.installed kernel.config arch/$(ARCH).config @@ -338,7 +341,7 @@ $(KERNEL_BUILD_PATH)/.config: $(TOOLCHAIN_PATH)/.installed kernel.config arch/$( cd $(KERNEL_BUILD_PATH) && ARCH=$(KERNEL_ARCH) $(KERNEL_PATH)/scripts/kconfig/merge_config.sh -n $(KERNEL_BUILD_PATH)/.config $(KERNEL_BUILD_PATH)/minimal.config $(if $(findstring yes,$(DEBUG_KERNEL)),cp debug.config $(KERNEL_BUILD_PATH) && cd $(KERNEL_BUILD_PATH) && ARCH=$(KERNEL_ARCH) $(KERNEL_PATH)/scripts/kconfig/merge_config.sh -n $(KERNEL_BUILD_PATH)/.config debug.config,) -$(KERNEL_BZIMAGE): $(TOOLCHAIN_PATH)/.installed $(KERNEL_BUILD_PATH)/.config $(BUILD_PATH)/init-cpio-spec.txt $(IPERF_PATH)/src/iperf3 $(IPUTILS_PATH)/ping $(BASH_PATH)/bash $(IPROUTE2_PATH)/misc/ss $(IPROUTE2_PATH)/ip/ip $(IPTABLES_PATH)/iptables/xtables-legacy-multi $(NMAP_PATH)/ncat/ncat $(WIREGUARD_TOOLS_PATH)/src/wg $(BUILD_PATH)/init +$(KERNEL_BZIMAGE): $(TOOLCHAIN_PATH)/.installed $(KERNEL_BUILD_PATH)/.config $(BUILD_PATH)/init-cpio-spec.txt $(IPERF_PATH)/src/iperf3 $(IPUTILS_PATH)/ping $(BASH_PATH)/bash $(IPROUTE2_PATH)/misc/ss $(IPROUTE2_PATH)/ip/ip $(LIBMNL_PATH)/libmnl $(LIBNFTNL_PATH)/libnftnl $(NFTABLES_PATH)/src/nft $(NMAP_PATH)/ncat/ncat $(WIREGUARD_TOOLS_PATH)/src/wg $(BUILD_PATH)/init $(MAKE) -C $(KERNEL_PATH) O=$(KERNEL_BUILD_PATH) ARCH=$(KERNEL_ARCH) CROSS_COMPILE=$(CROSS_COMPILE) .PHONY: $(KERNEL_BZIMAGE) @@ -421,15 +424,34 @@ $(IPROUTE2_PATH)/misc/ss: | $(IPROUTE2_PATH)/.installed $(USERSPACE_DEPS) $(MAKE) -C $(IPROUTE2_PATH) PREFIX=/ misc/ss $(STRIP) -s $@ -$(IPTABLES_PATH)/.installed: $(IPTABLES_TAR) +$(LIBMNL_PATH)/.installed: $(LIBMNL_TAR) mkdir -p $(BUILD_PATH) flock -s $<.lock tar -C $(BUILD_PATH) -xf $< - sed -i -e "/nfnetlink=[01]/s:=[01]:=0:" -e "/nfconntrack=[01]/s:=[01]:=0:" $(IPTABLES_PATH)/configure touch $@ -$(IPTABLES_PATH)/iptables/xtables-legacy-multi: | $(IPTABLES_PATH)/.installed $(USERSPACE_DEPS) - cd $(IPTABLES_PATH) && ./configure --prefix=/ $(CROSS_COMPILE_FLAG) --enable-static --disable-shared --disable-nftables --disable-bpf-compiler --disable-nfsynproxy --disable-libipq --disable-connlabel --with-kernel=$(BUILD_PATH)/include - $(MAKE) -C $(IPTABLES_PATH) +$(LIBMNL_PATH)/libmnl: | $(LIBMNL_PATH)/.installed $(USERSPACE_DEPS) + cd $(LIBMNL_PATH) && ./configure --prefix=$(TOOLCHAIN_PATH) $(CROSS_COMPILE_FLAG) + $(MAKE) -C $(LIBMNL_PATH) install + $(STRIP) -s $(TOOLCHAIN_PATH)/lib/libmnl.so.0 + +$(LIBNFTNL_PATH)/.installed: $(LIBNFTNL_TAR) + mkdir -p $(BUILD_PATH) + flock -s $<.lock tar -C $(BUILD_PATH) -xf $< + touch $@ + +$(LIBNFTNL_PATH)/libnftnl: | $(LIBNFTNL_PATH)/.installed $(USERSPACE_DEPS) + cd $(LIBNFTNL_PATH) && PKG_CONFIG_PATH="$(TOOLCHAIN_PATH)/lib/pkgconfig" ./configure --prefix=$(TOOLCHAIN_PATH) $(CROSS_COMPILE_FLAG) + $(MAKE) -C $(LIBNFTNL_PATH) install + $(STRIP) -s $(TOOLCHAIN_PATH)/lib/libnftnl.so.11 + +$(NFTABLES_PATH)/.installed: $(NFTABLES_TAR) + mkdir -p $(BUILD_PATH) + flock -s $<.lock tar -C $(BUILD_PATH) -xf $< + touch $@ + +$(NFTABLES_PATH)/src/nft: | $(NFTABLES_PATH)/.installed $(USERSPACE_DEPS) + cd $(NFTABLES_PATH) && PKG_CONFIG_PATH="$(TOOLCHAIN_PATH)/lib/pkgconfig" ./configure --prefix=/ $(CROSS_COMPILE_FLAG) --enable-static --disable-shared --disable-debug --disable-man-doc --with-mini-gmp --without-cli + $(MAKE) -C $(NFTABLES_PATH) PREFIX=/ $(STRIP) -s $@ $(NMAP_PATH)/.installed: $(NMAP_TAR) diff --git a/tools/testing/selftests/wireguard/qemu/kernel.config b/tools/testing/selftests/wireguard/qemu/kernel.config index f314d3789f17..9930116ecd81 100644 --- a/tools/testing/selftests/wireguard/qemu/kernel.config +++ b/tools/testing/selftests/wireguard/qemu/kernel.config @@ -19,10 +19,9 @@ CONFIG_NETFILTER_XTABLES=y CONFIG_NETFILTER_XT_NAT=y CONFIG_NETFILTER_XT_MATCH_LENGTH=y CONFIG_NETFILTER_XT_MARK=y -CONFIG_IP_NF_IPTABLES=y -CONFIG_IP_NF_FILTER=y -CONFIG_IP_NF_MANGLE=y -CONFIG_IP_NF_NAT=y +CONFIG_NF_TABLES=m +CONFIG_NF_TABLES_INET=y +CONFIG_NFT_NAT=y CONFIG_IP_ADVANCED_ROUTER=y CONFIG_IP_MULTIPLE_TABLES=y CONFIG_IPV6_MULTIPLE_TABLES=y -- 2.39.5 (Apple Git-154) From shaw.leon at gmail.com Wed Dec 18 13:09:27 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Wed, 18 Dec 2024 13:09:27 -0000 Subject: [PATCH net-next v6 00/11] net: Improve netns handling in rtnetlink Message-ID: <20241218130909.2173-1-shaw.leon@gmail.com> This patch series includes some netns-related improvements and fixes for rtnetlink, to make link creation more intuitive: 1) Creating link in another net namespace doesn't conflict with link names in current one. 2) Refector rtnetlink link creation. Create link in target namespace directly. So that # ip link add netns ns1 link-netns ns2 tun0 type gre ... will create tun0 in ns1, rather than create it in ns2 and move to ns1. And don't conflict with another interface named "tun0" in current netns. Patch 01 servers for 1) to avoids link name conflict in different netns. To achieve 2), there're mainly 3 steps: - Patch 02 packs newlink() parameters into a struct, including the original "src_net" along with more netns context. - Patch 03 ~ 07 converts device drivers to use the explicit netns extracted from params. - Patch 08 ~ 09 removes the old netns parameter, and converts rtnetlink to create device in target netns directly. Patch 10 ~ 11 adds some tests for link name and link netns. BTW please note there're some issues found in current code: - In amt_newlink() drivers/net/amt.c: amt->net = net; ... amt->stream_dev = dev_get_by_index(net, ... Uses net, but amt_lookup_upper_dev() only searches in dev_net. So the AMT device may not be properly deleted if it's in a different netns from lower dev. - In gtp_newlink() in drivers/net/gtp.c: gtp->net = src_net; ... gn = net_generic(dev_net(dev), gtp_net_id); list_add_rcu(>p->list, &gn->gtp_dev_list); Uses src_net, but priv is linked to list in dev_net. So it may not be properly deleted on removal of link netns. - In pfcp_newlink() in drivers/net/pfcp.c: pfcp->net = net; ... pn = net_generic(dev_net(dev), pfcp_net_id); list_add_rcu(&pfcp->list, &pn->pfcp_dev_list); Same as above. - In lowpan_newlink() in net/ieee802154/6lowpan/core.c: wdev = dev_get_by_index(dev_net(ldev), nla_get_u32(tb[IFLA_LINK])); Looks for IFLA_LINK in dev_net, but in theory the ifindex is defined in link netns. --- v6: - Split prototype, driver and rtnetlink changes. - Add more tests for link netns. - Fix IPv6 tunnel net overwriten in ndo_init(). - Reorder variable declarations. - Exclude a ip_tunnel-specific patch. v5: link: https://lore.kernel.org/all/20241209140151.231257-1-shaw.leon at gmail.com/ - Fix function doc in batman-adv. - Include peer_net in rtnl newlink parameters. v4: link: https://lore.kernel.org/all/20241118143244.1773-1-shaw.leon at gmail.com/ - Pack newlink() parameters to a single struct. - Use ynl async_msg_queue.empty() in selftest. v3: link: https://lore.kernel.org/all/20241113125715.150201-1-shaw.leon at gmail.com/ - Drop "netns_atomic" flag and module parameter. Add netns parameter to newlink() instead, and convert drivers accordingly. - Move python NetNSEnter helper to net selftest lib. v2: link: https://lore.kernel.org/all/20241107133004.7469-1-shaw.leon at gmail.com/ - Check NLM_F_EXCL to ensure only link creation is affected. - Add self tests for link name/ifindex conflict and notifications in different netns. - Changes in dummy driver and ynl in order to add the test case. v1: link: https://lore.kernel.org/all/20241023023146.372653-1-shaw.leon at gmail.com/ Xiao Liang (11): rtnetlink: Lookup device in target netns when creating link rtnetlink: Pack newlink() params into struct net: Use link netns in newlink() of rtnl_link_ops ieee802154: 6lowpan: Use link netns in newlink() of rtnl_link_ops net: ip_tunnel: Use link netns in newlink() of rtnl_link_ops net: ipv6: Use link netns in newlink() of rtnl_link_ops net: xfrm: Use link netns in newlink() of rtnl_link_ops rtnetlink: Remove "net" from newlink params rtnetlink: Create link directly in target net namespace selftests: net: Add python context manager for netns entering selftests: net: Add test cases for link and peer netns drivers/infiniband/ulp/ipoib/ipoib_netlink.c | 11 +- drivers/net/amt.c | 16 +- drivers/net/bareudp.c | 11 +- drivers/net/bonding/bond_netlink.c | 8 +- drivers/net/can/dev/netlink.c | 4 +- drivers/net/can/vxcan.c | 9 +- .../ethernet/qualcomm/rmnet/rmnet_config.c | 11 +- drivers/net/geneve.c | 11 +- drivers/net/gtp.c | 9 +- drivers/net/ipvlan/ipvlan.h | 4 +- drivers/net/ipvlan/ipvlan_main.c | 15 +- drivers/net/ipvlan/ipvtap.c | 10 +- drivers/net/macsec.c | 15 +- drivers/net/macvlan.c | 8 +- drivers/net/macvtap.c | 11 +- drivers/net/netkit.c | 9 +- drivers/net/pfcp.c | 11 +- drivers/net/ppp/ppp_generic.c | 10 +- drivers/net/team/team_core.c | 7 +- drivers/net/veth.c | 9 +- drivers/net/vrf.c | 11 +- drivers/net/vxlan/vxlan_core.c | 11 +- drivers/net/wireguard/device.c | 11 +- drivers/net/wireless/virtual/virt_wifi.c | 14 +- drivers/net/wwan/wwan_core.c | 25 ++- include/net/ip_tunnels.h | 5 +- include/net/rtnetlink.h | 44 +++++- net/8021q/vlan_netlink.c | 15 +- net/batman-adv/soft-interface.c | 16 +- net/bridge/br_netlink.c | 12 +- net/caif/chnl_net.c | 6 +- net/core/rtnetlink.c | 35 +++-- net/hsr/hsr_netlink.c | 14 +- net/ieee802154/6lowpan/core.c | 9 +- net/ipv4/ip_gre.c | 27 ++-- net/ipv4/ip_tunnel.c | 10 +- net/ipv4/ip_vti.c | 10 +- net/ipv4/ipip.c | 14 +- net/ipv6/ip6_gre.c | 42 ++++-- net/ipv6/ip6_tunnel.c | 20 ++- net/ipv6/ip6_vti.c | 16 +- net/ipv6/sit.c | 18 ++- net/xfrm/xfrm_interface_core.c | 15 +- tools/testing/selftests/net/Makefile | 1 + .../testing/selftests/net/lib/py/__init__.py | 2 +- tools/testing/selftests/net/lib/py/netns.py | 18 +++ tools/testing/selftests/net/link_netns.py | 142 ++++++++++++++++++ tools/testing/selftests/net/netns-name.sh | 10 ++ 48 files changed, 546 insertions(+), 226 deletions(-) create mode 100755 tools/testing/selftests/net/link_netns.py -- 2.47.1 From shaw.leon at gmail.com Wed Dec 18 13:09:35 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Wed, 18 Dec 2024 13:09:35 -0000 Subject: [PATCH net-next v6 01/11] rtnetlink: Lookup device in target netns when creating link In-Reply-To: <20241218130909.2173-1-shaw.leon@gmail.com> References: <20241218130909.2173-1-shaw.leon@gmail.com> Message-ID: <20241218130909.2173-2-shaw.leon@gmail.com> When creating link, lookup for existing device in target net namespace instead of current one. For example, two links created by: # ip link add dummy1 type dummy # ip link add netns ns1 dummy1 type dummy should have no conflict since they are in different namespaces. Signed-off-by: Xiao Liang --- net/core/rtnetlink.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index d9e363d9fa31..6a2fa030c8e0 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -3851,20 +3851,26 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh, { struct nlattr ** const tb = tbs->tb; struct net *net = sock_net(skb->sk); + struct net *device_net; struct net_device *dev; struct ifinfomsg *ifm; bool link_specified; + /* When creating, lookup for existing device in target net namespace */ + device_net = (nlh->nlmsg_flags & NLM_F_CREATE) && + (nlh->nlmsg_flags & NLM_F_EXCL) ? + tgt_net : net; + ifm = nlmsg_data(nlh); if (ifm->ifi_index > 0) { link_specified = true; - dev = __dev_get_by_index(net, ifm->ifi_index); + dev = __dev_get_by_index(device_net, ifm->ifi_index); } else if (ifm->ifi_index < 0) { NL_SET_ERR_MSG(extack, "ifindex can't be negative"); return -EINVAL; } else if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME]) { link_specified = true; - dev = rtnl_dev_get(net, tb); + dev = rtnl_dev_get(device_net, tb); } else { link_specified = false; dev = NULL; -- 2.47.1 From shaw.leon at gmail.com Wed Dec 18 13:09:44 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Wed, 18 Dec 2024 13:09:44 -0000 Subject: [PATCH net-next v6 02/11] rtnetlink: Pack newlink() params into struct In-Reply-To: <20241218130909.2173-1-shaw.leon@gmail.com> References: <20241218130909.2173-1-shaw.leon@gmail.com> Message-ID: <20241218130909.2173-3-shaw.leon@gmail.com> There are 4 net namespaces involved when creating links: - source netns - where the netlink socket resides, - target netns - where to put the device being created, - link netns - netns associated with the device (backend), - peer netns - netns of peer device. Currently, two nets are passed to newlink() callback - "src_net" parameter and "dev_net" (implicitly in net_device). They are set as follows, depending on netlink attributes in the request. +------------+-------------------+---------+---------+ | peer netns | IFLA_LINK_NETNSID | src_net | dev_net | +------------+-------------------+---------+---------+ | | absent | source | target | | absent +-------------------+---------+---------+ | | present | link | link | +------------+-------------------+---------+---------+ | | absent | peer | target | | present +-------------------+---------+---------+ | | present | peer | link | +------------+-------------------+---------+---------+ When IFLA_LINK_NETNSID is present, the device is created in link netns first and then moved to target netns. This has some side effects, including extra ifindex allocation, ifname validation and link events. These could be avoided if we create it in target netns from the beginning. On the other hand, the meaning of src_net parameter is ambiguous. It varies depending on how parameters are passed. It is the effective link (or peer netns) by design, but some drivers ignore it and use dev_net instead. This patch packs existing newlink() parameters, along with the source netns, link netns and peer netns, into a struct. The old "src_net" is renamed to "net" to avoid confusion with real source netns, and will be deprecated later. The use of src_net are converted to params->net trivially. To make the semantics more clear, two helper functions - rtnl_newlink_link_net() and rtnl_newlink_peer_net() - are provided for netns fallback logic. Peer netns falls back to link netns, and link netns falls back to source netns. In following patches, to prepare for creating link in target netns directly: - For device drivers that are aware of the old "src_net", the use of it are replace with one of the two helper functions. - And for those that takes dev_net() as link netns, we try params->link_net and then dev_net(), in order to maintain compatibility with the old behavior. Signed-off-by: Xiao Liang --- drivers/infiniband/ulp/ipoib/ipoib_netlink.c | 9 ++-- drivers/net/amt.c | 12 +++-- drivers/net/bareudp.c | 9 ++-- drivers/net/bonding/bond_netlink.c | 8 ++-- drivers/net/can/dev/netlink.c | 4 +- drivers/net/can/vxcan.c | 9 ++-- .../ethernet/qualcomm/rmnet/rmnet_config.c | 9 ++-- drivers/net/geneve.c | 9 ++-- drivers/net/gtp.c | 7 +-- drivers/net/ipvlan/ipvlan.h | 4 +- drivers/net/ipvlan/ipvlan_main.c | 13 ++++-- drivers/net/ipvlan/ipvtap.c | 10 ++-- drivers/net/macsec.c | 13 ++++-- drivers/net/macvlan.c | 7 ++- drivers/net/macvtap.c | 11 +++-- drivers/net/netkit.c | 9 ++-- drivers/net/pfcp.c | 9 ++-- drivers/net/ppp/ppp_generic.c | 8 ++-- drivers/net/team/team_core.c | 7 +-- drivers/net/veth.c | 9 ++-- drivers/net/vrf.c | 11 +++-- drivers/net/vxlan/vxlan_core.c | 9 ++-- drivers/net/wireguard/device.c | 9 ++-- drivers/net/wireless/virtual/virt_wifi.c | 12 +++-- drivers/net/wwan/wwan_core.c | 25 +++++++--- include/net/rtnetlink.h | 46 +++++++++++++++++-- net/8021q/vlan_netlink.c | 13 ++++-- net/batman-adv/soft-interface.c | 16 +++---- net/bridge/br_netlink.c | 12 +++-- net/caif/chnl_net.c | 6 +-- net/core/rtnetlink.c | 16 +++++-- net/hsr/hsr_netlink.c | 8 ++-- net/ieee802154/6lowpan/core.c | 6 +-- net/ipv4/ip_gre.c | 21 ++++++--- net/ipv4/ip_vti.c | 7 +-- net/ipv4/ipip.c | 11 +++-- net/ipv6/ip6_gre.c | 24 ++++++---- net/ipv6/ip6_tunnel.c | 7 +-- net/ipv6/ip6_vti.c | 6 +-- net/ipv6/sit.c | 7 +-- net/xfrm/xfrm_interface_core.c | 7 +-- 41 files changed, 296 insertions(+), 159 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c index 9ad8d9856275..61f2457aab77 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c @@ -97,10 +97,13 @@ static int ipoib_changelink(struct net_device *dev, struct nlattr *tb[], return ret; } -static int ipoib_new_child_link(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ipoib_new_child_link(struct rtnl_newlink_params *params) { + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *src_net = params->net; + struct nlattr **tb = params->tb; struct net_device *pdev; struct ipoib_dev_priv *ppriv; u16 child_pkey; diff --git a/drivers/net/amt.c b/drivers/net/amt.c index 98c6205ed19f..85878abb51d2 100644 --- a/drivers/net/amt.c +++ b/drivers/net/amt.c @@ -3161,13 +3161,17 @@ static int amt_validate(struct nlattr *tb[], struct nlattr *data[], return 0; } -static int amt_newlink(struct net *net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int amt_newlink(struct rtnl_newlink_params *params) { - struct amt_dev *amt = netdev_priv(dev); + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; + struct net *net = params->net; + struct amt_dev *amt; int err = -EINVAL; + amt = netdev_priv(dev); amt->net = net; amt->mode = nla_get_u32(data[IFLA_AMT_MODE]); diff --git a/drivers/net/bareudp.c b/drivers/net/bareudp.c index 70814303aab8..4c2a50bbf7c0 100644 --- a/drivers/net/bareudp.c +++ b/drivers/net/bareudp.c @@ -698,10 +698,13 @@ static void bareudp_dellink(struct net_device *dev, struct list_head *head) unregister_netdevice_queue(dev, head); } -static int bareudp_newlink(struct net *net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int bareudp_newlink(struct rtnl_newlink_params *params) { + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; + struct net *net = params->net; struct bareudp_conf conf; int err; diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c index 2a6a424806aa..39708a778285 100644 --- a/drivers/net/bonding/bond_netlink.c +++ b/drivers/net/bonding/bond_netlink.c @@ -564,10 +564,12 @@ static int bond_changelink(struct net_device *bond_dev, struct nlattr *tb[], return 0; } -static int bond_newlink(struct net *src_net, struct net_device *bond_dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int bond_newlink(struct rtnl_newlink_params *params) { + struct netlink_ext_ack *extack = params->extack; + struct net_device *bond_dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; int err; err = bond_changelink(bond_dev, tb, data, extack); diff --git a/drivers/net/can/dev/netlink.c b/drivers/net/can/dev/netlink.c index 01aacdcda260..52dae0e94858 100644 --- a/drivers/net/can/dev/netlink.c +++ b/drivers/net/can/dev/netlink.c @@ -624,9 +624,7 @@ static int can_fill_xstats(struct sk_buff *skb, const struct net_device *dev) return -EMSGSIZE; } -static int can_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int can_newlink(struct rtnl_newlink_params *params) { return -EOPNOTSUPP; } diff --git a/drivers/net/can/vxcan.c b/drivers/net/can/vxcan.c index ca8811941085..5d7717c22fab 100644 --- a/drivers/net/can/vxcan.c +++ b/drivers/net/can/vxcan.c @@ -172,10 +172,13 @@ static void vxcan_setup(struct net_device *dev) /* forward declaration for rtnl_create_link() */ static struct rtnl_link_ops vxcan_link_ops; -static int vxcan_newlink(struct net *peer_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int vxcan_newlink(struct rtnl_newlink_params *params) { + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *peer_net = params->net; + struct nlattr **tb = params->tb; struct vxcan_priv *priv; struct net_device *peer; diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c index f3bea196a8f9..b4834651c693 100644 --- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c @@ -117,11 +117,14 @@ static void rmnet_unregister_bridge(struct rmnet_port *port) rmnet_unregister_real_device(bridge_dev); } -static int rmnet_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int rmnet_newlink(struct rtnl_newlink_params *params) { u32 data_format = RMNET_FLAGS_INGRESS_DEAGGREGATION; + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *src_net = params->net; + struct nlattr **tb = params->tb; struct net_device *real_dev; int mode = RMNET_EPMODE_VND; struct rmnet_endpoint *ep; diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c index 642155cb8315..ea0a98a513ed 100644 --- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -1614,10 +1614,13 @@ static void geneve_link_config(struct net_device *dev, geneve_change_mtu(dev, ldev_mtu - info->options_len); } -static int geneve_newlink(struct net *net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int geneve_newlink(struct rtnl_newlink_params *params) { + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; + struct net *net = params->net; struct geneve_config cfg = { .df = GENEVE_DF_UNSET, .use_udp6_rx_checksums = false, diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index 89a996ad8cd0..46d5734da7f3 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -1460,10 +1460,11 @@ static int gtp_create_sockets(struct gtp_dev *gtp, const struct nlattr *nla, #define GTP_TH_MAXLEN (sizeof(struct udphdr) + sizeof(struct gtp0_header)) #define GTP_IPV6_MAXLEN (sizeof(struct ipv6hdr) + GTP_TH_MAXLEN) -static int gtp_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int gtp_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *src_net = params->net; unsigned int role = GTP_ROLE_GGSN; struct gtp_dev *gtp; struct gtp_net *gn; diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h index 025e0c19ec25..beff25a1d6f0 100644 --- a/drivers/net/ipvlan/ipvlan.h +++ b/drivers/net/ipvlan/ipvlan.h @@ -166,9 +166,7 @@ struct ipvl_addr *ipvlan_addr_lookup(struct ipvl_port *port, void *lyr3h, void *ipvlan_get_L3_hdr(struct ipvl_port *port, struct sk_buff *skb, int *type); void ipvlan_count_rx(const struct ipvl_dev *ipvlan, unsigned int len, bool success, bool mcast); -int ipvlan_link_new(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack); +int ipvlan_link_new(struct rtnl_newlink_params *params); void ipvlan_link_delete(struct net_device *dev, struct list_head *head); void ipvlan_link_setup(struct net_device *dev); int ipvlan_link_register(struct rtnl_link_ops *ops); diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c index ee2c3cf4df36..a994fd54ada4 100644 --- a/drivers/net/ipvlan/ipvlan_main.c +++ b/drivers/net/ipvlan/ipvlan_main.c @@ -532,16 +532,21 @@ static int ipvlan_nl_fillinfo(struct sk_buff *skb, return ret; } -int ipvlan_link_new(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +int ipvlan_link_new(struct rtnl_newlink_params *params) { - struct ipvl_dev *ipvlan = netdev_priv(dev); + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *src_net = params->net; + struct nlattr **tb = params->tb; + struct ipvl_dev *ipvlan; struct ipvl_port *port; struct net_device *phy_dev; int err; u16 mode = IPVLAN_MODE_L3; + ipvlan = netdev_priv(dev); + if (!tb[IFLA_LINK]) return -EINVAL; diff --git a/drivers/net/ipvlan/ipvtap.c b/drivers/net/ipvlan/ipvtap.c index 1afc4c47be73..0b0c65390066 100644 --- a/drivers/net/ipvlan/ipvtap.c +++ b/drivers/net/ipvlan/ipvtap.c @@ -73,13 +73,13 @@ static void ipvtap_update_features(struct tap_dev *tap, netdev_update_features(vlan->dev); } -static int ipvtap_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ipvtap_newlink(struct rtnl_newlink_params *params) { - struct ipvtap_dev *vlantap = netdev_priv(dev); + struct net_device *dev = params->dev; + struct ipvtap_dev *vlantap; int err; + vlantap = netdev_priv(dev); INIT_LIST_HEAD(&vlantap->tap.queue_list); /* Since macvlan supports all offloads by default, make @@ -97,7 +97,7 @@ static int ipvtap_newlink(struct net *src_net, struct net_device *dev, /* Don't put anything that may fail after macvlan_common_newlink * because we can't undo what it does. */ - err = ipvlan_link_new(src_net, dev, tb, data, extack); + err = ipvlan_link_new(params); if (err) { netdev_rx_handler_unregister(dev); return err; diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c index 1bc1e5993f56..9da111a6629c 100644 --- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -4141,17 +4141,22 @@ static int macsec_add_dev(struct net_device *dev, sci_t sci, u8 icv_len) static struct lock_class_key macsec_netdev_addr_lock_key; -static int macsec_newlink(struct net *net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int macsec_newlink(struct rtnl_newlink_params *params) { - struct macsec_dev *macsec = macsec_priv(dev); + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; + struct net *net = params->net; rx_handler_func_t *rx_handler; u8 icv_len = MACSEC_DEFAULT_ICV_LEN; struct net_device *real_dev; + struct macsec_dev *macsec; int err, mtu; sci_t sci; + macsec = macsec_priv(dev); + if (!tb[IFLA_LINK]) return -EINVAL; real_dev = __dev_get_by_index(net, nla_get_u32(tb[IFLA_LINK])); diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index fed4fe2a4748..1915f54bd35a 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -1565,11 +1565,10 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev, } EXPORT_SYMBOL_GPL(macvlan_common_newlink); -static int macvlan_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int macvlan_newlink(struct rtnl_newlink_params *params) { - return macvlan_common_newlink(src_net, dev, tb, data, extack); + return macvlan_common_newlink(params->net, params->dev, params->tb, + params->data, params->extack); } void macvlan_dellink(struct net_device *dev, struct list_head *head) diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c index 29a5929d48e5..e5fd8a147310 100644 --- a/drivers/net/macvtap.c +++ b/drivers/net/macvtap.c @@ -77,13 +77,13 @@ static void macvtap_update_features(struct tap_dev *tap, netdev_update_features(vlan->dev); } -static int macvtap_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int macvtap_newlink(struct rtnl_newlink_params *params) { - struct macvtap_dev *vlantap = netdev_priv(dev); + struct net_device *dev = params->dev; + struct macvtap_dev *vlantap; int err; + vlantap = netdev_priv(dev); INIT_LIST_HEAD(&vlantap->tap.queue_list); /* Since macvlan supports all offloads by default, make @@ -105,7 +105,8 @@ static int macvtap_newlink(struct net *src_net, struct net_device *dev, /* Don't put anything that may fail after macvlan_common_newlink * because we can't undo what it does. */ - err = macvlan_common_newlink(src_net, dev, tb, data, extack); + err = macvlan_common_newlink(params->net, dev, params->tb, params->data, + params->extack); if (err) { netdev_rx_handler_unregister(dev); return err; diff --git a/drivers/net/netkit.c b/drivers/net/netkit.c index c1d881dc6409..f5527bb533ab 100644 --- a/drivers/net/netkit.c +++ b/drivers/net/netkit.c @@ -327,10 +327,13 @@ static int netkit_validate(struct nlattr *tb[], struct nlattr *data[], static struct rtnl_link_ops netkit_link_ops; -static int netkit_new_link(struct net *peer_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int netkit_new_link(struct rtnl_newlink_params *params) { + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *peer_net = params->net; + struct nlattr **tb = params->tb; struct nlattr *peer_tb[IFLA_MAX + 1], **tbp = tb, *attr; enum netkit_action policy_prim = NETKIT_PASS; enum netkit_action policy_peer = NETKIT_PASS; diff --git a/drivers/net/pfcp.c b/drivers/net/pfcp.c index 69434fd13f96..cb936da99674 100644 --- a/drivers/net/pfcp.c +++ b/drivers/net/pfcp.c @@ -184,14 +184,15 @@ static int pfcp_add_sock(struct pfcp_dev *pfcp) return PTR_ERR_OR_ZERO(pfcp->sock); } -static int pfcp_newlink(struct net *net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int pfcp_newlink(struct rtnl_newlink_params *params) { - struct pfcp_dev *pfcp = netdev_priv(dev); + struct net_device *dev = params->dev; + struct net *net = params->net; + struct pfcp_dev *pfcp; struct pfcp_net *pn; int err; + pfcp = netdev_priv(dev); pfcp->net = net; err = pfcp_add_sock(pfcp); diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c index 4583e15ad03a..5b58e7bb4e7b 100644 --- a/drivers/net/ppp/ppp_generic.c +++ b/drivers/net/ppp/ppp_generic.c @@ -1303,10 +1303,12 @@ static int ppp_nl_validate(struct nlattr *tb[], struct nlattr *data[], return 0; } -static int ppp_nl_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ppp_nl_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *src_net = params->net; + struct nlattr **tb = params->tb; struct ppp_config conf = { .unit = -1, .ifname_is_set = true, diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c index 69ea2c3c76bf..820c655249f5 100644 --- a/drivers/net/team/team_core.c +++ b/drivers/net/team/team_core.c @@ -2207,10 +2207,11 @@ static void team_setup(struct net_device *dev) dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX; } -static int team_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int team_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + if (tb[IFLA_ADDRESS] == NULL) eth_hw_addr_random(dev); diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 01251868a9c2..04229c07023d 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -1765,10 +1765,13 @@ static int veth_init_queues(struct net_device *dev, struct nlattr *tb[]) return 0; } -static int veth_newlink(struct net *peer_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int veth_newlink(struct rtnl_newlink_params *params) { + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *peer_net = params->net; + struct nlattr **tb = params->tb; int err; struct net_device *peer; struct veth_priv *priv; diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c index ca81b212a246..9a21bfc5bcc7 100644 --- a/drivers/net/vrf.c +++ b/drivers/net/vrf.c @@ -1677,16 +1677,19 @@ static void vrf_dellink(struct net_device *dev, struct list_head *head) unregister_netdevice_queue(dev, head); } -static int vrf_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int vrf_newlink(struct rtnl_newlink_params *params) { - struct net_vrf *vrf = netdev_priv(dev); + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; struct netns_vrf *nn_vrf; + struct net_vrf *vrf; bool *add_fib_rules; struct net *net; int err; + vrf = netdev_priv(dev); + if (!data || !data[IFLA_VRF_TABLE]) { NL_SET_ERR_MSG(extack, "VRF table id is missing"); return -EINVAL; diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c index 0c356e0a61ef..b084adb6d319 100644 --- a/drivers/net/vxlan/vxlan_core.c +++ b/drivers/net/vxlan/vxlan_core.c @@ -4393,10 +4393,13 @@ static int vxlan_nl2conf(struct nlattr *tb[], struct nlattr *data[], return 0; } -static int vxlan_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int vxlan_newlink(struct rtnl_newlink_params *params) { + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *src_net = params->net; + struct nlattr **tb = params->tb; struct vxlan_config conf; int err; diff --git a/drivers/net/wireguard/device.c b/drivers/net/wireguard/device.c index 6cf173a008e7..92aac080d2b5 100644 --- a/drivers/net/wireguard/device.c +++ b/drivers/net/wireguard/device.c @@ -307,13 +307,14 @@ static void wg_setup(struct net_device *dev) wg->dev = dev; } -static int wg_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int wg_newlink(struct rtnl_newlink_params *params) { - struct wg_device *wg = netdev_priv(dev); + struct net_device *dev = params->dev; + struct net *src_net = params->net; + struct wg_device *wg; int ret = -ENOMEM; + wg = netdev_priv(dev); rcu_assign_pointer(wg->creating_net, src_net); init_rwsem(&wg->static_identity.lock); mutex_init(&wg->socket_update_lock); diff --git a/drivers/net/wireless/virtual/virt_wifi.c b/drivers/net/wireless/virtual/virt_wifi.c index 4ee374080466..d64eb03e0ac8 100644 --- a/drivers/net/wireless/virtual/virt_wifi.c +++ b/drivers/net/wireless/virtual/virt_wifi.c @@ -519,13 +519,17 @@ static rx_handler_result_t virt_wifi_rx_handler(struct sk_buff **pskb) } /* Called with rtnl lock held. */ -static int virt_wifi_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int virt_wifi_newlink(struct rtnl_newlink_params *params) { - struct virt_wifi_netdev_priv *priv = netdev_priv(dev); + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct virt_wifi_netdev_priv *priv; + struct net *src_net = params->net; + struct nlattr **tb = params->tb; int err; + priv = netdev_priv(dev); + if (!tb[IFLA_LINK]) return -EINVAL; diff --git a/drivers/net/wwan/wwan_core.c b/drivers/net/wwan/wwan_core.c index a51e2755991a..908a3db61477 100644 --- a/drivers/net/wwan/wwan_core.c +++ b/drivers/net/wwan/wwan_core.c @@ -967,15 +967,20 @@ static struct net_device *wwan_rtnl_alloc(struct nlattr *tb[], return dev; } -static int wwan_rtnl_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int wwan_rtnl_newlink(struct rtnl_newlink_params *params) { - struct wwan_device *wwandev = wwan_dev_get_by_parent(dev->dev.parent); - u32 link_id = nla_get_u32(data[IFLA_WWAN_LINK_ID]); - struct wwan_netdev_priv *priv = netdev_priv(dev); + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct wwan_netdev_priv *priv; + struct wwan_device *wwandev; + u32 link_id; int ret; + wwandev = wwan_dev_get_by_parent(dev->dev.parent); + link_id = nla_get_u32(data[IFLA_WWAN_LINK_ID]); + priv = netdev_priv(dev); + if (IS_ERR(wwandev)) return PTR_ERR(wwandev); @@ -1064,6 +1069,11 @@ static void wwan_create_default_link(struct wwan_device *wwandev, struct net_device *dev; struct nlmsghdr *nlh; struct sk_buff *msg; + struct rtnl_newlink_params params = { + .net = &init_net, + .tb = tb, + .data = data, + }; /* Forge attributes required to create a WWAN netdev. We first * build a netlink message and then parse it. This looks @@ -1105,7 +1115,8 @@ static void wwan_create_default_link(struct wwan_device *wwandev, if (WARN_ON(IS_ERR(dev))) goto unlock; - if (WARN_ON(wwan_rtnl_newlink(&init_net, dev, tb, data, NULL))) { + params.dev = dev; + if (WARN_ON(wwan_rtnl_newlink(¶ms))) { free_netdev(dev); goto unlock; } diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h index bc0069a8b6ea..ed970b4568d1 100644 --- a/include/net/rtnetlink.h +++ b/include/net/rtnetlink.h @@ -69,6 +69,46 @@ static inline int rtnl_msg_family(const struct nlmsghdr *nlh) return AF_UNSPEC; } +/** + * struct rtnl_newlink_params - parameters of rtnl_link_ops::newlink() + * + * @net: Netns of interest + * @src_net: Source netns of rtnetlink socket + * @link_net: Link netns by IFLA_LINK_NETNSID, NULL if not specified + * @peer_net: Peer netns + * @dev: The net_device being created + * @tb: IFLA_* attributes + * @data: IFLA_INFO_DATA attributes + * @extack: Netlink extended ACK + */ +struct rtnl_newlink_params { + struct net *net; + struct net *src_net; + struct net *link_net; + struct net *peer_net; + struct net_device *dev; + struct nlattr **tb; + struct nlattr **data; + struct netlink_ext_ack *extack; +}; + +/* Get effective link netns from newlink params. Generally, this is link_net + * and falls back to src_net. But for compatibility, a driver may * choose to + * use dev_net(dev) instead. + */ +static inline struct net *rtnl_newlink_link_net(struct rtnl_newlink_params *p) +{ + return p->link_net ? : p->src_net; +} + +/* Get peer netns from newlink params. Fallback to link netns if peer netns is + * not specified explicitly. + */ +static inline struct net *rtnl_newlink_peer_net(struct rtnl_newlink_params *p) +{ + return p->peer_net ? : rtnl_newlink_link_net(p); +} + /** * struct rtnl_link_ops - rtnetlink link operations * @@ -125,11 +165,7 @@ struct rtnl_link_ops { struct nlattr *data[], struct netlink_ext_ack *extack); - int (*newlink)(struct net *src_net, - struct net_device *dev, - struct nlattr *tb[], - struct nlattr *data[], - struct netlink_ext_ack *extack); + int (*newlink)(struct rtnl_newlink_params *params); int (*changelink)(struct net_device *dev, struct nlattr *tb[], struct nlattr *data[], diff --git a/net/8021q/vlan_netlink.c b/net/8021q/vlan_netlink.c index 134419667d59..26a0f0a2ce27 100644 --- a/net/8021q/vlan_netlink.c +++ b/net/8021q/vlan_netlink.c @@ -135,16 +135,21 @@ static int vlan_changelink(struct net_device *dev, struct nlattr *tb[], return 0; } -static int vlan_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int vlan_newlink(struct rtnl_newlink_params *params) { - struct vlan_dev_priv *vlan = vlan_dev_priv(dev); + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *src_net = params->net; + struct nlattr **tb = params->tb; struct net_device *real_dev; + struct vlan_dev_priv *vlan; unsigned int max_mtu; __be16 proto; int err; + vlan = vlan_dev_priv(dev); + if (!data[IFLA_VLAN_ID]) { NL_SET_ERR_MSG_MOD(extack, "VLAN id not specified"); return -EINVAL; diff --git a/net/batman-adv/soft-interface.c b/net/batman-adv/soft-interface.c index 2758aba47a2f..5f92a25d6b26 100644 --- a/net/batman-adv/soft-interface.c +++ b/net/batman-adv/soft-interface.c @@ -1063,22 +1063,20 @@ static int batadv_softif_validate(struct nlattr *tb[], struct nlattr *data[], /** * batadv_softif_newlink() - pre-initialize and register new batadv link - * @src_net: the applicable net namespace - * @dev: network device to register - * @tb: IFLA_INFO_DATA netlink attributes - * @data: enum batadv_ifla_attrs attributes - * @extack: extended ACK report struct + * @params: rtnl newlink parameters * * Return: 0 if successful or error otherwise. */ -static int batadv_softif_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int batadv_softif_newlink(struct rtnl_newlink_params *params) { - struct batadv_priv *bat_priv = netdev_priv(dev); + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct batadv_priv *bat_priv; const char *algo_name; int err; + bat_priv = netdev_priv(dev); + if (data && data[IFLA_BATADV_ALGO_NAME]) { algo_name = nla_data(data[IFLA_BATADV_ALGO_NAME]); err = batadv_algo_select(bat_priv, algo_name); diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 3e0f47203f2a..362ca10607ba 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -1553,13 +1553,17 @@ static int br_changelink(struct net_device *brdev, struct nlattr *tb[], return 0; } -static int br_dev_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int br_dev_newlink(struct rtnl_newlink_params *params) { - struct net_bridge *br = netdev_priv(dev); + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; + struct net_bridge *br; int err; + br = netdev_priv(dev); + err = register_netdevice(dev); if (err) return err; diff --git a/net/caif/chnl_net.c b/net/caif/chnl_net.c index 94ad09e36df2..748e38908709 100644 --- a/net/caif/chnl_net.c +++ b/net/caif/chnl_net.c @@ -438,10 +438,10 @@ static void caif_netlink_parms(struct nlattr *data[], } } -static int ipcaif_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ipcaif_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; int ret; struct chnl_net *caifdev; ASSERT_RTNL(); diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 6a2fa030c8e0..f7c176a2f1a0 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -3757,6 +3757,15 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm, struct net_device *dev; char ifname[IFNAMSIZ]; int err; + struct rtnl_newlink_params params = { + .net = net, + .src_net = net, + .link_net = link_net, + .peer_net = peer_net, + .tb = tb, + .data = data, + .extack = extack, + }; if (!ops->alloc && !ops->setup) return -EOPNOTSUPP; @@ -3776,14 +3785,15 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm, } dev->ifindex = ifm->ifi_index; + params.dev = dev; if (link_net) - net = link_net; + params.net = link_net; if (peer_net) - net = peer_net; + params.net = peer_net; if (ops->newlink) - err = ops->newlink(net, dev, tb, data, extack); + err = ops->newlink(¶ms); else err = register_netdevice(dev); if (err < 0) { diff --git a/net/hsr/hsr_netlink.c b/net/hsr/hsr_netlink.c index b68f2f71d0e1..08d38e2e2962 100644 --- a/net/hsr/hsr_netlink.c +++ b/net/hsr/hsr_netlink.c @@ -29,10 +29,12 @@ static const struct nla_policy hsr_policy[IFLA_HSR_MAX + 1] = { /* Here, it seems a netdevice has already been allocated for us, and the * hsr_dev_setup routine has been executed. Nice! */ -static int hsr_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int hsr_newlink(struct rtnl_newlink_params *params) { + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *src_net = params->net; enum hsr_version proto_version; unsigned char multicast_spec; u8 proto = HSR_PROTOCOL_HSR; diff --git a/net/ieee802154/6lowpan/core.c b/net/ieee802154/6lowpan/core.c index 175efd860f7b..c16c14807d87 100644 --- a/net/ieee802154/6lowpan/core.c +++ b/net/ieee802154/6lowpan/core.c @@ -129,10 +129,10 @@ static int lowpan_validate(struct nlattr *tb[], struct nlattr *data[], return 0; } -static int lowpan_newlink(struct net *src_net, struct net_device *ldev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int lowpan_newlink(struct rtnl_newlink_params *params) { + struct net_device *ldev = params->dev; + struct nlattr **tb = params->tb; struct net_device *wdev; int ret; diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c index f1f31ebfc793..ecad1d88dd26 100644 --- a/net/ipv4/ip_gre.c +++ b/net/ipv4/ip_gre.c @@ -1389,10 +1389,11 @@ ipgre_newlink_encap_setup(struct net_device *dev, struct nlattr *data[]) return 0; } -static int ipgre_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ipgre_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; struct ip_tunnel_parm_kern p; __u32 fwmark = 0; int err; @@ -1407,10 +1408,11 @@ static int ipgre_newlink(struct net *src_net, struct net_device *dev, return ip_tunnel_newlink(dev, tb, &p, fwmark); } -static int erspan_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int erspan_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; struct ip_tunnel_parm_kern p; __u32 fwmark = 0; int err; @@ -1695,6 +1697,10 @@ struct net_device *gretap_fb_dev_create(struct net *net, const char *name, LIST_HEAD(list_kill); struct ip_tunnel *t; int err; + struct rtnl_newlink_params params = { + .net = net, + .tb = tb, + }; memset(&tb, 0, sizeof(tb)); @@ -1707,7 +1713,8 @@ struct net_device *gretap_fb_dev_create(struct net *net, const char *name, t = netdev_priv(dev); t->collect_md = true; - err = ipgre_newlink(net, dev, tb, NULL, NULL); + params.dev = dev; + err = ipgre_newlink(¶ms); if (err < 0) { free_netdev(dev); return ERR_PTR(err); diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c index f0b4419cef34..12ccbf34fb6c 100644 --- a/net/ipv4/ip_vti.c +++ b/net/ipv4/ip_vti.c @@ -575,11 +575,12 @@ static void vti_netlink_parms(struct nlattr *data[], *fwmark = nla_get_u32(data[IFLA_VTI_FWMARK]); } -static int vti_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int vti_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; struct ip_tunnel_parm_kern parms; + struct nlattr **tb = params->tb; __u32 fwmark = 0; vti_netlink_parms(data, &parms, &fwmark); diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c index dc0db5895e0e..3a737ea3c2e5 100644 --- a/net/ipv4/ipip.c +++ b/net/ipv4/ipip.c @@ -436,15 +436,18 @@ static void ipip_netlink_parms(struct nlattr *data[], *fwmark = nla_get_u32(data[IFLA_IPTUN_FWMARK]); } -static int ipip_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ipip_newlink(struct rtnl_newlink_params *params) { - struct ip_tunnel *t = netdev_priv(dev); + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; struct ip_tunnel_encap ipencap; struct ip_tunnel_parm_kern p; + struct ip_tunnel *t; __u32 fwmark = 0; + t = netdev_priv(dev); + if (ip_tunnel_netlink_encap_parms(data, &ipencap)) { int err = ip_tunnel_encap_setup(t, &ipencap); diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c index 235808cfec70..3efd51f0d7d2 100644 --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -2005,15 +2005,19 @@ static int ip6gre_newlink_common(struct net *src_net, struct net_device *dev, return err; } -static int ip6gre_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ip6gre_newlink(struct rtnl_newlink_params *params) { - struct ip6_tnl *nt = netdev_priv(dev); + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *src_net = params->net; + struct nlattr **tb = params->tb; struct net *net = dev_net(dev); struct ip6gre_net *ign; + struct ip6_tnl *nt; int err; + nt = netdev_priv(dev); ip6gre_netlink_parms(data, &nt->parms); ign = net_generic(net, ip6gre_net_id); @@ -2241,15 +2245,19 @@ static void ip6erspan_tap_setup(struct net_device *dev) netif_keep_dst(dev); } -static int ip6erspan_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ip6erspan_newlink(struct rtnl_newlink_params *params) { - struct ip6_tnl *nt = netdev_priv(dev); + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *src_net = params->net; + struct nlattr **tb = params->tb; struct net *net = dev_net(dev); struct ip6gre_net *ign; + struct ip6_tnl *nt; int err; + nt = netdev_priv(dev); ip6gre_netlink_parms(data, &nt->parms); ip6erspan_set_version(data, &nt->parms); ign = net_generic(net, ip6gre_net_id); diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 48fd53b98972..f4bdbabc3246 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -2002,10 +2002,11 @@ static void ip6_tnl_netlink_parms(struct nlattr *data[], parms->fwmark = nla_get_u32(data[IFLA_IPTUN_FWMARK]); } -static int ip6_tnl_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ip6_tnl_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; struct net *net = dev_net(dev); struct ip6_tnl_net *ip6n = net_generic(net, ip6_tnl_net_id); struct ip_tunnel_encap ipencap; diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c index 590737c27537..79e601e629d2 100644 --- a/net/ipv6/ip6_vti.c +++ b/net/ipv6/ip6_vti.c @@ -997,10 +997,10 @@ static void vti6_netlink_parms(struct nlattr *data[], parms->fwmark = nla_get_u32(data[IFLA_VTI_FWMARK]); } -static int vti6_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int vti6_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; struct net *net = dev_net(dev); struct ip6_tnl *nt; diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c index 39bd8951bfca..4dd1309d1eb3 100644 --- a/net/ipv6/sit.c +++ b/net/ipv6/sit.c @@ -1550,10 +1550,11 @@ static bool ipip6_netlink_6rd_parms(struct nlattr *data[], } #endif -static int ipip6_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ipip6_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; struct net *net = dev_net(dev); struct ip_tunnel *nt; struct ip_tunnel_encap ipencap; diff --git a/net/xfrm/xfrm_interface_core.c b/net/xfrm/xfrm_interface_core.c index 98f1e2b67c76..77d50d4af4a1 100644 --- a/net/xfrm/xfrm_interface_core.c +++ b/net/xfrm/xfrm_interface_core.c @@ -814,10 +814,11 @@ static void xfrmi_netlink_parms(struct nlattr *data[], parms->collect_md = true; } -static int xfrmi_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int xfrmi_newlink(struct rtnl_newlink_params *params) { + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; struct net *net = dev_net(dev); struct xfrm_if_parms p = {}; struct xfrm_if *xi; -- 2.47.1 From shaw.leon at gmail.com Wed Dec 18 13:09:51 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Wed, 18 Dec 2024 13:09:51 -0000 Subject: [PATCH net-next v6 03/11] net: Use link netns in newlink() of rtnl_link_ops In-Reply-To: <20241218130909.2173-1-shaw.leon@gmail.com> References: <20241218130909.2173-1-shaw.leon@gmail.com> Message-ID: <20241218130909.2173-4-shaw.leon@gmail.com> These netdevice drivers already uses netns parameter in newlink() callback. Convert them to use rtnl_newlink_link_net() or rtnl_newlink_peer_net() for clarity and deprecate params->net. Signed-off-by: Xiao Liang --- drivers/infiniband/ulp/ipoib/ipoib_netlink.c | 4 ++-- drivers/net/amt.c | 6 +++--- drivers/net/bareudp.c | 4 ++-- drivers/net/can/vxcan.c | 2 +- drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c | 4 ++-- drivers/net/geneve.c | 4 ++-- drivers/net/gtp.c | 4 ++-- drivers/net/ipvlan/ipvlan_main.c | 4 ++-- drivers/net/macsec.c | 4 ++-- drivers/net/macvlan.c | 5 +++-- drivers/net/macvtap.c | 4 ++-- drivers/net/netkit.c | 2 +- drivers/net/pfcp.c | 4 ++-- drivers/net/ppp/ppp_generic.c | 4 ++-- drivers/net/veth.c | 2 +- drivers/net/vxlan/vxlan_core.c | 4 ++-- drivers/net/wireguard/device.c | 4 ++-- drivers/net/wireless/virtual/virt_wifi.c | 4 ++-- drivers/net/wwan/wwan_core.c | 2 +- net/8021q/vlan_netlink.c | 4 ++-- net/hsr/hsr_netlink.c | 8 ++++---- 21 files changed, 42 insertions(+), 41 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c index 61f2457aab77..ac01650b0ac2 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c @@ -99,10 +99,10 @@ static int ipoib_changelink(struct net_device *dev, struct nlattr *tb[], static int ipoib_new_child_link(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *src_net = params->net; struct nlattr **tb = params->tb; struct net_device *pdev; struct ipoib_dev_priv *ppriv; @@ -112,7 +112,7 @@ static int ipoib_new_child_link(struct rtnl_newlink_params *params) if (!tb[IFLA_LINK]) return -EINVAL; - pdev = __dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK])); + pdev = __dev_get_by_index(link_net, nla_get_u32(tb[IFLA_LINK])); if (!pdev || pdev->type != ARPHRD_INFINIBAND) return -ENODEV; diff --git a/drivers/net/amt.c b/drivers/net/amt.c index 85878abb51d2..de4ea1a3f3d3 100644 --- a/drivers/net/amt.c +++ b/drivers/net/amt.c @@ -3163,16 +3163,16 @@ static int amt_validate(struct nlattr *tb[], struct nlattr *data[], static int amt_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; struct nlattr **tb = params->tb; - struct net *net = params->net; struct amt_dev *amt; int err = -EINVAL; amt = netdev_priv(dev); - amt->net = net; + amt->net = link_net; amt->mode = nla_get_u32(data[IFLA_AMT_MODE]); if (data[IFLA_AMT_MAX_TUNNELS] && @@ -3187,7 +3187,7 @@ static int amt_newlink(struct rtnl_newlink_params *params) amt->hash_buckets = AMT_HSIZE; amt->nr_tunnels = 0; get_random_bytes(&amt->hash_seed, sizeof(amt->hash_seed)); - amt->stream_dev = dev_get_by_index(net, + amt->stream_dev = dev_get_by_index(link_net, nla_get_u32(data[IFLA_AMT_LINK])); if (!amt->stream_dev) { NL_SET_ERR_MSG_ATTR(extack, tb[IFLA_AMT_LINK], diff --git a/drivers/net/bareudp.c b/drivers/net/bareudp.c index 4c2a50bbf7c0..1fe5dcae38f5 100644 --- a/drivers/net/bareudp.c +++ b/drivers/net/bareudp.c @@ -700,11 +700,11 @@ static void bareudp_dellink(struct net_device *dev, struct list_head *head) static int bareudp_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; struct nlattr **tb = params->tb; - struct net *net = params->net; struct bareudp_conf conf; int err; @@ -712,7 +712,7 @@ static int bareudp_newlink(struct rtnl_newlink_params *params) if (err) return err; - err = bareudp_configure(net, dev, &conf, extack); + err = bareudp_configure(link_net, dev, &conf, extack); if (err) return err; diff --git a/drivers/net/can/vxcan.c b/drivers/net/can/vxcan.c index 5d7717c22fab..e3c52c191086 100644 --- a/drivers/net/can/vxcan.c +++ b/drivers/net/can/vxcan.c @@ -174,10 +174,10 @@ static struct rtnl_link_ops vxcan_link_ops; static int vxcan_newlink(struct rtnl_newlink_params *params) { + struct net *peer_net = rtnl_newlink_peer_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *peer_net = params->net; struct nlattr **tb = params->tb; struct vxcan_priv *priv; struct net_device *peer; diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c index b4834651c693..7a6b746a3b15 100644 --- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c @@ -120,10 +120,10 @@ static void rmnet_unregister_bridge(struct rmnet_port *port) static int rmnet_newlink(struct rtnl_newlink_params *params) { u32 data_format = RMNET_FLAGS_INGRESS_DEAGGREGATION; + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *src_net = params->net; struct nlattr **tb = params->tb; struct net_device *real_dev; int mode = RMNET_EPMODE_VND; @@ -137,7 +137,7 @@ static int rmnet_newlink(struct rtnl_newlink_params *params) return -EINVAL; } - real_dev = __dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK])); + real_dev = __dev_get_by_index(link_net, nla_get_u32(tb[IFLA_LINK])); if (!real_dev) { NL_SET_ERR_MSG_MOD(extack, "link does not exist"); return -ENODEV; diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c index ea0a98a513ed..3dec3e5aae79 100644 --- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -1616,11 +1616,11 @@ static void geneve_link_config(struct net_device *dev, static int geneve_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; struct nlattr **tb = params->tb; - struct net *net = params->net; struct geneve_config cfg = { .df = GENEVE_DF_UNSET, .use_udp6_rx_checksums = false, @@ -1634,7 +1634,7 @@ static int geneve_newlink(struct rtnl_newlink_params *params) if (err) return err; - err = geneve_configure(net, dev, extack, &cfg); + err = geneve_configure(link_net, dev, extack, &cfg); if (err) return err; diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index 46d5734da7f3..50f8a0cd1d4b 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -1462,9 +1462,9 @@ static int gtp_create_sockets(struct gtp_dev *gtp, const struct nlattr *nla, static int gtp_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *src_net = params->net; unsigned int role = GTP_ROLE_GGSN; struct gtp_dev *gtp; struct gtp_net *gn; @@ -1495,7 +1495,7 @@ static int gtp_newlink(struct rtnl_newlink_params *params) gtp->restart_count = nla_get_u8_default(data[IFLA_GTP_RESTART_COUNT], 0); - gtp->net = src_net; + gtp->net = link_net; err = gtp_hashtable_new(gtp, hashsize); if (err < 0) diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c index a994fd54ada4..7d19771383c7 100644 --- a/drivers/net/ipvlan/ipvlan_main.c +++ b/drivers/net/ipvlan/ipvlan_main.c @@ -534,10 +534,10 @@ static int ipvlan_nl_fillinfo(struct sk_buff *skb, int ipvlan_link_new(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *src_net = params->net; struct nlattr **tb = params->tb; struct ipvl_dev *ipvlan; struct ipvl_port *port; @@ -550,7 +550,7 @@ int ipvlan_link_new(struct rtnl_newlink_params *params) if (!tb[IFLA_LINK]) return -EINVAL; - phy_dev = __dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK])); + phy_dev = __dev_get_by_index(link_net, nla_get_u32(tb[IFLA_LINK])); if (!phy_dev) return -ENODEV; diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c index 9da111a6629c..ad53a67410dc 100644 --- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -4143,11 +4143,11 @@ static struct lock_class_key macsec_netdev_addr_lock_key; static int macsec_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; struct nlattr **tb = params->tb; - struct net *net = params->net; rx_handler_func_t *rx_handler; u8 icv_len = MACSEC_DEFAULT_ICV_LEN; struct net_device *real_dev; @@ -4159,7 +4159,7 @@ static int macsec_newlink(struct rtnl_newlink_params *params) if (!tb[IFLA_LINK]) return -EINVAL; - real_dev = __dev_get_by_index(net, nla_get_u32(tb[IFLA_LINK])); + real_dev = __dev_get_by_index(link_net, nla_get_u32(tb[IFLA_LINK])); if (!real_dev) return -ENODEV; if (real_dev->type != ARPHRD_ETHER) diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index 1915f54bd35a..7050a061b2b9 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -1567,8 +1567,9 @@ EXPORT_SYMBOL_GPL(macvlan_common_newlink); static int macvlan_newlink(struct rtnl_newlink_params *params) { - return macvlan_common_newlink(params->net, params->dev, params->tb, - params->data, params->extack); + return macvlan_common_newlink(rtnl_newlink_link_net(params), + params->dev, params->tb, params->data, + params->extack); } void macvlan_dellink(struct net_device *dev, struct list_head *head) diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c index e5fd8a147310..01cf1efbe4c5 100644 --- a/drivers/net/macvtap.c +++ b/drivers/net/macvtap.c @@ -105,8 +105,8 @@ static int macvtap_newlink(struct rtnl_newlink_params *params) /* Don't put anything that may fail after macvlan_common_newlink * because we can't undo what it does. */ - err = macvlan_common_newlink(params->net, dev, params->tb, params->data, - params->extack); + err = macvlan_common_newlink(rtnl_newlink_link_net(params), dev, + params->tb, params->data, params->extack); if (err) { netdev_rx_handler_unregister(dev); return err; diff --git a/drivers/net/netkit.c b/drivers/net/netkit.c index f5527bb533ab..79a2c37990fd 100644 --- a/drivers/net/netkit.c +++ b/drivers/net/netkit.c @@ -329,10 +329,10 @@ static struct rtnl_link_ops netkit_link_ops; static int netkit_new_link(struct rtnl_newlink_params *params) { + struct net *peer_net = rtnl_newlink_peer_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *peer_net = params->net; struct nlattr **tb = params->tb; struct nlattr *peer_tb[IFLA_MAX + 1], **tbp = tb, *attr; enum netkit_action policy_prim = NETKIT_PASS; diff --git a/drivers/net/pfcp.c b/drivers/net/pfcp.c index cb936da99674..e98724a71c22 100644 --- a/drivers/net/pfcp.c +++ b/drivers/net/pfcp.c @@ -186,14 +186,14 @@ static int pfcp_add_sock(struct pfcp_dev *pfcp) static int pfcp_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct net_device *dev = params->dev; - struct net *net = params->net; struct pfcp_dev *pfcp; struct pfcp_net *pn; int err; pfcp = netdev_priv(dev); - pfcp->net = net; + pfcp->net = link_net; err = pfcp_add_sock(pfcp); if (err) { diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c index 5b58e7bb4e7b..316b6d01436b 100644 --- a/drivers/net/ppp/ppp_generic.c +++ b/drivers/net/ppp/ppp_generic.c @@ -1305,9 +1305,9 @@ static int ppp_nl_validate(struct nlattr *tb[], struct nlattr *data[], static int ppp_nl_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *src_net = params->net; struct nlattr **tb = params->tb; struct ppp_config conf = { .unit = -1, @@ -1345,7 +1345,7 @@ static int ppp_nl_newlink(struct rtnl_newlink_params *params) if (!tb[IFLA_IFNAME] || !nla_len(tb[IFLA_IFNAME]) || !*(char *)nla_data(tb[IFLA_IFNAME])) conf.ifname_is_set = false; - err = ppp_dev_configure(src_net, dev, &conf); + err = ppp_dev_configure(link_net, dev, &conf); out_unlock: mutex_unlock(&ppp_mutex); diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 04229c07023d..11ee821edcd6 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -1767,10 +1767,10 @@ static int veth_init_queues(struct net_device *dev, struct nlattr *tb[]) static int veth_newlink(struct rtnl_newlink_params *params) { + struct net *peer_net = rtnl_newlink_peer_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *peer_net = params->net; struct nlattr **tb = params->tb; int err; struct net_device *peer; diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c index b084adb6d319..751da726cf56 100644 --- a/drivers/net/vxlan/vxlan_core.c +++ b/drivers/net/vxlan/vxlan_core.c @@ -4395,10 +4395,10 @@ static int vxlan_nl2conf(struct nlattr *tb[], struct nlattr *data[], static int vxlan_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *src_net = params->net; struct nlattr **tb = params->tb; struct vxlan_config conf; int err; @@ -4407,7 +4407,7 @@ static int vxlan_newlink(struct rtnl_newlink_params *params) if (err) return err; - return __vxlan_dev_create(src_net, dev, &conf, extack); + return __vxlan_dev_create(link_net, dev, &conf, extack); } static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[], diff --git a/drivers/net/wireguard/device.c b/drivers/net/wireguard/device.c index 92aac080d2b5..b2ba9d9c6ad3 100644 --- a/drivers/net/wireguard/device.c +++ b/drivers/net/wireguard/device.c @@ -309,13 +309,13 @@ static void wg_setup(struct net_device *dev) static int wg_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct net_device *dev = params->dev; - struct net *src_net = params->net; struct wg_device *wg; int ret = -ENOMEM; wg = netdev_priv(dev); - rcu_assign_pointer(wg->creating_net, src_net); + rcu_assign_pointer(wg->creating_net, link_net); init_rwsem(&wg->static_identity.lock); mutex_init(&wg->socket_update_lock); mutex_init(&wg->device_update_lock); diff --git a/drivers/net/wireless/virtual/virt_wifi.c b/drivers/net/wireless/virtual/virt_wifi.c index d64eb03e0ac8..5e7c7a1d7d5f 100644 --- a/drivers/net/wireless/virtual/virt_wifi.c +++ b/drivers/net/wireless/virtual/virt_wifi.c @@ -521,10 +521,10 @@ static rx_handler_result_t virt_wifi_rx_handler(struct sk_buff **pskb) /* Called with rtnl lock held. */ static int virt_wifi_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct virt_wifi_netdev_priv *priv; - struct net *src_net = params->net; struct nlattr **tb = params->tb; int err; @@ -536,7 +536,7 @@ static int virt_wifi_newlink(struct rtnl_newlink_params *params) netif_carrier_off(dev); priv->upperdev = dev; - priv->lowerdev = __dev_get_by_index(src_net, + priv->lowerdev = __dev_get_by_index(link_net, nla_get_u32(tb[IFLA_LINK])); if (!priv->lowerdev) diff --git a/drivers/net/wwan/wwan_core.c b/drivers/net/wwan/wwan_core.c index 908a3db61477..06a2172d1856 100644 --- a/drivers/net/wwan/wwan_core.c +++ b/drivers/net/wwan/wwan_core.c @@ -1070,7 +1070,7 @@ static void wwan_create_default_link(struct wwan_device *wwandev, struct nlmsghdr *nlh; struct sk_buff *msg; struct rtnl_newlink_params params = { - .net = &init_net, + .src_net = &init_net, .tb = tb, .data = data, }; diff --git a/net/8021q/vlan_netlink.c b/net/8021q/vlan_netlink.c index 26a0f0a2ce27..0a9930017bba 100644 --- a/net/8021q/vlan_netlink.c +++ b/net/8021q/vlan_netlink.c @@ -137,10 +137,10 @@ static int vlan_changelink(struct net_device *dev, struct nlattr *tb[], static int vlan_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *src_net = params->net; struct nlattr **tb = params->tb; struct net_device *real_dev; struct vlan_dev_priv *vlan; @@ -160,7 +160,7 @@ static int vlan_newlink(struct rtnl_newlink_params *params) return -EINVAL; } - real_dev = __dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK])); + real_dev = __dev_get_by_index(link_net, nla_get_u32(tb[IFLA_LINK])); if (!real_dev) { NL_SET_ERR_MSG_MOD(extack, "link does not exist"); return -ENODEV; diff --git a/net/hsr/hsr_netlink.c b/net/hsr/hsr_netlink.c index 08d38e2e2962..9bc564e81827 100644 --- a/net/hsr/hsr_netlink.c +++ b/net/hsr/hsr_netlink.c @@ -31,10 +31,10 @@ static const struct nla_policy hsr_policy[IFLA_HSR_MAX + 1] = { */ static int hsr_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *src_net = params->net; enum hsr_version proto_version; unsigned char multicast_spec; u8 proto = HSR_PROTOCOL_HSR; @@ -48,7 +48,7 @@ static int hsr_newlink(struct rtnl_newlink_params *params) NL_SET_ERR_MSG_MOD(extack, "Slave1 device not specified"); return -EINVAL; } - link[0] = __dev_get_by_index(src_net, + link[0] = __dev_get_by_index(link_net, nla_get_u32(data[IFLA_HSR_SLAVE1])); if (!link[0]) { NL_SET_ERR_MSG_MOD(extack, "Slave1 does not exist"); @@ -58,7 +58,7 @@ static int hsr_newlink(struct rtnl_newlink_params *params) NL_SET_ERR_MSG_MOD(extack, "Slave2 device not specified"); return -EINVAL; } - link[1] = __dev_get_by_index(src_net, + link[1] = __dev_get_by_index(link_net, nla_get_u32(data[IFLA_HSR_SLAVE2])); if (!link[1]) { NL_SET_ERR_MSG_MOD(extack, "Slave2 does not exist"); @@ -71,7 +71,7 @@ static int hsr_newlink(struct rtnl_newlink_params *params) } if (data[IFLA_HSR_INTERLINK]) - interlink = __dev_get_by_index(src_net, + interlink = __dev_get_by_index(link_net, nla_get_u32(data[IFLA_HSR_INTERLINK])); if (interlink && interlink == link[0]) { -- 2.47.1 From shaw.leon at gmail.com Wed Dec 18 13:09:59 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Wed, 18 Dec 2024 13:09:59 -0000 Subject: [PATCH net-next v6 04/11] ieee802154: 6lowpan: Use link netns in newlink() of rtnl_link_ops In-Reply-To: <20241218130909.2173-1-shaw.leon@gmail.com> References: <20241218130909.2173-1-shaw.leon@gmail.com> Message-ID: <20241218130909.2173-5-shaw.leon@gmail.com> When link_net is set, use it as link netns instead of dev_net(). This prepares for rtnetlink core to create device in target netns directly, in which case the two namespaces may be different. Signed-off-by: Xiao Liang --- net/ieee802154/6lowpan/core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/ieee802154/6lowpan/core.c b/net/ieee802154/6lowpan/core.c index c16c14807d87..65a5c61cf38c 100644 --- a/net/ieee802154/6lowpan/core.c +++ b/net/ieee802154/6lowpan/core.c @@ -143,7 +143,8 @@ static int lowpan_newlink(struct rtnl_newlink_params *params) if (!tb[IFLA_LINK]) return -EINVAL; /* find and hold wpan device */ - wdev = dev_get_by_index(dev_net(ldev), nla_get_u32(tb[IFLA_LINK])); + wdev = dev_get_by_index(params->link_net ? : dev_net(ldev), + nla_get_u32(tb[IFLA_LINK])); if (!wdev) return -ENODEV; if (wdev->type != ARPHRD_IEEE802154) { -- 2.47.1 From shaw.leon at gmail.com Wed Dec 18 13:10:08 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Wed, 18 Dec 2024 13:10:08 -0000 Subject: [PATCH net-next v6 05/11] net: ip_tunnel: Use link netns in newlink() of rtnl_link_ops In-Reply-To: <20241218130909.2173-1-shaw.leon@gmail.com> References: <20241218130909.2173-1-shaw.leon@gmail.com> Message-ID: <20241218130909.2173-6-shaw.leon@gmail.com> When link_net is set, use it as link netns instead of dev_net(). This prepares for rtnetlink core to create device in target netns directly, in which case the two namespaces may be different. Convert common ip_tunnel_newlink() to accept an extra link netns argument. Don't overwrite ip_tunnel.net in ip_tunnel_init(). Signed-off-by: Xiao Liang --- include/net/ip_tunnels.h | 5 +++-- net/ipv4/ip_gre.c | 8 +++++--- net/ipv4/ip_tunnel.c | 10 ++++++---- net/ipv4/ip_vti.c | 3 ++- net/ipv4/ipip.c | 3 ++- 5 files changed, 18 insertions(+), 11 deletions(-) diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h index 1aa31bdb2b31..ae1f2dda4533 100644 --- a/include/net/ip_tunnels.h +++ b/include/net/ip_tunnels.h @@ -406,8 +406,9 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct sk_buff *skb, bool log_ecn_error); int ip_tunnel_changelink(struct net_device *dev, struct nlattr *tb[], struct ip_tunnel_parm_kern *p, __u32 fwmark); -int ip_tunnel_newlink(struct net_device *dev, struct nlattr *tb[], - struct ip_tunnel_parm_kern *p, __u32 fwmark); +int ip_tunnel_newlink(struct net *net, struct net_device *dev, + struct nlattr *tb[], struct ip_tunnel_parm_kern *p, + __u32 fwmark); void ip_tunnel_setup(struct net_device *dev, unsigned int net_id); bool ip_tunnel_netlink_encap_parms(struct nlattr *data[], diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c index ecad1d88dd26..bae80bb7839a 100644 --- a/net/ipv4/ip_gre.c +++ b/net/ipv4/ip_gre.c @@ -1405,7 +1405,8 @@ static int ipgre_newlink(struct rtnl_newlink_params *params) err = ipgre_netlink_parms(dev, data, tb, &p, &fwmark); if (err < 0) return err; - return ip_tunnel_newlink(dev, tb, &p, fwmark); + return ip_tunnel_newlink(params->link_net ? : dev_net(dev), dev, tb, &p, + fwmark); } static int erspan_newlink(struct rtnl_newlink_params *params) @@ -1424,7 +1425,8 @@ static int erspan_newlink(struct rtnl_newlink_params *params) err = erspan_netlink_parms(dev, data, tb, &p, &fwmark); if (err) return err; - return ip_tunnel_newlink(dev, tb, &p, fwmark); + return ip_tunnel_newlink(params->link_net ? : dev_net(dev), dev, tb, &p, + fwmark); } static int ipgre_changelink(struct net_device *dev, struct nlattr *tb[], @@ -1698,7 +1700,7 @@ struct net_device *gretap_fb_dev_create(struct net *net, const char *name, struct ip_tunnel *t; int err; struct rtnl_newlink_params params = { - .net = net, + .src_net = net, .tb = tb, }; diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index 25505f9b724c..952d2241c9b1 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -1213,11 +1213,11 @@ void ip_tunnel_delete_nets(struct list_head *net_list, unsigned int id, } EXPORT_SYMBOL_GPL(ip_tunnel_delete_nets); -int ip_tunnel_newlink(struct net_device *dev, struct nlattr *tb[], - struct ip_tunnel_parm_kern *p, __u32 fwmark) +int ip_tunnel_newlink(struct net *net, struct net_device *dev, + struct nlattr *tb[], struct ip_tunnel_parm_kern *p, + __u32 fwmark) { struct ip_tunnel *nt; - struct net *net = dev_net(dev); struct ip_tunnel_net *itn; int mtu; int err; @@ -1326,7 +1326,9 @@ int ip_tunnel_init(struct net_device *dev) } tunnel->dev = dev; - tunnel->net = dev_net(dev); + if (!tunnel->net) + tunnel->net = dev_net(dev); + strscpy(tunnel->parms.name, dev->name); iph->version = 4; iph->ihl = 5; diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c index 12ccbf34fb6c..98752b4d28ad 100644 --- a/net/ipv4/ip_vti.c +++ b/net/ipv4/ip_vti.c @@ -584,7 +584,8 @@ static int vti_newlink(struct rtnl_newlink_params *params) __u32 fwmark = 0; vti_netlink_parms(data, &parms, &fwmark); - return ip_tunnel_newlink(dev, tb, &parms, fwmark); + return ip_tunnel_newlink(params->link_net ? : dev_net(dev), dev, tb, + &parms, fwmark); } static int vti_changelink(struct net_device *dev, struct nlattr *tb[], diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c index 3a737ea3c2e5..c65c8b0e838f 100644 --- a/net/ipv4/ipip.c +++ b/net/ipv4/ipip.c @@ -456,7 +456,8 @@ static int ipip_newlink(struct rtnl_newlink_params *params) } ipip_netlink_parms(data, &p, &t->collect_md, &fwmark); - return ip_tunnel_newlink(dev, tb, &p, fwmark); + return ip_tunnel_newlink(params->link_net ? : dev_net(dev), dev, tb, &p, + fwmark); } static int ipip_changelink(struct net_device *dev, struct nlattr *tb[], -- 2.47.1 From shaw.leon at gmail.com Wed Dec 18 13:10:17 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Wed, 18 Dec 2024 13:10:17 -0000 Subject: [PATCH net-next v6 06/11] net: ipv6: Use link netns in newlink() of rtnl_link_ops In-Reply-To: <20241218130909.2173-1-shaw.leon@gmail.com> References: <20241218130909.2173-1-shaw.leon@gmail.com> Message-ID: <20241218130909.2173-7-shaw.leon@gmail.com> When link_net is set, use it as link netns instead of dev_net(). This prepares for rtnetlink core to create device in target netns directly, in which case the two namespaces may be different. Set correct netns in priv before registering device, and avoid overwriting it in ndo_init() path. Signed-off-by: Xiao Liang --- net/ipv6/ip6_gre.c | 22 ++++++++++++---------- net/ipv6/ip6_tunnel.c | 13 ++++++++----- net/ipv6/ip6_vti.c | 10 ++++++---- net/ipv6/sit.c | 11 +++++++---- 4 files changed, 33 insertions(+), 23 deletions(-) diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c index 3efd51f0d7d2..1d47c229068d 100644 --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -1498,7 +1498,8 @@ static int ip6gre_tunnel_init_common(struct net_device *dev) tunnel = netdev_priv(dev); tunnel->dev = dev; - tunnel->net = dev_net(dev); + if (!tunnel->net) + tunnel->net = dev_net(dev); strcpy(tunnel->parms.name, dev->name); ret = dst_cache_init(&tunnel->dst_cache, GFP_KERNEL); @@ -1882,7 +1883,8 @@ static int ip6erspan_tap_init(struct net_device *dev) tunnel = netdev_priv(dev); tunnel->dev = dev; - tunnel->net = dev_net(dev); + if (!tunnel->net) + tunnel->net = dev_net(dev); strcpy(tunnel->parms.name, dev->name); ret = dst_cache_init(&tunnel->dst_cache, GFP_KERNEL); @@ -1971,7 +1973,7 @@ static bool ip6gre_netlink_encap_parms(struct nlattr *data[], return ret; } -static int ip6gre_newlink_common(struct net *src_net, struct net_device *dev, +static int ip6gre_newlink_common(struct net *link_net, struct net_device *dev, struct nlattr *tb[], struct nlattr *data[], struct netlink_ext_ack *extack) { @@ -1992,7 +1994,7 @@ static int ip6gre_newlink_common(struct net *src_net, struct net_device *dev, eth_hw_addr_random(dev); nt->dev = dev; - nt->net = dev_net(dev); + nt->net = link_net; err = register_netdevice(dev); if (err) @@ -2010,13 +2012,13 @@ static int ip6gre_newlink(struct rtnl_newlink_params *params) struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *src_net = params->net; struct nlattr **tb = params->tb; - struct net *net = dev_net(dev); struct ip6gre_net *ign; struct ip6_tnl *nt; + struct net *net; int err; + net = params->link_net ? : dev_net(dev); nt = netdev_priv(dev); ip6gre_netlink_parms(data, &nt->parms); ign = net_generic(net, ip6gre_net_id); @@ -2029,7 +2031,7 @@ static int ip6gre_newlink(struct rtnl_newlink_params *params) return -EEXIST; } - err = ip6gre_newlink_common(src_net, dev, tb, data, extack); + err = ip6gre_newlink_common(net, dev, tb, data, extack); if (!err) { ip6gre_tnl_link_config(nt, !tb[IFLA_MTU]); ip6gre_tunnel_link_md(ign, nt); @@ -2250,13 +2252,13 @@ static int ip6erspan_newlink(struct rtnl_newlink_params *params) struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *src_net = params->net; struct nlattr **tb = params->tb; - struct net *net = dev_net(dev); struct ip6gre_net *ign; struct ip6_tnl *nt; + struct net *net; int err; + net = params->link_net ? : dev_net(dev); nt = netdev_priv(dev); ip6gre_netlink_parms(data, &nt->parms); ip6erspan_set_version(data, &nt->parms); @@ -2270,7 +2272,7 @@ static int ip6erspan_newlink(struct rtnl_newlink_params *params) return -EEXIST; } - err = ip6gre_newlink_common(src_net, dev, tb, data, extack); + err = ip6gre_newlink_common(net, dev, tb, data, extack); if (!err) { ip6erspan_tnl_link_config(nt, !tb[IFLA_MTU]); ip6erspan_tunnel_link_md(ign, nt); diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index f4bdbabc3246..cb09cc878dee 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -253,8 +253,7 @@ static void ip6_dev_free(struct net_device *dev) static int ip6_tnl_create2(struct net_device *dev) { struct ip6_tnl *t = netdev_priv(dev); - struct net *net = dev_net(dev); - struct ip6_tnl_net *ip6n = net_generic(net, ip6_tnl_net_id); + struct ip6_tnl_net *ip6n = net_generic(t->net, ip6_tnl_net_id); int err; dev->rtnl_link_ops = &ip6_link_ops; @@ -1878,7 +1877,8 @@ ip6_tnl_dev_init_gen(struct net_device *dev) int t_hlen; t->dev = dev; - t->net = dev_net(dev); + if (!t->net) + t->net = dev_net(dev); ret = dst_cache_init(&t->dst_cache, GFP_KERNEL); if (ret) @@ -2007,13 +2007,16 @@ static int ip6_tnl_newlink(struct rtnl_newlink_params *params) struct net_device *dev = params->dev; struct nlattr **data = params->data; struct nlattr **tb = params->tb; - struct net *net = dev_net(dev); - struct ip6_tnl_net *ip6n = net_generic(net, ip6_tnl_net_id); struct ip_tunnel_encap ipencap; + struct ip6_tnl_net *ip6n; struct ip6_tnl *nt, *t; + struct net *net; int err; + net = params->link_net ? : dev_net(dev); + ip6n = net_generic(net, ip6_tnl_net_id); nt = netdev_priv(dev); + nt->net = net; if (ip_tunnel_netlink_encap_parms(data, &ipencap)) { err = ip6_tnl_encap_setup(nt, &ipencap); diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c index 79e601e629d2..a3108a7464c7 100644 --- a/net/ipv6/ip6_vti.c +++ b/net/ipv6/ip6_vti.c @@ -177,8 +177,7 @@ vti6_tnl_unlink(struct vti6_net *ip6n, struct ip6_tnl *t) static int vti6_tnl_create2(struct net_device *dev) { struct ip6_tnl *t = netdev_priv(dev); - struct net *net = dev_net(dev); - struct vti6_net *ip6n = net_generic(net, vti6_net_id); + struct vti6_net *ip6n = net_generic(t->net, vti6_net_id); int err; dev->rtnl_link_ops = &vti6_link_ops; @@ -925,7 +924,8 @@ static inline int vti6_dev_init_gen(struct net_device *dev) struct ip6_tnl *t = netdev_priv(dev); t->dev = dev; - t->net = dev_net(dev); + if (!t->net) + t->net = dev_net(dev); netdev_hold(dev, &t->dev_tracker, GFP_KERNEL); netdev_lockdep_set_classes(dev); return 0; @@ -1001,13 +1001,15 @@ static int vti6_newlink(struct rtnl_newlink_params *params) { struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *net = dev_net(dev); struct ip6_tnl *nt; + struct net *net; + net = params->link_net ? : dev_net(dev); nt = netdev_priv(dev); vti6_netlink_parms(data, &nt->parms); nt->parms.proto = IPPROTO_IPV6; + nt->net = net; if (vti6_locate(net, &nt->parms, 0)) return -EEXIST; diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c index 4dd1309d1eb3..8888fc51fa0b 100644 --- a/net/ipv6/sit.c +++ b/net/ipv6/sit.c @@ -201,8 +201,7 @@ static void ipip6_tunnel_clone_6rd(struct net_device *dev, struct sit_net *sitn) static int ipip6_tunnel_create(struct net_device *dev) { struct ip_tunnel *t = netdev_priv(dev); - struct net *net = dev_net(dev); - struct sit_net *sitn = net_generic(net, sit_net_id); + struct sit_net *sitn = net_generic(t->net, sit_net_id); int err; __dev_addr_set(dev, &t->parms.iph.saddr, 4); @@ -270,6 +269,7 @@ static struct ip_tunnel *ipip6_tunnel_locate(struct net *net, nt = netdev_priv(dev); nt->parms = *parms; + nt->net = net; if (ipip6_tunnel_create(dev) < 0) goto failed_free; @@ -1449,7 +1449,8 @@ static int ipip6_tunnel_init(struct net_device *dev) int err; tunnel->dev = dev; - tunnel->net = dev_net(dev); + if (!tunnel->net) + tunnel->net = dev_net(dev); strcpy(tunnel->parms.name, dev->name); ipip6_tunnel_bind_dev(dev); @@ -1555,15 +1556,17 @@ static int ipip6_newlink(struct rtnl_newlink_params *params) struct net_device *dev = params->dev; struct nlattr **data = params->data; struct nlattr **tb = params->tb; - struct net *net = dev_net(dev); struct ip_tunnel *nt; struct ip_tunnel_encap ipencap; #ifdef CONFIG_IPV6_SIT_6RD struct ip_tunnel_6rd ip6rd; #endif + struct net *net; int err; + net = params->link_net ? : dev_net(dev); nt = netdev_priv(dev); + nt->net = net; if (ip_tunnel_netlink_encap_parms(data, &ipencap)) { err = ip_tunnel_encap_setup(nt, &ipencap); -- 2.47.1 From shaw.leon at gmail.com Wed Dec 18 13:10:24 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Wed, 18 Dec 2024 13:10:24 -0000 Subject: [PATCH net-next v6 07/11] net: xfrm: Use link netns in newlink() of rtnl_link_ops In-Reply-To: <20241218130909.2173-1-shaw.leon@gmail.com> References: <20241218130909.2173-1-shaw.leon@gmail.com> Message-ID: <20241218130909.2173-8-shaw.leon@gmail.com> When link_net is set, use it as link netns instead of dev_net(). This prepares for rtnetlink core to create device in target netns directly, in which case the two namespaces may be different. Signed-off-by: Xiao Liang --- net/xfrm/xfrm_interface_core.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/net/xfrm/xfrm_interface_core.c b/net/xfrm/xfrm_interface_core.c index 77d50d4af4a1..d1198c63dd23 100644 --- a/net/xfrm/xfrm_interface_core.c +++ b/net/xfrm/xfrm_interface_core.c @@ -242,10 +242,9 @@ static void xfrmi_dev_free(struct net_device *dev) gro_cells_destroy(&xi->gro_cells); } -static int xfrmi_create(struct net_device *dev) +static int xfrmi_create(struct net *net, struct net_device *dev) { struct xfrm_if *xi = netdev_priv(dev); - struct net *net = dev_net(dev); struct xfrmi_net *xfrmn = net_generic(net, xfrmi_net_id); int err; @@ -819,11 +818,12 @@ static int xfrmi_newlink(struct rtnl_newlink_params *params) struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *net = dev_net(dev); struct xfrm_if_parms p = {}; struct xfrm_if *xi; + struct net *net; int err; + net = params->link_net ? : dev_net(dev); xfrmi_netlink_parms(data, &p); if (p.collect_md) { struct xfrmi_net *xfrmn = net_generic(net, xfrmi_net_id); @@ -852,7 +852,7 @@ static int xfrmi_newlink(struct rtnl_newlink_params *params) xi->net = net; xi->dev = dev; - err = xfrmi_create(dev); + err = xfrmi_create(net, dev); return err; } -- 2.47.1 From shaw.leon at gmail.com Wed Dec 18 13:10:32 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Wed, 18 Dec 2024 13:10:32 -0000 Subject: [PATCH net-next v6 08/11] rtnetlink: Remove "net" from newlink params In-Reply-To: <20241218130909.2173-1-shaw.leon@gmail.com> References: <20241218130909.2173-1-shaw.leon@gmail.com> Message-ID: <20241218130909.2173-9-shaw.leon@gmail.com> Now that devices have been converted to use the specific netns instead of ambiguous "net", let's remove it from newlink parameters. Signed-off-by: Xiao Liang --- include/net/rtnetlink.h | 2 -- net/core/rtnetlink.c | 6 ------ 2 files changed, 8 deletions(-) diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h index ed970b4568d1..04fc0e91af42 100644 --- a/include/net/rtnetlink.h +++ b/include/net/rtnetlink.h @@ -72,7 +72,6 @@ static inline int rtnl_msg_family(const struct nlmsghdr *nlh) /** * struct rtnl_newlink_params - parameters of rtnl_link_ops::newlink() * - * @net: Netns of interest * @src_net: Source netns of rtnetlink socket * @link_net: Link netns by IFLA_LINK_NETNSID, NULL if not specified * @peer_net: Peer netns @@ -82,7 +81,6 @@ static inline int rtnl_msg_family(const struct nlmsghdr *nlh) * @extack: Netlink extended ACK */ struct rtnl_newlink_params { - struct net *net; struct net *src_net; struct net *link_net; struct net *peer_net; diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index f7c176a2f1a0..e33ef8a0a6d6 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -3758,7 +3758,6 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm, char ifname[IFNAMSIZ]; int err; struct rtnl_newlink_params params = { - .net = net, .src_net = net, .link_net = link_net, .peer_net = peer_net, @@ -3787,11 +3786,6 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm, dev->ifindex = ifm->ifi_index; params.dev = dev; - if (link_net) - params.net = link_net; - if (peer_net) - params.net = peer_net; - if (ops->newlink) err = ops->newlink(¶ms); else -- 2.47.1 From shaw.leon at gmail.com Wed Dec 18 13:10:40 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Wed, 18 Dec 2024 13:10:40 -0000 Subject: [PATCH net-next v6 09/11] rtnetlink: Create link directly in target net namespace In-Reply-To: <20241218130909.2173-1-shaw.leon@gmail.com> References: <20241218130909.2173-1-shaw.leon@gmail.com> Message-ID: <20241218130909.2173-10-shaw.leon@gmail.com> Make rtnl_newlink_create() create device in target namespace directly. Avoid extra netns change when link netns is provided. Device drivers has been converted to be aware of link netns, that is not assuming device netns is and link netns is the same when ops->newlink() is called. Signed-off-by: Xiao Liang --- net/core/rtnetlink.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index e33ef8a0a6d6..ce5bea096bac 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -3776,8 +3776,8 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm, name_assign_type = NET_NAME_ENUM; } - dev = rtnl_create_link(link_net ? : tgt_net, ifname, - name_assign_type, ops, tb, extack); + dev = rtnl_create_link(tgt_net, ifname, name_assign_type, ops, tb, + extack); if (IS_ERR(dev)) { err = PTR_ERR(dev); goto out; @@ -3798,11 +3798,6 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm, err = rtnl_configure_link(dev, ifm, portid, nlh); if (err < 0) goto out_unregister; - if (link_net) { - err = dev_change_net_namespace(dev, tgt_net, ifname); - if (err < 0) - goto out_unregister; - } if (tb[IFLA_MASTER]) { err = do_set_master(dev, nla_get_u32(tb[IFLA_MASTER]), extack); if (err) -- 2.47.1 From shaw.leon at gmail.com Wed Dec 18 13:10:48 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Wed, 18 Dec 2024 13:10:48 -0000 Subject: [PATCH net-next v6 10/11] selftests: net: Add python context manager for netns entering In-Reply-To: <20241218130909.2173-1-shaw.leon@gmail.com> References: <20241218130909.2173-1-shaw.leon@gmail.com> Message-ID: <20241218130909.2173-11-shaw.leon@gmail.com> Change netns of current thread and switch back on context exit. For example: with NetNSEnter("ns1"): ip("link add dummy0 type dummy") The command be executed in netns "ns1". Signed-off-by: Xiao Liang --- tools/testing/selftests/net/lib/py/__init__.py | 2 +- tools/testing/selftests/net/lib/py/netns.py | 18 ++++++++++++++++++ 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/net/lib/py/__init__.py b/tools/testing/selftests/net/lib/py/__init__.py index 54d8f5eba810..e2d6c7b63019 100644 --- a/tools/testing/selftests/net/lib/py/__init__.py +++ b/tools/testing/selftests/net/lib/py/__init__.py @@ -2,7 +2,7 @@ from .consts import KSRC from .ksft import * -from .netns import NetNS +from .netns import NetNS, NetNSEnter from .nsim import * from .utils import * from .ynl import NlError, YnlFamily, EthtoolFamily, NetdevFamily, RtnlFamily diff --git a/tools/testing/selftests/net/lib/py/netns.py b/tools/testing/selftests/net/lib/py/netns.py index ecff85f9074f..8e9317044eef 100644 --- a/tools/testing/selftests/net/lib/py/netns.py +++ b/tools/testing/selftests/net/lib/py/netns.py @@ -1,9 +1,12 @@ # SPDX-License-Identifier: GPL-2.0 from .utils import ip +import ctypes import random import string +libc = ctypes.cdll.LoadLibrary('libc.so.6') + class NetNS: def __init__(self, name=None): @@ -29,3 +32,18 @@ class NetNS: def __repr__(self): return f"NetNS({self.name})" + + +class NetNSEnter: + def __init__(self, ns_name): + self.ns_path = f"/run/netns/{ns_name}" + + def __enter__(self): + self.saved = open("/proc/thread-self/ns/net") + with open(self.ns_path) as ns_file: + libc.setns(ns_file.fileno(), 0) + return self + + def __exit__(self, exc_type, exc_value, traceback): + libc.setns(self.saved.fileno(), 0) + self.saved.close() -- 2.47.1 From shaw.leon at gmail.com Wed Dec 18 13:10:57 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Wed, 18 Dec 2024 13:10:57 -0000 Subject: [PATCH net-next v6 11/11] selftests: net: Add test cases for link and peer netns In-Reply-To: <20241218130909.2173-1-shaw.leon@gmail.com> References: <20241218130909.2173-1-shaw.leon@gmail.com> Message-ID: <20241218130909.2173-12-shaw.leon@gmail.com> - Add test for creating link in another netns when a link of the same name and ifindex exists in current netns. - Add test to verify that link is created in target netns directly - no link new/del events should be generated in link netns or current netns. - Add test cases to verify that link-netns is set as expected for various drivers and combination of namespace-related parameters. Signed-off-by: Xiao Liang --- tools/testing/selftests/net/Makefile | 1 + tools/testing/selftests/net/link_netns.py | 142 ++++++++++++++++++++++ tools/testing/selftests/net/netns-name.sh | 10 ++ 3 files changed, 153 insertions(+) create mode 100755 tools/testing/selftests/net/link_netns.py diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile index f09bd96cc978..cc6665212304 100644 --- a/tools/testing/selftests/net/Makefile +++ b/tools/testing/selftests/net/Makefile @@ -35,6 +35,7 @@ TEST_PROGS += cmsg_so_mark.sh TEST_PROGS += cmsg_so_priority.sh TEST_PROGS += cmsg_time.sh cmsg_ipv6.sh TEST_PROGS += netns-name.sh +TEST_PROGS += link_netns.py TEST_PROGS += nl_netdev.py TEST_PROGS += srv6_end_dt46_l3vpn_test.sh TEST_PROGS += srv6_end_dt4_l3vpn_test.sh diff --git a/tools/testing/selftests/net/link_netns.py b/tools/testing/selftests/net/link_netns.py new file mode 100755 index 000000000000..c4b2ddf201ff --- /dev/null +++ b/tools/testing/selftests/net/link_netns.py @@ -0,0 +1,142 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +import time + +from lib.py import ksft_run, ksft_exit, ksft_true +from lib.py import ip +from lib.py import NetNS, NetNSEnter +from lib.py import RtnlFamily + + +LINK_NETNSID = 100 + + +def test_event() -> None: + with NetNS() as ns1, NetNS() as ns2: + with NetNSEnter(str(ns2)): + rtnl = RtnlFamily() + + rtnl.ntf_subscribe("rtnlgrp-link") + + ip(f"netns set {ns2} {LINK_NETNSID}", ns=str(ns1)) + ip(f"link add netns {ns1} link-netnsid {LINK_NETNSID} dummy1 type dummy") + ip(f"link add netns {ns1} dummy2 type dummy", ns=str(ns2)) + + ip("link del dummy1", ns=str(ns1)) + ip("link del dummy2", ns=str(ns1)) + + time.sleep(1) + rtnl.check_ntf() + ksft_true(rtnl.async_msg_queue.empty(), + "Received unexpected link notification") + + +def validate_link_netns(netns, ifname, link_netnsid) -> bool: + link_info = ip(f"-d link show dev {ifname}", ns=netns, json=True) + if not link_info: + return False + return link_info[0].get("link_netnsid") == link_netnsid + + +def test_link_net() -> None: + configs = [ + # type, common args, type args, fallback to dev_net + ("ipvlan", "link dummy1", "", False), + ("macsec", "link dummy1", "", False), + ("macvlan", "link dummy1", "", False), + ("macvtap", "link dummy1", "", False), + ("vlan", "link dummy1", "id 100", False), + ("gre", "", "local 192.0.2.1", True), + ("vti", "", "local 192.0.2.1", True), + ("ipip", "", "local 192.0.2.1", True), + ("ip6gre", "", "local 2001:db8::1", True), + ("ip6gre", "", "local 2001:db8::1", True), + ("ip6tnl", "", "local 2001:db8::1", True), + ("vti6", "", "local 2001:db8::1", True), + ("sit", "", "local 192.0.2.1", True), + ("xfrm", "", "if_id 1", True), + ] + + with NetNS() as ns1, NetNS() as ns2, NetNS() as ns3: + net1, net2, net3 = str(ns1), str(ns2), str(ns3) + + # prepare link netnsid and a dummy link needed by certain drivers + ip(f"netns set {net3} {LINK_NETNSID}", ns=str(net2)) + ip("link add dummy1 type dummy", ns=net3) + + cases = [ + # source, "netns", "link-netns", expected link-netns + (net3, None, None, None, None), + (net3, net2, None, None, LINK_NETNSID), + (net2, None, net3, LINK_NETNSID, LINK_NETNSID), + (net1, net2, net3, LINK_NETNSID, LINK_NETNSID), + ] + + for src_net, netns, link_netns, exp1, exp2 in cases: + tgt_net = netns or src_net + for typ, cargs, targs, fb_dev_net in configs: + cmd = "link add" + if netns: + cmd += f" netns {netns}" + if link_netns: + cmd += f" link-netns {link_netns}" + cmd += f" {cargs} foo type {typ} {targs}" + ip(cmd, ns=src_net) + if fb_dev_net: + ksft_true(validate_link_netns(tgt_net, "foo", exp1), + f"{typ} link_netns validation failed") + else: + ksft_true(validate_link_netns(tgt_net, "foo", exp2), + f"{typ} link_netns validation failed") + ip(f"link del foo", ns=tgt_net) + + +def test_peer_net() -> None: + types = [ + "vxcan", + "netkit", + "veth", + ] + + with NetNS() as ns1, NetNS() as ns2, NetNS() as ns3, NetNS() as ns4: + net1, net2, net3, net4 = str(ns1), str(ns2), str(ns3), str(ns4) + + ip(f"netns set {net3} {LINK_NETNSID}", ns=str(net2)) + + cases = [ + # source, "netns", "link-netns", "peer netns", expected + (net1, None, None, None, None), + (net1, net2, None, None, None), + (net2, None, net3, None, LINK_NETNSID), + (net1, net2, net3, None, None), + (net2, None, None, net3, LINK_NETNSID), + (net1, net2, None, net3, LINK_NETNSID), + (net2, None, net2, net3, LINK_NETNSID), + (net1, net2, net4, net3, LINK_NETNSID), + ] + + for src_net, netns, link_netns, peer_netns, exp in cases: + tgt_net = netns or src_net + for typ in types: + cmd = "link add" + if netns: + cmd += f" netns {netns}" + if link_netns: + cmd += f" link-netns {link_netns}" + cmd += f" foo type {typ}" + if peer_netns: + cmd += f" peer netns {peer_netns}" + ip(cmd, ns=src_net) + ksft_true(validate_link_netns(tgt_net, "foo", exp), + f"{typ} peer_netns validation failed") + ip(f"link del foo", ns=tgt_net) + + +def main() -> None: + ksft_run([test_event, test_link_net, test_peer_net]) + ksft_exit() + + +if __name__ == "__main__": + main() diff --git a/tools/testing/selftests/net/netns-name.sh b/tools/testing/selftests/net/netns-name.sh index 6974474c26f3..0be1905d1f2f 100755 --- a/tools/testing/selftests/net/netns-name.sh +++ b/tools/testing/selftests/net/netns-name.sh @@ -78,6 +78,16 @@ ip -netns $NS link show dev $ALT_NAME 2> /dev/null && fail "Can still find alt-name after move" ip -netns $test_ns link del $DEV || fail +# +# Test no conflict of the same name/ifindex in different netns +# +ip -netns $NS link add name $DEV index 100 type dummy || fail +ip -netns $NS link add netns $test_ns name $DEV index 100 type dummy || + fail "Can create in netns without moving" +ip -netns $test_ns link show dev $DEV >> /dev/null || fail "Device not found" +ip -netns $NS link del $DEV || fail +ip -netns $test_ns link del $DEV || fail + echo -ne "$(basename $0) \t\t\t\t" if [ $RET_CODE -eq 0 ]; then echo "[ OK ]" -- 2.47.1 From kuba at kernel.org Wed Dec 18 23:38:03 2024 From: kuba at kernel.org (Jakub Kicinski) Date: Wed, 18 Dec 2024 23:38:03 -0000 Subject: [PATCH net-next v6 11/11] selftests: net: Add test cases for link and peer netns In-Reply-To: <20241218130909.2173-12-shaw.leon@gmail.com> References: <20241218130909.2173-1-shaw.leon@gmail.com> <20241218130909.2173-12-shaw.leon@gmail.com> Message-ID: <20241218153759.672b7014@kernel.org> On Wed, 18 Dec 2024 21:09:09 +0800 Xiao Liang wrote: > - Add test for creating link in another netns when a link of the same > name and ifindex exists in current netns. > - Add test to verify that link is created in target netns directly - > no link new/del events should be generated in link netns or current > netns. > - Add test cases to verify that link-netns is set as expected for > various drivers and combination of namespace-related parameters. Nice work! You need to make sure all the drivers the test is using are enabled by the selftest kernel config: tools/testing/selftests/net/config This may be helpful: https://github.com/linux-netdev/nipa/wiki/How-to-run-netdev-selftests-CI-style#how-to-build -- pw-bot: cr From shaw.leon at gmail.com Thu Dec 19 05:54:42 2024 From: shaw.leon at gmail.com (Xiao Liang) Date: Thu, 19 Dec 2024 05:54:42 -0000 Subject: [PATCH net-next v6 11/11] selftests: net: Add test cases for link and peer netns In-Reply-To: <20241218153759.672b7014@kernel.org> References: <20241218130909.2173-1-shaw.leon@gmail.com> <20241218130909.2173-12-shaw.leon@gmail.com> <20241218153759.672b7014@kernel.org> Message-ID: On Thu, Dec 19, 2024 at 7:38?AM Jakub Kicinski wrote: > > On Wed, 18 Dec 2024 21:09:09 +0800 Xiao Liang wrote: > > - Add test for creating link in another netns when a link of the same > > name and ifindex exists in current netns. > > - Add test to verify that link is created in target netns directly - > > no link new/del events should be generated in link netns or current > > netns. > > - Add test cases to verify that link-netns is set as expected for > > various drivers and combination of namespace-related parameters. > > Nice work! > > You need to make sure all the drivers the test is using are enabled by > the selftest kernel config: tools/testing/selftests/net/config > > This may be helpful: > https://github.com/linux-netdev/nipa/wiki/How-to-run-netdev-selftests-CI-style#how-to-build Thanks for pointing it out. And vng is really cool. I will add the missing config in the next version. > -- > pw-bot: cr From syzbot+c40f14a86aa820015153 at syzkaller.appspotmail.com Mon Dec 23 13:46:00 2024 From: syzbot+c40f14a86aa820015153 at syzkaller.appspotmail.com (syzbot) Date: Mon, 23 Dec 2024 13:46:00 -0000 Subject: [syzbot] [wireguard?] WARNING: locking bug in wg_packet_decrypt_worker (2) Message-ID: <676967ce.050a0220.2f3838.0097.GAE@google.com> Hello, syzbot found the following issue on: HEAD commit: eabcdba3ad40 Merge tag 'for-6.13-rc3-tag' of git://git.ker.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=10a41f44580000 kernel config: https://syzkaller.appspot.com/x/.config?x=6a2b862bf4a5409f dashboard link: https://syzkaller.appspot.com/bug?extid=c40f14a86aa820015153 compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 Unfortunately, I don't have any reproducer for this issue yet. Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/2be6996abd02/disk-eabcdba3.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/4e177f4e98c0/vmlinux-eabcdba3.xz kernel image: https://storage.googleapis.com/syzbot-assets/bbf1b6ecbf58/bzImage-eabcdba3.xz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+c40f14a86aa820015153 at syzkaller.appspotmail.com ------------[ cut here ]------------ DEBUG_LOCKS_WARN_ON(1) WARNING: CPU: 0 PID: 5889 at kernel/locking/lockdep.c:232 hlock_class kernel/locking/lockdep.c:232 [inline] WARNING: CPU: 0 PID: 5889 at kernel/locking/lockdep.c:232 check_wait_context kernel/locking/lockdep.c:4850 [inline] WARNING: CPU: 0 PID: 5889 at kernel/locking/lockdep.c:232 __lock_acquire+0x564/0x2100 kernel/locking/lockdep.c:5176 Modules linked in: CPU: 0 UID: 0 PID: 5889 Comm: kworker/0:5 Not tainted 6.13.0-rc3-syzkaller-00073-geabcdba3ad40 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/25/2024 Workqueue: wg-crypt-wg2 wg_packet_decrypt_worker RIP: 0010:hlock_class kernel/locking/lockdep.c:232 [inline] RIP: 0010:check_wait_context kernel/locking/lockdep.c:4850 [inline] RIP: 0010:__lock_acquire+0x564/0x2100 kernel/locking/lockdep.c:5176 Code: 00 00 83 3d 21 f4 9e 0e 00 75 23 90 48 c7 c7 00 96 0a 8c 48 c7 c6 00 99 0a 8c e8 67 5d e5 ff 48 ba 00 00 00 00 00 fc ff df 90 <0f> 0b 90 90 90 31 db 48 81 c3 c4 00 00 00 48 89 d8 48 c1 e8 03 0f RSP: 0018:ffffc9000488f450 EFLAGS: 00010046 RAX: d0f6e83a7c789700 RBX: 0000000000001201 RCX: ffff88802ede3c00 RDX: dffffc0000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000041201 R08: ffffffff81601a42 R09: 1ffff110170c519a R10: dffffc0000000000 R11: ffffed10170c519b R12: ffff88802ede46c4 R13: 000000000000000a R14: 1ffff11005dbc8ea R15: ffff88802ede4750 FS: 0000000000000000(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b30112ff8 CR3: 000000000e736000 CR4: 0000000000350ef0 Call Trace: lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849 __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline] _raw_spin_lock_bh+0x35/0x50 kernel/locking/spinlock.c:178 spin_lock_bh include/linux/spinlock.h:356 [inline] ptr_ring_consume_bh include/linux/ptr_ring.h:365 [inline] wg_packet_decrypt_worker+0xcf/0xd80 drivers/net/wireguard/receive.c:499 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xa68/0x1840 kernel/workqueue.c:3310 worker_thread+0x870/0xd30 kernel/workqueue.c:3391 kthread+0x2f2/0x390 kernel/kthread.c:389 ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller at googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. If the report is already addressed, let syzbot know by replying with: #syz fix: exact-commit-title If you want to overwrite report's subsystems, reply with: #syz set subsystems: new-subsystem (See the list of subsystem names on the web dashboard) If the report is a duplicate of another one, reply with: #syz dup: exact-subject-of-another-report If you want to undo deduplication, reply with: #syz undup From syzbot+listbcd07c53ebf03869ffc2 at syzkaller.appspotmail.com Thu Dec 26 09:01:31 2024 From: syzbot+listbcd07c53ebf03869ffc2 at syzkaller.appspotmail.com (syzbot) Date: Thu, 26 Dec 2024 09:01:31 -0000 Subject: [syzbot] Monthly wireguard report (Dec 2024) Message-ID: <676d1b69.050a0220.226966.007e.GAE@google.com> Hello wireguard maintainers/developers, This is a 31-day syzbot report for the wireguard subsystem. All related reports/information can be found at: https://syzkaller.appspot.com/upstream/s/wireguard During the period, 1 new issues were detected and 0 were fixed. In total, 7 issues are still open and 19 have already been fixed. Some of the still happening issues: Ref Crashes Repro Title <1> 345 No INFO: task hung in wg_netns_pre_exit (5) https://syzkaller.appspot.com/bug?extid=f2fbf7478a35a94c8b7c <2> 98 No INFO: task hung in netdev_run_todo (4) https://syzkaller.appspot.com/bug?extid=894cca71fa925aabfdb2 <3> 2 No general protection fault in wg_packet_receive (2) https://syzkaller.appspot.com/bug?extid=c0f4a2553a2527b3fc1f --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller at googlegroups.com. To disable reminders for individual bugs, reply with the following command: #syz set no-reminders To change bug's subsystems, reply with: #syz set subsystems: new-subsystem You may send multiple commands in a single email message. From greearb at candelatech.com Tue Dec 3 18:25:36 2024 From: greearb at candelatech.com (Ben Greear) Date: Tue, 03 Dec 2024 18:25:36 -0000 Subject: [PATCH] net: wireguard: Allow binding to specific ifindex In-Reply-To: <20241203090927.GA9361@kernel.org> References: <20241125212111.1533982-1-greearb@candelatech.com> <20241203090927.GA9361@kernel.org> Message-ID: <0d30b5d3-d3ce-f959-e30d-d5ec57f2b2f1@candelatech.com> On 12/3/24 01:09, Simon Horman wrote: > On Mon, Nov 25, 2024 at 01:21:11PM -0800, greearb at candelatech.com wrote: >> From: Ben Greear >> >> Which allows us to bind to VRF. >> >> Signed-off-by: Ben Greear >> --- >> >> NOTE: Modified user-space to utilize this may be found here: >> https://github.com/greearb/wireguard-tools-ct >> Only the 'wg' part has been tested with this new feature as of today. > > ... > >> diff --git a/drivers/net/wireguard/socket.c b/drivers/net/wireguard/socket.c >> index 0414d7a6ce74..a7cb1c7c3112 100644 >> --- a/drivers/net/wireguard/socket.c >> +++ b/drivers/net/wireguard/socket.c >> @@ -25,7 +25,8 @@ static int send4(struct wg_device *wg, struct sk_buff *skb, >> .daddr = endpoint->addr4.sin_addr.s_addr, >> .fl4_dport = endpoint->addr4.sin_port, >> .flowi4_mark = wg->fwmark, >> - .flowi4_proto = IPPROTO_UDP >> + .flowi4_proto = IPPROTO_UDP, >> + .flowi4_oif = wg->lowerdev, >> }; >> struct rtable *rt = NULL; >> struct sock *sock; >> @@ -111,6 +112,9 @@ static int send6(struct wg_device *wg, struct sk_buff *skb, >> struct sock *sock; >> int ret = 0; >> >> + if (wg->lowerdev) >> + fl.flowi6_oif = wg->lowerdev, > > Hi Ben, > > I think that the trailing ',' on the line above should be a ';'. > As written, with a ',', the call to skb_mark_not_on_list() > below will be included in the conditional block above. > And this doesn't seem to be the intention of the code based on indentation. > > Flagged by clang-19 with -Wcomma Thank you for noticing that, it was bad copy paste bug on my part. I'll submit a v2. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From greearb at candelatech.com Tue Dec 3 19:39:54 2024 From: greearb at candelatech.com (greearb at candelatech.com) Date: Tue, 03 Dec 2024 19:39:54 -0000 Subject: [PATCH v2] net: wireguard: Allow binding to specific ifindex Message-ID: <20241203193939.1953303-1-greearb@candelatech.com> From: Ben Greear Which allows us to bind to VRF. Signed-off-by: Ben Greear --- v2: Fix bad use of comma, semicolon now used instead. drivers/net/wireguard/device.h | 1 + drivers/net/wireguard/netlink.c | 12 +++++++++++- drivers/net/wireguard/socket.c | 8 +++++++- include/uapi/linux/wireguard.h | 3 +++ 4 files changed, 22 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireguard/device.h b/drivers/net/wireguard/device.h index 43c7cebbf50b..9698d9203915 100644 --- a/drivers/net/wireguard/device.h +++ b/drivers/net/wireguard/device.h @@ -53,6 +53,7 @@ struct wg_device { atomic_t handshake_queue_len; unsigned int num_peers, device_update_gen; u32 fwmark; + int lowerdev; /* ifindex of lower level device to bind UDP transport */ u16 incoming_port; }; diff --git a/drivers/net/wireguard/netlink.c b/drivers/net/wireguard/netlink.c index f7055180ba4a..5de3d59a17b0 100644 --- a/drivers/net/wireguard/netlink.c +++ b/drivers/net/wireguard/netlink.c @@ -27,7 +27,8 @@ static const struct nla_policy device_policy[WGDEVICE_A_MAX + 1] = { [WGDEVICE_A_FLAGS] = { .type = NLA_U32 }, [WGDEVICE_A_LISTEN_PORT] = { .type = NLA_U16 }, [WGDEVICE_A_FWMARK] = { .type = NLA_U32 }, - [WGDEVICE_A_PEERS] = { .type = NLA_NESTED } + [WGDEVICE_A_PEERS] = { .type = NLA_NESTED }, + [WGDEVICE_A_LOWERDEV] = { .type = NLA_U32 }, }; static const struct nla_policy peer_policy[WGPEER_A_MAX + 1] = { @@ -232,6 +233,7 @@ static int wg_get_device_dump(struct sk_buff *skb, struct netlink_callback *cb) if (nla_put_u16(skb, WGDEVICE_A_LISTEN_PORT, wg->incoming_port) || nla_put_u32(skb, WGDEVICE_A_FWMARK, wg->fwmark) || + nla_put_u32(skb, WGDEVICE_A_LOWERDEV, wg->lowerdev) || nla_put_u32(skb, WGDEVICE_A_IFINDEX, wg->dev->ifindex) || nla_put_string(skb, WGDEVICE_A_IFNAME, wg->dev->name)) goto out; @@ -530,6 +532,14 @@ static int wg_set_device(struct sk_buff *skb, struct genl_info *info) wg_socket_clear_peer_endpoint_src(peer); } + if (info->attrs[WGDEVICE_A_LOWERDEV]) { + struct wg_peer *peer; + + wg->lowerdev = nla_get_u32(info->attrs[WGDEVICE_A_LOWERDEV]); + list_for_each_entry(peer, &wg->peer_list, peer_list) + wg_socket_clear_peer_endpoint_src(peer); + } + if (info->attrs[WGDEVICE_A_LISTEN_PORT]) { ret = set_port(wg, nla_get_u16(info->attrs[WGDEVICE_A_LISTEN_PORT])); diff --git a/drivers/net/wireguard/socket.c b/drivers/net/wireguard/socket.c index 0414d7a6ce74..7cef4b27f6ba 100644 --- a/drivers/net/wireguard/socket.c +++ b/drivers/net/wireguard/socket.c @@ -25,7 +25,8 @@ static int send4(struct wg_device *wg, struct sk_buff *skb, .daddr = endpoint->addr4.sin_addr.s_addr, .fl4_dport = endpoint->addr4.sin_port, .flowi4_mark = wg->fwmark, - .flowi4_proto = IPPROTO_UDP + .flowi4_proto = IPPROTO_UDP, + .flowi4_oif = wg->lowerdev, }; struct rtable *rt = NULL; struct sock *sock; @@ -111,6 +112,9 @@ static int send6(struct wg_device *wg, struct sk_buff *skb, struct sock *sock; int ret = 0; + if (wg->lowerdev) + fl.flowi6_oif = wg->lowerdev; + skb_mark_not_on_list(skb); skb->dev = wg->dev; skb->mark = wg->fwmark; @@ -360,6 +364,7 @@ int wg_socket_init(struct wg_device *wg, u16 port) .family = AF_INET, .local_ip.s_addr = htonl(INADDR_ANY), .local_udp_port = htons(port), + .bind_ifindex = wg->lowerdev, .use_udp_checksums = true }; #if IS_ENABLED(CONFIG_IPV6) @@ -369,6 +374,7 @@ int wg_socket_init(struct wg_device *wg, u16 port) .local_ip6 = IN6ADDR_ANY_INIT, .use_udp6_tx_checksums = true, .use_udp6_rx_checksums = true, + .bind_ifindex = wg->lowerdev, .ipv6_v6only = true }; #endif diff --git a/include/uapi/linux/wireguard.h b/include/uapi/linux/wireguard.h index ae88be14c947..f3784885389a 100644 --- a/include/uapi/linux/wireguard.h +++ b/include/uapi/linux/wireguard.h @@ -29,6 +29,7 @@ * WGDEVICE_A_PUBLIC_KEY: NLA_EXACT_LEN, len WG_KEY_LEN * WGDEVICE_A_LISTEN_PORT: NLA_U16 * WGDEVICE_A_FWMARK: NLA_U32 + * WGDEVICE_A_LOWERDEV: NLA_U32 * WGDEVICE_A_PEERS: NLA_NESTED * 0: NLA_NESTED * WGPEER_A_PUBLIC_KEY: NLA_EXACT_LEN, len WG_KEY_LEN @@ -83,6 +84,7 @@ * WGDEVICE_A_PRIVATE_KEY: len WG_KEY_LEN, all zeros to remove * WGDEVICE_A_LISTEN_PORT: NLA_U16, 0 to choose randomly * WGDEVICE_A_FWMARK: NLA_U32, 0 to disable + * WGDEVICE_A_LOWERDEV: NLA_U32, ifindex to bind lower transport, 0 to disable * WGDEVICE_A_PEERS: NLA_NESTED * 0: NLA_NESTED * WGPEER_A_PUBLIC_KEY: len WG_KEY_LEN @@ -157,6 +159,7 @@ enum wgdevice_attribute { WGDEVICE_A_LISTEN_PORT, WGDEVICE_A_FWMARK, WGDEVICE_A_PEERS, + WGDEVICE_A_LOWERDEV, __WGDEVICE_A_LAST }; #define WGDEVICE_A_MAX (__WGDEVICE_A_LAST - 1) -- 2.42.0 From arvidjaar at gmail.com Fri Dec 20 17:16:47 2024 From: arvidjaar at gmail.com (Andrei Borzenkov) Date: Fri, 20 Dec 2024 17:16:47 -0000 Subject: wg-quick fails with systemd resolvconf compatibility shim Message-ID: systemd resolvectl supports resolvconf compatibility mode when invoked as resolvconf. Ubuntu 24.04 apparently no more provides the real recolvconf: bor at bor-Latitude-E5450:~/tmp$ dpkg-query -S /usr/sbin/resolvconf systemd-resolved: /usr/sbin/resolvconf This version requires the real interface and fails with wg-quick hack: bor at bor-Latitude-E5450:~/tmp$ sudo wg-quick up ./wg0.conf [#] ip link add wg0 type wireguard [#] wg setconf wg0 /dev/fd/63 [#] ip -4 address add 192.168.6.160/32 dev wg0 [#] ip link set mtu 1420 up dev wg0 [#] resolvconf -a tun.wg0 -m 0 -x Failed to resolve interface "tun.wg0": No such device [#] ip link delete dev wg0 bor at bor-Latitude-E5450:~/tmp$ So, wg-quick became unusable on Ubuntu. It worked fine in Ubuntu 22.04. From fossdd at pwned.life Mon Dec 23 11:04:55 2024 From: fossdd at pwned.life (fossdd) Date: Mon, 23 Dec 2024 11:04:55 -0000 Subject: [PATCH] wg-quick: use SUDO variable for a different sudo implementation like doas Message-ID: Signed-off-by: fossdd --- src/wg-quick/linux.bash | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/wg-quick/linux.bash b/src/wg-quick/linux.bash index 4193ce5..7795c0b 100755 --- a/src/wg-quick/linux.bash +++ b/src/wg-quick/linux.bash @@ -82,7 +82,7 @@ read_bool() { } auto_su() { - [[ $UID == 0 ]] || exec sudo -p "$PROGRAM must be run as root. Please enter the password for %u to continue: " -- "$BASH" -- "$SELF" "${ARGS[@]}" + [[ $UID == 0 ]] || exec "${SUDO:-sudo}" "$BASH" -- "$SELF" "${ARGS[@]}" } add_if() { base-commit: 13f4ac4cb74b5a833fa7f825ba785b1e5774e84f -- 2.47.1 From andreas.hasenack at canonical.com Mon Dec 23 19:46:04 2024 From: andreas.hasenack at canonical.com (Andreas Hasenack) Date: Mon, 23 Dec 2024 19:46:04 -0000 Subject: Trying to route only IRC traffic through wireguard interface Message-ID: Hi, I'm traveling, and this ISP that I'm using "on the road" decided to block port 6697/tcp. I thought about using my existing wireguard VPN to also route this traffic through it. The problem is that there isn't just one ip to pick to add to AllowedIPs, it's several, and they change according to what DNS is resolving at that particular time. So I thought to use policy routing. Something like: iptables -t mangle -A OUTPUT -p tcp --dport 6697 -j MARK --set-mark 1 echo "100 wireguard" > /etc/iproute2/rt_tables.d/wireguard.conf ip rule add fwmark 1 table wireguard ip route add default via 10.10.12.1 dev wg0 table wireguard source 10.10.12.11 tcpdump shows this working on the local box, i.e., I see an outbound connection to the IRC server on the wireguard interface, but it never arrives anywyere. tcpdump on the other side of the wireguard tunnel shows zero traffic. I suspect wireguard locally is blocking it, because that IP is not in AllowedIPs, but I can't confirm because this box has secure boot and I can't enable debugfs to check the wireguard messages. If that's the case, is the only solution to really add all IPs of this IRC server to AllowedIPs, dynamically even perhaps? I know I could just route everything through wireguard, but now my interest is spiked by this particular case, and I wanted to be able to use policy routing.