linux-rockchip

mirror of https://github.com/armbian/linux-rockchip.git synced 2026-01-06 11:08:10 -08:00

Author	SHA1	Message	Date
Tobias Waldekranz	2d5b4b3376	net: bridge: switchdev: Skip MDB replays of deferred events on offload [ Upstream commit dc489f86257cab5056e747344f17a164f63bff4b ] Before this change, generation of the list of MDB events to replay would race against the creation of new group memberships, either from the IGMP/MLD snooping logic or from user configuration. While new memberships are immediately visible to walkers of br->mdb_list, the notification of their existence to switchdev event subscribers is deferred until a later point in time. So if a replay list was generated during a time that overlapped with such a window, it would also contain a replay of the not-yet-delivered event. The driver would thus receive two copies of what the bridge internally considered to be one single event. On destruction of the bridge, only a single membership deletion event was therefore sent. As a consequence of this, drivers which reference count memberships (at least DSA), would be left with orphan groups in their hardware database when the bridge was destroyed. This is only an issue when replaying additions. While deletion events may still be pending on the deferred queue, they will already have been removed from br->mdb_list, so no duplicates can be generated in that scenario. To a user this meant that old group memberships, from a bridge in which a port was previously attached, could be reanimated (in hardware) when the port joined a new bridge, without the new bridge's knowledge. For example, on an mv88e6xxx system, create a snooping bridge and immediately add a port to it: root@infix-06-0b-00:~$ ip link add dev br0 up type bridge mcast_snooping 1 && \ > ip link set dev x3 up master br0 And then destroy the bridge: root@infix-06-0b-00:~$ ip link del dev br0 root@infix-06-0b-00:~$ mvls atu ADDRESS FID STATE Q F 0 1 2 3 4 5 6 7 8 9 a DEV:0 Marvell 88E6393X 33:33:00:00:00:6a 1 static - - 0 . . . . . . . . . . 33:33:ff:87:e4:3f 1 static - - 0 . . . . . . . . . . ff:ff:ff:ff:ff:ff 1 static - - 0 1 2 3 4 5 6 7 8 9 a root@infix-06-0b-00:~$ The two IPv6 groups remain in the hardware database because the port (x3) is notified of the host's membership twice: once via the original event and once via a replay. Since only a single delete notification is sent, the count remains at 1 when the bridge is destroyed. Then add the same port (or another port belonging to the same hardware domain) to a new bridge, this time with snooping disabled: root@infix-06-0b-00:~$ ip link add dev br1 up type bridge mcast_snooping 0 && \ > ip link set dev x3 up master br1 All multicast, including the two IPv6 groups from br0, should now be flooded, according to the policy of br1. But instead the old memberships are still active in the hardware database, causing the switch to only forward traffic to those groups towards the CPU (port 0). Eliminate the race in two steps: 1. Grab the write-side lock of the MDB while generating the replay list. This prevents new memberships from showing up while we are generating the replay list. But it leaves the scenario in which a deferred event was already generated, but not delivered, before we grabbed the lock. Therefore: 2. Make sure that no deferred version of a replay event is already enqueued to the switchdev deferred queue, before adding it to the replay list, when replaying additions. Fixes: `4f2673b3a2` ("net: bridge: add helper to replay port and host-joined mdb entries") Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-01 13:26:35 +01:00
Jakub Kicinski	d62607c3fe	net: rename reference+tracking helpers Netdev reference helpers have a dev_ prefix for historic reasons. Renaming the old helpers would be too much churn but we can rename the tracking ones which are relatively recent and should be the default for new code. Rename: dev_hold_track() -> netdev_hold() dev_put_track() -> netdev_put() dev_replace_track() -> netdev_ref_replace() Link: https://lore.kernel.org/r/20220608043955.919359-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-06-09 21:52:55 -07:00
Vladimir Oltean	ec638740fc	net: switchdev: remove lag_mod_cb from switchdev_handle_fdb_event_to_device When the switchdev_handle_fdb_event_to_device() event replication helper was created, my original thought was that FDB events on LAG interfaces should most likely be special-cased, not just replicated towards all switchdev ports beneath that LAG. So this replication helper currently does not recurse through switchdev lower interfaces of LAG bridge ports, but rather calls the lag_mod_cb() if that was provided. No switchdev driver uses this helper for FDB events on LAG interfaces yet, so that was an assumption which was yet to be tested. It is certainly usable for that purpose, as my RFC series shows: https://patchwork.kernel.org/project/netdevbpf/cover/20220210125201.2859463-1-vladimir.oltean@nxp.com/ however this approach is slightly convoluted because: - the switchdev driver gets a "dev" that isn't its own net device, but rather the LAG net device. It must call switchdev_lower_dev_find(dev) in order to get a handle of any of its own net devices (the ones that pass check_cb). - in order for FDB entries on LAG ports to be correctly refcounted per the number of switchdev ports beneath that LAG, we haven't escaped the need to iterate through the LAG's lower interfaces. Except that is now the responsibility of the switchdev driver, because the replication helper just stopped half-way. So, even though yes, FDB events on LAG bridge ports must be special-cased, in the end it's simpler to let switchdev_handle_fdb_* just iterate through the LAG port's switchdev lowers, and let the switchdev driver figure out that those physical ports are under a LAG. The switchdev_handle_fdb_event_to_device() helper takes a "foreign_dev_check" callback so it can figure out whether @dev can autonomously forward to @foreign_dev. DSA fills this method properly: if the LAG is offloaded by another port in the same tree as @dev, then it isn't foreign. If it is a software LAG, it is foreign - forwarding happens in software. Whether an interface is foreign or not decides whether the replication helper will go through the LAG's switchdev lowers or not. Since the lan966x doesn't properly fill this out, FDB events on software LAG uppers will get called. By changing lan966x_foreign_dev_check(), we can suppress them. Whereas DSA will now start receiving FDB events for its offloaded LAG uppers, so we need to return -EOPNOTSUPP, since we currently don't do the right thing for them. Cc: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-24 21:31:43 -08:00
Vladimir Oltean	acd8df5880	net: switchdev: avoid infinite recursion from LAG to bridge with port object handler The logic from switchdev_handle_port_obj_add_foreign() is directly adapted from switchdev_handle_fdb_event_to_device(), which already detects events on foreign interfaces and reoffloads them towards the switchdev neighbors. However, when we have a simple br0 <-> bond0 <-> swp0 topology and the switchdev_handle_port_obj_add_foreign() gets called on bond0, we get stuck into an infinite recursion: 1. bond0 does not pass check_cb(), so we attempt to find switchdev neighbor interfaces. For that, we recursively call __switchdev_handle_port_obj_add() for bond0's bridge, br0. 2. __switchdev_handle_port_obj_add() recurses through br0's lowers, essentially calling __switchdev_handle_port_obj_add() for bond0 3. Go to step 1. This happens because switchdev_handle_fdb_event_to_device() and switchdev_handle_port_obj_add_foreign() are not exactly the same. The FDB event helper special-cases LAG interfaces with its lag_mod_cb(), so this is why we don't end up in an infinite loop - because it doesn't attempt to treat LAG interfaces as potentially foreign bridge ports. The problem is solved by looking ahead through the bridge's lowers to see whether there is any switchdev interface that is foreign to the @dev we are currently processing. This stops the recursion described above at step 1: __switchdev_handle_port_obj_add(bond0) will not create another call to __switchdev_handle_port_obj_add(br0). Going one step upper should only happen when we're starting from a bridge port that has been determined to be "foreign" to the switchdev driver that passes the foreign_dev_check_cb(). Fixes: `c4076cdd21` ("net: switchdev: introduce switchdev_handle_port_obj_{add,del} for foreign interfaces") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-23 12:12:12 +00:00
Vladimir Oltean	c4076cdd21	net: switchdev: introduce switchdev_handle_port_obj_{add,del} for foreign interfaces The switchdev_handle_port_obj_add() helper is good for replicating a port object on the lower interfaces of @dev, if that object was emitted on a bridge, or on a bridge port that is a LAG. However, drivers that use this helper limit themselves to a box from which they can no longer intercept port objects notified on neighbor ports ("foreign interfaces"). One such driver is DSA, where software bridging with foreign interfaces such as standalone NICs or Wi-Fi APs is an important use case. There, a VLAN installed on a neighbor bridge port roughly corresponds to a forwarding VLAN installed on the DSA switch's CPU port. To support this use case while also making use of the benefits of the switchdev_handle_* replication helper for port objects, introduce a new variant of these functions that crawls through the neighbor ports of @dev, in search of potentially compatible switchdev ports that are interested in the event. The strategy is identical to switchdev_handle_fdb_event_to_device(): if @dev wasn't a switchdev interface, then go one step upper, and recursively call this function on the bridge that this port belongs to. At the next recursion step, __switchdev_handle_port_obj_add() will iterate through the bridge's lower interfaces. Among those, some will be switchdev interfaces, and one will be the original @dev that we came from. To prevent infinite recursion, we must suppress reentry into the original @dev, and just call the @add_cb for the switchdev_interfaces. It looks like this: br0 / \| \ / \| \ / \| \ swp0 swp1 eth0 1. __switchdev_handle_port_obj_add(eth0) -> check_cb(eth0) returns false -> eth0 has no lower interfaces -> eth0's bridge is br0 -> switchdev_lower_dev_find(br0, check_cb, foreign_dev_check_cb)) finds br0 2. __switchdev_handle_port_obj_add(br0) -> check_cb(br0) returns false -> netdev_for_each_lower_dev -> check_cb(swp0) returns true, so we don't skip this interface 3. __switchdev_handle_port_obj_add(swp0) -> check_cb(swp0) returns true, so we call add_cb(swp0) (back to netdev_for_each_lower_dev from 2) -> check_cb(swp1) returns true, so we don't skip this interface 4. __switchdev_handle_port_obj_add(swp1) -> check_cb(swp1) returns true, so we call add_cb(swp1) (back to netdev_for_each_lower_dev from 2) -> check_cb(eth0) returns false, so we skip this interface to avoid infinite recursion Note: eth0 could have been a LAG, and we don't want to suppress the recursion through its lowers if those exist, so when check_cb() returns false, we still call switchdev_lower_dev_find() to estimate whether there's anything worth a recursion beneath that LAG. Using check_cb() and foreign_dev_check_cb(), switchdev_lower_dev_find() not only figures out whether the lowers of the LAG are switchdev, but also whether they actively offload the LAG or not (whether the LAG is "foreign" to the switchdev interface or not). The port_obj_info->orig_dev is preserved across recursive calls, so switchdev drivers still know on which device was this notification originally emitted. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-16 11:21:04 +00:00
Vladimir Oltean	7b465f4cf3	net: switchdev: rename switchdev_lower_dev_find to switchdev_lower_dev_find_rcu switchdev_lower_dev_find() assumes RCU read-side critical section calling context, since it uses netdev_walk_all_lower_dev_rcu(). Rename it appropriately, in preparation of adding a similar iterator that assumes writer-side rtnl_mutex protection. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-16 11:21:04 +00:00
Minghao Chi (CGEL ZTE)	d8c2858181	net/switchdev: use struct_size over open coded arithmetic Replace zero-length array with flexible-array member and make use of the struct_size() helper in kmalloc(). For example: struct switchdev_deferred_item { ... unsigned long data[]; }; Make use of the struct_size() helper instead of an open-coded version in order to avoid any potential type mistakes. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi (CGEL ZTE) <chi.minghao@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-10 15:37:47 +00:00
Eric Dumazet	4fc003fe03	net: switchdev: add net device refcount tracker Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-07 20:44:58 -08:00
Vladimir Oltean	716a30a97a	net: switchdev: merge switchdev_handle_fdb_{add,del}_to_device To reduce code churn, the same patch makes multiple changes, since they all touch the same lines: 1. The implementations for these two are identical, just with different function pointers. Reduce duplications and name the function pointers "mod_cb" instead of "add_cb" and "del_cb". Pass the event as argument. 2. Drop the "const" attribute from "orig_dev". If the driver needs to check whether orig_dev belongs to itself and then call_switchdev_notifiers(orig_dev, SWITCHDEV_FDB_OFFLOADED), it can't, because call_switchdev_notifiers takes a non-const struct net_device *. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-27 14:54:02 +01:00
Vladimir Oltean	957e2235e5	net: make switchdev_bridge_port_{,unoffload} loosely coupled with the bridge With the introduction of explicit offloading API in switchdev in commit `2f5dc00f7a` ("net: bridge: switchdev: let drivers inform which bridge ports are offloaded"), we started having Ethernet switch drivers calling directly into a function exported by net/bridge/br_switchdev.c, which is a function exported by the bridge driver. This means that drivers that did not have an explicit dependency on the bridge before, like cpsw and am65-cpsw, now do - otherwise it is not possible to call a symbol exported by a driver that can be built as module unless you are a module too. There was an attempt to solve the dependency issue in the form of commit `b0e8181762` ("net: build all switchdev drivers as modules when the bridge is a module"). Grygorii Strashko, however, says about it: \| In my opinion, the problem is a bit bigger here than just fixing the \| build :( \| \| In case, of ^cpsw the switchdev mode is kinda optional and in many \| cases (especially for testing purposes, NFS) the multi-mac mode is \| still preferable mode. \| \| There were no such tight dependency between switchdev drivers and \| bridge core before and switchdev serviced as independent, notification \| based layer between them, so ^cpsw still can be "Y" and bridge can be \| "M". Now for mostly every kernel build configuration the CONFIG_BRIDGE \| will need to be set as "Y", or we will have to update drivers to \| support build with BRIDGE=n and maintain separate builds for \| networking vs non-networking testing. But is this enough? Wouldn't \| it cause 'chain reaction' required to add more and more "Y" options \| (like CONFIG_VLAN_8021Q)? \| \| PS. Just to be sure we on the same page - ARM builds will be forced \| (with this patch) to have CONFIG_TI_CPSW_SWITCHDEV=m and so all our \| automation testing will just fail with omap2plus_defconfig. In the light of this, it would be desirable for some configurations to avoid dependencies between switchdev drivers and the bridge, and have the switchdev mode as completely optional within the driver. Arnd Bergmann also tried to write a patch which better expressed the build time dependency for Ethernet switch drivers where the switchdev support is optional, like cpsw/am65-cpsw, and this made the drivers follow the bridge (compile as module if the bridge is a module) only if the optional switchdev support in the driver was enabled in the first place: https://patchwork.kernel.org/project/netdevbpf/patch/20210802144813.1152762-1-arnd@kernel.org/ but this still did not solve the fact that cpsw and am65-cpsw now must be built as modules when the bridge is a module - it just expressed correctly that optional dependency. But the new behavior is an apparent regression from Grygorii's perspective. So to support the use case where the Ethernet driver is built-in, NET_SWITCHDEV (a bool option) is enabled, and the bridge is a module, we need a framework that can handle the possible absence of the bridge from the running system, i.e. runtime bloatware as opposed to build-time bloatware. Luckily we already have this framework, since switchdev has been using it extensively. Events from the bridge side are transmitted to the driver side using notifier chains - this was originally done so that unrelated drivers could snoop for events emitted by the bridge towards ports that are implemented by other drivers (think of a switch driver with LAG offload that listens for switchdev events on a bonding/team interface that it offloads). There are also events which are transmitted from the driver side to the bridge side, which again are modeled using notifiers. SWITCHDEV_FDB_ADD_TO_BRIDGE is an example of this, and deals with notifying the bridge that a MAC address has been dynamically learned. So there is a precedent we can use for modeling the new framework. The difference compared to SWITCHDEV_FDB_ADD_TO_BRIDGE is that the work that the bridge needs to do when a port becomes offloaded is blocking in its nature: replay VLANs, MDBs etc. The calling context is indeed blocking (we are under rtnl_mutex), but the existing switchdev notification chain that the bridge is subscribed to is only the atomic one. So we need to subscribe the bridge to the blocking switchdev notification chain too. This patch: - keeps the driver-side perception of the switchdev_bridge_port_{,un}offload unchanged - moves the implementation of switchdev_bridge_port_{,un}offload from the bridge module into the switchdev module. - makes everybody that is subscribed to the switchdev blocking notifier chain "hear" offload & unoffload events - makes the bridge driver subscribe and handle those events - moves the bridge driver's handling of those events into 2 new functions called br_switchdev_port_{,un}offload. These functions contain in fact the core of the logic that was previously in switchdev_bridge_port_{,un}offload, just that now we go through an extra indirection layer to reach them. Unlike all the other switchdev notification structures, the structure used to carry the bridge port information, struct switchdev_notifier_brport_info, does not contain a "bool handled". This is because in the current usage pattern, we always know that a switchdev bridge port offloading event will be handled by the bridge, because the switchdev_bridge_port_offload() call was initiated by a NETDEV_CHANGEUPPER event in the first place, where info->upper_dev is a bridge. So if the bridge wasn't loaded, then the CHANGEUPPER event couldn't have happened. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-04 12:35:07 +01:00
Vladimir Oltean	2b0a568849	net: switchdev: fix FDB entries towards foreign ports not getting propagated to us The newly introduced switchdev_handle_fdb_{add,del}_to_device helpers solved a problem but introduced another one. They have a severe design bug: they do not propagate FDB events on foreign interfaces to us, i.e. this use case: br0 / \ / \ / \ / \ swp0 eno0 (switchdev) (foreign) when an address is learned on eno0, what is supposed to happen is that this event should also be propagated towards swp0. Somehow I managed to convince myself that this did work correctly, but obviously it does not. The trouble with foreign interfaces is that we must reach a switchdev net_device pointer through a foreign net_device that has no direct upper/lower relationship with it. So we need to do exploratory searching through the lower interfaces of the foreign net_device's bridge upper (to reach swp0 from eno0, we must check its upper, br0, for lower interfaces that pass the check_cb and foreign_dev_check_cb). This is something that the previous code did not do, it just assumed that "dev" will become a switchdev interface at some point, somehow, probably by magic. With this patch, assisted address learning on the CPU port works again in DSA: ip link add br0 type bridge ip link set swp0 master br0 ip link set eno0 master br0 ip link set br0 up [ 46.708929] mscc_felix 0000:00:00.5 swp0: Adding FDB entry towards eno0, addr 00:04:9f:05:f4:ab vid 0 as host address Fixes: `8ca07176ab` ("net: switchdev: introduce a fanout helper for SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE") Reported-by: Eric Woudstra <ericwouds@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-22 00:45:40 -07:00
Vladimir Oltean	71f4f89a03	net: switchdev: recurse into __switchdev_handle_fdb_del_to_device The difference between __switchdev_handle_fdb_del_to_device and switchdev_handle_del_to_device is that the former takes an extra orig_dev argument, while the latter starts with dev == orig_dev. We should recurse into the variant that does not lose the orig_dev along the way. This is relevant when deleting FDB entries pointing towards a bridge (dev changes to the lower interfaces, but orig_dev shouldn't). The addition helper already recurses properly, just the deletion one doesn't. Fixes: `8ca07176ab` ("net: switchdev: introduce a fanout helper for SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-21 08:00:16 -07:00
Vladimir Oltean	8ca07176ab	net: switchdev: introduce a fanout helper for SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE Currently DSA has an issue with FDB entries pointing towards the bridge in the presence of br_fdb_replay() being called at port join and leave time. In particular, each bridge port will ask for a replay for the FDB entries pointing towards the bridge when it joins, and for another replay when it leaves. This means that for example, a bridge with 4 switch ports will notify DSA 4 times of the bridge MAC address. But if the MAC address of the bridge changes during the normal runtime of the system, the bridge notifies switchdev [ once ] of the deletion of the old MAC address as a local FDB towards the bridge, and of the insertion [ again once ] of the new MAC address as a local FDB. This is a problem, because DSA keeps the old MAC address as a host FDB entry with refcount 4 (4 ports asked for it using br_fdb_replay). So the old MAC address will not be deleted. Additionally, the new MAC address will only be installed with refcount 1, and when the first switch port leaves the bridge (leaving 3 others as still members), it will delete with it the new MAC address of the bridge from the local FDB entries kept by DSA (because the br_fdb_replay call on deletion will bring the entry's refcount from 1 to 0). So the problem, really, is that the number of br_fdb_replay() calls is not matched with the refcount that a host FDB is offloaded to DSA during normal runtime. An elegant way to solve the problem would be to make the switchdev notification emitted by br_fdb_change_mac_address() result in a host FDB kept by DSA which has a refcount exactly equal to the number of ports under that bridge. Then, no matter how many DSA ports join or leave that bridge, the host FDB entry will always be deleted when there are exactly zero remaining DSA switch ports members of the bridge. To implement the proposed solution, we remember that the switchdev objects and port attributes have some helpers provided by switchdev, which can be optionally called by drivers: switchdev_handle_port_obj_{add,del} and switchdev_handle_port_attr_set. These helpers: - fan out a switchdev object/attribute emitted for the bridge towards all the lower interfaces that pass the check_cb(). - fan out a switchdev object/attribute emitted for a bridge port that is a LAG towards all the lower interfaces that pass the check_cb(). In other words, this is the model we need for the FDB events too: something that will keep an FDB entry emitted towards a physical port as it is, but translate an FDB entry emitted towards the bridge into N FDB entries, one per physical port. Of course, there are many differences between fanning out a switchdev object (VLAN) on 3 lower interfaces of a LAG and fanning out an FDB entry on 3 lower interfaces of a LAG. Intuitively, an FDB entry towards a LAG should be treated specially, because FDB entries are unicast, we can't just install the same address towards 3 destinations. It is imaginable that drivers might want to treat this case specifically, so create some methods for this case and do not recurse into the LAG lower ports, just the bridge ports. DSA also listens for FDB entries on "foreign" interfaces, aka interfaces bridged with us which are not part of our hardware domain: think an Ethernet switch bridged with a Wi-Fi AP. For those addresses, DSA installs host FDB entries. However, there we have the same problem (those host FDB entries are installed with a refcount of only 1) and an even bigger one which we did not have with FDB entries towards the bridge: br_fdb_replay() is currently not called for FDB entries on foreign interfaces, just for the physical port and for the bridge itself. So when DSA sniffs an address learned by the software bridge towards a foreign interface like an e1000 port, and then that e1000 leaves the bridge, DSA remains with the dangling host FDB address. That will be fixed separately by replaying all FDB entries and not just the ones towards the port and the bridge. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-20 07:04:27 -07:00
Vladimir Oltean	69bfac968a	net: switchdev: add a context void pointer to struct switchdev_notifier_info In the case where the driver asks for a replay of a certain type of event (port object or attribute) for a bridge port that is a LAG, it may do so because this port has just joined the LAG. But there might already be other switchdev ports in that LAG, and it is preferable that those preexisting switchdev ports do not act upon the replayed event. The solution is to add a context to switchdev events, which is NULL most of the time (when the bridge layer initiates the call) but which can be set to a value controlled by the switchdev driver when a replay is requested. The driver can then check the context to figure out if all ports within the LAG should act upon the switchdev event, or just the ones that match the context. We have to modify all switchdev_handle_* helper functions as well as the prototypes in the drivers that use these helpers too, because these helpers hide the underlying struct switchdev_notifier_info from us and there is no way to retrieve the context otherwise. The context structure will be populated and used in later patches. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-28 14:09:03 -07:00
Vladimir Oltean	dcbdf1350e	net: bridge: propagate extack through switchdev_port_attr_set The benefit is the ability to propagate errors from switchdev drivers for the SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING and SWITCHDEV_ATTR_ID_BRIDGE_VLAN_PROTOCOL attributes. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-14 17:38:11 -08:00
Vladimir Oltean	4c08c586ff	net: switchdev: propagate extack to port attributes When a struct switchdev_attr is notified through switchdev, there is no way to report informational messages, unlike for struct switchdev_obj. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Nikolay Aleksandrov <nikolay@nvidia.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 17:08:04 -08:00
Jakub Kicinski	c358f95205	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net drivers/net/can/dev.c `b552766c87` ("can: dev: prevent potential information leak in can_fill_info()") `3e77f70e73` ("can: dev: move driver related infrastructure into separate subdir") `0a042c6ec9` ("can: dev: move netlink related code into seperate file") Code move. drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c `57ac4a31c4` ("net/mlx5e: Correctly handle changing the number of queues when the interface is down") `214baf2287` ("net/mlx5e: Support HTB offload") Adjacent code changes net/switchdev/switchdev.c `20776b465c` ("net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP") `ffb68fc58e` ("net: switchdev: remove the transaction structure from port object notifiers") `bae33f2b5a` ("net: switchdev: remove the transaction structure from port attributes") Transaction parameter gets dropped otherwise keep the fix. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 17:09:31 -08:00
Masahiro Yamada	0cfd99b487	net: switchdev: use obj-$(CONFIG_NET_SWITCHDEV) form in net/Makefile CONFIG_NET_SWITCHDEV is a bool option. Change the ifeq conditional to the standard obj-$(CONFIG_NET_SWITCHDEV) form. Use obj-y in net/switchdev/Makefile because Kbuild visits this Makefile only when CONFIG_NET_SWITCHDEV=y. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20210125231659.106201-3-masahiroy@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-27 17:03:52 -08:00
Rasmus Villemoes	20776b465c	net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP It's not true that switchdev_port_obj_notify() only inspects the ->handled field of "struct switchdev_notifier_port_obj_info" if call_switchdev_blocking_notifiers() returns 0 - there's a WARN_ON() triggering for a non-zero return combined with ->handled not being true. But the real problem here is that -EOPNOTSUPP is not being properly handled. The wrapper functions switchdev_handle_port_obj_add() et al change a return value of -EOPNOTSUPP to 0, and the treatment of ->handled in switchdev_port_obj_notify() seems to be designed to change that back to -EOPNOTSUPP in case nobody actually acted on the notifier (i.e., everybody returned -EOPNOTSUPP). Currently, as soon as some device down the stack passes the check_cb() check, ->handled gets set to true, which means that switchdev_port_obj_notify() cannot actually ever return -EOPNOTSUPP. This, for example, means that the detection of hardware offload support in the MRP code is broken: switchdev_port_obj_add() used by br_mrp_switchdev_send_ring_test() always returns 0, so since the MRP code thinks the generation of MRP test frames has been offloaded, no such frames are actually put on the wire. Similarly, br_mrp_switchdev_set_ring_role() also always returns 0, causing mrp->ring_role_offloaded to be set to 1. To fix this, continue to set ->handled true if any callback returns success or any error distinct from -EOPNOTSUPP. But if all the callbacks return -EOPNOTSUPP, make sure that ->handled stays false, so the logic in switchdev_port_obj_notify() can propagate that information. Fixes: `9a9f26e8f7` ("bridge: mrp: Connect MRP API with the switchdev API") Fixes: `f30f0601eb` ("switchdev: Add helpers to aid traversal through lower devices") Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Link: https://lore.kernel.org/r/20210125124116.102928-1-rasmus.villemoes@prevas.dk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-27 16:41:59 -08:00
Vladimir Oltean	bae33f2b5a	net: switchdev: remove the transaction structure from port attributes Since the introduction of the switchdev API, port attributes were transmitted to drivers for offloading using a two-step transactional model, with a prepare phase that was supposed to catch all errors, and a commit phase that was supposed to never fail. Some classes of failures can never be avoided, like hardware access, or memory allocation. In the latter case, merely attempting to move the memory allocation to the preparation phase makes it impossible to avoid memory leaks, since commit `91cf8eceff` ("switchdev: Remove unused transaction item queue") which has removed the unused mechanism of passing on the allocated memory between one phase and another. It is time we admit that separating the preparation from the commit phase is something that is best left for the driver to decide, and not something that should be baked into the API, especially since there are no switchdev callers that depend on this. This patch removes the struct switchdev_trans member from switchdev port attribute notifier structures, and converts drivers to not look at this member. In part, this patch contains a revert of my previous commit `2e554a7a5d` ("net: dsa: propagate switchdev vlan_filtering prepare phase to drivers"). For the most part, the conversion was trivial except for: - Rocker's world implementation based on Broadcom OF-DPA had an odd implementation of ofdpa_port_attr_bridge_flags_set. The conversion was done mechanically, by pasting the implementation twice, then only keeping the code that would get executed during prepare phase on top, then only keeping the code that gets executed during the commit phase on bottom, then simplifying the resulting code until this was obtained. - DSA's offloading of STP state, bridge flags, VLAN filtering and multicast router could be converted right away. But the ageing time could not, so a shim was introduced and this was left for a further commit. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek Reviewed-by: Linus Walleij <linus.walleij@linaro.org> # RTL8366RB Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-11 16:00:57 -08:00
Vladimir Oltean	cf6def51ba	net: switchdev: delete switchdev_port_obj_add_now After the removal of the transactional model inside switchdev_port_obj_add_now, it has no added value and we can just call switchdev_port_obj_notify directly, bypassing this function. Let's delete it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-11 16:00:56 -08:00
Vladimir Oltean	ffb68fc58e	net: switchdev: remove the transaction structure from port object notifiers Since the introduction of the switchdev API, port objects were transmitted to drivers for offloading using a two-step transactional model, with a prepare phase that was supposed to catch all errors, and a commit phase that was supposed to never fail. Some classes of failures can never be avoided, like hardware access, or memory allocation. In the latter case, merely attempting to move the memory allocation to the preparation phase makes it impossible to avoid memory leaks, since commit `91cf8eceff` ("switchdev: Remove unused transaction item queue") which has removed the unused mechanism of passing on the allocated memory between one phase and another. It is time we admit that separating the preparation from the commit phase is something that is best left for the driver to decide, and not something that should be baked into the API, especially since there are no switchdev callers that depend on this. This patch removes the struct switchdev_trans member from switchdev port object notifier structures, and converts drivers to not look at this member. Where driver conversion is trivial (like in the case of the Marvell Prestera driver, NXP DPAA2 switch, TI CPSW, and Rocker drivers), it is done in this patch. Where driver conversion needs more attention (DSA, Mellanox Spectrum), the conversion is left for subsequent patches and here we only fake the prepare/commit phases at a lower level, just not in the switchdev notifier itself. Where the code has a natural structure that is best left alone as a preparation and a commit phase (as in the case of the Ocelot switch), that structure is left in place, just made to not depend upon the switchdev transactional model. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-11 16:00:56 -08:00
Tian Tao	ea6754aef2	net: switchdev: Fixed kerneldoc warning Update kernel-doc line comments to fix warnings reported by make W=1. net/switchdev/switchdev.c:413: warning: Function parameter or member 'extack' not described in 'call_switchdev_notifiers' Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Acked-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-23 17:46:31 -07:00
Andrew Lunn	c8af73f0b2	net: switchdev: kerneldoc fixes Simple fixes which require no deep knowledge of the code. Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-07-13 17:20:40 -07:00
Masahiro Yamada	a7f7f6248d	treewide: replace '---help---' in Kconfig files with 'help' Since commit `84af7a6194` ("checkpatch: kconfig: prefer 'help' over '---help---'"), the number of '---help---' has been gradually decreasing, but there are still more than 2400 instances. This commit finishes the conversion. While I touched the lines, I also fixed the indentation. There are a variety of indentation styles found. a) 4 spaces + '---help---' b) 7 spaces + '---help---' c) 8 spaces + '---help---' d) 1 space + 1 tab + '---help---' e) 1 tab + '---help---' (correct indentation) f) 1 tab + 1 space + '---help---' g) 1 tab + 2 spaces + '---help---' In order to convert all of them to 1 tab + 'help', I ran the following commend: $ find . -name 'Kconfig' \| xargs sed -i 's/^[[:space:]]---help---/\thelp/' Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2020-06-14 01:57:21 +09:00

1 2 3 4 5 ...

144 Commits