Commit Graph

832 Commits

Author SHA1 Message Date
Vladimir Oltean
5df5661a13 net: dsa: stop overriding master's ndo_get_phys_port_name
The purpose of this override is to give the user an indication of what
the number of the CPU port is (in DSA, the CPU port is a hardware
implementation detail and not a network interface capable of traffic).

However, it has always failed (by design) at providing this information
to the user in a reliable fashion.

Prior to commit 3369afba1e ("net: Call into DSA netdevice_ops
wrappers"), the behavior was to only override this callback if it was
not provided by the DSA master.

That was its first failure: if the DSA master itself was a DSA port or a
switchdev, then the user would not see the number of the CPU port in
/sys/class/net/eth0/phys_port_name, but the number of the DSA master
port within its respective physical switch.

But that was actually ok in a way. The commit mentioned above changed
that behavior, and now overrides the master's ndo_get_phys_port_name
unconditionally. That comes with problems of its own, which are worse in
a way.

The idea is that it's typical for switchdev users to have udev rules for
consistent interface naming. These are based, among other things, on
the phys_port_name attribute. If we let the DSA switch at the bottom
to start randomly overriding ndo_get_phys_port_name with its own CPU
port, we basically lose any predictability in interface naming, or even
uniqueness, for that matter.

So, there are reasons to let DSA override the master's callback (to
provide a consistent interface, a number which has a clear meaning and
must not be interpreted according to context), and there are reasons to
not let DSA override it (it breaks udev matching for the DSA master).

But, there is an alternative method for users to retrieve the number of
the CPU port of each DSA switch in the system:

  $ devlink port
  pci/0000:00:00.5/0: type eth netdev swp0 flavour physical port 0
  pci/0000:00:00.5/2: type eth netdev swp2 flavour physical port 2
  pci/0000:00:00.5/4: type notset flavour cpu port 4
  spi/spi2.0/0: type eth netdev sw0p0 flavour physical port 0
  spi/spi2.0/1: type eth netdev sw0p1 flavour physical port 1
  spi/spi2.0/2: type eth netdev sw0p2 flavour physical port 2
  spi/spi2.0/4: type notset flavour cpu port 4
  spi/spi2.1/0: type eth netdev sw1p0 flavour physical port 0
  spi/spi2.1/1: type eth netdev sw1p1 flavour physical port 1
  spi/spi2.1/2: type eth netdev sw1p2 flavour physical port 2
  spi/spi2.1/3: type eth netdev sw1p3 flavour physical port 3
  spi/spi2.1/4: type notset flavour cpu port 4

So remove this duplicated, unreliable and troublesome method. From this
patch on, the phys_port_name attribute of the DSA master will only
contain information about itself (if at all). If the users need reliable
information about the CPU port they're probably using devlink anyway.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23 15:14:58 -07:00
Kurt Kanzenbach
85e05d263e net: dsa: of: Allow ethernet-ports as encapsulating node
Due to unified Ethernet Switch Device Tree Bindings allow for ethernet-ports as
encapsulating node as well.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 16:56:43 -07:00
Vladimir Oltean
71d4364abd net: dsa: use the ETH_MIN_MTU and ETH_DATA_LEN default values
Now that DSA supports MTU configuration, undo the effects of commit
8b1efc0f83 ("net: remove MTU limits on a few ether_setup callers") and
let DSA interfaces use the default min_mtu and max_mtu specified by
ether_setup(). This is more important for min_mtu: since DSA is
Ethernet, the minimum MTU is the same as of any other Ethernet
interface, and definitely not zero. For the max_mtu, we have a callback
through which drivers can override that, if they want to.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:35:04 -07:00
Florian Fainelli
9c0c7014f3 net: dsa: Setup dsa_netdev_ops
Now that we have all the infrastructure in place for calling into the
dsa_ptr->netdev_ops function pointers, install them when we configure
the DSA CPU/management interface and tear them down. The flow is
unchanged from before, but now we preserve equality of tests when
network device drivers do tests like dev->netdev_ops == &foo_ops which
was not the case before since we were allocating an entirely new
structure.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 16:48:22 -07:00
Vladimir Oltean
67c2404922 net: dsa: felix: create a template for the DSA tags on xmit
With this patch we try to kill 2 birds with 1 stone.

First of all, some switches that use tag_ocelot.c don't have the exact
same bitfield layout for the DSA tags. The destination ports field is
different for Seville VSC9953 for example. So the choices are to either
duplicate tag_ocelot.c into a new tag_seville.c (sub-optimal) or somehow
take into account a supposed ocelot->dest_ports_offset when packing this
field into the DSA injection header (again not ideal).

Secondly, tag_ocelot.c already needs to memset a 128-bit area to zero
and call some packing() functions of dubious performance in the
fastpath. And most of the values it needs to pack are pretty much
constant (BYPASS=1, SRC_PORT=CPU, DEST=port index). So it would be good
if we could improve that.

The proposed solution is to allocate a memory area per port at probe
time, initialize that with the statically defined bits as per chip
hardware revision, and just perform a simpler memcpy in the fastpath.

Other alternatives have been analyzed, such as:
- Create a separate tag_seville.c: too much code duplication for just 1
  bit field difference.
- Create a separate DSA_TAG_PROTO_SEVILLE under tag_ocelot.c, just like
  tag_brcm.c, which would have a separate .xmit function. Again, too
  much code duplication for just 1 bit field difference.
- Allocate the template from the init function of the tag_ocelot.c
  module, instead of from the driver: couldn't figure out a method of
  accessing the correct port template corresponding to the correct
  tagger in the .xmit function.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:40:01 -07:00
Danielle Ratson
71ad8d55f8 devlink: Replace devlink_port_attrs_set parameters with a struct
Currently, devlink_port_attrs_set accepts a long list of parameters,
that most of them are devlink port's attributes.

Use the devlink_port_attrs struct to replace the relevant parameters.

Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09 13:15:29 -07:00
Linus Walleij
efd7fe68f0 net: dsa: tag_rtl4_a: Implement Realtek 4 byte A tag
This implements the known parts of the Realtek 4 byte
tag protocol version 0xA, as found in the RTL8366RB
DSA switch.

It is designated as protocol version 0xA as a
different Realtek 4 byte tag format with protocol
version 0x9 is known to exist in the Realtek RTL8306
chips.

The tag and switch chip lacks public documentation, so
the tag format has been reverse-engineered from
packet dumps. As only ingress traffic has been available
for analysis an egress tag has not been possible to
develop (even using educated guesses about bit fields)
so this is as far as it gets. It is not known if the
switch even supports egress tagging.

Excessive attempts to figure out the egress tag format
was made. When nothing else worked, I just tried all bit
combinations with 0xannp where a is protocol and p is
port. I looped through all values several times trying
to get a response from ping, without any positive
result.

Using just these ingress tags however, the switch
functionality is vastly improved and the packets find
their way into the destination port without any
tricky VLAN configuration. On the D-Link DIR-685 the
LAN ports now come up and respond to ping without
any command line configuration so this is a real
improvement for users.

Egress packets need to be restricted to the proper
target ports using VLAN, which the RTL8366RB DSA
switch driver already sets up.

Cc: DENG Qingfang <dqfext@gmail.com>
Cc: Mauri Sandberg <sandberg@mailfence.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08 15:36:19 -07:00
Andrew Lunn
99bac53d06 net: dsa: tag_qca.c: Fix warning for __be16 vs u16
net/dsa/tag_qca.c:48:15: warning: incorrect type in assignment (different base types)
net/dsa/tag_qca.c:48:15:    expected unsigned short [usertype]
net/dsa/tag_qca.c:48:15:    got restricted __be16 [usertype]
net/dsa/tag_qca.c:68:13: warning: incorrect type in assignment (different base types)
net/dsa/tag_qca.c:68:13:    expected restricted __be16 [usertype] hdr
net/dsa/tag_qca.c:68:13:    got int
net/dsa/tag_qca.c:71:16: warning: restricted __be16 degrades to integer
net/dsa/tag_qca.c:81:17: warning: restricted __be16 degrades to integer

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-05 15:31:58 -07:00
Andrew Lunn
04d63f9d91 net: dsa: tag_mtk: Fix warnings for __be16
net/dsa/tag_mtk.c:84:13: warning: incorrect type in assignment (different base types)
net/dsa/tag_mtk.c:84:13:    expected restricted __be16 [usertype] hdr
net/dsa/tag_mtk.c:84:13:    got int
net/dsa/tag_mtk.c:94:17: warning: restricted __be16 degrades to integer

The result of a ntohs() is not __be16, but u16.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-05 15:31:58 -07:00
Andrew Lunn
802734ad77 net: dsa: tag_lan9303: Fix __be16 warnings
net/dsa/tag_lan9303.c:76:24: warning: incorrect type in assignment (different base types)
net/dsa/tag_lan9303.c:76:24:    expected unsigned short [usertype]
net/dsa/tag_lan9303.c:76:24:    got restricted __be16 [usertype]
net/dsa/tag_lan9303.c:80:24: warning: incorrect type in assignment (different base types)
net/dsa/tag_lan9303.c:80:24:    expected unsigned short [usertype]
net/dsa/tag_lan9303.c:80:24:    got restricted __be16 [usertype]
net/dsa/tag_lan9303.c:106:31: warning: restricted __be16 degrades to integer
net/dsa/tag_lan9303.c:111:24: warning: cast to restricted __be16
net/dsa/tag_lan9303.c:111:24: warning: cast to restricted __be16
net/dsa/tag_lan9303.c:111:24: warning: cast to restricted __be16
net/dsa/tag_lan9303.c:111:24: warning: cast to restricted __be16

Make use of __be16 where appropriate to fix these warnings.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-05 15:31:58 -07:00
Andrew Lunn
ed6444ea03 net: dsa: tag_ksz: Fix __be16 warnings
cpu_to_be16 returns a __be16 value. So what it is assigned to needs to
have the same type to avoid warnings.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-05 15:31:58 -07:00
Andrew Lunn
a61bf20831 net: dsa: Add __percpu property to prevent warnings
net/dsa/slave.c:505:13: warning: incorrect type in initializer (different address spaces)
net/dsa/slave.c:505:13:    expected void const [noderef] <asn:3> *__vpp_verify
net/dsa/slave.c:505:13:    got struct pcpu_sw_netstats *

Add the needed _percpu property to prevent this warning.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-05 15:31:58 -07:00
Florian Fainelli
65951a9eb6 net: dsa: Improve subordinate PHY error message
It is not very informative to know the DSA master device when a
subordinate network device fails to get its PHY setup. Provide the
device name and capitalize PHY while we are it.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 12:44:20 -07:00
Daniel Mack
1ed9ec9b08 dsa: Allow forwarding of redirected IGMP traffic
The driver for Marvell switches puts all ports in IGMP snooping mode
which results in all IGMP/MLD frames that ingress on the ports to be
forwarded to the CPU only.

The bridge code in the kernel can then interpret these frames and act
upon them, for instance by updating the mdb in the switch to reflect
multicast memberships of stations connected to the ports. However,
the IGMP/MLD frames must then also be forwarded to other ports of the
bridge so external IGMP queriers can track membership reports, and
external multicast clients can receive query reports from foreign IGMP
queriers.

Currently, this is impossible as the EDSA tagger sets offload_fwd_mark
on the skb when it unwraps the tagged frames, and that will make the
switchdev layer prevent the skb from egressing on any other port of
the same switch.

To fix that, look at the To_CPU code in the DSA header and make
forwarding of the frame possible for trapped IGMP packets.

Introduce some #defines for the frame types to make the code a bit more
comprehensive.

This was tested on a Marvell 88E6352 variant.

Signed-off-by: Daniel Mack <daniel@zonque.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24 14:39:43 -07:00
Linus Torvalds
96144c58ab Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Fix cfg80211 deadlock, from Johannes Berg.

 2) RXRPC fails to send norigications, from David Howells.

 3) MPTCP RM_ADDR parsing has an off by one pointer error, fix from
    Geliang Tang.

 4) Fix crash when using MSG_PEEK with sockmap, from Anny Hu.

 5) The ucc_geth driver needs __netdev_watchdog_up exported, from
    Valentin Longchamp.

 6) Fix hashtable memory leak in dccp, from Wang Hai.

 7) Fix how nexthops are marked as FDB nexthops, from David Ahern.

 8) Fix mptcp races between shutdown and recvmsg, from Paolo Abeni.

 9) Fix crashes in tipc_disc_rcv(), from Tuong Lien.

10) Fix link speed reporting in iavf driver, from Brett Creeley.

11) When a channel is used for XSK and then reused again later for XSK,
    we forget to clear out the relevant data structures in mlx5 which
    causes all kinds of problems. Fix from Maxim Mikityanskiy.

12) Fix memory leak in genetlink, from Cong Wang.

13) Disallow sockmap attachments to UDP sockets, it simply won't work.
    From Lorenz Bauer.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits)
  net: ethernet: ti: ale: fix allmulti for nu type ale
  net: ethernet: ti: am65-cpsw-nuss: fix ale parameters init
  net: atm: Remove the error message according to the atomic context
  bpf: Undo internal BPF_PROBE_MEM in BPF insns dump
  libbpf: Support pre-initializing .bss global variables
  tools/bpftool: Fix skeleton codegen
  bpf: Fix memlock accounting for sock_hash
  bpf: sockmap: Don't attach programs to UDP sockets
  bpf: tcp: Recv() should return 0 when the peer socket is closed
  ibmvnic: Flush existing work items before device removal
  genetlink: clean up family attributes allocations
  net: ipa: header pad field only valid for AP->modem endpoint
  net: ipa: program upper nibbles of sequencer type
  net: ipa: fix modem LAN RX endpoint id
  net: ipa: program metadata mask differently
  ionic: add pcie_print_link_status
  rxrpc: Fix race between incoming ACK parser and retransmitter
  net/mlx5: E-Switch, Fix some error pointer dereferences
  net/mlx5: Don't fail driver on failure to create debugfs
  net/mlx5e: CT: Fix ipv6 nat header rewrite actions
  ...
2020-06-13 16:27:13 -07:00
Masahiro Yamada
a7f7f6248d treewide: replace '---help---' in Kconfig files with 'help'
Since commit 84af7a6194 ("checkpatch: kconfig: prefer 'help' over
'---help---'"), the number of '---help---' has been gradually
decreasing, but there are still more than 2400 instances.

This commit finishes the conversion. While I touched the lines,
I also fixed the indentation.

There are a variety of indentation styles found.

  a) 4 spaces + '---help---'
  b) 7 spaces + '---help---'
  c) 8 spaces + '---help---'
  d) 1 space + 1 tab + '---help---'
  e) 1 tab + '---help---'    (correct indentation)
  f) 1 tab + 1 space + '---help---'
  g) 1 tab + 2 spaces + '---help---'

In order to convert all of them to 1 tab + 'help', I ran the
following commend:

  $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-06-14 01:57:21 +09:00
Cong Wang
845e0ebb44 net: change addr_list_lock back to static key
The dynamic key update for addr_list_lock still causes troubles,
for example the following race condition still exists:

CPU 0:				CPU 1:
(RCU read lock)			(RTNL lock)
dev_mc_seq_show()		netdev_update_lockdep_key()
				  -> lockdep_unregister_key()
 -> netif_addr_lock_bh()

because lockdep doesn't provide an API to update it atomically.
Therefore, we have to move it back to static keys and use subclass
for nest locking like before.

In commit 1a33e10e4a ("net: partially revert dynamic lockdep key
changes"), I already reverted most parts of commit ab92d68fc2
("net: core: add generic lockdep keys").

This patch reverts the rest and also part of commit f3b0a18bb6
("net: remove unnecessary variables and callback"). After this
patch, addr_list_lock changes back to using static keys and
subclasses to satisfy lockdep. Thanks to dev->lower_level, we do
not have to change back to ->ndo_get_lock_subclass().

And hopefully this reduces some syzbot lockdep noises too.

Reported-by: syzbot+f3a0e80c34b3fc28ac5e@syzkaller.appspotmail.com
Cc: Taehee Yoo <ap420073@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-09 12:59:45 -07:00
David S. Miller
1806c13dc2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
xdp_umem.c had overlapping changes between the 64-bit math fix
for the calculation of npgs and the removal of the zerocopy
memory type which got rid of the chunk_size_nohdr member.

The mlx5 Kconfig conflict is a case where we just take the
net-next copy of the Kconfig entry dependency as it takes on
the ESWITCH dependency by one level of indirection which is
what the 'net' conflicting change is trying to ensure.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-31 17:48:46 -07:00
Vladimir Oltean
04198499b2 net: dsa: tag_8021q: stop restoring VLANs from bridge
Right now, our only tag_8021q user, sja1105, has the ability to restore
bridge VLANs on its own, so this logic is unnecessary.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29 16:45:46 -07:00
Vladimir Oltean
2b86cb8299 net: dsa: declare lockless TX feature for slave ports
Be there a platform with the following layout:

      Regular NIC
       |
       +----> DSA master for switch port
               |
               +----> DSA master for another switch port

After changing DSA back to static lockdep class keys in commit
1a33e10e4a ("net: partially revert dynamic lockdep key changes"), this
kernel splat can be seen:

[   13.361198] ============================================
[   13.366524] WARNING: possible recursive locking detected
[   13.371851] 5.7.0-rc4-02121-gc32a05ecd7af-dirty #988 Not tainted
[   13.377874] --------------------------------------------
[   13.383201] swapper/0/0 is trying to acquire lock:
[   13.388004] ffff0000668ff298 (&dsa_slave_netdev_xmit_lock_key){+.-.}-{2:2}, at: __dev_queue_xmit+0x84c/0xbe0
[   13.397879]
[   13.397879] but task is already holding lock:
[   13.403727] ffff0000661a1698 (&dsa_slave_netdev_xmit_lock_key){+.-.}-{2:2}, at: __dev_queue_xmit+0x84c/0xbe0
[   13.413593]
[   13.413593] other info that might help us debug this:
[   13.420140]  Possible unsafe locking scenario:
[   13.420140]
[   13.426075]        CPU0
[   13.428523]        ----
[   13.430969]   lock(&dsa_slave_netdev_xmit_lock_key);
[   13.435946]   lock(&dsa_slave_netdev_xmit_lock_key);
[   13.440924]
[   13.440924]  *** DEADLOCK ***
[   13.440924]
[   13.446860]  May be due to missing lock nesting notation
[   13.446860]
[   13.453668] 6 locks held by swapper/0/0:
[   13.457598]  #0: ffff800010003de0 ((&idev->mc_ifc_timer)){+.-.}-{0:0}, at: call_timer_fn+0x0/0x400
[   13.466593]  #1: ffffd4d3fb478700 (rcu_read_lock){....}-{1:2}, at: mld_sendpack+0x0/0x560
[   13.474803]  #2: ffffd4d3fb478728 (rcu_read_lock_bh){....}-{1:2}, at: ip6_finish_output2+0x64/0xb10
[   13.483886]  #3: ffffd4d3fb478728 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x6c/0xbe0
[   13.492793]  #4: ffff0000661a1698 (&dsa_slave_netdev_xmit_lock_key){+.-.}-{2:2}, at: __dev_queue_xmit+0x84c/0xbe0
[   13.503094]  #5: ffffd4d3fb478728 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x6c/0xbe0
[   13.512000]
[   13.512000] stack backtrace:
[   13.516369] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.7.0-rc4-02121-gc32a05ecd7af-dirty #988
[   13.530421] Call trace:
[   13.532871]  dump_backtrace+0x0/0x1d8
[   13.536539]  show_stack+0x24/0x30
[   13.539862]  dump_stack+0xe8/0x150
[   13.543271]  __lock_acquire+0x1030/0x1678
[   13.547290]  lock_acquire+0xf8/0x458
[   13.550873]  _raw_spin_lock+0x44/0x58
[   13.554543]  __dev_queue_xmit+0x84c/0xbe0
[   13.558562]  dev_queue_xmit+0x24/0x30
[   13.562232]  dsa_slave_xmit+0xe0/0x128
[   13.565988]  dev_hard_start_xmit+0xf4/0x448
[   13.570182]  __dev_queue_xmit+0x808/0xbe0
[   13.574200]  dev_queue_xmit+0x24/0x30
[   13.577869]  neigh_resolve_output+0x15c/0x220
[   13.582237]  ip6_finish_output2+0x244/0xb10
[   13.586430]  __ip6_finish_output+0x1dc/0x298
[   13.590709]  ip6_output+0x84/0x358
[   13.594116]  mld_sendpack+0x2bc/0x560
[   13.597786]  mld_ifc_timer_expire+0x210/0x390
[   13.602153]  call_timer_fn+0xcc/0x400
[   13.605822]  run_timer_softirq+0x588/0x6e0
[   13.609927]  __do_softirq+0x118/0x590
[   13.613597]  irq_exit+0x13c/0x148
[   13.616918]  __handle_domain_irq+0x6c/0xc0
[   13.621023]  gic_handle_irq+0x6c/0x160
[   13.624779]  el1_irq+0xbc/0x180
[   13.627927]  cpuidle_enter_state+0xb4/0x4d0
[   13.632120]  cpuidle_enter+0x3c/0x50
[   13.635703]  call_cpuidle+0x44/0x78
[   13.639199]  do_idle+0x228/0x2c8
[   13.642433]  cpu_startup_entry+0x2c/0x48
[   13.646363]  rest_init+0x1ac/0x280
[   13.649773]  arch_call_rest_init+0x14/0x1c
[   13.653878]  start_kernel+0x490/0x4bc

Lockdep keys themselves were added in commit ab92d68fc2 ("net: core:
add generic lockdep keys"), and it's very likely that this splat existed
since then, but I have no real way to check, since this stacked platform
wasn't supported by mainline back then.

>From Taehee's own words:

  This patch was considered that all stackable devices have LLTX flag.
  But the dsa doesn't have LLTX, so this splat happened.
  After this patch, dsa shares the same lockdep class key.
  On the nested dsa interface architecture, which you illustrated,
  the same lockdep class key will be used in __dev_queue_xmit() because
  dsa doesn't have LLTX.
  So that lockdep detects deadlock because the same lockdep class key is
  used recursively although actually the different locks are used.
  There are some ways to fix this problem.

  1. using NETIF_F_LLTX flag.
  If possible, using the LLTX flag is a very clear way for it.
  But I'm so sorry I don't know whether the dsa could have LLTX or not.

  2. using dynamic lockdep again.
  It means that each interface uses a separate lockdep class key.
  So, lockdep will not detect recursive locking.
  But this way has a problem that it could consume lockdep class key
  too many.
  Currently, lockdep can have 8192 lockdep class keys.
   - you can see this number with the following command.
     cat /proc/lockdep_stats
     lock-classes:                         1251 [max: 8192]
     ...
     The [max: 8192] means that the maximum number of lockdep class keys.
  If too many lockdep class keys are registered, lockdep stops to work.
  So, using a dynamic(separated) lockdep class key should be considered
  carefully.
  In addition, updating lockdep class key routine might have to be existing.
  (lockdep_register_key(), lockdep_set_class(), lockdep_unregister_key())

  3. Using lockdep subclass.
  A lockdep class key could have 8 subclasses.
  The different subclass is considered different locks by lockdep
  infrastructure.
  But "lock-classes" is not counted by subclasses.
  So, it could avoid stopping lockdep infrastructure by an overflow of
  lockdep class keys.
  This approach should also have an updating lockdep class key routine.
  (lockdep_set_subclass())

  4. Using nonvalidate lockdep class key.
  The lockdep infrastructure supports nonvalidate lockdep class key type.
  It means this lockdep is not validated by lockdep infrastructure.
  So, the splat will not happen but lockdep couldn't detect real deadlock
  case because lockdep really doesn't validate it.
  I think this should be used for really special cases.
  (lockdep_set_novalidate_class())

Further discussion here:
https://patchwork.ozlabs.org/project/netdev/patch/20200503052220.4536-2-xiyou.wangcong@gmail.com/

There appears to be no negative side-effect to declaring lockless TX for
the DSA virtual interfaces, which means they handle their own locking.
So that's what we do to make the splat go away.

Patch tested in a wide variety of cases: unicast, multicast, PTP, etc.

Fixes: ab92d68fc2 ("net: core: add generic lockdep keys")
Suggested-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 15:09:28 -07:00
David S. Miller
13209a8f73 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
The MSCC bug fix in 'net' had to be slightly adjusted because the
register accesses are done slightly differently in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 13:47:27 -07:00
DENG Qingfang
5e5502e012 net: dsa: mt7530: fix roaming from DSA user ports
When a client moves from a DSA user port to a software port in a bridge,
it cannot reach any other clients that connected to the DSA user ports.
That is because SA learning on the CPU port is disabled, so the switch
ignores the client's frames from the CPU port and still thinks it is at
the user port.

Fix it by enabling SA learning on the CPU port.

To prevent the switch from learning from flooding frames from the CPU
port, set skb->offload_fwd_mark to 1 for unicast and broadcast frames,
and let the switch flood them instead of trapping to the CPU port.
Multicast frames still need to be trapped to the CPU port for snooping,
so set the SA_DIS bit of the MTK tag to 1 when transmitting those frames
to disable SA learning.

Fixes: b8f126a8d5 ("net-next: dsa: add dsa support for Mediatek MT7530 switch")
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-16 13:49:28 -07:00
Vladimir Oltean
fb9f2e9286 net: dsa: tag_sja1105: appease sparse checks for ethertype accessors
A comparison between a value from the packet and an integer constant
value needs to be done by converting the value from the packet from
net->host, or the constant from host->net. Not the other way around.
Even though it makes no practical difference, correct that.

Fixes: 38b5beeae7 ("net: dsa: sja1105: prepare tagger for handling DSA tags and VLAN simultaneously")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 18:02:42 -07:00
Vladimir Oltean
84eeb5d460 net: dsa: tag_sja1105: implement sub-VLAN decoding
Create a subvlan_map as part of each port's tagger private structure.
This keeps reverse mappings of bridge-to-dsa_8021q VLAN retagging rules.

Note that as of this patch, this piece of code is never engaged, due to
the fact that the driver hasn't installed any retagging rule, so we'll
always see packets with a subvlan code of 0 (untagged).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 13:08:08 -07:00
Vladimir Oltean
3eaae1d05f net: dsa: tag_8021q: support up to 8 VLANs per port using sub-VLANs
For switches that support VLAN retagging, such as sja1105, we extend
dsa_8021q by encoding a "sub-VLAN" into the remaining 3 free bits in the
dsa_8021q tag.

A sub-VLAN is nothing more than a number in the range 0-7, which serves
as an index into a per-port driver lookup table. The sub-VLAN value of
zero means that traffic is untagged (this is also backwards-compatible
with dsa_8021q without retagging).

The switch should be configured to retag VLAN-tagged traffic that gets
transmitted towards the CPU port (and towards the CPU only). Example:

bridge vlan add dev sw1p0 vid 100

The switch retags frames received on port 0, going to the CPU, and
having VID 100, to the VID of 1104 (0x0450). In dsa_8021q language:

 | 11  | 10  |  9  |  8  |  7  |  6  |  5  |  4  |  3  |  2  |  1  |  0  |
 +-----------+-----+-----------------+-----------+-----------------------+
 |    DIR    | SVL |    SWITCH_ID    |  SUBVLAN  |          PORT         |
 +-----------+-----+-----------------+-----------+-----------------------+

0x0450 means:
 - DIR = 0b01: this is an RX VLAN
 - SUBVLAN = 0b001: this is subvlan #1
 - SWITCH_ID = 0b001: this is switch 1 (see the name "sw1p0")
 - PORT = 0b0000: this is port 0 (see the name "sw1p0")

The driver also remembers the "1 -> 100" mapping. In the hotpath, if the
sub-VLAN from the tag encodes a non-untagged frame, this mapping is used
to create a VLAN hwaccel tag, with the value of 100.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 13:08:08 -07:00