Transactions are always allocated one at a time. The maximum number
of them we could ever need occurs if each TRE is assigned to a
transaction. So a channel requires no more transactions than the
number of TREs in its transfer ring. That number is known to be a
power-of-2 less than 65536.
The transaction pool abstraction is used for other things, but for
transactions we can use a simple array of transaction structures,
and use a free index to indicate which entry in the array is the
next one free for allocation.
By having the number of elements in the array be a power-of-2, we
can use an ever-incrementing 16-bit free index, and use it modulo
the array size. Distinguish a "trans_id" (whose value can exceed
the number of entries in the transaction array) from a "trans_index"
(which is less than the number of entries).
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no dedicated reset for just the switch core. The reset which
is used up until now, is more of a global reset, resetting almost the
whole SoC and cause spurious errors by doing so. Make it possible to
handle the reset elsewhere and make the reset optional.
Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the reset line optional. It turns out, there is no dedicated reset
for the switch. Instead, the reset which was used up until now, was kind
of a global reset. This is now handled elsewhere, thus don't require a
reset.
Signed-off-by: Michael Walle <michael@walle.cc>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a followup of commit c67b85558f ("ipv6: tcp: send consistent
autoflowlabel in TIME_WAIT state"), but for SYN_RECV state.
In some cases, TCP sends a challenge ACK on behalf of a SYN_RECV request.
WHen this happens, we want to use the flow label that was used when
the prior SYNACK packet was sent, instead of another one.
After his patch, following packetdrill passes:
0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0
+.2 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
+0 > (flowlabel 0x11) S. 0:0(0) ack 1 <...>
// Test if a challenge ack is properly sent (same flowlabel than prior SYNACK)
+.01 < . 4000000000:4000000000(0) ack 1 win 320
+0 > (flowlabel 0x11) . 1:1(0) ack 1
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220831203729.458000-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This has been validated on the Ocelot/Felix switch family (NXP LS1028A)
and should be relevant to any switch driver that offloads the tc-flower
and/or tc-matchall actions trap, drop, accept, mirred, for which DSA has
operations.
TEST: gact drop and ok (skip_hw) [ OK ]
TEST: mirred egress flower redirect (skip_hw) [ OK ]
TEST: mirred egress flower mirror (skip_hw) [ OK ]
TEST: mirred egress matchall mirror (skip_hw) [ OK ]
TEST: mirred_egress_to_ingress (skip_hw) [ OK ]
TEST: gact drop and ok (skip_sw) [ OK ]
TEST: mirred egress flower redirect (skip_sw) [ OK ]
TEST: mirred egress flower mirror (skip_sw) [ OK ]
TEST: mirred egress matchall mirror (skip_sw) [ OK ]
TEST: trap (skip_sw) [ OK ]
TEST: mirred_egress_to_ingress (skip_sw) [ OK ]
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220831170839.931184-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull networking fixes from Paolo Abeni:
"Including fixes from bluetooth, bpf and wireless.
Current release - regressions:
- bpf:
- fix wrong last sg check in sk_msg_recvmsg()
- fix kernel BUG in purge_effective_progs()
- mac80211:
- fix possible leak in ieee80211_tx_control_port()
- potential NULL dereference in ieee80211_tx_control_port()
Current release - new code bugs:
- nfp: fix the access to management firmware hanging
Previous releases - regressions:
- ip: fix triggering of 'icmp redirect'
- sched: tbf: don't call qdisc_put() while holding tree lock
- bpf: fix corrupted packets for XDP_SHARED_UMEM
- bluetooth: hci_sync: fix suspend performance regression
- micrel: fix probe failure
Previous releases - always broken:
- tcp: make global challenge ack rate limitation per net-ns and
default disabled
- tg3: fix potential hang-up on system reboot
- mac802154: fix reception for no-daddr packets
Misc:
- r8152: add PID for the lenovo onelink+ dock"
* tag 'net-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (56 commits)
net/smc: Remove redundant refcount increase
Revert "sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb"
tcp: make global challenge ack rate limitation per net-ns and default disabled
tcp: annotate data-race around challenge_timestamp
net: dsa: hellcreek: Print warning only once
ip: fix triggering of 'icmp redirect'
sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb
selftests: net: sort .gitignore file
Documentation: networking: correct possessive "its"
kcm: fix strp_init() order and cleanup
mlxbf_gige: compute MDIO period based on i1clk
ethernet: rocker: fix sleep in atomic context bug in neigh_timer_handler
net: lan966x: improve error handle in lan966x_fdma_rx_get_frame()
nfp: fix the access to management firmware hanging
net: phy: micrel: Make the GPIO to be non-exclusive
net: virtio_net: fix notification coalescing comments
net/sched: fix netdevice reference leaks in attach_default_qdiscs()
net: sched: tbf: don't call qdisc_put() while holding tree lock
net: Use u64_stats_fetch_begin_irq() for stats fetch.
net: dsa: xrs700x: Use irqsave variant for u64 stats update
...
Pull slab fix from Vlastimil Babka:
- A fix from Waiman Long to avoid a theoretical deadlock reported by
lockdep.
* tag 'slab-for-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
mm/slab_common: Deleting kobject in kmem_cache_destroy() without holding slab_mutex/cpu_hotplug_lock
Pull sound fixes from Takashi Iwai:
"Just handful changes at this time. The only major change is the
regression fix about the x86 WC-page buffer allocation.
The rest are trivial data-race fixes for ALSA sequencer core, the
possible out-of-bounds access fixes in the new ALSA control hash code,
and a few device-specific workarounds and fixes"
* tag 'sound-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: usb-audio: Add quirk for LH Labs Geek Out HD Audio 1V5
ALSA: hda/realtek: Add speaker AMP init for Samsung laptops with ALC298
ALSA: control: Re-order bounds checking in get_ctl_id_hash()
ALSA: control: Fix an out-of-bounds bug in get_ctl_id_hash()
ALSA: hda: intel-nhlt: Correct the handling of fmt_config flexible array
ALSA: seq: Fix data-race at module auto-loading
ALSA: seq: oss: Fix data-race for max_midi_devs access
ALSA: memalloc: Revive x86-specific WC page allocations again
A circular locking problem is reported by lockdep due to the following
circular locking dependency.
+--> cpu_hotplug_lock --> slab_mutex --> kn->active --+
| |
+-----------------------------------------------------+
The forward cpu_hotplug_lock ==> slab_mutex ==> kn->active dependency
happens in
kmem_cache_destroy(): cpus_read_lock(); mutex_lock(&slab_mutex);
==> sysfs_slab_unlink()
==> kobject_del()
==> kernfs_remove()
==> __kernfs_remove()
==> kernfs_drain(): rwsem_acquire(&kn->dep_map, ...);
The backward kn->active ==> cpu_hotplug_lock dependency happens in
kernfs_fop_write_iter(): kernfs_get_active();
==> slab_attr_store()
==> cpu_partial_store()
==> flush_all(): cpus_read_lock()
One way to break this circular locking chain is to avoid holding
cpu_hotplug_lock and slab_mutex while deleting the kobject in
sysfs_slab_unlink() which should be equivalent to doing a write_lock
and write_unlock pair of the kn->active virtual lock.
Since the kobject structures are not protected by slab_mutex or the
cpu_hotplug_lock, we can certainly release those locks before doing
the delete operation.
Move sysfs_slab_unlink() and sysfs_slab_release() to the newly
created kmem_cache_release() and call it outside the slab_mutex &
cpu_hotplug_lock critical sections. There will be a slight delay
in the deletion of sysfs files if kmem_cache_release() is called
indirectly from a work function.
Fixes: 5a836bf6b0 ("mm: slub: move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context")
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: David Rientjes <rientjes@google.com>
Link: https://lore.kernel.org/all/YwOImVd+nRUsSAga@hyeyoo/
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Sebastian Reichel says:
====================
RK3588 Ethernet Support
This adds ethernet support for the RK3588(s) SoCs.
Changes since PATCHv2:
* Rebased to v6.0-rc1
* Wrap DT bindings additions at 80 characters
* Add Acked-by from Krzysztof
Changes since PATCHv1:
* Drop first patch (not required) and rebase second patch accordingly
* Rename DT property rockchip,php_grf -> rockchip,php-grf
====================
Link: https://lore.kernel.org/r/20220830154559.61506-1-sebastian.reichel@collabora.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add constants and callback functions for the dwmac on RK3588 soc.
As can be seen, the base structure is the same, only registers
and the bits in them moved slightly.
Signed-off-by: David Wu <david.wu@rock-chips.com>
[rebase, squash fixes]
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
As of now all transmit queues transmit packets out of same scheduler
queue hierarchy. Due to this PFC frames sent by peer are not handled
properly, either all transmit queues are backpressured or none.
To fix this when user enables PFC for a given priority map relavant
transmit queue to a different scheduler queue hierarcy, so that
backpressure is applied only to the traffic egressing out of that TXQ.
Signed-off-by: Suman Ghosh <sumang@marvell.com>
Link: https://lore.kernel.org/r/20220830120304.158060-1-sumang@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Currently, the change function can be called by two ways. The one way is
that qdisc_change() will call it. Before calling change function,
qdisc_change() ensures tca[TCA_OPTIONS] is not empty. The other way is
that .init() will call it. The opt parameter is also checked before
calling change function in .init(). Therefore, it's no need to check the
input parameter opt in change function.
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Link: https://lore.kernel.org/r/20220829071219.208646-1-shaozhengchao@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Eric Dumazet says:
====================
tcp: tcp challenge ack fixes
syzbot found a typical data-race addressed in the first patch.
While we are at it, second patch makes the global rate limit
per net-ns and disabled by default.
====================
Link: https://lore.kernel.org/r/20220830185656.268523-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Because per host rate limiting has been proven problematic (side channel
attacks can be based on it), per host rate limiting of challenge acks ideally
should be per netns and turned off by default.
This is a long due followup of following commits:
083ae30828 ("tcp: enable per-socket rate limiting of all 'challenge acks'")
f2b2c582e8 ("tcp: mitigate ACK loops for connections as tcp_sock")
75ff39ccc1 ("tcp: make challenge acks less predictable")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jason Baron <jbaron@akamai.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
challenge_timestamp can be read an written by concurrent threads.
This was expected, but we need to annotate the race to avoid potential issues.
Following patch moves challenge_timestamp and challenge_count
to per-netns storage to provide better isolation.
Fixes: 354e4aa391 ("tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>