The "fs->location" is a u32 that comes from the user in ethtool_set_rxnfc().
We can't pass unclamped values to test_bit() or it results in an out of
bounds access beyond the end of the bitmap.
Fixes: 7318166cac ("net: dsa: bcm_sf2: Add support for ethtool::rxnfc")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In rxrpc_destroy_all_calls(), there are two phases: (1) make sure the
->calls list is empty, emitting error messages if not, and (2) wait for the
RCU cleanup to happen on outstanding calls (ie. ->nr_calls becomes 0).
To avoid taking the call_lock, the function prechecks ->calls and if empty,
it returns to avoid taking the lock - this is wrong, however: it still
needs to go and do the second phase and wait for ->nr_calls to become 0.
Without this, the rxrpc_net struct may get deallocated before we get to the
RCU cleanup for the last calls. This can lead to:
Slab corruption (Not tainted): kmalloc-16k start=ffff88802b178000, len=16384
050: 6b 6b 6b 6b 6b 6b 6b 6b 61 6b 6b 6b 6b 6b 6b 6b kkkkkkkkakkkkkkk
Note the "61" at offset 0x58. This corresponds to the ->nr_calls member of
struct rxrpc_net (which is >9k in size, and thus allocated out of the 16k
slab).
Fix this by flipping the condition on the if-statement, putting the locked
section inside the if-body and dropping the return from there. The
function will then always go on to wait for the RCU cleanup on outstanding
calls.
Fixes: 2baec2c3f8 ("rxrpc: Support network namespacing")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert says:
====================
pull request (net): ipsec 2019-04-30
1) Fix an out-of-bound array accesses in __xfrm_policy_unlink.
From YueHaibing.
2) Reset the secpath on failure in the ESP GRO handlers
to avoid dereferencing an invalid pointer on error.
From Myungho Jung.
3) Add and revert a patch that tried to add rcu annotations
to netns_xfrm. From Su Yanjun.
4) Wait for rcu callbacks before freeing xfrm6_tunnel_spi_kmem.
From Su Yanjun.
5) Fix forgotten vti4 ipip tunnel deregistration.
From Jeremy Sowden:
6) Remove some duplicated log messages in vti4.
From Jeremy Sowden.
7) Don't use IPSEC_PROTO_ANY when flushing states because
this will flush only IPsec portocol speciffic states.
IPPROTO_ROUTING states may remain in the lists when
doing net exit. Fix this by replacing IPSEC_PROTO_ANY
with zero. From Cong Wang.
8) Add length check for UDP encapsulation to fix "Oversized IP packet"
warnings on receive side. From Sabrina Dubroca.
9) Fix xfrm interface lookup when the interface is associated to
a vrf layer 3 master device. From Martin Willi.
10) Reload header pointers after pskb_may_pull() in _decode_session4(),
otherwise we may read from uninitialized memory.
11) Update the documentation about xfrm[46]_gc_thresh, it
is not used anymore after the flowcache removal.
From Nicolas Dichtel.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When there is no route to an IPv6 dest addr, skb_dst(skb) points
to loopback dev in the case of that the IP6CB(skb)->iif is
enslaved to a vrf. This causes Ip6InNoRoutes to be incremented on the
loopback dev. This also causes the lookup to fail on icmpv6_send() and
the dest unreachable to not sent and Ip6OutNoRoutes gets incremented on
the loopback dev.
To reproduce:
* Gateway configuration:
ip link add dev vrf_258 type vrf table 258
ip link set dev enp0s9 master vrf_258
ip addr add 66:1/64 dev enp0s9
ip -6 route add unreachable default metric 8192 table 258
sysctl -w net.ipv6.conf.all.forwarding=1
sysctl -w net.ipv6.conf.enp0s9.forwarding=1
* Sender configuration:
ip addr add 66::2/64 dev enp0s9
ip -6 route add default via 66::1
and ping 67::1 for example from the sender.
Fix this by counting on the original netdev and reset the skb dst to
force a fresh lookup.
v2: Fix typo of destination address in the repro steps.
v3: Simplify the loopback check (per David Ahern) and use reverse
Christmas tree format (per David Miller).
Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Richard and Bruno both reported that my commit added a bug,
and Bruno was able to determine the problem came when a segment
wih a FIN packet was coalesced to a prior one in tcp backlog queue.
It turns out the header prediction in tcp_rcv_established()
looks back to TCP headers in the packet, not in the metadata
(aka TCP_SKB_CB(skb)->tcp_flags)
The fast path in tcp_rcv_established() is not supposed to
handle a FIN flag (it does not call tcp_fin())
Therefore we need to make sure to propagate the FIN flag,
so that the coalesced packet does not go through the fast path,
the same than a GRO packet carrying a FIN flag.
While we are at it, make sure we do not coalesce packets with
RST or SYN, or if they do not have ACK set.
Many thanks to Richard and Bruno for pinpointing the bad commit,
and to Richard for providing a first version of the fix.
Fixes: 4f693b55c3 ("tcp: implement coalescing on backlog queue")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Reported-by: Bruno Prémont <bonbons@sysophe.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
A request for a flowlabel fails in process or user exclusive mode must
fail if the caller pid or uid does not match. Invert the test.
Previously, the test was unsafe wrt PID recycling, but indeed tested
for inequality: fl1->owner != fl->owner
Fixes: 4f82f45730 ("net ip6 flowlabel: Make owner a union of struct pid* and kuid_t")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Schmidt says:
====================
ieee802154 for net 2019-04-25
An update from ieee802154 for your *net* tree.
Another fix from Kangjie Lu to ensure better checking regmap updates in the
mcr20a driver. Nothing else I have pending for the final release.
If there are any problems let me know.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg says:
====================
* fix use-after-free in mac80211 TXQs
* fix RX STBC byte order
* fix debugfs rename crashing due to ERR_PTR()
* fix missing regulatory notification
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the UDP GRO code path does bad things on some edge
conditions - Aggregation can happen even on packet with different
lengths.
Fix the above by rewriting the 'complete' condition for GRO
packets. While at it, note explicitly that we allow merging the
first packet per burst below gso_size.
Reported-by: Sean Tong <seantong114@gmail.com>
Fixes: e20cf8d3f1 ("udp: implement GRO for plain UDP sockets.")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski says:
====================
net/tls: fix data copies in tls_device_reencrypt()
This series fixes the tls_device_reencrypt() which is broken
if record starts in the frags of the message skb.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fragments may contain data from other records so we have to account
for that when we calculate the destination and max length of copy we
can perform. Note that 'offset' is the offset within the message,
so it can't be passed as offset within the frag..
Here skb_store_bits() would have realised the call is wrong and
simply not copy data.
Fixes: 4799ac81e5 ("tls: Add rx inline crypto offload")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no guarantee the record starts before the skb frags.
If we don't check for this condition copy amount will get
negative, leading to reads and writes to random memory locations.
Familiar hilarity ensues.
Fixes: 4799ac81e5 ("tls: Add rx inline crypto offload")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan says:
====================
bnxt_en: Misc. bug fixes.
6 miscellaneous bug fixes covering several issues in error code paths,
a setup issue for statistics DMA, and an improvement for setting up
multicast address filters.
Please queue these for stable as well.
Patch #5 (bnxt_en: Fix statistics context reservation logic) is for the
most recent 5.0 stable only. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In bnxt_rx_pkt(), if the driver encounters BD errors, it will recycle
the buffers and jump to the end where the uninitailized variable "len"
is referenced. Fix it by adding a new jump label that will skip
the length update. This is the most correct fix since the length
may not be valid when we get this type of error.
Fixes: 6a8788f256 ("bnxt_en: add support for software dynamic interrupt moderation")
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In an earlier commit that fixes the number of stats contexts to
reserve for the RDMA driver, we added a function parameter to pass in
the number of stats contexts to all the relevant functions. The passed
in parameter should have been used to set the enables field of the
firmware message.
Fixes: 780baad44f ("bnxt_en: Reserve 1 stat_ctx for RDMA driver.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If driver determines that extended TX port statistics are not supported
or allocation of the data structure fails, make sure to pass 0 TX stats
size to firmware to disable it. The firmware returned TX stats size should
also be set to 0 for consistency. This will prevent
bnxt_get_ethtool_stats() from accessing the NULL TX stats pointer in
case there is mismatch between firmware and driver.
Fixes: 36e53349b6 ("bnxt_en: Add additional extended port statistics.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we encounter errors during open and proceed to clean up,
bnxt_hwrm_ring_free() may crash if the rings we try to free have never
been allocated. bnxt_cp_ring_for_rx() or bnxt_cp_ring_for_tx()
may reference pointers that have not been allocated.
Fix it by checking for valid fw_ring_id first before calling
bnxt_cp_ring_for_rx() or bnxt_cp_ring_for_tx().
Fixes: 2c61d2117e ("bnxt_en: Add helper functions to get firmware CP ring ID.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the bnxt_init_one() error path, short FW command request memory
is not freed. This patch fixes it.
Fixes: e605db801b ("bnxt_en: Support for Short Firmware Message")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver builds a list of multicast addresses and sends it to the
firmware when the driver's ndo_set_rx_mode() is called. In rare
cases, the firmware can fail this call if internal resources to
add multicast addresses are exhausted. In that case, we should
try the call again by setting the ALL_MCAST flag which is more
guaranteed to succeed.
Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
marvell_get_sset_count() returns how many statistics counters there
are. If the PHY supports fibre, there are 3, otherwise two.
marvell_get_strings() does not make this distinction, and always
returns 3 strings. This then often results in writing past the end
of the buffer for the strings.
Fixes: 2170fef78a ("Marvell phy: add field to get errors from fiber link.")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When allocating the next family->id it makes more sense to use
idr_alloc_cyclic to avoid re-using a previously used family->id as much
as possible.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Canonical way to fetch sk_user_data from an encap_rcv() handler called
from UDP stack in rcu protected section is to use rcu_dereference_sk_user_data(),
otherwise compiler might read it multiple times.
Fixes: d00fa9adc5 ("il2tp: fix races with tunnel socket close")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>