IPI's from the victim cpu are not handled in dev_cpu_callback.
So these pending IPI's would be sent to the remote cpu only when
NET_RX is scheduled on the victim cpu and since this trigger is
unpredictable it would result in packet latencies on the remote cpu.
This patch add support to send the pending ipi's of victim cpu.
Signed-off-by: Ashwanth Goli <ashwanth@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
1.) Bugfix of function stmmac_get_tx_hwtstamp.
Corrected the tx timestamp available check (same as 4.8 and older)
Change printout from info syslevel to debug.
2.) Bugfix of function stmmac_get_rx_hwtstamp.
Corrected the rx timestamp available check (same as 4.8 and older)
Change printout from info syslevel to debug.
Fixes: ba1ffd74df ("stmmac: fix PTP support for GMAC4")
Signed-off-by: Mario Molitor <mario_molitor@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
According the CYCLON V documention only the bit 16 of snaptypesel should
set.
(more information see Table 17-20 (cv_5v4.pdf) :
Timestamp Snapshot Dependency on Register Bits)
Fixes: d2042052a0 ("stmmac: update the PTP header file")
Signed-off-by: Mario Molitor <mario_molitor@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
It looks like this:
Message from syslogd@flamingo at Apr 26 00:45:00 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 4
They seem to coincide with net namespace teardown.
The message is emitted by netdev_wait_allrefs().
Forced a kdump in netdev_run_todo, but found that the refcount on the lo
device was already 0 at the time we got to the panic.
Used bcc to check the blocking in netdev_run_todo. The only places
where we're off cpu there are in the rcu_barrier() and msleep() calls.
That behavior is expected. The msleep time coincides with the amount of
time we spend waiting for the refcount to reach zero; the rcu_barrier()
wait times are not excessive.
After looking through the list of callbacks that the netdevice notifiers
invoke in this path, it appears that the dst_dev_event is the most
interesting. The dst_ifdown path places a hold on the loopback_dev as
part of releasing the dev associated with the original dst cache entry.
Most of our notifier callbacks are straight-forward, but this one a)
looks complex, and b) places a hold on the network interface in
question.
I constructed a new bcc script that watches various events in the
liftime of a dst cache entry. Note that dst_ifdown will take a hold on
the loopback device until the invalidated dst entry gets freed.
[ __dst_free] on DST: ffff883ccabb7900 IF tap1008300eth0 invoked at 1282115677036183
__dst_free
rcu_nocb_kthread
kthread
ret_from_fork
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Verify that the caller-provided sockaddr structure is large enough to
contain the sa_family field, before accessing it in bind() and connect()
handlers of the AF_UNIX socket. Since neither syscall enforces a minimum
size of the corresponding memory region, very short sockaddrs (zero or
one byte long) result in operating on uninitialized memory while
referencing .sa_family.
Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 1aa6c4f6b8 ("net: vrf: Add l3mdev rules on first device create")
adds the l3mdev FIB rule the first time a VRF device is created. However,
it only creates the rule once and only in the namespace the first device
is created - which may not be init_net. Fix by using the net_generic
capability to make the add_fib_rules flag per network namespace.
Fixes: 1aa6c4f6b8 ("net: vrf: Add l3mdev rules on first device create")
Reported-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I noticed that test_l4lb was failing in selftests:
# ./test_progs
test_pkt_access:PASS:ipv4 77 nsec
test_pkt_access:PASS:ipv6 44 nsec
test_xdp:PASS:ipv4 2933 nsec
test_xdp:PASS:ipv6 1500 nsec
test_l4lb:PASS:ipv4 377 nsec
test_l4lb:PASS:ipv6 544 nsec
test_l4lb:FAIL:stats 6297600000 200000
test_tcp_estats:PASS: 0 nsec
Summary: 7 PASSED, 1 FAILED
Tracking down the issue actually revealed that endianness selection
in bpf_endian.h is broken when compiled with clang with bpf target.
test_pkt_access.c, test_l4lb.c is compiled with __BYTE_ORDER as
__BIG_ENDIAN, test_xdp.c as __LITTLE_ENDIAN! test_l4lb noticeably
fails, because the test accounts bytes via bpf_ntohs(ip6h->payload_len)
and bpf_ntohs(iph->tot_len), and compares them against a defined
value and given a wrong endianness, the test outcome is different,
of course.
Turns out that there are actually two bugs: i) when we do __BYTE_ORDER
comparison with __LITTLE_ENDIAN/__BIG_ENDIAN, then depending on the
include order we see different outcomes. Reason is that __BYTE_ORDER
is undefined due to missing endian.h include. Before we include the
asm/byteorder.h (e.g. through linux/in.h), then __BYTE_ORDER equals
__LITTLE_ENDIAN since both are undefined, after the include which
correctly pulls in linux/byteorder/little_endian.h, __LITTLE_ENDIAN
is defined, but given __BYTE_ORDER is still undefined, we match on
__BYTE_ORDER equals to __BIG_ENDIAN since __BIG_ENDIAN is also
undefined at that point, sigh. ii) But even that would be wrong,
since when compiling the test cases with clang, one can select between
bpfeb and bpfel targets for cross compilation. Hence, we can also not
rely on what the system's endian.h provides, but we need to look at
the compiler's defined endianness. The compiler defines __BYTE_ORDER__,
and we can match __ORDER_LITTLE_ENDIAN__ and __ORDER_BIG_ENDIAN__,
which also reflects targets bpf (native), bpfel, bpfeb correctly,
thus really only rely on that. After patch:
# ./test_progs
test_pkt_access:PASS:ipv4 74 nsec
test_pkt_access:PASS:ipv6 42 nsec
test_xdp:PASS:ipv4 2340 nsec
test_xdp:PASS:ipv6 1461 nsec
test_l4lb:PASS:ipv4 400 nsec
test_l4lb:PASS:ipv6 530 nsec
test_tcp_estats:PASS: 0 nsec
Summary: 7 PASSED, 0 FAILED
Fixes: 43bcf707cc ("bpf: fix _htons occurences in test_progs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each time a new speed is added, the bonding 802.3ad isn't updated. Add a
comment to remind the developer to update this driver.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds 14 Gbps enum definition, and fixes
aggregated bandwidth calculation based on above slave links.
Fixes: 0d7e2d2166 ("IB/ipoib: add get_link_ksettings in ethtool")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds [5|50] Gbps enum definition, and fixes
aggregated bandwidth calculation based on above slave links.
Fixes: c9a70d4346 ("net-next: ethtool: Added port speed macros.")
Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The first netlink attribute (value 0) must always be defined
as none/unspec.
Because we cannot change an existing UAPI, I add a comment to point the
mistake and avoid to propagate it in a new ovs API in the future.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While discussing the possible merits of clang warning about unused initialized
functions, I found one function that was clearly meant to be called but
never actually is.
__ila_hash_secret_init() initializes the hash value for the ila locator,
apparently this is intended to prevent hash collision attacks, but this ends
up being a read-only zero constant since there is no caller. I could find
no indication of why it was never called, the earliest patch submission
for the module already was like this. If my interpretation is right, we
certainly want to backport the patch to stable kernels as well.
I considered adding it to the ila_xlat_init callback, but for best effect
the random data is read as late as possible, just before it is first used.
The underlying net_get_random_once() is already highly optimized to avoid
overhead when called frequently.
Fixes: 7f00feaf10 ("ila: Add generic ILA translation facility")
Cc: stable@vger.kernel.org
Link: https://www.spinics.net/lists/kernel/msg2527243.html
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c: In function ‘rtw_cfg80211_add_monitor_if’:
drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c:2670:10: error: ‘struct net_device’ has no member named ‘destructor’
mon_ndev->destructor = rtw_ndev_destructor;
^
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger says:
====================
netvsc: bug fixes
These are bugfixes for netvsc driver in 4.12.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The work queue and handling of network filter parameters should
be in rndis_device. This gets rid of warning from RCU checks,
eliminates a race and cleans up code.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ndo_poll_controller function needs to schedule NAPI to pick
up arriving packets and send completions. Otherwise no data
will ever be received. For simple case of netconsole, it also
will allow send completions to happen. Without this netpoll
will eventually get stuck.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ethtool info command calls the netvsc get_sset_count with RTNL
but not with RCU. Which causes warning:
drivers/net/hyperv/netvsc_drv.c:1010 suspicious rcu_dereference_check() usage!
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roopa reported attempts to delete a bond device that is referenced in a
multipath route is hanging:
$ ifdown bond2 # ifupdown2 command that deletes virtual devices
unregister_netdevice: waiting for bond2 to become free. Usage count = 2
Steps to reproduce:
echo 1 > /proc/sys/net/ipv6/conf/all/ignore_routes_with_linkdown
ip link add dev bond12 type bond
ip link add dev bond13 type bond
ip addr add 2001:db8:2::0/64 dev bond12
ip addr add 2001:db8:3::0/64 dev bond13
ip route add 2001:db8:33::0/64 nexthop via 2001:db8:2::2 nexthop via 2001:db8:3::2
ip link del dev bond12
ip link del dev bond13
The root cause is the recent change to keep routes on a linkdown. Update
the check to detect when the device is unregistering and release the
route for that case.
Fixes: a1a22c1206 ("net: ipv6: Keep nexthop of multipath route on admin down")
Reported-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some of the structure's fields are not initialized by the
rtnetlink. If driver doesn't set those in ndo_get_vf_config(),
they'd leak memory to user.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
CC: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Verify that the length of the socket buffer is sufficient to cover the
nlmsghdr structure before accessing the nlh->nlmsg_len field for further
input sanitization. If the client only supplies 1-3 bytes of data in
sk_buff, then nlh->nlmsg_len remains partially uninitialized and
contains leftover memory from the corresponding kernel allocation.
Operating on such data may result in indeterminate evaluation of the
nlmsg_len < sizeof(*nlh) expression.
The bug was discovered by a runtime instrumentation designed to detect
use of uninitialized memory in the kernel. The patch prevents this and
other similar tools (e.g. KMSAN) from flagging this behavior in the future.
Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 85eac2ba35.
There is an updated version of this fix which we should
use instead.
Signed-off-by: David S. Miller <davem@davemloft.net>
emac_mdio_read_link() was not copying the requested phy settings
back into the emac driver's own phy api. This has caused a link
speed mismatch issue for the AR8035 as the emac driver kept
trying to connect with 10/100MBps on a 1GBit/s link.
This patch also unifies shared code between emac_setup_aneg()
and emac_mdio_setup_forced(). And furthermore it removes
a chunk of emac_mdio_init_phy(), that was copying the same
data into itself.
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a problem where the AR8035 PHY can't be
detected on an Cisco Meraki MR24, if the ethernet cable is
not connected on boot.
Russell Senior provided steps to reproduce the issue:
|Disconnect ethernet cable, apply power, wait until device has booted,
|plug in ethernet, check for interfaces, no eth0 is listed.
|
|This appears to be a problem during probing of the AR8035 Phy chip.
|When ethernet has no link, the phy detection fails, and eth0 is not
|created. Plugging ethernet later has no effect, because there is no
|interface as far as the kernel is concerned. The relevant part of
|the boot log looks like this:
|this is the failing case:
|
|[ 0.876611] /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode
|[ 0.882532] /plb/opb/ethernet@ef600c00: reset timeout
|[ 0.888546] /plb/opb/ethernet@ef600c00: can't find PHY!
|and the succeeding case:
|
|[ 0.876672] /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode
|[ 0.883952] eth0: EMAC-0 /plb/opb/ethernet@ef600c00, MAC 00:01:..
|[ 0.890822] eth0: found Atheros 8035 Gigabit Ethernet PHY (0x01)
Based on the comment and the commit message of
commit 23fbb5a87c ("emac: Fix EMAC soft reset on 460EX/GT").
This is because the AR8035 PHY doesn't provide the TX Clock,
if the ethernet cable is not attached. This causes the reset
to timeout and the PHY detection code in emac_init_phy() is
unable to detect the AR8035 PHY. As a result, the emac driver
bails out early and the user left with no ethernet.
In order to stay compatible with existing configurations, the driver
tries the current reset approach at first. Only if the first attempt
timed out, it does perform one more retry with the clock temporarily
switched to the internal source for just the duration of the reset.
LEDE-Bug: #687 <https://bugs.lede-project.org/index.php?do=details&task_id=687>
Cc: Chris Blake <chrisrblake93@gmail.com>
Reported-by: Russell Senior <russell@personaltelco.net>
Fixes: 23fbb5a87c ("emac: Fix EMAC soft reset on 460EX/GT")
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Verify that the length of the socket buffer is sufficient to cover the
entire nlh->nlmsg_len field before accessing that field for further
input sanitization. If the client only supplies 1-3 bytes of data in
sk_buff, then nlh->nlmsg_len remains partially uninitialized and
contains leftover memory from the corresponding kernel allocation.
Operating on such data may result in indeterminate evaluation of the
nlmsg_len < sizeof(*nlh) expression.
The bug was discovered by a runtime instrumentation designed to detect
use of uninitialized memory in the kernel. The patch prevents this and
other similar tools (e.g. KMSAN) from flagging this behavior in the future.
Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>