rtnl_fdb_dump always expects an index to be returned by the ndo_fdb_dump op,
but when CONFIG_NET_SWITCHDEV is off, it returns an error.
Fix that by returning the given unmodified idx.
A similar fix was 0890cf6cb6 ("switchdev: fix return value of
switchdev_port_fdb_dump in case of error") but for the CONFIG_NET_SWITCHDEV=y
case.
Fixes: 45d4122ca7 ("switchdev: add support for fdb add/del/dump via switchdev_port_obj ops.")
Signed-off-by: Dragos Tatulea <dragos@endocode.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drivers like vxlan use the recently introduced
udp_tunnel_xmit_skb/udp_tunnel6_xmit_skb APIs. udp_tunnel6_xmit_skb
makes use of ip6tunnel_xmit, and ip6tunnel_xmit, after sending the
packet, updates the struct stats using the usual
u64_stats_update_begin/end calls on this_cpu_ptr(dev->tstats).
udp_tunnel_xmit_skb makes use of iptunnel_xmit, which doesn't touch
tstats, so drivers like vxlan, immediately after, call
iptunnel_xmit_stats, which does the same thing - calls
u64_stats_update_begin/end on this_cpu_ptr(dev->tstats).
While vxlan is probably fine (I don't know?), calling a similar function
from, say, an unbound workqueue, on a fully preemptable kernel causes
real issues:
[ 188.434537] BUG: using smp_processor_id() in preemptible [00000000] code: kworker/u8:0/6
[ 188.435579] caller is debug_smp_processor_id+0x17/0x20
[ 188.435583] CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.2.6 #2
[ 188.435607] Call Trace:
[ 188.435611] [<ffffffff8234e936>] dump_stack+0x4f/0x7b
[ 188.435615] [<ffffffff81915f3d>] check_preemption_disabled+0x19d/0x1c0
[ 188.435619] [<ffffffff81915f77>] debug_smp_processor_id+0x17/0x20
The solution would be to protect the whole
this_cpu_ptr(dev->tstats)/u64_stats_update_begin/end blocks with
disabling preemption and then reenabling it.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some functions access TCP sockets without holding a lock and
might output non consistent data, depending on compiler and or
architecture.
tcp_diag_get_info(), tcp_get_info(), tcp_poll(), get_tcp4_sock() ...
Introduce sk_state_load() and sk_state_store() to fix the issues,
and more clearly document where this lack of locking is happening.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All DST_NOCACHE rt6_info used to have rt->dst.from set to
its parent.
After commit 8e3d5be736 ("ipv6: Avoid double dst_free"),
DST_NOCACHE is also set to rt6_info which does not have
a parent (i.e. rt->dst.from is NULL).
This patch catches the rt->dst.from == NULL case.
Fixes: 8e3d5be736 ("ipv6: Avoid double dst_free")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for your net tree. This
large batch that includes fixes for ipset, netfilter ingress, nf_tables
dynamic set instantiation and a longstanding Kconfig dependency problem.
More specifically, they are:
1) Add missing check for empty hook list at the ingress hook, from
Florian Westphal.
2) Input and output interface are swapped at the ingress hook,
reported by Patrick McHardy.
3) Resolve ipset extension alignment issues on ARM, patch from Jozsef
Kadlecsik.
4) Fix bit check on bitmap in ipset hash type, also from Jozsef.
5) Release buckets when all entries have expired in ipset hash type,
again from Jozsef.
6) Oneliner to initialize conntrack tuple object in the PPTP helper,
otherwise the conntrack lookup may fail due to random bits in the
structure holes, patch from Anthony Lineham.
7) Silence a bogus gcc warning in nfnetlink_log, from Arnd Bergmann.
8) Fix Kconfig dependency problems with TPROXY, socket and dup, also
from Arnd.
9) Add __netdev_alloc_pcpu_stats() to allow creating percpu counters
from atomic context, this is required by the follow up fix for
nf_tables.
10) Fix crash from the dynamic set expression, we have to add new clone
operation that should be defined when a simple memcpy is not enough.
This resolves a crash when using per-cpu counters with new Patrick
McHardy's flow table nft support.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Fix null deref in xt_TEE netfilter module, from Eric Dumazet.
2) Several spots need to get to the original listner for SYN-ACK
packets, most spots got this ok but some were not. Whilst covering
the remaining cases, create a helper to do this. From Eric Dumazet.
3) Missiing check of return value from alloc_netdev() in CAIF SPI code,
from Rasmus Villemoes.
4) Don't sleep while != TASK_RUNNING in macvtap, from Vlad Yasevich.
5) Use after free in mvneta driver, from Justin Maggard.
6) Fix race on dst->flags access in dst_release(), from Eric Dumazet.
7) Add missing ZLIB_INFLATE dependency for new qed driver. From Arnd
Bergmann.
8) Fix multicast getsockopt deadlock, from WANG Cong.
9) Fix deadlock in btusb, from Kuba Pawlak.
10) Some ipv6_add_dev() failure paths were not cleaning up the SNMP6
counter state. From Sabrina Dubroca.
11) Fix packet_bind() race, which can cause lost notifications, from
Francesco Ruggeri.
12) Fix MAC restoration in qlcnic driver during bonding mode changes,
from Jarod Wilson.
13) Revert bridging forward delay change which broke libvirt and other
userspace things, from Vlad Yasevich.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits)
Revert "bridge: Allow forward delay to be cfgd when STP enabled"
bpf_trace: Make dependent on PERF_EVENTS
qed: select ZLIB_INFLATE
net: fix a race in dst_release()
net: mvneta: Fix memory use after free.
net: Documentation: Fix default value tcp_limit_output_bytes
macvtap: Resolve possible __might_sleep warning in macvtap_do_read()
mvneta: add FIXED_PHY dependency
net: caif: check return value of alloc_netdev
net: hisilicon: NET_VENDOR_HISILICON should depend on HAS_DMA
drivers: net: xgene: fix RGMII 10/100Mb mode
netfilter: nft_meta: use skb_to_full_sk() helper
net_sched: em_meta: use skb_to_full_sk() helper
sched: cls_flow: use skb_to_full_sk() helper
netfilter: xt_owner: use skb_to_full_sk() helper
smack: use skb_to_full_sk() helper
net: add skb_to_full_sk() helper and use it in selinux_netlbl_skbuff_setsid()
bpf: doc: correct arch list for supported eBPF JIT
dwc_eth_qos: Delete an unnecessary check before the function call "of_node_put"
bonding: fix panic on non-ARPHRD_ETHER enslave failure
...
With the conversion of the counter expressions to make it percpu, we
need to clone the percpu memory area, otherwise we crash when using
counters from flow tables.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Generalize selinux_skb_sk() added in commit 212cd08953
("selinux: fix random read in selinux_ip_postroute_compat()")
so that we can use it other contexts.
Use it right away in selinux_netlbl_skbuff_setsid()
Fixes: ca6fb06518 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
__GFP_WAIT has been used to identify atomic context in callers that hold
spinlocks or are in interrupts. They are expected to be high priority and
have access one of two watermarks lower than "min" which can be referred
to as the "atomic reserve". __GFP_HIGH users get access to the first
lower watermark and can be called the "high priority reserve".
Over time, callers had a requirement to not block when fallback options
were available. Some have abused __GFP_WAIT leading to a situation where
an optimisitic allocation with a fallback option can access atomic
reserves.
This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
cannot sleep and have no alternative. High priority users continue to use
__GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and
are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify
callers that want to wake kswapd for background reclaim. __GFP_WAIT is
redefined as a caller that is willing to enter direct reclaim and wake
kswapd for background reclaim.
This patch then converts a number of sites
o __GFP_ATOMIC is used by callers that are high priority and have memory
pools for those requests. GFP_ATOMIC uses this flag.
o Callers that have a limited mempool to guarantee forward progress clear
__GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
into this category where kswapd will still be woken but atomic reserves
are not used as there is a one-entry mempool to guarantee progress.
o Callers that are checking if they are non-blocking should use the
helper gfpflags_allow_blocking() where possible. This is because
checking for __GFP_WAIT as was done historically now can trigger false
positives. Some exceptions like dm-crypt.c exist where the code intent
is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
flag manipulations.
o Callers that built their own GFP flags instead of starting with GFP_KERNEL
and friends now also need to specify __GFP_KSWAPD_RECLAIM.
The first key hazard to watch out for is callers that removed __GFP_WAIT
and was depending on access to atomic reserves for inconspicuous reasons.
In some cases it may be appropriate for them to use __GFP_HIGH.
The second key hazard is callers that assembled their own combination of
GFP flags instead of starting with something like GFP_KERNEL. They may
now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless
if it's missed in most cases as other activity will wake kswapd.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johan Hedberg says:
====================
pull request: bluetooth 2015-11-05
The following set of Bluetooth patches would be good to get into 4.4-rc1
if possible:
- Fix for missing LE CoC parameter validity checks
- Fix for potential deadlock in btusb
- Fix for issuing unsupported commands during HCI init
Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The core spec defines specific response codes for situations when the
received CID is incorrect. Add the defines for these and return them
as appropriate from the LE Connect Request handler function.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
In tun_dst_unclone() the return value of skb_metadata_dst() is checked
for being NULL after it is dereferenced. Fix this by moving the
dereference after the NULL check.
Found by the Coverity scanner (CID 1338068).
Fixes: fc4099f172 ("openvswitch: Fix egress tunnel info.")
Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Minor overlapping changes in net/ipv4/ipmr.c, in 'net' we were
fixing the "BH-ness" of the counter bumps whilst in 'net-next'
the functions were modified to take an explicit 'net' parameter.
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg says:
====================
Another set of fixes:
* remove a warning on a check that can trigger without any
errors having happened (Andrei)
* correctly handle deauth request while in the process of
associating (Andrei)
* fix TDLS HT operation (Arik)
* allow changing AID/listen interval during client setup (Ayala)
* be more forgiving with WMM parameters to get HT/VHT in case of
broken APs with bad WMM settings (Emmanuel, myself)
* a number of other fixes (some in documentation)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Channel context driver operations can sleep, so add might_sleep()
and document this.
Signed-off-by: Chaitanya T K <chaitanya.mgit@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The previous patch changed mac80211 to always report an event
after a CQM RSSI reconfiguration. Document that as expected
behaviour in both the cfg80211 and mac80211 API.
Currently, iwlmvm already implements that behaviour; the other
drivers implementing CQM RSSI events may have to be changed.
This behaviour lets userspace know what the current state is
without relying on querying the data which is racy.
Reviewed-by: Sharon, Sara <sara.sharon@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Old logic of updating state-machine is not required since
ad_update_actor_keys() does it implicitly. The only loss is
the notification differentiation between speed vs. duplex
change. Now only one unified notification is printed.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes following problems :
1) percpu_counter_init() can return an error, therefore
init_frag_mem_limit() must propagate this error so that
inet_frags_init_net() can do the same up to its callers.
2) If ip[46]_frags_ns_ctl_register() fail, we must unwind
properly and free the percpu_counter.
Without this fix, we leave freed object in percpu_counters
global list (if CONFIG_HOTPLUG_CPU) leading to crashes.
This bug was detected by KASAN and syzkaller tool
(http://github.com/google/syzkaller)
Fixes: 6d7b857d54 ("net: use lib/percpu_counter API for fragmentation mem accounting")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Under low memory conditions, tcp_sk_init() and icmp_sk_init()
can both iterate on all possible cpus and call inet_ctl_sock_destroy(),
with eventual NULL pointer.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
skb_set_owner_w() is called from various places that assume
skb->sk always point to a full blown socket (as it changes
sk->sk_wmem_alloc)
We'd like to attach skb to request sockets, and in the future
to timewait sockets as well. For these kind of pseudo sockets,
we need to take a traditional refcount and use sock_edemux()
as the destructor.
It is now time to un-inline skb_set_owner_w(), being too big.
Fixes: ca6fb06518 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Bisected-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When fib_netdev_event calls fib_disable_ip on NETDEV_DOWN event
we should not delete the local routes if the local address
is still present. The confusion comes from the fact that both
fib_netdev_event and fib_inetaddr_event use the NETDEV_DOWN
constant. Fix it by returning back the variable 'force'.
Steps to reproduce:
modprobe dummy
ifconfig dummy0 192.168.168.1 up
ifconfig dummy0 down
ip route list table local | grep dummy | grep host
local 192.168.168.1 dev dummy0 proto kernel scope host src 192.168.168.1
Fixes: 8a3d03166f ("net: track link-status of ipv4 nexthops")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simplify DSA by pushing the switchdev objects for VLAN add and delete
operations down to its drivers. Currently only mv88e6xxx is affected.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SS_LISTEN socket state is defined by both af_vsock.c and
vmci_transport.c. This is risky since the value could be changed in one
file and the other would be out of sync.
Rename from SS_LISTEN to VSOCK_SS_LISTEN since the constant is not part
of enum socket_state (SS_CONNECTED, ...). This way it is clear that the
constant is vsock-specific.
The big text reflow in af_vsock.c was necessary to keep to the maximum
line length. Text is unchanged except for s/SS_LISTEN/VSOCK_SS_LISTEN/.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Samuel Ortiz says:
====================
NFC 4.4 pull request
This is the NFC pull request for 4.4.
It's a bit bigger than usual, the 3 main culprits being:
- A new driver for Intel's Fields Peak NCI chipset. In order to
support this chipset we had to export a few NCI routines and
extend the driver NCI ops to not only support proprietary
commands but also core ones.
- Support for vendor commands for both STM drivers, st-nci
and st21nfca. Those vendor commands allow to run factory tests
through the NFC netlink interface.
- New i2c and SPI support for the Marvell driver, together with
firmware download support for this driver's core.
Besides that we also have:
- A few file renames in the STM drivers, to keep the naming
consistent between drivers.
- Some improvements and fixes on the NCI HCI layer, mostly to
properly reach a secure element over a legacy HCI link.
- A few fixes for the s3fwrn5 and trf7970a drivers.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Johan Hedberg says:
====================
pull request: bluetooth-next 2015-10-28
Here are a some more Bluetooth patches for 4.4 which collected up during
the past week. The most important ones are from Kuba Pawlak for fixing
locking issues with SCO sockets. There's also a fix from Alexander Aring
for 6lowpan, a memleak fix from Julia Lawall for the btmrvl driver and
some cleanup patches from Marcel.
Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>