Commit Graph

59330 Commits

Author SHA1 Message Date
Linus Torvalds e59cd88028 Merge tag 'for-5.7/io_uring-2020-03-29' of git://git.kernel.dk/linux-block
Pull io_uring updates from Jens Axboe:
 "Here are the io_uring changes for this merge window. Light on new
  features this time around (just splice + buffer selection), lots of
  cleanups, fixes, and improvements to existing support. In particular,
  this contains:

   - Cleanup fixed file update handling for stack fallback (Hillf)

   - Re-work of how pollable async IO is handled, we no longer require
     thread offload to handle that. Instead we rely using poll to drive
     this, with task_work execution.

   - In conjunction with the above, allow expendable buffer selection,
     so that poll+recv (for example) no longer has to be a split
     operation.

   - Make sure we honor RLIMIT_FSIZE for buffered writes

   - Add support for splice (Pavel)

   - Linked work inheritance fixes and optimizations (Pavel)

   - Async work fixes and cleanups (Pavel)

   - Improve io-wq locking (Pavel)

   - Hashed link write improvements (Pavel)

   - SETUP_IOPOLL|SETUP_SQPOLL improvements (Xiaoguang)"

* tag 'for-5.7/io_uring-2020-03-29' of git://git.kernel.dk/linux-block: (54 commits)
  io_uring: cleanup io_alloc_async_ctx()
  io_uring: fix missing 'return' in comment
  io-wq: handle hashed writes in chains
  io-uring: drop 'free_pfile' in struct io_file_put
  io-uring: drop completion when removing file
  io_uring: Fix ->data corruption on re-enqueue
  io-wq: close cancel gap for hashed linked work
  io_uring: make spdxcheck.py happy
  io_uring: honor original task RLIMIT_FSIZE
  io-wq: hash dependent work
  io-wq: split hashing and enqueueing
  io-wq: don't resched if there is no work
  io-wq: remove duplicated cancel code
  io_uring: fix truncated async read/readv and write/writev retry
  io_uring: dual license io_uring.h uapi header
  io_uring: io_uring_enter(2) don't poll while SETUP_IOPOLL|SETUP_SQPOLL enabled
  io_uring: Fix unused function warnings
  io_uring: add end-of-bits marker and build time verify it
  io_uring: provide means of removing buffers
  io_uring: add IOSQE_BUFFER_SELECT support for IORING_OP_RECVMSG
  ...
2020-03-30 12:18:49 -07:00
Linus Torvalds e595dd9451 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Fix memory leak in vti6, from Torsten Hilbrich.

 2) Fix double free in xfrm_policy_timer, from YueHaibing.

 3) NL80211_ATTR_CHANNEL_WIDTH attribute is put with wrong type, from
    Johannes Berg.

 4) Wrong allocation failure check in qlcnic driver, from Xu Wang.

 5) Get ks8851-ml IO operations right, for real this time, from Marek
    Vasut.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (22 commits)
  r8169: fix PHY driver check on platforms w/o module softdeps
  net: ks8851-ml: Fix IO operations, again
  mlxsw: spectrum_mr: Fix list iteration in error path
  qlcnic: Fix bad kzalloc null test
  mac80211: set IEEE80211_TX_CTRL_PORT_CTRL_PROTO for nl80211 TX
  mac80211: mark station unauthorized before key removal
  mac80211: Check port authorization in the ieee80211_tx_dequeue() case
  cfg80211: Do not warn on same channel at the end of CSA
  mac80211: drop data frames without key on encrypted links
  ieee80211: fix HE SPR size calculation
  nl80211: fix NL80211_ATTR_CHANNEL_WIDTH attribute type
  xfrm: policy: Fix doulbe free in xfrm_policy_timer
  bpf: Explicitly memset some bpf info structures declared on the stack
  bpf: Explicitly memset the bpf_attr structure
  bpf: Sanitize the bpf_struct_ops tcp-cc name
  vti6: Fix memory leak of skb if input policy check fails
  esp: remove the skb from the chain when it's enqueued in cryptd_wq
  ipv6: xfrm6_tunnel.c: Use built-in RCU list checking
  xfrm: add the missing verify_sec_ctx_len check in xfrm_add_acquire
  xfrm: fix uctx len check in verify_sec_ctx_len
  ...
2020-03-28 18:55:15 -07:00
David S. Miller a0ba26f37e Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2020-03-27

The following pull-request contains BPF updates for your *net* tree.

We've added 3 non-merge commits during the last 4 day(s) which contain
a total of 4 files changed, 25 insertions(+), 20 deletions(-).

The main changes are:

1) Explicitly memset the bpf_attr structure on bpf() syscall to avoid
   having to rely on compiler to do so. Issues have been noticed on
   some compilers with padding and other oddities where the request was
   then unexpectedly rejected, from Greg Kroah-Hartman.

2) Sanitize the bpf_struct_ops TCP congestion control name in order to
   avoid problematic characters such as whitespaces, from Martin KaFai Lau.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-27 16:18:51 -07:00
David S. Miller e00dd941ff Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2020-03-27

1) Handle NETDEV_UNREGISTER for xfrm device to handle asynchronous
   unregister events cleanly. From Raed Salem.

2) Fix vti6 tunnel inter address family TX through bpf_redirect().
   From Nicolas Dichtel.

3) Fix lenght check in verify_sec_ctx_len() to avoid a
   slab-out-of-bounds. From Xin Long.

4) Add a missing verify_sec_ctx_len check in xfrm_add_acquire
   to avoid a possible out-of-bounds to access. From Xin Long.

5) Use built-in RCU list checking of hlist_for_each_entry_rcu
   to silence false lockdep warning in __xfrm6_tunnel_spi_lookup
   when CONFIG_PROVE_RCU_LIST is enabled. From Madhuparna Bhowmik.

6) Fix a panic on esp offload when crypto is done asynchronously.
   From Xin Long.

7) Fix a skb memory leak in an error path of vti6_rcv.
   From Torsten Hilbrich.

8) Fix a race that can lead to a doulbe free in xfrm_policy_timer.
   From Xin Long.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-27 14:56:55 -07:00
Linus Torvalds 60268940cd Merge tag 'ceph-for-5.6-rc8' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
 "A patch for a rather old regression in fullness handling and two
  memory leak fixes, marked for stable"

* tag 'ceph-for-5.6-rc8' of git://github.com/ceph/ceph-client:
  ceph: fix memory leak in ceph_cleanup_snapid_map()
  libceph: fix alloc_msg_with_page_vector() memory leaks
  ceph: check POOL_FLAG_FULL/NEARFULL in addition to OSDMAP_FULL/NEARFULL
2020-03-26 15:44:41 -07:00
David S. Miller 328f5bb993 Merge tag 'mac80211-for-net-2020-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:

====================
We have the following fixes:
 * drop data packets if there's no key for them anymore, after
   there had been one, to avoid sending them in clear when
   hostapd removes the key before it removes the station and
   the packets are still queued
 * check port authorization again after dequeue, to avoid
   sending packets if the station is no longer authorized
 * actually remove the authorization flag before the key so
   packets are also dropped properly because of this
 * fix nl80211 control port packet tagging to handle them as
   packets allowed to go out without encryption
 * fix NL80211_ATTR_CHANNEL_WIDTH outgoing netlink attribute
   width (should be 32 bits, not 8)
 * don't WARN in a CSA scenario that happens on some APs
 * fix HE spatial reuse element size calculation
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26 12:03:02 -07:00
Johannes Berg b95d2ccd2c mac80211: set IEEE80211_TX_CTRL_PORT_CTRL_PROTO for nl80211 TX
When a frame is transmitted via the nl80211 TX rather than as a
normal frame, IEEE80211_TX_CTRL_PORT_CTRL_PROTO wasn't set and
this will lead to wrong decisions (rate control etc.) being made
about the frame; fix this.

Fixes: 9118064914 ("mac80211: Add support for tx_control_port")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20200326155333.f183f52b02f0.I4054e2a8c11c2ddcb795a0103c87be3538690243@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-03-26 15:54:12 +01:00
Johannes Berg b16798f5b9 mac80211: mark station unauthorized before key removal
If a station is still marked as authorized, mark it as no longer
so before removing its keys. This allows frames transmitted to it
to be rejected, providing additional protection against leaking
plain text data during the disconnection flow.

Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200326155133.ccb4fb0bb356.If48f0f0504efdcf16b8921f48c6d3bb2cb763c99@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-03-26 15:52:25 +01:00
Jouni Malinen ce2e1ca703 mac80211: Check port authorization in the ieee80211_tx_dequeue() case
mac80211 used to check port authorization in the Data frame enqueue case
when going through start_xmit(). However, that authorization status may
change while the frame is waiting in a queue. Add a similar check in the
dequeue case to avoid sending previously accepted frames after
authorization change. This provides additional protection against
potential leaking of frames after a station has been disconnected and
the keys for it are being removed.

Cc: stable@vger.kernel.org
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Link: https://lore.kernel.org/r/20200326155133.ced84317ea29.I34d4c47cd8cc8a4042b38a76f16a601fbcbfd9b3@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-03-26 15:52:14 +01:00
Ilan Peer 05dcb8bb25 cfg80211: Do not warn on same channel at the end of CSA
When cfg80211_update_assoc_bss_entry() is called, there is a
verification that the BSS channel actually changed. As some APs use
CSA also for bandwidth changes, this would result with a kernel
warning.

Fix this by removing the WARN_ON().

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20200326150855.96316ada0e8d.I6710376b1b4257e5f4712fc7ab16e2b638d512aa@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-03-26 15:50:10 +01:00
Johannes Berg a0761a3017 mac80211: drop data frames without key on encrypted links
If we know that we have an encrypted link (based on having had
a key configured for TX in the past) then drop all data frames
in the key selection handler if there's no key anymore.

This fixes an issue with mac80211 internal TXQs - there we can
buffer frames for an encrypted link, but then if the key is no
longer there when they're dequeued, the frames are sent without
encryption. This happens if a station is disconnected while the
frames are still on the TXQ.

Detecting that a link should be encrypted based on a first key
having been configured for TX is fine as there are no use cases
for a connection going from with encryption to no encryption.
With extended key IDs, however, there is a case of having a key
configured for only decryption, so we can't just trigger this
behaviour on a key being configured.

Cc: stable@vger.kernel.org
Reported-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20200326150855.6865c7f28a14.I9fb1d911b064262d33e33dfba730cdeef83926ca@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-03-26 15:49:24 +01:00
Linus Torvalds 1b649e0bca Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Fix deadlock in bpf_send_signal() from Yonghong Song.

 2) Fix off by one in kTLS offload of mlx5, from Tariq Toukan.

 3) Add missing locking in iwlwifi mvm code, from Avraham Stern.

 4) Fix MSG_WAITALL handling in rxrpc, from David Howells.

 5) Need to hold RTNL mutex in tcindex_partial_destroy_work(), from Cong
    Wang.

 6) Fix producer race condition in AF_PACKET, from Willem de Bruijn.

 7) cls_route removes the wrong filter during change operations, from
    Cong Wang.

 8) Reject unrecognized request flags in ethtool netlink code, from
    Michal Kubecek.

 9) Need to keep MAC in reset until PHY is up in bcmgenet driver, from
    Doug Berger.

10) Don't leak ct zone template in act_ct during replace, from Paul
    Blakey.

11) Fix flushing of offloaded netfilter flowtable flows, also from Paul
    Blakey.

12) Fix throughput drop during tx backpressure in cxgb4, from Rahul
    Lakkireddy.

13) Don't let a non-NULL skb->dev leave the TCP stack, from Eric
    Dumazet.

14) TCP_QUEUE_SEQ socket option has to update tp->copied_seq as well,
    also from Eric Dumazet.

15) Restrict macsec to ethernet devices, from Willem de Bruijn.

16) Fix reference leak in some ethtool *_SET handlers, from Michal
    Kubecek.

17) Fix accidental disabling of MSI for some r8169 chips, from Heiner
    Kallweit.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (138 commits)
  net: Fix CONFIG_NET_CLS_ACT=n and CONFIG_NFT_FWD_NETDEV={y, m} build
  net: ena: Add PCI shutdown handler to allow safe kexec
  selftests/net/forwarding: define libs as TEST_PROGS_EXTENDED
  selftests/net: add missing tests to Makefile
  r8169: re-enable MSI on RTL8168c
  net: phy: mdio-bcm-unimac: Fix clock handling
  cxgb4/ptp: pass the sign of offset delta in FW CMD
  net: dsa: tag_8021q: replace dsa_8021q_remove_header with __skb_vlan_pop
  net: cbs: Fix software cbs to consider packet sending time
  net/mlx5e: Do not recover from a non-fatal syndrome
  net/mlx5e: Fix ICOSQ recovery flow with Striding RQ
  net/mlx5e: Fix missing reset of SW metadata in Striding RQ reset
  net/mlx5e: Enhance ICOSQ WQE info fields
  net/mlx5_core: Set IB capability mask1 to fix ib_srpt connection failure
  selftests: netfilter: add nfqueue test case
  netfilter: nft_fwd_netdev: allow to redirect to ifb via ingress
  netfilter: nft_fwd_netdev: validate family and chain type
  netfilter: nft_set_rbtree: Detect partial overlaps on insertion
  netfilter: nft_set_rbtree: Introduce and use nft_rbtree_interval_start()
  netfilter: nft_set_pipapo: Separate partial and complete overlap cases on insertion
  ...
2020-03-25 13:58:05 -07:00
Pablo Neira Ayuso 2c64605b59 net: Fix CONFIG_NET_CLS_ACT=n and CONFIG_NFT_FWD_NETDEV={y, m} build
net/netfilter/nft_fwd_netdev.c: In function ‘nft_fwd_netdev_eval’:
    net/netfilter/nft_fwd_netdev.c:32:10: error: ‘struct sk_buff’ has no member named ‘tc_redirected’
      pkt->skb->tc_redirected = 1;
              ^~
    net/netfilter/nft_fwd_netdev.c:33:10: error: ‘struct sk_buff’ has no member named ‘tc_from_ingress’
      pkt->skb->tc_from_ingress = 1;
              ^~

To avoid a direct dependency with tc actions from netfilter, wrap the
redirect bits around CONFIG_NET_REDIRECT and move helpers to
include/linux/skbuff.h. Turn on this toggle from the ifb driver, the
only existing client of these bits in the tree.

This patch adds skb_set_redirected() that sets on the redirected bit
on the skbuff, it specifies if the packet was redirect from ingress
and resets the timestamp (timestamp reset was originally missing in the
netfilter bugfix).

Fixes: bcfabee1af ("netfilter: nft_fwd_netdev: allow to redirect to ifb via ingress")
Reported-by: noreply@ellerman.id.au
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 12:24:33 -07:00
Johannes Berg 0016d32017 nl80211: fix NL80211_ATTR_CHANNEL_WIDTH attribute type
The new opmode notification used this attribute with a u8, when
it's documented as a u32 and indeed used in userspace as such,
it just happens to work on little-endian systems since userspace
isn't doing any strict size validation, and the u8 goes into the
lower byte. Fix this.

Cc: stable@vger.kernel.org
Fixes: 466b9936bf ("cfg80211: Add support to notify station's opmode change to userspace")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20200325090531.be124f0a11c7.Iedbf4e197a85471ebd729b186d5365c0343bf7a8@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-03-25 09:58:43 +01:00
David S. Miller 6f000f9878 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) A new selftest for nf_queue, from Florian Westphal. This test
   covers two recent fixes: 07f8e4d0fd ("tcp: also NULL skb->dev
   when copy was needed") and b738a185be ("tcp: ensure skb->dev is
   NULL before leaving TCP stack").

2) The fwd action breaks with ifb. For safety in next extensions,
   make sure the fwd action only runs from ingress until it is extended
   to be used from a different hook.

3) The pipapo set type now reports EEXIST in case of subrange overlaps.
   Update the rbtree set to validate range overlaps, so far this
   validation is only done only from userspace. From Stefano Brivio.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 17:30:40 -07:00
Vladimir Oltean e80f40cbe4 net: dsa: tag_8021q: replace dsa_8021q_remove_header with __skb_vlan_pop
Not only did this wheel did not need reinventing, but there is also
an issue with it: It doesn't remove the VLAN header in a way that
preserves the L2 payload checksum when that is being provided by the DSA
master hw.  It should recalculate checksum both for the push, before
removing the header, and for the pull afterwards. But the current
implementation is quite dizzying, with pulls followed immediately
afterwards by pushes, the memmove is done before the push, etc.  This
makes a DSA master with RX checksumming offload to print stack traces
with the infamous 'hw csum failure' message.

So remove the dsa_8021q_remove_header function and replace it with
something that actually works with inet checksumming.

Fixes: d461933638 ("net: dsa: tag_8021q: Create helper function for removing VLAN header")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 16:19:01 -07:00
Zh-yuan Ye 961d0e5b32 net: cbs: Fix software cbs to consider packet sending time
Currently the software CBS does not consider the packet sending time
when depleting the credits. It caused the throughput to be
Idleslope[kbps] * (Port transmit rate[kbps] / |Sendslope[kbps]|) where
Idleslope * (Port transmit rate / (Idleslope + |Sendslope|)) = Idleslope
is expected. In order to fix the issue above, this patch takes the time
when the packet sending completes into account by moving the anchor time
variable "last" ahead to the send completion time upon transmission and
adding wait when the next dequeue request comes before the send
completion time of the previous packet.

changelog:
V2->V3:
 - remove unnecessary whitespace cleanup
 - add the checks if port_rate is 0 before division

V1->V2:
 - combine variable "send_completed" into "last"
 - add the comment for estimate of the packet sending

Fixes: 585d763af0 ("net/sched: Introduce Credit Based Shaper (CBS) qdisc")
Signed-off-by: Zh-yuan Ye <ye.zh-yuan@socionext.com>
Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 16:14:05 -07:00
Pablo Neira Ayuso bcfabee1af netfilter: nft_fwd_netdev: allow to redirect to ifb via ingress
Set skb->tc_redirected to 1, otherwise the ifb driver drops the packet.
Set skb->tc_from_ingress to 1 to reinject the packet back to the ingress
path after leaving the ifb egress path.

This patch inconditionally sets on these two skb fields that are
meaningful to the ifb driver. The existing forward action is guaranteed
to run from ingress path.

Fixes: 39e6dea28a ("netfilter: nf_tables: add forward expression to the netdev family")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-24 19:59:39 +01:00
Pablo Neira Ayuso 76a109fac2 netfilter: nft_fwd_netdev: validate family and chain type
Make sure the forward action is only used from ingress.

Fixes: 39e6dea28a ("netfilter: nf_tables: add forward expression to the netdev family")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-24 19:59:38 +01:00
Stefano Brivio 7c84d41416 netfilter: nft_set_rbtree: Detect partial overlaps on insertion
...and return -ENOTEMPTY to the front-end in this case, instead of
proceeding. Currently, nft takes care of checking for these cases
and not sending them to the kernel, but if we drop the set_overlap()
call in nft we can end up in situations like:

 # nft add table t
 # nft add set t s '{ type inet_service ; flags interval ; }'
 # nft add element t s '{ 1 - 5 }'
 # nft add element t s '{ 6 - 10 }'
 # nft add element t s '{ 4 - 7 }'
 # nft list set t s
 table ip t {
 	set s {
 		type inet_service
 		flags interval
 		elements = { 1-3, 4-5, 6-7 }
 	}
 }

This change has the primary purpose of making the behaviour
consistent with nft_set_pipapo, but is also functional to avoid
inconsistent behaviour if userspace sends overlapping elements for
any reason.

v2: When we meet the same key data in the tree, as start element while
    inserting an end element, or as end element while inserting a start
    element, actually check that the existing element is active, before
    resetting the overlap flag (Pablo Neira Ayuso)

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-24 19:59:37 +01:00
Stefano Brivio 6f7c9caf01 netfilter: nft_set_rbtree: Introduce and use nft_rbtree_interval_start()
Replace negations of nft_rbtree_interval_end() with a new helper,
nft_rbtree_interval_start(), wherever this helps to visualise the
problem at hand, that is, for all the occurrences except for the
comparison against given flags in __nft_rbtree_get().

This gets especially useful in the next patch.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-24 19:59:30 +01:00
Stefano Brivio 0eb4b5ee33 netfilter: nft_set_pipapo: Separate partial and complete overlap cases on insertion
...and return -ENOTEMPTY to the front-end on collision, -EEXIST if
an identical element already exists. Together with the previous patch,
element collision will now be returned to the user as -EEXIST.

Reported-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-24 19:59:08 +01:00
Pablo Neira Ayuso 8c2d45b2b6 netfilter: nf_tables: Allow set back-ends to report partial overlaps on insertion
Currently, the -EEXIST return code of ->insert() callbacks is ambiguous: it
might indicate that a given element (including intervals) already exists as
such, or that the new element would clash with existing ones.

If identical elements already exist, the front-end is ignoring this without
returning error, in case NLM_F_EXCL is not set. However, if the new element
can't be inserted due an overlap, we should report this to the user.

To this purpose, allow set back-ends to return -ENOTEMPTY on collision with
existing elements, translate that to -EEXIST, and return that to userspace,
no matter if NLM_F_EXCL was set.

Reported-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-24 19:58:57 +01:00
YueHaibing 4c59406ed0 xfrm: policy: Fix doulbe free in xfrm_policy_timer
After xfrm_add_policy add a policy, its ref is 2, then

                             xfrm_policy_timer
                               read_lock
                               xp->walk.dead is 0
                               ....
                               mod_timer()
xfrm_policy_kill
  policy->walk.dead = 1
  ....
  del_timer(&policy->timer)
    xfrm_pol_put //ref is 1
  xfrm_pol_put  //ref is 0
    xfrm_policy_destroy
      call_rcu
                                 xfrm_pol_hold //ref is 1
                               read_unlock
                               xfrm_pol_put //ref is 0
                                 xfrm_policy_destroy
                                  call_rcu

xfrm_policy_destroy is called twice, which may leads to
double free.

Call Trace:
RIP: 0010:refcount_warn_saturate+0x161/0x210
...
 xfrm_policy_timer+0x522/0x600
 call_timer_fn+0x1b3/0x5e0
 ? __xfrm_decode_session+0x2990/0x2990
 ? msleep+0xb0/0xb0
 ? _raw_spin_unlock_irq+0x24/0x40
 ? __xfrm_decode_session+0x2990/0x2990
 ? __xfrm_decode_session+0x2990/0x2990
 run_timer_softirq+0x5c5/0x10e0

Fix this by use write_lock_bh in xfrm_policy_kill.

Fixes: ea2dea9dac ("xfrm: remove policy lock when accessing policy->walk.dead")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Timo Teräs <timo.teras@iki.fi>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-03-24 06:56:54 +01:00
Michal Kubecek 2f599ec422 ethtool: fix reference leak in some *_SET handlers
Andrew noticed that some handlers for *_SET commands leak a netdev
reference if required ethtool_ops callbacks do not exist. A simple
reproducer would be e.g.

  ip link add veth1 type veth peer name veth2
  ethtool -s veth1 wol g
  ip link del veth1

Make sure dev_put() is called when ethtool_ops check fails.

v2: add Fixes tags

Fixes: a53f3d41e4 ("ethtool: set link settings with LINKINFO_SET request")
Fixes: bfbcfe2032 ("ethtool: set link modes related data with LINKMODES_SET request")
Fixes: e54d04e3af ("ethtool: set message mask with DEBUG_SET request")
Fixes: 8d425b19b3 ("ethtool: set wake-on-lan settings with WOL_SET request")
Reported-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23 21:50:14 -07:00