linux-apfs

mirror of https://github.com/linux-apfs/linux-apfs.git synced 2026-05-01 15:00:59 -07:00

Author	SHA1	Message	Date
Liping Zhang	12be15dd5a	netfilter: nfnetlink_acct: fix race between nfacct del and xt_nfacct destroy Suppose that we input the following commands at first: # nfacct add test # iptables -A INPUT -m nfacct --nfacct-name test And now "test" acct's refcnt is 2, but later when we try to delete the "test" nfacct and the related iptables rule at the same time, race maybe happen: CPU0 CPU1 nfnl_acct_try_del nfnl_acct_put atomic_dec_and_test //ref=1,testfail - - atomic_dec_and_test //ref=0,testok - kfree_rcu atomic_inc //ref=1 - So after the rcu grace period, nf_acct will be freed but it is still linked in the nfnl_acct_list, and we can access it later, then oops will happen. Convert atomic_dec_and_test and atomic_inc combinaiton to one atomic operation atomic_cmpxchg here to fix this problem. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-08-18 15:16:36 +02:00
Eric Dumazet	dcbe35909c	netfilter: tproxy: properly refcount tcp listeners inet_lookup_listener() and inet6_lookup_listener() no longer take a reference on the found listener. This minimal patch adds back the refcounting, but we might do this differently in net-next later. Fixes: `3b24d854cb` ("tcp/dccp: do not touch listener sk_refcnt under synflood") Reported-and-tested-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-08-18 00:51:13 +02:00
Liping Zhang	aca300183e	netfilter: nfnetlink_acct: report overquota to the right netns We should report the over quota message to the right net namespace instead of the init netns. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-08-18 00:38:23 +02:00
Liping Zhang	2497b84625	netfilter: nfnetlink_log: add "nf-logger-3-1" module alias name Otherwise, if nfnetlink_log.ko is not loaded, we cannot add rules to log packets to the userspace when we specify it with arp family, such as: # nft add rule arp filter input log group 0 <cmdline>:1:1-37: Error: Could not process rule: No such file or directory add rule arp filter input log group 0 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-08-17 17:44:53 +02:00
Liping Zhang	e77e6ff502	netfilter: conntrack: do not dump other netns's conntrack entries via proc We should skip the conntracks that belong to a different namespace, otherwise other unrelated netns's conntrack entries will be dumped via /proc/net/nf_conntrack. Fixes: `56d52d4892` ("netfilter: conntrack: use a single hashtable for all namespaces") Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-08-17 17:41:58 +02:00
David S. Miller	a1560dd7a4	Merge branch 'mediatek-fixes' Sean Wang says: ==================== mediatek: Fix warning and issue This patch set fixes the following warning and issues v1 -> v2: Fix message typos and add coverletter v2 -> v3: Split from the previous series for submitting bug fixes as a series targeting 'net' ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-15 23:02:45 -07:00
sean.wang@mediatek.com	55a4e77819	net: ethernet: mediatek: fix runtime warning raised by inconsistent struct device pointers passed to DMA API Runtime warning occurs if DMA-API debug feature is enabled that would be raised by pointers passed to DMA API as arguments to inconsistent struct device objects, so that the patch makes them usage aligned between DMA operations such as dma_map_() and dma_unmap_() to eliminate the warning. Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-15 23:02:44 -07:00
sean.wang@mediatek.com	b2025c7cc9	net: ethernet: mediatek: fix flow control settings on GMAC0 is not being enabled properly Commit `08ef55c6f2` ("net-next: mediatek: fix gigabit and flow control advertisement") had supported proper flow control settings for GMAC1. But for GMAC0, 1.GMAC0 shares the common logic with GMAC1 inside mtk_phy_link_adjust() to adapt various settings for the target phy. 2.GMAC0 uses fixed-phy to connect to a builtin gigabit switch with fixed link speed as commit `0c72c50f6f` ("net-next: mediatek: add fixed-phy support") describes. 3.However, fixed-phy doesn't enable SUPPORTED_Pause & SUPPORTED_Asym_Pause supported flag on default that would cause mtk_phy_link_adjust() not to enable flow control setting on GMAC0 properly and cause packet dropped when high traffic. Due to these reasons, the patch adds SUPPORTED_Pause & SUPPORTED_Asym_Pause supported flags on fixed-phy used by the driver to have proper handling on the both GMAC with the shared common logic. Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-15 23:02:44 -07:00
sean.wang@mediatek.com	8ca7f4fe07	net: ethernet: mediatek: fix RMII mode and add REVMII supported by GMAC The patch fixes up the incorrect setup of reduced MII (RMII) on GMAC and adds the supplement for the setup of reverse MII (REVMII) on GMAC , and rearranges the error handling for invalid PHY argument. Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-15 23:02:44 -07:00
Vegard Nossum	d2fbdf76b8	tipc: fix NULL pointer dereference in shutdown() tipc_msg_create() can return a NULL skb and if so, we shouldn't try to call tipc_node_xmit_skb() on it. general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 3 PID: 30298 Comm: trinity-c0 Not tainted 4.7.0-rc7+ #19 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 task: ffff8800baf09980 ti: ffff8800595b8000 task.ti: ffff8800595b8000 RIP: 0010:[<ffffffff830bb46b>] [<ffffffff830bb46b>] tipc_node_xmit_skb+0x6b/0x140 RSP: 0018:ffff8800595bfce8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000003023b0e0 RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffffffff83d12580 RBP: ffff8800595bfd78 R08: ffffed000b2b7f32 R09: 0000000000000000 R10: fffffbfff0759725 R11: 0000000000000000 R12: 1ffff1000b2b7f9f R13: ffff8800595bfd58 R14: ffffffff83d12580 R15: dffffc0000000000 FS: 00007fcdde242700(0000) GS:ffff88011af80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fcddde1db10 CR3: 000000006874b000 CR4: 00000000000006e0 DR0: 00007fcdde248000 DR1: 00007fcddd73d000 DR2: 00007fcdde248000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000090602 Stack: 0000000000000018 0000000000000018 0000000041b58ab3 ffffffff83954208 ffffffff830bb400 ffff8800595bfd30 ffffffff8309d767 0000000000000018 0000000000000018 ffff8800595bfd78 ffffffff8309da1a 00000000810ee611 Call Trace: [<ffffffff830c84a3>] tipc_shutdown+0x553/0x880 [<ffffffff825b4a3b>] SyS_shutdown+0x14b/0x170 [<ffffffff8100334c>] do_syscall_64+0x19c/0x410 [<ffffffff83295ca5>] entry_SYSCALL64_slow_path+0x25/0x25 Code: 90 00 b4 0b 83 c7 00 f1 f1 f1 f1 4c 8d 6d e0 c7 40 04 00 00 00 f4 c7 40 08 f3 f3 f3 f3 48 89 d8 48 c1 e8 03 c7 45 b4 00 00 00 00 <80> 3c 30 00 75 78 48 8d 7b 08 49 8d 75 c0 48 b8 00 00 00 00 00 RIP [<ffffffff830bb46b>] tipc_node_xmit_skb+0x6b/0x140 RSP <ffff8800595bfce8> ---[ end trace 57b0484e351e71f1 ]--- I feel like we should maybe return -ENOMEM or -ENOBUFS, but I'm not sure userspace is equipped to handle that. Anyway, this is better than a GPF and looks somewhat consistent with other tipc_msg_create() callers. Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-15 13:55:36 -07:00
David S. Miller	a8545b60a8	Merge branch 'hv_netvsc-VF-removal-fixes' Vitaly Kuznetsov says: ==================== hv_netvsc: fixes for VF removal path Kernel crash is reported after VF is removed and detached from netvsc device. Turns out we have multiple different (but related) issues on the VF removal path which I'm trying to address with PATCHes 2-5 of this series. PATCH1 is required to support the change. Changes since v1: - Re-arrange patches in the series to not introduce new issues [David Miller] - Add PATCH5 which fixes a new issue I discovered while testing. - Add Haiyang' A-b tags to PATCH1-4 With regards to Stephen's suggestion: I believe that switching to using RCU and eliminating vf_use_cnt/vf_inject is the right thing to do long-term, we can either put this on top of this series or do it later in net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-15 13:48:08 -07:00
Vitaly Kuznetsov	0dbff144a1	hv_netvsc: fix bonding devices check in netvsc_netdev_event() Bonding driver sets IFF_BONDING on both master (the bonding device) and slave (the real NIC) devices and in netvsc_netdev_event() we want to skip master devices only. Currently, there is an uncertainty when a slave interface is removed: if bonding module comes first in netdev_chain it clears IFF_BONDING flag on the netdev and netvsc_netdev_event() correctly handles NETDEV_UNREGISTER event, but in case netvsc comes first on the chain it sees the device with IFF_BONDING still attached and skips it. As we still hold vf_netdev pointer to the device we crash on the next inject. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-15 13:48:07 -07:00
Vitaly Kuznetsov	0f20d795f7	hv_netvsc: protect module refcount by checking net_device_ctx->vf_netdev We're not guaranteed to see NETDEV_REGISTER/NETDEV_UNREGISTER notifications only once per VF but we increase/decrease module refcount unconditionally. Check vf_netdev to make sure we don't take/release it twice. We presume that only one VF per netvsc device may exist. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-15 13:48:07 -07:00
Vitaly Kuznetsov	57c1826b99	hv_netvsc: reset vf_inject on VF removal We reset vf_inject on VF going down (netvsc_vf_down()) but we don't on VF removal (netvsc_unregister_vf()) so vf_inject stays 'true' while vf_netdev is already NULL and we're trying to inject packets into NULL net device in netvsc_recv_callback() causing kernel to crash. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-15 13:48:07 -07:00
Vitaly Kuznetsov	d072218f21	hv_netvsc: avoid deadlocks between rtnl lock and vf_use_cnt wait Here is a deadlock scenario: - netvsc_vf_up() schedules netvsc_notify_peers() work and quits. - netvsc_vf_down() runs before netvsc_notify_peers() gets executed. As it is being executed from netdev notifier chain we hold rtnl lock when we get here. - we enter while (atomic_read(&net_device_ctx->vf_use_cnt) != 0) loop and wait till netvsc_notify_peers() drops vf_use_cnt. - netvsc_notify_peers() starts on some other CPU but netdev_notify_peers() will hang on rtnl_lock(). - deadlock! Instead of introducing additional synchronization I suggest we drop gwrk.dwrk completely and call NETDEV_NOTIFY_PEERS directly. As we're acting under rtnl lock this is legitimate. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-15 13:48:07 -07:00
Vitaly Kuznetsov	f9a7da9130	hv_netvsc: don't lose VF information struct netvsc_device is not suitable for storing VF information as this structure is being destroyed on MTU change / set channel operation (see rndis_filter_device_remove()). Move all VF related stuff to struct net_device_context which is persistent. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-15 13:48:07 -07:00
Simon Horman	3d7b332092	gre: set inner_protocol on xmit Ensure that the inner_protocol is set on transmit so that GSO segmentation, which relies on that field, works correctly. This is achieved by setting the inner_protocol in gre_build_header rather than each caller of that function. It ensures that the inner_protocol is set when gre_fb_xmit() is used to transmit GRE which was not previously the case. I have observed this is not the case when OvS transmits GRE using lwtunnel metadata (which it always does). Fixes: `3872035241` ("gre: Use inner_proto to obtain inner header protocol") Cc: Pravin Shelar <pshelar@ovn.org> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-15 13:37:12 -07:00
Lorenzo Colitti	5e45789698	net: ipv6: Fix ping to link-local addresses. ping_v6_sendmsg does not set flowi6_oif in response to sin6_scope_id or sk_bound_dev_if, so it is not possible to use these APIs to ping an IPv6 address on a different interface. Instead, it sets flowi6_iif, which is incorrect but harmless. Stop setting flowi6_iif, and support various ways of setting oif in the same priority order used by udpv6_sendmsg. Tested: https://android-review.googlesource.com/#/c/254470/ Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-15 12:19:09 -07:00
Vegard Nossum	12311959ec	rhashtable: fix shift by 64 when shrinking I got this: ================================================================================ UBSAN: Undefined behaviour in ./include/linux/log2.h:63:13 shift exponent 64 is too large for 64-bit type 'long unsigned int' CPU: 1 PID: 721 Comm: kworker/1:1 Not tainted 4.8.0-rc1+ #87 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 Workqueue: events rht_deferred_worker 0000000000000000 ffff88011661f8d8 ffffffff82344f50 0000000041b58ab3 ffffffff84f98000 ffffffff82344ea4 ffff88011661f900 ffff88011661f8b0 0000000000000001 ffff88011661f6b8 dffffc0000000000 ffffffff867f7640 Call Trace: [<ffffffff82344f50>] dump_stack+0xac/0xfc [<ffffffff82344ea4>] ? _atomic_dec_and_lock+0xc4/0xc4 [<ffffffff8242f5b8>] ubsan_epilogue+0xd/0x8a [<ffffffff82430c41>] __ubsan_handle_shift_out_of_bounds+0x255/0x29a [<ffffffff824309ec>] ? __ubsan_handle_out_of_bounds+0x180/0x180 [<ffffffff84003436>] ? nl80211_req_set_reg+0x256/0x2f0 [<ffffffff812112ba>] ? print_context_stack+0x8a/0x160 [<ffffffff81200031>] ? amd_pmu_reset+0x341/0x380 [<ffffffff823af808>] rht_deferred_worker+0x1618/0x1790 [<ffffffff823af808>] ? rht_deferred_worker+0x1618/0x1790 [<ffffffff823ae1f0>] ? rhashtable_jhash2+0x370/0x370 [<ffffffff8134c12d>] ? process_one_work+0x6fd/0x1970 [<ffffffff8134c1cf>] process_one_work+0x79f/0x1970 [<ffffffff8134c12d>] ? process_one_work+0x6fd/0x1970 [<ffffffff8134ba30>] ? try_to_grab_pending+0x4c0/0x4c0 [<ffffffff8134d564>] ? worker_thread+0x1c4/0x1340 [<ffffffff8134d8ff>] worker_thread+0x55f/0x1340 [<ffffffff845e904f>] ? __schedule+0x4df/0x1d40 [<ffffffff8134d3a0>] ? process_one_work+0x1970/0x1970 [<ffffffff8134d3a0>] ? process_one_work+0x1970/0x1970 [<ffffffff813642f7>] kthread+0x237/0x390 [<ffffffff813640c0>] ? __kthread_parkme+0x280/0x280 [<ffffffff845f8c93>] ? _raw_spin_unlock_irq+0x33/0x50 [<ffffffff845f95df>] ret_from_fork+0x1f/0x40 [<ffffffff813640c0>] ? __kthread_parkme+0x280/0x280 ================================================================================ roundup_pow_of_two() is undefined when called with an argument of 0, so let's avoid the call and just fall back to ht->p.min_size (which should never be smaller than HASH_MIN_SIZE). Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-15 11:10:09 -07:00
Vincent	eb8fc32354	mlxsw: spectrum_router: Fix use after free In mlxsw_sp_router_fib4_add_info_destroy(), the fib_entry pointer is used after it has been freed by mlxsw_sp_fib_entry_destroy(). Use a temporary variable to fix this. Fixes: `61c503f976` ("mlxsw: spectrum_router: Implement fib4 add/del switchdev obj ops") Signed-off-by: Vincent Stehlé <vincent.stehle@laposte.net> Cc: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-14 21:32:05 -07:00
Florian Westphal	4cf0b354d9	rhashtable: avoid large lock-array allocations Sander reports following splat after netfilter nat bysrc table got converted to rhashtable: swapper/0: page allocation failure: order:3, mode:0x2084020(GFP_ATOMIC\|__GFP_COMP) CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc1 [..] [<ffffffff811633ed>] warn_alloc_failed+0xdd/0x140 [<ffffffff811638b1>] __alloc_pages_nodemask+0x3e1/0xcf0 [<ffffffff811a72ed>] alloc_pages_current+0x8d/0x110 [<ffffffff8117cb7f>] kmalloc_order+0x1f/0x70 [<ffffffff811aec19>] __kmalloc+0x129/0x140 [<ffffffff8146d561>] bucket_table_alloc+0xc1/0x1d0 [<ffffffff8146da1d>] rhashtable_insert_rehash+0x5d/0xe0 [<ffffffff819fcfff>] nf_nat_setup_info+0x2ef/0x400 The failure happens when allocating the spinlock array. Even with GFP_KERNEL its unlikely for such a large allocation to succeed. Thomas Graf pointed me at inet_ehash_locks_alloc(), so in addition to adding NOWARN for atomic allocations this also makes the bucket-array sizing more conservative. In commit `095dc8e0c3` ("tcp: fix/cleanup inet_ehash_locks_alloc()"), Eric Dumazet says: "Budget 2 cache lines per cpu worth of 'spinlocks'". IOW, consider size needed by a single spinlock when determining number of locks per cpu. So with 64 byte per cacheline and 4 byte per spinlock this gives 32 locks per cpu. Resulting size of the lock-array (sizeof(spinlock) == 4): cpus: 1 2 4 8 16 32 64 old: 1k 1k 4k 8k 16k 16k 16k new: 128 256 512 1k 2k 4k 8k 8k allocation should have decent chance of success even with GFP_ATOMIC, and should not fail with GFP_KERNEL. With 72-byte spinlock (LOCKDEP): cpus : 1 2 old: 9k 18k new: ~2k ~4k Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Suggested-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-14 21:12:57 -07:00
Sabrina Dubroca	952fcfd08c	net: remove type_check from dev_get_nest_level() The idea for type_check in dev_get_nest_level() was to count the number of nested devices of the same type (currently, only macvlan or vlan devices). This prevented the false positive lockdep warning on configurations such as: eth0 <--- macvlan0 <--- vlan0 <--- macvlan1 However, this doesn't prevent a warning on a configuration such as: eth0 <--- macvlan0 <--- vlan0 eth1 <--- vlan1 <--- macvlan1 In this case, all the locks end up with a nesting subclass of 1, so lockdep thinks that there is still a deadlock: - in the first case we have (macvlan_netdev_addr_lock_key, 1) and then take (vlan_netdev_xmit_lock_key, 1) - in the second case, we have (vlan_netdev_xmit_lock_key, 1) and then take (macvlan_netdev_addr_lock_key, 1) By removing the linktype check in dev_get_nest_level() and always incrementing the nesting depth, lockdep considers this configuration valid. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-13 15:15:54 -07:00
Sabrina Dubroca	e200387245	macsec: fix lockdep splats when nesting devices Currently, trying to setup a vlan over a macsec device, or other combinations of devices, triggers a lockdep warning. Use netdev_lockdep_set_classes and ndo_get_lock_subclass, similar to what macvlan does. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-13 15:15:54 -07:00
Mike Manning	bc561632dd	net: ipv6: Do not keep IPv6 addresses when IPv6 is disabled If IPv6 is disabled when the option is set to keep IPv6 addresses on link down, userspace is unaware of this as there is no such indication via netlink. The solution is to remove the IPv6 addresses in this case, which results in netlink messages indicating removal of addresses in the usual manner. This fix also makes the behavior consistent with the case of having IPv6 disabled first, which stops IPv6 addresses from being added. Fixes: `f1705ec197` ("net: ipv6: Make address flushing on ifdown optional") Signed-off-by: Mike Manning <mmanning@brocade.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-13 15:14:00 -07:00
Vegard Nossum	54236ab09e	net/sctp: always initialise sctp_ht_iter::start_fail sctp_transport_seq_start() does not currently clear iter->start_fail on success, but relies on it being zero when it is allocated (by seq_open_net()). This can be a problem in the following sequence: open() // allocates iter (and implicitly sets iter->start_fail = 0) read() - iter->start() // fails and sets iter->start_fail = 1 - iter->stop() // doesn't call sctp_transport_walk_stop() (correct) read() again - iter->start() // succeeds, but doesn't change iter->start_fail - iter->stop() // doesn't call sctp_transport_walk_stop() (wrong) We should initialize sctp_ht_iter::start_fail to zero if ->start() succeeds, otherwise it's possible that we leave an old value of 1 there, which will cause ->stop() to not call sctp_transport_walk_stop(), which causes all sorts of problems like not calling rcu_read_unlock() (and preempt_enable()), eventually leading to more warnings like this: BUG: sleeping function called from invalid context at mm/slab.h:388 in_atomic(): 0, irqs_disabled(): 0, pid: 16551, name: trinity-c2 Preemption disabled at:[<ffffffff819bceb6>] rhashtable_walk_start+0x46/0x150 [<ffffffff81149abb>] preempt_count_add+0x1fb/0x280 [<ffffffff83295892>] _raw_spin_lock+0x12/0x40 [<ffffffff819bceb6>] rhashtable_walk_start+0x46/0x150 [<ffffffff82ec665f>] sctp_transport_walk_start+0x2f/0x60 [<ffffffff82edda1d>] sctp_transport_seq_start+0x4d/0x150 [<ffffffff81439e50>] traverse+0x170/0x850 [<ffffffff8143aeec>] seq_read+0x7cc/0x1180 [<ffffffff814f996c>] proc_reg_read+0xbc/0x180 [<ffffffff813d0384>] do_loop_readv_writev+0x134/0x210 [<ffffffff813d2a95>] do_readv_writev+0x565/0x660 [<ffffffff813d6857>] vfs_readv+0x67/0xa0 [<ffffffff813d6c16>] do_preadv+0x126/0x170 [<ffffffff813d710c>] SyS_preadv+0xc/0x10 [<ffffffff8100334c>] do_syscall_64+0x19c/0x410 [<ffffffff83296225>] return_from_SYSCALL_64+0x0/0x6a [<ffffffffffffffff>] 0xffffffffffffffff Notice that this is a subtly different stacktrace from the one in commit `5fc382d875` ("net/sctp: terminate rhashtable walk correctly"). Cc: Xin Long <lucien.xin@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Acked-By: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-13 15:10:16 -07:00

1 2 3 4 5 ...

615249 Commits