linux-apfs

mirror of https://github.com/linux-apfs/linux-apfs.git synced 2026-05-01 15:00:59 -07:00

Author	SHA1	Message	Date
David S. Miller	bb598c1b8c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Several cases of bug fixes in 'net' overlapping other changes in 'net-next-. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-15 10:54:36 -05:00
WANG Cong	00ffc1ba02	genetlink: fix a memory leak on error path In __genl_register_family(), when genl_validate_assign_mc_groups() fails, we forget to free the memory we possibly allocate for family->attrbuf. Note, some callers call genl_unregister_family() to clean up on error path, it doesn't work because the family is inserted to the global list in the nearly last step. Cc: Jakub Kicinski <kubakici@wp.pl> Cc: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-03 16:52:29 -04:00
Eric Dumazet	93636d1f1f	netlink: netlink_diag_dump() runs without locks A recent commit removed locking from netlink_diag_dump() but forgot one error case. ===================================== [ BUG: bad unlock balance detected! ] 4.9.0-rc3+ #336 Not tainted ------------------------------------- syz-executor/4018 is trying to release lock ([ 36.220068] nl_table_lock ) at: [<ffffffff82dc8683>] netlink_diag_dump+0x1a3/0x250 net/netlink/diag.c:182 but there are no more locks to release! other info that might help us debug this: 3 locks held by syz-executor/4018: #0: [ 36.220068] ( sock_diag_mutex[ 36.220068] ){+.+.+.} , at: [ 36.220068] [<ffffffff82c3873b>] sock_diag_rcv+0x1b/0x40 #1: [ 36.220068] ( sock_diag_table_mutex[ 36.220068] ){+.+.+.} , at: [ 36.220068] [<ffffffff82c38e00>] sock_diag_rcv_msg+0x140/0x3a0 #2: [ 36.220068] ( nlk->cb_mutex[ 36.220068] ){+.+.+.} , at: [ 36.220068] [<ffffffff82db6600>] netlink_dump+0x50/0xac0 stack backtrace: CPU: 1 PID: 4018 Comm: syz-executor Not tainted 4.9.0-rc3+ #336 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 ffff8800645df688 ffffffff81b46934 ffffffff84eb3e78 ffff88006ad85800 ffffffff82dc8683 ffffffff84eb3e78 ffff8800645df6b8 ffffffff812043ca dffffc0000000000 ffff88006ad85ff8 ffff88006ad85fd0 00000000ffffffff Call Trace: [< inline >] __dump_stack lib/dump_stack.c:15 [<ffffffff81b46934>] dump_stack+0xb3/0x10f lib/dump_stack.c:51 [<ffffffff812043ca>] print_unlock_imbalance_bug+0x17a/0x1a0 kernel/locking/lockdep.c:3388 [< inline >] __lock_release kernel/locking/lockdep.c:3512 [<ffffffff8120cfd8>] lock_release+0x8e8/0xc60 kernel/locking/lockdep.c:3765 [< inline >] __raw_read_unlock ./include/linux/rwlock_api_smp.h:225 [<ffffffff83fc001a>] _raw_read_unlock+0x1a/0x30 kernel/locking/spinlock.c:255 [<ffffffff82dc8683>] netlink_diag_dump+0x1a3/0x250 net/netlink/diag.c:182 [<ffffffff82db6947>] netlink_dump+0x397/0xac0 net/netlink/af_netlink.c:2110 Fixes: `ad20207432` ("netlink: Use rhashtable walk interface in diag dump") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-03 16:16:51 -04:00
Wei Yongjun	22ca904ad7	genetlink: fix error return code in genl_register_family() Fix to return a negative error code from the idr_alloc() error handling case instead of 0, as done elsewhere in this function. Also fix the return value check of idr_alloc() since idr_alloc return negative errors on failure, not zero. Fixes: `2ae0f17df1` ("genetlink: use idr to track families") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-01 12:13:13 -04:00
pravin shelar	0e82c76359	genetlink: Fix generic netlink family unregister This patch fixes a typo in unregister operation. Following crash is fixed by this patch. It can be easily reproduced by repeating modprobe and rmmod module that uses genetlink. [ 261.446686] BUG: unable to handle kernel paging request at ffffffffa0264088 [ 261.448921] IP: [<ffffffff813cb70e>] strcmp+0xe/0x30 [ 261.450494] PGD 1c09067 [ 261.451266] PUD 1c0a063 [ 261.452091] PMD 8068d5067 [ 261.452525] PTE 0 [ 261.453164] [ 261.453618] Oops: 0000 [#1] SMP [ 261.454577] Modules linked in: openvswitch(+) ... [ 261.480753] RIP: 0010:[<ffffffff813cb70e>] [<ffffffff813cb70e>] strcmp+0xe/0x30 [ 261.483069] RSP: 0018:ffffc90003c0bc28 EFLAGS: 00010282 [ 261.510145] Call Trace: [ 261.510896] [<ffffffff816f10ca>] genl_family_find_byname+0x5a/0x70 [ 261.512819] [<ffffffff816f2319>] genl_register_family+0xb9/0x630 [ 261.514805] [<ffffffffa02840bc>] dp_init+0xbc/0x120 [openvswitch] [ 261.518268] [<ffffffff8100217d>] do_one_initcall+0x3d/0x160 [ 261.525041] [<ffffffff811808a9>] do_init_module+0x60/0x1f1 [ 261.526754] [<ffffffff8110687f>] load_module+0x22af/0x2860 [ 261.530144] [<ffffffff81107026>] SYSC_finit_module+0x96/0xd0 [ 261.531901] [<ffffffff8110707e>] SyS_finit_module+0xe/0x10 [ 261.533605] [<ffffffff8100391e>] do_syscall_64+0x6e/0x180 [ 261.535284] [<ffffffff817c2faf>] entry_SYSCALL64_slow_path+0x25/0x25 [ 261.546512] RIP [<ffffffff813cb70e>] strcmp+0xe/0x30 [ 261.550198] ---[ end trace 76505a814dd68770 ]--- Fixes: `2ae0f17df1` ("genetlink: use idr to track families"). Reported-by: Jarno Rajahalme <jarno@ovn.org> CC: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 20:58:15 -04:00
Johannes Berg	56989f6d85	genetlink: mark families as __ro_after_init Now genl_register_family() is the only thing (other than the users themselves, perhaps, but I didn't find any doing that) writing to the family struct. In all families that I found, genl_register_family() is only called from __init functions (some indirectly, in which case I've add __init annotations to clarifly things), so all can actually be marked __ro_after_init. This protects the data structure from accidental corruption. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-27 16:16:09 -04:00
Johannes Berg	2ae0f17df1	genetlink: use idr to track families Since generic netlink family IDs are small integers, allocated densely, IDR is an ideal match for lookups. Replace the existing hand-written hash-table with IDR for allocation and lookup. This lets the families only be written to once, during register, since the list_head can be removed and removal of a family won't cause any writes. It also slightly reduces the code size (by about 1.3k on x86-64). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-27 16:16:09 -04:00
Johannes Berg	489111e5c2	genetlink: statically initialize families Instead of providing macros/inline functions to initialize the families, make all users initialize them statically and get rid of the macros. This reduces the kernel code size by about 1.6k on x86-64 (with allyesconfig). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-27 16:16:09 -04:00
Johannes Berg	a07ea4d994	genetlink: no longer support using static family IDs Static family IDs have never really been used, the only use case was the workaround I introduced for those users that assumed their family ID was also their multicast group ID. Additionally, because static family IDs would never be reserved by the generic netlink code, using a relatively low ID would only work for built-in families that can be registered immediately after generic netlink is started, which is basically only the control family (apart from the workaround code, which I also had to add code for so it would reserve those IDs) Thus, anything other than GENL_ID_GENERATE is flawed and luckily not used except in the cases I mentioned. Move those workarounds into a few lines of code, and then get rid of GENL_ID_GENERATE entirely, making it more robust. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-27 16:16:09 -04:00
Johannes Berg	c90c39dab3	genetlink: introduce and use genl_family_attrbuf() This helper function allows family implementations to access their family's attrbuf. This gets rid of the attrbuf usage in families, and also adds locking validation, since it's not valid to use the attrbuf with parallel_ops or outside of the dumpit callback. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-27 16:16:08 -04:00
Eric Dumazet	d35c99ff77	netlink: do not enter direct reclaim from netlink_dump() Since linux-3.15, netlink_dump() can use up to 16384 bytes skb allocations. Due to struct skb_shared_info ~320 bytes overhead, we end up using order-3 (on x86) page allocations, that might trigger direct reclaim and add stress. The intent was really to attempt a large allocation but immediately fallback to a smaller one (order-1 on x86) in case of memory stress. On recent kernels (linux-4.4), we can remove __GFP_DIRECT_RECLAIM to meet the goal. Old kernels would need to remove __GFP_WAIT While we are at it, since we do an order-3 allocation, allow to use all the allocated bytes instead of 16384 to reduce syscalls during large dumps. iproute2 already uses 32KB recvmsg() buffer sizes. Alexei provided an initial patch downsizing to SKB_WITH_OVERHEAD(16384) Fixes: `9063e21fb0` ("netlink: autosize skb lengthes") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Alexei Starovoitov <ast@kernel.org> Cc: Greg Thelen <gthelen@google.com> Reviewed-by: Greg Rose <grose@lightfleet.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-06 20:53:13 -04:00
Andrey Vagin	733ade23de	netlink: don't forget to release a rhashtable_iter structure This bug was detected by kmemleak: unreferenced object 0xffff8804269cc3c0 (size 64): comm "criu", pid 1042, jiffies 4294907360 (age 13.713s) hex dump (first 32 bytes): a0 32 cc 2c 04 88 ff ff 00 00 00 00 00 00 00 00 .2.,............ 00 01 00 00 00 00 ad de 00 02 00 00 00 00 ad de ................ backtrace: [<ffffffff8184dffa>] kmemleak_alloc+0x4a/0xa0 [<ffffffff8124720f>] kmem_cache_alloc_trace+0x10f/0x280 [<ffffffffa02864cc>] __netlink_diag_dump+0x26c/0x290 [netlink_diag] v2: don't remove a reference on a rhashtable_iter structure to release it from netlink_diag_dump_done Cc: Herbert Xu <herbert@gondor.apana.org.au> Fixes: `ad20207432` ("netlink: Use rhashtable walk interface in diag dump") Signed-off-by: Andrei Vagin <avagin@openvz.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-07 17:29:38 -07:00
stephen hemminger	12d8de6d95	net: make genetlink ctrl ops const Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-01 14:09:00 -07:00
Herbert Xu	ad20207432	netlink: Use rhashtable walk interface in diag dump This patch converts the diag dumping code to use the rhashtable walk code instead of going through rhashtable by hand. The lock nl_table_lock is now only taken while we process the multicast list as it's not needed for the rhashtable walk. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-19 14:40:25 -07:00
Fabien Siron	21aff3b905	net/netlink/af_netlink.h: Remove unused structure. Signed-off-by: Fabien Siron <fabien.siron@epita.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-09 22:26:24 -07:00
Herbert Xu	92964c79b3	netlink: Fix dump skb leak/double free When we free cb->skb after a dump, we do it after releasing the lock. This means that a new dump could have started in the time being and we'll end up freeing their skb instead of ours. This patch saves the skb and module before we unlock so we free the right memory. Fixes: `16b304f340` ("netlink: Eliminate kmalloc in netlink dump operation.") Reported-by: Baozeng Ding <sploving1@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-16 22:05:15 -04:00
David S. Miller	1602f49b58	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts were two cases of simple overlapping changes, nothing serious. In the UDP case, we need to add a hlist_add_tail_rcu() to linux/rculist.h, because we've moved UDP socket handling away from using nulls lists. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-23 18:51:33 -04:00
Dmitry Ivanov	e272602039	netlink: don't send NETLINK_URELEASE for unbound sockets All existing users of NETLINK_URELEASE use it to clean up resources that were previously allocated to a socket via some command. As a result, no users require getting this notification for unbound sockets. Sending it for unbound sockets, however, is a problem because any user (including unprivileged users) can create a socket that uses the same ID as an existing socket. Binding this new socket will fail, but if the NETLINK_URELEASE notification is generated for such sockets, the users thereof will be tricked into thinking the socket that they allocated the resources for is closed. In the nl80211 case, this will cause destruction of virtual interfaces that still belong to an existing hostapd process; this is the case that Dmitry noticed. In the NFC case, it will cause a poll abort. In the case of netlink log/queue it will cause them to stop reporting events, as if NFULNL_CFG_CMD_UNBIND/NFQNL_CFG_CMD_UNBIND had been called. Fix this problem by checking that the socket is bound before generating the NETLINK_URELEASE notification. Cc: stable@vger.kernel.org Signed-off-by: Dmitry Ivanov <dima@ubnt.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-10 23:32:23 -04:00
Bob Copeland	8f6fd83c6c	rhashtable: accept GFP flags in rhashtable_walk_init In certain cases, the 802.11 mesh pathtable code wants to iterate over all of the entries in the forwarding table from the receive path, which is inside an RCU read-side critical section. Enable walks inside atomic sections by allowing GFP_ATOMIC allocations for the walker state. Change all existing callsites to pass in GFP_KERNEL. Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Bob Copeland <me@bobcopeland.com> [also adjust gfs2/glock.c and rhashtable tests] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2016-04-05 10:56:32 +02:00
David Decotigny	025c68186e	netlink: add support for NIC driver ioctls By returning -ENOIOCTLCMD, sock_do_ioctl() falls back to calling dev_ioctl(), which provides support for NIC driver ioctls, which includes ethtool support. This is similar to the way ioctls are handled in udp.c or tcp.c. This removes the requirement that ethtool for example be tied to the support of a specific L3 protocol (ethtool uses an AF_INET socket today). Signed-off-by: David Decotigny <decot@googlers.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-22 15:45:44 -04:00
Florian Westphal	c5b0db3263	nfnetlink: Revert "nfnetlink: add support for memory mapped netlink" reverts commit `3ab1f683bf` ("nfnetlink: add support for memory mapped netlink")' Like previous commits in the series, remove wrappers that are not needed after mmapped netlink removal. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-18 11:42:22 -05:00
Florian Westphal	263ea09084	Revert "genl: Add genlmsg_new_unicast() for unicast message allocation" This reverts commit `bb9b18fb55` ("genl: Add genlmsg_new_unicast() for unicast message allocation")'. Nothing wrong with it; its no longer needed since this was only for mmapped netlink support. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-18 11:42:19 -05:00
Florian Westphal	d1b4c689d4	netlink: remove mmapped netlink support mmapped netlink has a number of unresolved issues: - TX zerocopy support had to be disabled more than a year ago via commit `4682a03586` ("netlink: Always copy on mmap TX.") because the content of the mmapped area can change after netlink attribute validation but before message processing. - RX support was implemented mainly to speed up nfqueue dumping packet payload to userspace. However, since commit `ae08ce0021` ("netfilter: nfnetlink_queue: zero copy support") we avoid one copy with the socket-based interface too (via the skb_zerocopy helper). The other problem is that skbs attached to mmaped netlink socket behave different from normal skbs: - they don't have a shinfo area, so all functions that use skb_shinfo() (e.g. skb_clone) cannot be used. - reserving headroom prevents userspace from seeing the content as it expects message to start at skb->head. See for instance commit `aa3a022094` ("netlink: not trim skb for mmaped socket when dump"). - skbs handed e.g. to netlink_ack must have non-NULL skb->sk, else we crash because it needs the sk to check if a tx ring is attached. Also not obvious, leads to non-intuitive bug fixes such as `7c7bdf359` ("netfilter: nfnetlink: use original skbuff when acking batches"). mmaped netlink also didn't play nicely with the skb_zerocopy helper used by nfqueue and openvswitch. Daniel Borkmann fixed this via commit `6bb0fef489` ("netlink, mmap: fix edge-case leakages in nf queue zero-copy")' but at the cost of also needing to provide remaining length to the allocation function. nfqueue also has problems when used with mmaped rx netlink: - mmaped netlink doesn't allow use of nfqueue batch verdict messages. Problem is that in the mmap case, the allocation time also determines the ordering in which the frame will be seen by userspace (A allocating before B means that A is located in earlier ring slot, but this also means that B might get a lower sequence number then A since seqno is decided later. To fix this we would need to extend the spinlocked region to also cover the allocation and message setup which isn't desirable. - nfqueue can now be configured to queue large (GSO) skbs to userspace. Queing GSO packets is faster than having to force a software segmentation in the kernel, so this is a desirable option. However, with a mmap based ring one has to use 64kb per ring slot element, else mmap has to fall back to the socket path (NL_MMAP_STATUS_COPY) for all large packets. To use the mmap interface, userspace not only has to probe for mmap netlink support, it also has to implement a recv/socket receive path in order to handle messages that exceed the size of an rx ring element. Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ken-ichirou MATSUZAWA <chamaken@gmail.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-18 11:42:18 -05:00
Tycho Andersen	4a92602aa1	openvswitch: allow management from inside user namespaces Operations with the GENL_ADMIN_PERM flag fail permissions checks because this flag means we call netlink_capable, which uses the init user ns. Instead, let's introduce a new flag, GENL_UNS_ADMIN_PERM for operations which should be allowed inside a user namespace. The motivation for this is to be able to run openvswitch in unprivileged containers. I've tested this and it seems to work, but I really have no idea about the security consequences of this patch, so thoughts would be much appreciated. v2: use the GENL_UNS_ADMIN_PERM flag instead of a check in each function v3: use separate ifs for UNS_ADMIN_PERM and ADMIN_PERM, instead of one massive one Reported-by: James Page <james.page@canonical.com> Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> CC: Eric Biederman <ebiederm@xmission.com> CC: Pravin Shelar <pshelar@ovn.org> CC: Justin Pettit <jpettit@nicira.com> CC: "David S. Miller" <davem@davemloft.net> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-11 09:53:19 -05:00
Ken-ichirou MATSUZAWA	aa3a022094	netlink: not trim skb for mmaped socket when dump We should not trim skb for mmaped socket since its buf size is fixed and userspace will read as frame which data equals head. mmaped socket will not call recvmsg, means max_recvmsg_len is 0, skb_reserve was not called before commit: `db65a3aaf2`. Fixes: `db65a3aaf2` (netlink: Trim skb to alloc size to avoid MSG_TRUNC) Signed-off-by: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-29 20:25:17 -08:00

1 2 3 4 5 ...

472 Commits