kernel

mirror of https://github.com/ukui/kernel.git synced 2026-03-09 10:07:04 -07:00

Author	SHA1	Message	Date
Tejun Heo	783bd07d09	kernfs: Implement kernfs_show() Currently, kernfs nodes can be created hidden and activated later by calling kernfs_activate() to allow creation of multiple nodes to succeed or fail as a unit. This is an one-way one-time-only transition. This patch introduces kernfs_show() which can toggle visibility dynamically. As the currently proposed use - toggling the cgroup pressure files - only requires operating on leaf nodes, for the sake of simplicity, restrict it as such for now. Hiding uses the same mechanism as deactivation and likewise guarantees that there are no in-flight operations on completion. KERNFS_ACTIVATED and KERNFS_HIDDEN are used to manage the interactions between activations and show/hide operations. A node is visible iff both activated & !hidden. Cc: Chengming Zhou <zhouchengming@bytedance.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Tested-by: Chengming Zhou <zhouchengming@bytedance.com> Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220828050440.734579-9-tj@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-01 18:08:44 +02:00
Tejun Heo	c25491747b	kernfs: Add KERNFS_REMOVING flags KERNFS_ACTIVATED tracks whether a given node has ever been activated. As a node was only deactivated on removal, this was used for 1. Drain optimization (removed by the previous patch). 2. To hide !activated nodes 3. To avoid double activations 4. Reject adding children to a node being removed 5. Skip activaing a node which is being removed. We want to decouple deactivation from removal so that nodes can be deactivated and hidden dynamically, which makes KERNFS_ACTIVATED useless for all of the above purposes. #1 is already gone. #2 and #3 can instead test whether the node is currently active. A new flag KERNFS_REMOVING is added to explicitly mark nodes which are being removed for #4 and #5. While this leaves KERNFS_ACTIVATED with no users, leave it be as it will be used in a following patch. Cc: Chengming Zhou <zhouchengming@bytedance.com> Tested-by: Chengming Zhou <zhouchengming@bytedance.com> Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220828050440.734579-7-tj@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-01 18:08:44 +02:00
Imran Khan	2fd26970cf	Revert "kernfs: Change kernfs_notify_list to llist." This reverts commit `b8f35fa118`. This is causing regression due to same kernfs_node getting added multiple times in kernfs_notify_list so revert it until safe way of using llist in this context is found. Reported-by: Nathan Chancellor <nathan@kernel.org> Reported-by: Michael Walle <michael@walle.cc> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Imran Khan <imran.f.khan@oracle.com> Cc: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220705201026.2487665-1-imran.f.khan@oracle.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-07-06 14:20:22 +02:00
Imran Khan	1d25b84e44	kernfs: Replace global kernfs_open_file_mutex with hashed mutexes. In current kernfs design a single mutex, kernfs_open_file_mutex, protects the list of kernfs_open_file instances corresponding to a sysfs attribute. So even if different tasks are opening or closing different sysfs files they can contend on osq_lock of this mutex. The contention is more apparent in large scale systems with few hundred CPUs where most of the CPUs have running tasks that are opening, accessing or closing sysfs files at any point of time. Using hashed mutexes in place of a single global mutex, can significantly reduce contention around global mutex and hence can provide better scalability. Moreover as these hashed mutexes are not part of kernfs_node objects we will not see any singnificant change in memory utilization of kernfs based file systems like sysfs, cgroupfs etc. Modify interface introduced in previous patch to make use of hashed mutexes. Use kernfs_node address as hashing key. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Imran Khan <imran.f.khan@oracle.com> Link: https://lore.kernel.org/r/20220615021059.862643-5-imran.f.khan@oracle.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-06-27 16:46:15 +02:00
Imran Khan	b8f35fa118	kernfs: Change kernfs_notify_list to llist. At present kernfs_notify_list is implemented as a singly linked list of kernfs_node(s), where last element points to itself and value of ->attr.next tells if node is present on the list or not. Both addition and deletion to list happen under kernfs_notify_lock. Change kernfs_notify_list to llist so that addition to list can heppen locklessly. Suggested by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Imran Khan <imran.f.khan@oracle.com> Link: https://lore.kernel.org/r/20220615021059.862643-3-imran.f.khan@oracle.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-06-27 16:46:15 +02:00
Imran Khan	086c00c71f	kernfs: make ->attr.open RCU protected. After removal of kernfs_open_node->refcnt in the previous patch, kernfs_open_node_lock can be removed as well by making ->attr.open RCU protected. kernfs_put_open_node can delegate freeing to ->attr.open to RCU and other readers of ->attr.open can do so under rcu_read_(un)lock. Suggested by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Imran Khan <imran.f.khan@oracle.com> Link: https://lore.kernel.org/r/20220615021059.862643-2-imran.f.khan@oracle.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-06-27 16:46:14 +02:00
Greg Kroah-Hartman	7a19006b60	kernfs: remove unneeded #if 0 guard Commit `f2eb478f2f` ("kernfs: move struct kernfs_root out of the public view.") moved kernfs_root out of kernfs.h, but my debugging code of a #if 0 was left in accidentally. Fix that up by removing the guards. Fixes: `f2eb478f2f` ("kernfs: move struct kernfs_root out of the public view.") Cc: Tejun Heo <tj@kernel.org> Reported-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/20220318073452.1486568-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-18 09:47:04 +01:00
Greg Kroah-Hartman	f2eb478f2f	kernfs: move struct kernfs_root out of the public view. There is no need to have struct kernfs_root be part of kernfs.h for the whole kernel to see and poke around it. Move it internal to kernfs code and provide a helper function, kernfs_root_to_node(), to handle the one field that kernfs users were directly accessing from the structure. Cc: Imran Khan <imran.f.khan@oracle.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220222070713.3517679-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-02-23 15:46:34 +01:00
Andy Shevchenko	79f1c73042	kernfs: Replace kernel.h with the necessary inclusions When kernel.h is used in the headers it adds a lot into dependency hell, especially when there are circular dependencies are involved. Replace kernel.h inclusion with the list of what is really being used. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20211209123008.3391-1-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-21 10:34:39 +01:00
Minchan Kim	393c371408	kernfs: switch global kernfs_rwsem lock to per-fs lock The kernfs implementation has big lock granularity(kernfs_rwsem) so every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock. It makes trouble for some cases to wait the global lock for a long time even though they are totally independent contexts each other. A general example is process A goes under direct reclaim with holding the lock when it accessed the file in sysfs and process B is waiting the lock with exclusive mode and then process C is waiting the lock until process B could finish the job after it gets the lock from process A. This patch switches the global kernfs_rwsem to per-fs lock, which put the rwsem into kernfs_root. Suggested-by: Tejun Heo <tj@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Minchan Kim <minchan@kernel.org> Link: https://lore.kernel.org/r/20211118230008.2679780-1-minchan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-11-24 13:55:16 +01:00
Christoph Hellwig	eaf501e0d8	kernfs: remove the unused lockdep_key field in struct kernfs_ops Not actually used anywhere. Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210913054121.616001-4-hch@lst.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-09-14 16:55:19 +02:00
Christoph Hellwig	2935662449	kernfs: remove kernfs_create_file and kernfs_create_file_ns All callers actually use __kernfs_create_file. Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210913054121.616001-3-hch@lst.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-09-14 16:54:55 +02:00
Ian Kent	7ba0273b2f	kernfs: switch kernfs to use an rwsem The kernfs global lock restricts the ability to perform kernfs node lookup operations in parallel during path walks. Change the kernfs mutex to an rwsem so that, when opportunity arises, node searches can be done in parallel with path walk lookups. Reviewed-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Ian Kent <raven@themaw.net> Link: https://lore.kernel.org/r/162642770946.63632.2218304587223241374.stgit@web.messagingengine.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-07-27 09:29:15 +02:00
Ian Kent	895adbec30	kernfs: add a revision to identify directory node changes Add a revision counter to kernfs directory nodes so it can be used to detect if a directory node has changed during negative dentry revalidation. There's an assumption that sizeof(unsigned long) <= sizeof(pointer) on all architectures and as far as I know that assumption holds. So adding a revision counter to the struct kernfs_elem_dir variant of the kernfs_node type union won't increase the size of the kernfs_node struct. This is because struct kernfs_elem_dir is at least sizeof(pointer) smaller than the largest union variant. It's tempting to make the revision counter a u64 but that would increase the size of kernfs_node on archs where sizeof(pointer) is smaller than the revision counter. Reviewed-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Ian Kent <raven@themaw.net> Link: https://lore.kernel.org/r/162642769895.63632.8356662784964509867.stgit@web.messagingengine.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-07-27 09:29:14 +02:00
Willem de Bruijn	21774fd81a	kernfs: bring names in comments in line with code Fix two stragglers in the comments of the below rename operation. Fixes: `adc5e8b58f` ("kernfs: drop s_ prefix from kernfs_node members") Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20201015185726.1386868-1-willemdebruijn.kernel@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-11-09 18:12:39 +01:00
Daniel Xu	0c47383ba3	kernfs: Add option to enable user xattrs User extended attributes are useful as metadata storage for kernfs consumers like cgroups. Especially in the case of cgroups, it is useful to have a central metadata store that multiple processes/services can use to coordinate actions. A concrete example is for userspace out of memory killers. We want to let delegated cgroup subtree owners (running as non-root) to be able to say "please avoid killing this cgroup". This is especially important for desktop linux as delegated subtrees owners are less likely to run as root. This patch introduces a new flag, KERNFS_ROOT_SUPPORT_USER_XATTR, that lets kernfs consumers enable user xattr support. An initial limit of 128 entries or 128KB -- whichever is hit first -- is placed per cgroup because xattrs come from kernel memory and we don't want to let unprivileged users accidentally eat up too much kernel memory. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Acked-by: Chris Down <chris@chrisdown.name> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tejun Heo <tj@kernel.org>	2020-03-16 15:53:47 -04:00
Tejun Heo	40430452fd	kernfs: use 64bit inos if ino_t is 64bit Each kernfs_node is identified with a 64bit ID. The low 32bit is exposed as ino and the high gen. While this already allows using inos as keys by looking up with wildcard generation number of 0, it's adding unnecessary complications for 64bit ino archs which can directly use kernfs_node IDs as inos to uniquely identify each cgroup instance. This patch exposes IDs directly as inos on 64bit ino archs. The conversion is mostly straight-forward. * 32bit ino archs behave the same as before. 64bit ino archs now use the whole 64bit ID as ino and the generation number is fixed at 1. * 64bit inos still use the same idr allocator which gurantees that the lower 32bits identify the current live instance uniquely and the high 32bits are incremented whenever the low bits wrap. As the upper 32bits are no longer used as gen and we don't wanna start ino allocation with 33rd bit set, the initial value for highbits allocation is changed to 0 on 64bit ino archs. * blktrace exposes two 32bit numbers - (INO,GEN) pair - to identify the issuing cgroup. Userland builds FILEID_INO32_GEN fids from these numbers to look up the cgroups. To remain compatible with the behavior, always output (LOW32,HIGH32) which will be constructed back to the original 64bit ID by __kernfs_fh_to_dentry(). Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Namhyung Kim <namhyung@kernel.org>	2019-11-12 08:18:04 -08:00
Tejun Heo	fe0f726c9f	kernfs: combine ino/id lookup functions into kernfs_find_and_get_node_by_id() kernfs_find_and_get_node_by_ino() looks the kernfs_node matching the specified ino. On top of that, kernfs_get_node_by_id() and kernfs_fh_get_inode() implement full ID matching by testing the rest of ID. On surface, confusingly, the two are slightly different in that the latter uses 0 gen as wildcard while the former doesn't - does it mean that the latter can't uniquely identify inodes w/ 0 gen? In practice, this is a distinction without a difference because generation number starts at 1. There are no actual IDs with 0 gen, so it can always safely used as wildcard. Let's simplify the code by renaming kernfs_find_and_get_node_by_ino() to kernfs_find_and_get_node_by_id(), moving all lookup logics into it, and removing now unnecessary kernfs_get_node_by_id(). Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-11-12 08:18:04 -08:00
Tejun Heo	67c0496e87	kernfs: convert kernfs_node->id from union kernfs_node_id to u64 kernfs_node->id is currently a union kernfs_node_id which represents either a 32bit (ino, gen) pair or u64 value. I can't see much value in the usage of the union - all that's needed is a 64bit ID which the current code is already limited to. Using a union makes the code unnecessarily complicated and prevents using 64bit ino without adding practical benefits. This patch drops union kernfs_node_id and makes kernfs_node->id a u64. ino is stored in the lower 32bits and gen upper. Accessors - kernfs[_id]_ino() and kernfs[_id]_gen() - are added to retrieve the ino and gen. This simplifies ID handling less cumbersome and will allow using 64bit inos on supported archs. This patch doesn't make any functional changes. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Alexei Starovoitov <ast@kernel.org>	2019-11-12 08:18:03 -08:00
Tejun Heo	e23f568aa6	kernfs: fix ino wrap-around detection When the 32bit ino wraps around, kernfs increments the generation number to distinguish reused ino instances. The wrap-around detection tests whether the allocated ino is lower than what the cursor but the cursor is pointing to the next ino to allocate so the condition never triggers. Fix it by remembering the last ino and comparing against that. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Fixes: `4a3ef68aca` ("kernfs: implement i_generation") Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org # v4.14+	2019-11-12 08:18:03 -08:00
Thomas Gleixner	55716d2643	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428 Based on 1 normalized pattern(s): this file is released under the gplv2 extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 68 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Armijn Hemel <armijn@tjaldur.nl> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531190114.292346262@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-05 17:37:16 +02:00
Linus Torvalds	f72dae2089	Merge tag 'selinux-pr-20190507' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: "We've got a few SELinux patches for the v5.2 merge window, the highlights are below: - Add LSM hooks, and the SELinux implementation, for proper labeling of kernfs. While we are only including the SELinux implementation here, the rest of the LSM folks have given the hooks a thumbs-up. - Update the SELinux mdp (Make Dummy Policy) script to actually work on a modern system. - Disallow userspace to change the LSM credentials via /proc/self/attr when the task's credentials are already overridden. The change was made in procfs because all the LSM folks agreed this was the Right Thing To Do and duplicating it across each LSM was going to be annoying" * tag 'selinux-pr-20190507' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: proc: prevent changes to overridden credentials selinux: Check address length before reading address family kernfs: fix xattr name handling in LSM helpers MAINTAINERS: update SELinux file patterns selinux: avoid uninitialized variable warning selinux: remove useless assignments LSM: lsm_hooks.h - fix missing colon in docstring selinux: Make selinux_kernfs_init_security static kernfs: initialize security of newly created nodes selinux: implement the kernfs_init_security hook LSM: add new hook for kernfs node initialization kernfs: use simple_xattrs for security attributes selinux: try security xattr after genfs for kernfs filesystems kernfs: do not alloc iattrs in kernfs_xattr_get kernfs: clean up struct kernfs_iattrs scripts/selinux: fix build selinux: use kernel linux/socket.h for genheaders and mdp scripts/selinux: modernize mdp	2019-05-07 18:48:09 -07:00
Christina Quast	0d1a393d61	fs: kernfs: Corrected spelling mistake flies => files Signed-off-by: Christina Quast <cquast@hanoverdisplays.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-04-25 21:41:35 +02:00
Ondrej Mosnacek	1537ad15c9	kernfs: fix xattr name handling in LSM helpers The implementation of kernfs_security_xattr_() helpers reuses the kernfs_node_xattr_() functions, which take the suffix of the xattr name and extract full xattr name from it using xattr_full_name(). However, this function relies on the fact that the suffix passed to xattr handlers from VFS is always constructed from the full name by just incerementing the pointer. This doesn't necessarily hold for the callers of kernfs_security_xattr_(), so their usage will easily lead to out-of-bounds access. Fix this by moving the xattr name reconstruction to the VFS xattr handlers and replacing the kernfs_security_xattr_() helpers with more general kernfs_xattr_*() helpers that take full xattr name and allow accessing all kernfs node's xattrs. Reported-by: kernel test robot <rong.a.chen@intel.com> Fixes: `b230d5aba2` ("LSM: add new hook for kernfs node initialization") Fixes: `ec882da5cd` ("selinux: implement the kernfs_init_security hook") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2019-04-04 09:00:27 -04:00
Ondrej Mosnacek	b230d5aba2	LSM: add new hook for kernfs node initialization This patch introduces a new security hook that is intended for initializing the security data for newly created kernfs nodes, which provide a way of storing a non-default security context, but need to operate independently from mounts (and therefore may not have an associated inode at the moment of creation). The main motivation is to allow kernfs nodes to inherit the context of the parent under SELinux, similar to the behavior of security_inode_init_security(). Other LSMs may implement their own logic for handling the creation of new nodes. This patch also adds helper functions to <linux/kernfs.h> for getting/setting security xattrs of a kernfs node so that LSMs hooks are able to do their job. Other important attributes should be accessible direcly in the kernfs_node fields (in case there is need for more, then new helpers should be added to kernfs.h along with the patch that needs them). Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> [PM: more manual merge fixes] Signed-off-by: Paul Moore <paul@paul-moore.com>	2019-03-20 22:01:02 -04:00

1 2 3 4 5

117 Commits