Pull sysctl fix from Al Viro:
"Another regression fix for sysctl changes this cycle..."
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
Call sysctl_head_finish on error
This error path returned directly instead of calling sysctl_head_finish().
Fixes: ef9d965bc8 ("sysctl: reject gigantic reads/write to sysctl files")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull tracing fixes from Steven Rostedt:
- Have recordmcount work with > 64K sections (to support LTO)
- kprobe RCU fixes
- Correct a kprobe critical section with missing mutex
- Remove redundant arch_disarm_kprobe() call
- Fix lockup when kretprobe triggers within kprobe_flush_task()
- Fix memory leak in fetch_op_data operations
- Fix sleep in atomic in ftrace trace array sample code
- Free up memory on failure in sample trace array code
- Fix incorrect reporting of function_graph fields in format file
- Fix quote within quote parsing in bootconfig
- Fix return value of bootconfig tool
- Add testcases for bootconfig tool
- Fix maybe uninitialized warning in ftrace pid file code
- Remove unused variable in tracing_iter_reset()
- Fix some typos
* tag 'trace-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace: Fix maybe-uninitialized compiler warning
tools/bootconfig: Add testcase for show-command and quotes test
tools/bootconfig: Fix to return 0 if succeeded to show the bootconfig
tools/bootconfig: Fix to use correct quotes for value
proc/bootconfig: Fix to use correct quotes for value
tracing: Remove unused event variable in tracing_iter_reset
tracing/probe: Fix memleak in fetch_op_data operations
trace: Fix typo in allocate_ftrace_ops()'s comment
tracing: Make ftrace packed events have align of 1
sample-trace-array: Remove trace_array 'sample-instance'
sample-trace-array: Fix sleeping function called from invalid context
kretprobe: Prevent triggering kretprobe from within kprobe_flush_task
kprobes: Remove redundant arch_disarm_kprobe() call
kprobes: Fix to protect kick_kprobe_optimizer() by kprobe_mutex
kprobes: Use non RCU traversal APIs on kprobe_tables if possible
kprobes: Suppress the suspicious RCU warning on kprobes
recordmcount: support >64k sections
Fix /proc/bootconfig to select double or single quotes
corrctly according to the value.
If a bootconfig value includes a double quote character,
we must use single-quotes to quote that value.
This modifies if() condition and blocks for avoiding
double-quote in value check in 2 places. Anyway, since
xbc_array_for_each_value() can handle the array which
has a single node correctly.
Thus,
if (vnode && xbc_node_is_array(vnode)) {
xbc_array_for_each_value(vnode) /* vnode->next != NULL */
...
} else {
snprintf(val); /* val is an empty string if !vnode */
}
is equivalent to
if (vnode) {
xbc_array_for_each_value(vnode) /* vnode->next can be NULL */
...
} else {
snprintf(""); /* value is always empty */
}
Link: http://lkml.kernel.org/r/159230244786.65555.3763894451251622488.stgit@devnote2
Cc: stable@vger.kernel.org
Fixes: c1a3c36017 ("proc: bootconfig: Add /proc/bootconfig to show boot config list")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Pull more Kbuild updates from Masahiro Yamada:
- fix build rules in binderfs sample
- fix build errors when Kbuild recurses to the top Makefile
- covert '---help---' in Kconfig to 'help'
* tag 'kbuild-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
treewide: replace '---help---' in Kconfig files with 'help'
kbuild: fix broken builds because of GZIP,BZIP2,LZOP variables
samples: binderfs: really compile this sample and fix build issues
Since commit 84af7a6194 ("checkpatch: kconfig: prefer 'help' over
'---help---'"), the number of '---help---' has been gradually
decreasing, but there are still more than 2400 instances.
This commit finishes the conversion. While I touched the lines,
I also fixed the indentation.
There are a variety of indentation styles found.
a) 4 spaces + '---help---'
b) 7 spaces + '---help---'
c) 8 spaces + '---help---'
d) 1 space + 1 tab + '---help---'
e) 1 tab + '---help---' (correct indentation)
f) 1 tab + 1 space + '---help---'
g) 1 tab + 2 spaces + '---help---'
In order to convert all of them to 1 tab + 'help', I ran the
following commend:
$ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Pull proc fix from Eric Biederman:
"Much to my surprise syzbot found a very old bug in proc that the
recent changes made easier to reproce. This bug is subtle enough it
looks like it fooled everyone who should know better"
* 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
proc: Use new_inode not new_inode_pseudo
Recently syzbot reported that unmounting proc when there is an ongoing
inotify watch on the root directory of proc could result in a use
after free when the watch is removed after the unmount of proc
when the watcher exits.
Commit 69879c01a0 ("proc: Remove the now unnecessary internal mount
of proc") made it easier to unmount proc and allowed syzbot to see the
problem, but looking at the code it has been around for a long time.
Looking at the code the fsnotify watch should have been removed by
fsnotify_sb_delete in generic_shutdown_super. Unfortunately the inode
was allocated with new_inode_pseudo instead of new_inode so the inode
was not on the sb->s_inodes list. Which prevented
fsnotify_unmount_inodes from finding the inode and removing the watch
as well as made it so the "VFS: Busy inodes after unmount" warning
could not find the inodes to warn about them.
Make all of the inodes in proc visible to generic_shutdown_super,
and fsnotify_sb_delete by using new_inode instead of new_inode_pseudo.
The only functional difference is that new_inode places the inodes
on the sb->s_inodes list.
I wrote a small test program and I can verify that without changes it
can trigger this issue, and by replacing new_inode_pseudo with
new_inode the issues goes away.
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/000000000000d788c905a7dfa3f4@google.com
Reported-by: syzbot+7d2debdcdb3cb93c1e5e@syzkaller.appspotmail.com
Fixes: 0097875bd4 ("proc: Implement /proc/thread-self to point at the directory of the current thread")
Fixes: 021ada7dff ("procfs: switch /proc/self away from proc_dir_entry")
Fixes: 51f0885e54 ("vfs,proc: guarantee unique inodes in /proc")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Pull sysctl fixes from Al Viro:
"Fixups to regressions in sysctl series"
* 'work.sysctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
sysctl: reject gigantic reads/write to sysctl files
cdrom: fix an incorrect __user annotation on cdrom_sysctl_info
trace: fix an incorrect __user annotation on stack_trace_sysctl
random: fix an incorrect __user annotation on proc_do_entropy
net/sysctl: remove leftover __user annotations on neigh_proc_dointvec*
net/sysctl: use cpumask_parse in flow_limit_cpu_sysctl
Pull proc fix from Eric Biederman:
"Syzbot found a NULL pointer dereference if kzalloc of s_fs_info fails"
* 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
proc: s_fs_info may be NULL when proc_kill_sb is called
Instead of triggering a WARN_ON deep down in the page allocator just
give up early on allocations that are way larger than the usual sysctl
values.
Fixes: 32927393dc ("sysctl: pass kernel pointers to ->proc_handler")
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Convert the last few remaining mmap_sem rwsem calls to use the new mmap
locking API. These were missed by coccinelle for some reason (I think
coccinelle does not support some of the preprocessor constructs in these
files ?)
[akpm@linux-foundation.org: convert linux-next leftovers]
[akpm@linux-foundation.org: more linux-next leftovers]
[akpm@linux-foundation.org: more linux-next leftovers]
Signed-off-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Liam Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ying Han <yinghan@google.com>
Link: http://lkml.kernel.org/r/20200520052908.204642-6-walken@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge still more updates from Andrew Morton:
"Various trees. Mainly those parts of MM whose linux-next dependents
are now merged. I'm still sitting on ~160 patches which await merges
from -next.
Subsystems affected by this patch series: mm/proc, ipc, dynamic-debug,
panic, lib, sysctl, mm/gup, mm/pagemap"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (52 commits)
doc: cgroup: update note about conditions when oom killer is invoked
module: move the set_fs hack for flush_icache_range to m68k
nommu: use flush_icache_user_range in brk and mmap
binfmt_flat: use flush_icache_user_range
exec: use flush_icache_user_range in read_code
exec: only build read_code when needed
m68k: implement flush_icache_user_range
arm: rename flush_cache_user_range to flush_icache_user_range
xtensa: implement flush_icache_user_range
sh: implement flush_icache_user_range
asm-generic: add a flush_icache_user_range stub
mm: rename flush_icache_user_range to flush_icache_user_page
arm,sparc,unicore32: remove flush_icache_user_range
riscv: use asm-generic/cacheflush.h
powerpc: use asm-generic/cacheflush.h
openrisc: use asm-generic/cacheflush.h
m68knommu: use asm-generic/cacheflush.h
microblaze: use asm-generic/cacheflush.h
ia64: use asm-generic/cacheflush.h
hexagon: use asm-generic/cacheflush.h
...
After a recent change introduced by Vlastimil's series [0], kernel is
able now to handle sysctl parameters on kernel command line; also, the
series introduced a simple infrastructure to convert legacy boot
parameters (that duplicate sysctls) into sysctl aliases.
This patch converts the watchdog parameters softlockup_panic and
{hard,soft}lockup_all_cpu_backtrace to use the new alias infrastructure.
It fixes the documentation too, since the alias only accepts values 0 or
1, not the full range of integers.
We also took the opportunity here to improve the documentation of the
previously converted hung_task_panic (see the patch series [0]) and put
the alias table in alphabetical order.
[0] http://lkml.kernel.org/r/20200427180433.7029-1-vbabka@suse.cz
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Kees Cook <keescook@chromium.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Link: http://lkml.kernel.org/r/20200507214624.21911-1-gpiccoli@canonical.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "support setting sysctl parameters from kernel command line", v3.
This series adds support for something that seems like many people
always wanted but nobody added it yet, so here's the ability to set
sysctl parameters via kernel command line options in the form of
sysctl.vm.something=1
The important part is Patch 1. The second, not so important part is an
attempt to clean up legacy one-off parameters that do the same thing as
a sysctl. I don't want to remove them completely for compatibility
reasons, but with generic sysctl support the idea is to remove the
one-off param handlers and treat the parameters as aliases for the
sysctl variants.
I have identified several parameters that mention sysctl counterparts in
Documentation/admin-guide/kernel-parameters.txt but there might be more.
The conversion also has varying level of success:
- numa_zonelist_order is converted in Patch 2 together with adding the
necessary infrastructure. It's easy as it doesn't really do anything
but warn on deprecated value these days.
- hung_task_panic is converted in Patch 3, but there's a downside that
now it only accepts 0 and 1, while previously it was any integer
value
- nmi_watchdog maps to two sysctls nmi_watchdog and hardlockup_panic,
so there's no straighforward conversion possible
- traceoff_on_warning is a flag without value and it would be required
to handle that somehow in the conversion infractructure, which seems
pointless for a single flag
This patch (of 5):
A recently proposed patch to add vm_swappiness command line parameter in
addition to existing sysctl [1] made me wonder why we don't have a
general support for passing sysctl parameters via command line.
Googling found only somebody else wondering the same [2], but I haven't
found any prior discussion with reasons why not to do this.
Settings the vm_swappiness issue aside (the underlying issue might be
solved in a different way), quick search of kernel-parameters.txt shows
there are already some that exist as both sysctl and kernel parameter -
hung_task_panic, nmi_watchdog, numa_zonelist_order, traceoff_on_warning.
A general mechanism would remove the need to add more of those one-offs
and might be handy in situations where configuration by e.g.
/etc/sysctl.d/ is impractical.
Hence, this patch adds a new parse_args() pass that looks for parameters
prefixed by 'sysctl.' and tries to interpret them as writes to the
corresponding sys/ files using an temporary in-kernel procfs mount.
This mechanism was suggested by Eric W. Biederman [3], as it handles
all dynamically registered sysctl tables, even though we don't handle
modular sysctls. Errors due to e.g. invalid parameter name or value
are reported in the kernel log.
The processing is hooked right before the init process is loaded, as
some handlers might be more complicated than simple setters and might
need some subsystems to be initialized. At the moment the init process
can be started and eventually execute a process writing to /proc/sys/
then it should be also fine to do that from the kernel.
Sysctls registered later on module load time are not set by this
mechanism - it's expected that in such scenarios, setting sysctl values
from userspace is practical enough.
[1] https://lore.kernel.org/r/BL0PR02MB560167492CA4094C91589930E9FC0@BL0PR02MB5601.namprd02.prod.outlook.com/
[2] https://unix.stackexchange.com/questions/558802/how-to-set-sysctl-using-kernel-command-line-parameter
[3] https://lore.kernel.org/r/87bloj2skm.fsf@x220.int.ebiederm.org/
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Ivan Teterevkov <ivan.teterevkov@nutanix.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Guilherme G . Piccoli" <gpiccoli@canonical.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Link: http://lkml.kernel.org/r/20200427180433.7029-1-vbabka@suse.cz
Link: http://lkml.kernel.org/r/20200427180433.7029-2-vbabka@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull apparmor updates from John Johansen:
"Features:
- Replace zero-length array with flexible-array
- add a valid state flags check
- add consistency check between state and dfa diff encode flags
- add apparmor subdir to proc attr interface
- fail unpack if profile mode is unknown
- add outofband transition and use it in xattr match
- ensure that dfa state tables have entries
Cleanups:
- Use true and false for bool variable
- Remove semicolon
- Clean code by removing redundant instructions
- Replace two seq_printf() calls by seq_puts() in aa_label_seq_xprint()
- remove duplicate check of xattrs on profile attachment
- remove useless aafs_create_symlink
Bug fixes:
- Fix memory leak of profile proxy
- fix introspection of of task mode for unconfined tasks
- fix nnp subset test for unconfined
- check/put label on apparmor_sk_clone_security()"
* tag 'apparmor-pr-2020-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
apparmor: Fix memory leak of profile proxy
apparmor: fix introspection of of task mode for unconfined tasks
apparmor: check/put label on apparmor_sk_clone_security()
apparmor: Use true and false for bool variable
security/apparmor/label.c: Clean code by removing redundant instructions
apparmor: Replace zero-length array with flexible-array
apparmor: ensure that dfa state tables have entries
apparmor: remove duplicate check of xattrs on profile attachment.
apparmor: add outofband transition and use it in xattr match
apparmor: fail unpack if profile mode is unknown
apparmor: fix nnp subset test for unconfined
apparmor: remove useless aafs_create_symlink
apparmor: add proc subdir to attrs
apparmor: add consistency check between state and dfa diff encode flags
apparmor: add a valid state flags check
AppArmor: Remove semicolon
apparmor: Replace two seq_printf() calls by seq_puts() in aa_label_seq_xprint()
Merge yet more updates from Andrew Morton:
- More MM work. 100ish more to go. Mike Rapoport's "mm: remove
__ARCH_HAS_5LEVEL_HACK" series should fix the current ppc issue
- Various other little subsystems
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (127 commits)
lib/ubsan.c: fix gcc-10 warnings
tools/testing/selftests/vm: remove duplicate headers
selftests: vm: pkeys: fix multilib builds for x86
selftests: vm: pkeys: use the correct page size on powerpc
selftests/vm/pkeys: override access right definitions on powerpc
selftests/vm/pkeys: test correct behaviour of pkey-0
selftests/vm/pkeys: introduce a sub-page allocator
selftests/vm/pkeys: detect write violation on a mapped access-denied-key page
selftests/vm/pkeys: associate key on a mapped page and detect write violation
selftests/vm/pkeys: associate key on a mapped page and detect access violation
selftests/vm/pkeys: improve checks to determine pkey support
selftests/vm/pkeys: fix assertion in test_pkey_alloc_exhaust()
selftests/vm/pkeys: fix number of reserved powerpc pkeys
selftests/vm/pkeys: introduce powerpc support
selftests/vm/pkeys: introduce generic pkey abstractions
selftests: vm: pkeys: use the correct huge page size
selftests/vm/pkeys: fix alloc_random_pkey() to make it really random
selftests/vm/pkeys: fix assertion in pkey_disable_set/clear()
selftests/vm/pkeys: fix pkey_disable_clear()
selftests: vm: pkeys: add helpers for pkey bits
...