linux

mirror of https://github.com/armbian/linux.git synced 2026-01-06 10:13:00 -08:00

Author	SHA1	Message	Date
Serge E. Hallyn	6b550f9495	user namespace: make signal.c respect user namespaces ipc/mqueue.c: for __SI_MESQ, convert the uid being sent to recipient's user namespace. (new, thanks Oleg) __send_signal: convert current's uid to the recipient's user namespace for any siginfo which is not SI_FROMKERNEL (patch from Oleg, thanks again :) do_notify_parent and do_notify_parent_cldstop: map task's uid to parent's user namespace ptrace_signal maps parent's uid into current's user namespace before including in signal to current. IIUC Oleg has argued that this shouldn't matter as the debugger will play with it, but it seems like not converting the value currently being set is misleading. Changelog: Sep 20: Inspired by Oleg's suggestion, define map_cred_ns() helper to simplify callers and help make clear what we are translating (which uid into which namespace). Passing the target task would make callers even easier to read, but we pass in user_ns because current_user_ns() != task_cred_xxx(current, user_ns). Sep 20: As recommended by Oleg, also put task_pid_vnr() under rcu_read_lock in ptrace_signal(). Sep 23: In send_signal(), detect when (user) signal is coming from an ancestor or unrelated user namespace. Pass that on to __send_signal, which sets si_uid to 0 or overflowuid if needed. Oct 12: Base on Oleg's fixup_uid() patch. On top of that, handle all SI_FROMKERNEL cases at callers, because we can't assume sender is current in those cases. Nov 10: (mhelsley) rename fixup_uid to more meaningful usern_fixup_signal_uid Nov 10: (akpm) make the !CONFIG_USER_NS case clearer Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> From: Serge Hallyn <serge.hallyn@canonical.com> Subject: __send_signal: pass q->info, not info, to userns_fixup_signal_uid (v2) Eric Biederman pointed out that passing info is a bug and could lead to a NULL pointer deref to boot. A collection of signal, securebits, filecaps, cap_bounds, and a few other ltp tests passed with this kernel. Changelog: Nov 18: previous patch missed a leading '&' Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> From: Dan Carpenter <dan.carpenter@oracle.com> Subject: ipc/mqueue: lock() => unlock() typo There was a double lock typo introduced in b085f4bd6b21 "user namespace: make signal.c respect user namespaces" Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-01-10 16:30:54 -08:00
Al Viro	df0a42837b	switch mq_open() to umode_t	2012-01-03 22:55:16 -05:00
Al Viro	1b9d5ff764	mqueue: propagate umode_t Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:55:11 -05:00
Al Viro	4acdaf27eb	switch ->create() to umode_t vfs_create() ignores everything outside of 16bit subset of its mode argument; switching it to umode_t is obviously equivalent and it's the only caller of the method Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:54:53 -05:00
Al Viro	6b520e0565	vfs: fix the stupidity with i_dentry in inode destructors Seeing that just about every destructor got that INIT_LIST_HEAD() copied into it, there is no point whatsoever keeping this INIT_LIST_HEAD in inode_init_once(); the cost of taking it into inode_init_always() will be negligible for pipes and sockets and negative for everything else. Not to mention the removal of boilerplate code from ->destroy_inode() instances... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:52:40 -05:00
Al Viro	6f686574cc	... and the same kind of leak for mqueue Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-12-09 00:40:21 -05:00
Manfred Spraul	e57940d719	ipc/sem.c: remove private structures from public header file include/linux/sem.h contains several structures that are only used within ipc/sem.c. The patch moves them into ipc/sem.c - there is no need to expose the structures to the whole kernel. No functional changes, only whitespace cleanups and 80-char per line fixes. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-11-02 16:07:01 -07:00
Manfred Spraul	0b0577f608	ipc/sem.c: handle spurious wakeups semtimedop() does not handle spurious wakeups, it returns -EINTR to user space. Most other schedule() users would just loop and not return to user space. The patch adds such a loop to semtimedop() Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-11-02 16:07:01 -07:00
Manfred Spraul	3c24783bb2	ipc/sem.c: fix return code race with semop vs. semop +semctl(IPC_RMID) sys_semtimedop() may return -EIDRM although the semaphore operation completed successfully: thread 1: thread 2: semtimedop(), sleeps semop(): * acquires sem_lock() semtimedop() woken up due to timeout sem_lock() loops * notices that thread 2 could be completed. * performs the operations that thread 2 is sleeping on. * marks the semaphore operation as IN_WAKEUP * drops sem_lock(), does wakeup, sets return code to 0 * thread delayed due to interrupt, whatever * returns to user space * thread still delayed semctl(IPC_RMID) * acquires sem_lock() * ipc_rmid(), ipcp->deleted=1 * drops sem_lock() * thread finally continues - but seem_lock() now fails due to ipcp->deleted == 1 * returns -EIDRM instead of 0 The fix is trivial: Always use the return code in queue.status. In real world, the race probably doesn't matter: If the semaphore array is destroyed, the app is probably not interested if the last operation succeeded or was already cancelled. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mike Galbraith <efault@gmx.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-11-02 16:07:01 -07:00
Wanlong Gao	32ea845d5b	ipc/mqueue.c: fix wrong use of schedule_hrtimeout_range_clock() Fix the wrong use of schedule_hrtimeout_range_clock() in wq_sleep(), although it is harmless for the syscall mq_timed* now. It was introduced by `9ca7d8e` ("mqueue: Convert message queue timeout to use hrtimers"). Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Cc: Carsten Emde <C.Emde@osadl.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-10-31 17:30:44 -07:00
Linus Torvalds	140d0b2108	Do 'shm_init_ns()' in an early pure_initcall This isn't really critical any more, since other patches (commit `298507d4d2`: "shm: optimize exit_shm()") have caused us to not actually need to touch the rw_mutex unless there are actual shm segments associated with the namespace, but we really should do tne shm_init_ns() earlier than we do now. This, together with commit `288d5abec8` ("Boot up with usermodehelper disabled") will mean that we really do initialize the initial ipc namespace data structure before we run any tasks. Tested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-08-04 19:35:59 -10:00
Vasiliy Kulikov	298507d4d2	shm: optimize exit_shm() We may optimistically check .in_use == 0 without holding the rw_mutex: it's the common case, and if it's zero, there certainly won't be any segments associated with us. After taking the lock, the idr_for_each() will do the right thing, so we could now drop the re-check inside the lock without any real cost. But it won't hurt. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-08-03 14:45:55 -10:00
Vasiliy Kulikov	33a30ed4bd	shm: fix wrong tests Commit `4c677e2eef` ("shm: optimize locking and ipc_namespace getting") introduced a copy-paste bug. Due to the bug cycle optimizations were disabled. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-08-03 14:45:55 -10:00
Vasiliy Kulikov	4c677e2eef	shm: optimize locking and ipc_namespace getting shm_lock() does a lookup of shm segment in shm_ids(ns).ipcs_idr, which is redundant as we already know shmid_kernel address. An actual lock is also not required for reads until we really want to destroy the segment. exit_shm() and shm_destroy_orphaned() may avoid the loop by checking whether there is at least one segment in current ipc_namespace. The check of nsproxy and ipc_ns against NULL is redundant as exit_shm() is called from do_exit() before the call to exit_notify(), so the dereferencing current->nsproxy->ipc_ns is guaranteed to be safe. Reported-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-30 08:44:20 -10:00
Vasiliy Kulikov	5774ed014f	shm: handle separate PID namespaces case shm_try_destroy_orphaned() and shm_try_destroy_current() didn't handle the case of separate PID namespaces, but a single IPC namespace. If there are tasks with the same PID values using the same shmem object, the wrong destroy decision could be reached. On shm segment creation store the pointer to the creator task in shmid_kernel->shm_creator field and zero it on task exit. Then use the ->shm_creator insread of shm_cprid in both functions. As shmid_kernel object is already locked at this stage, no additional locking is needed. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-30 08:44:19 -10:00
Vasiliy Kulikov	b34a6b1da3	ipc: introduce shm_rmid_forced sysctl Add support for the shm_rmid_forced sysctl. If set to 1, all shared memory objects in current ipc namespace will be automatically forced to use IPC_RMID. The POSIX way of handling shmem allows one to create shm objects and call shmdt(), leaving shm object associated with no process, thus consuming memory not counted via rlimits. With shm_rmid_forced=1 the shared memory object is counted at least for one process, so OOM killer may effectively kill the fat process holding the shared memory. It obviously breaks POSIX - some programs relying on the feature would stop working. So set shm_rmid_forced=1 only if you're sure nobody uses "orphaned" memory. Use shm_rmid_forced=0 by default for compatability reasons. The feature was previously impemented in -ow as a configure option. [akpm@linux-foundation.org: fix documentation, per Randy] [akpm@linux-foundation.org: fix warning] [akpm@linux-foundation.org: readability/conventionality tweaks] [akpm@linux-foundation.org: fix shm_rmid_forced/shm_forced_rmid confusion, use standard comment layout] Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "Serge E. Hallyn" <serge.hallyn@canonical.com> Cc: Daniel Lezcano <daniel.lezcano@free.fr> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Solar Designer <solar@openwall.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-26 16:49:44 -07:00
Jiri Slaby	d40dcdb017	ipc/mqueue.c: fix mq_open() return value We return ENOMEM from mqueue_get_inode even when we have enough memory. Namely in case the system rlimit of mqueue was reached. This error propagates to mq_queue and user sees the error unexpectedly. So fix this up to properly return EMFILE as described in the manpage: EMFILE The process already has the maximum number of files and message queues open. instead of: ENOMEM Insufficient memory. With the previous patch we just switch to ERR_PTR/PTR_ERR/IS_ERR error handling here. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-26 16:49:44 -07:00
Jiri Slaby	04715206c0	ipc/mqueue.c: refactor failure handling If new_inode fails to allocate an inode we need only to return with NULL. But now we test the opposite and have all the work in a nested block. So do the opposite to save one indentation level (and remove unnecessary line breaks). This is only a preparation/cleanup for the next patch where we fix up return values from mqueue_get_inode. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-26 16:49:44 -07:00
Manfred Spraul	d694ad62bf	ipc/sem.c: fix race with concurrent semtimedop() timeouts and IPC_RMID If a semaphore array is removed and in parallel a sleeping task is woken up (signal or timeout, does not matter), then the woken up task does not wait until wake_up_sem_queue_do() is completed. This will cause crashes, because wake_up_sem_queue_do() will read from a stale pointer. The fix is simple: Regardless of anything, always call get_queue_result(). This function waits until wake_up_sem_queue_do() has finished it's task. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=27142 Reported-by: Yuriy Yevtukhov <yuriy@ucoz.com> Reported-by: Harald Laabs <kernel@dasr.de> Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: <stable@kernel.org> [2.6.35+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-25 20:57:07 -07:00
Linus Torvalds	bbd9d6f7fb	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (107 commits) vfs: use ERR_CAST for err-ptr tossing in lookup_instantiate_filp isofs: Remove global fs lock jffs2: fix IN_DELETE_SELF on overwriting rename() killing a directory fix IN_DELETE_SELF on overwriting rename() on ramfs et.al. mm/truncate.c: fix build for CONFIG_BLOCK not enabled fs:update the NOTE of the file_operations structure Remove dead code in dget_parent() AFS: Fix silly characters in a comment switch d_add_ci() to d_splice_alias() in "found negative" case as well simplify gfs2_lookup() jfs_lookup(): don't bother with . or .. get rid of useless dget_parent() in btrfs rename() and link() get rid of useless dget_parent() in fs/btrfs/ioctl.c fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers drivers: fix up various ->llseek() implementations fs: handle SEEK_HOLE/SEEK_DATA properly in all fs's that define their own llseek Ext4: handle SEEK_HOLE/SEEK_DATA generically Btrfs: implement our own ->llseek fs: add SEEK_HOLE and SEEK_DATA flags reiserfs: make reiserfs default to barrier=flush ... Fix up trivial conflicts in fs/xfs/linux-2.6/xfs_super.c due to the new shrinker callout for the inode cache, that clashed with the xfs code to start the periodic workers later.	2011-07-22 19:02:39 -07:00
Josef Bacik	02c24a8218	fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers Btrfs needs to be able to control how filemap_write_and_wait_range() is called in fsync to make it less of a painful operation, so push down taking i_mutex and the calling of filemap_write_and_wait() down into the ->fsync() handlers. Some file systems can drop taking the i_mutex altogether it seems, like ext3 and ocfs2. For correctness sake I just pushed everything down in all cases to make sure that we keep the current behavior the same for everybody, and then each individual fs maintainer can make up their mind about what to do from there. Thanks, Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-07-20 20:47:59 -04:00
Lai Jiangshan	d4ee9aa33d	ipc,rcu: Convert call_rcu(ipc_immediate_free) to kfree_rcu() The rcu callback ipc_immediate_free() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(ipc_immediate_free). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-07-20 14:10:16 -07:00
Lai Jiangshan	693a8b6eec	ipc,rcu: Convert call_rcu(free_un) to kfree_rcu() The rcu callback free_un() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(free_un). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Manfred Spraul <manfred@colorfullife.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-07-20 14:10:16 -07:00
KOSAKI Motohiro	ca16d140af	mm: don't access vm_flags as 'int' The type of vma->vm_flags is 'unsigned long'. Neither 'int' nor 'unsigned int'. This patch fixes such misuse. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> [ Changed to use a typedef - we'll extend it to cover more cases later, since there has been discussion about making it a 64-bit type.. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-26 09:20:31 -07:00
Eric W. Biederman	a00eaf11a2	ns proc: Add support for the ipc namespace Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2011-05-10 14:35:47 -07:00

1 2 3 4 5 ...

323 Commits