linux

mirror of https://github.com/armbian/linux.git synced 2026-01-06 10:13:00 -08:00

Author	SHA1	Message	Date
Dan Rosenberg	982f7c2b2e	sys_semctl: fix kernel stack leakage The semctl syscall has several code paths that lead to the leakage of uninitialized kernel stack memory (namely the IPC_INFO, SEM_INFO, IPC_STAT, and SEM_STAT commands) during the use of the older, obsolete version of the semid_ds struct. The copy_semid_to_user() function declares a semid_ds struct on the stack and copies it back to the user without initializing or zeroing the "sem_base", "sem_pending", "sem_pending_last", and "undo" pointers, allowing the leakage of 16 bytes of kernel stack memory. The code is still reachable on 32-bit systems - when calling semctl() newer glibc's automatically OR the IPC command with the IPC_64 flag, but invoking the syscall directly allows users to use the older versions of the struct. Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-01 10:50:58 -07:00
Al Viro	6d8af64c1c	switch mqueue to ->evict_inode() ... and since the inodes are never hashed, we can use default ->drop_inode() just fine. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-08-09 16:47:58 -04:00
Manfred Spraul	c61284e991	ipc/sem.c: bugfix for semop() not reporting successful operation The last change to improve the scalability moved the actual wake-up out of the section that is protected by spin_lock(sma->sem_perm.lock). This means that IN_WAKEUP can be in queue.status even when the spinlock is acquired by the current task. Thus the same loop that is performed when queue.status is read without the spinlock acquired must be performed when the spinlock is acquired. Thanks to kamezawa.hiroyu@jp.fujitsu.com for noticing lack of the memory barrier. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16255 [akpm@linux-foundation.org: clean up kerneldoc, checkpatch warning and whitespace] Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Reported-by: Luca Tettamanti <kronos.it@gmail.com> Tested-by: Luca Tettamanti <kronos.it@gmail.com> Reported-by: Christoph Lameter <cl@linux-foundation.org> Cc: Maciej Rutecki <maciej.rutecki@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-07-20 16:25:40 -07:00
Al Viro	0abbb609ac	mqueue doesn't need make_bad_inode() It never hashes them anyway and does final iput() immediately afterwards. With ->drop_inode() being generic_delete_inode()... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-06-04 17:16:27 -04:00
Christoph Hellwig	7ea8085910	drop unused dentry argument to ->fsync Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-05-27 22:05:02 -04:00
Julia Lawall	4de85cd6d6	ipc/sem.c: use ERR_CAST Use ERR_CAST(x) rather than ERR_PTR(PTR_ERR(x)). The former makes more clear what is the purpose of the operation, which otherwise looks like a no-op. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ type T; T x; identifier f; @@ T f (...) { <+... - ERR_PTR(PTR_ERR(x)) + x ...+> } @@ expression x; @@ - ERR_PTR(PTR_ERR(x)) + ERR_CAST(x) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-27 09:12:49 -07:00
Manfred Spraul	c5cf6359ad	ipc/sem.c: update description of the implementation ipc/sem.c begins with a 15 year old description about bugs in the initial implementation in Linux-1.0. The patch replaces that with a top level description of the current code. A TODO could be derived from this text: The opengroup man page for semop() does not mandate FIFO. Thus there is no need for a semaphore array list of pending operations. If - this list is removed - the per-semaphore array spinlock is removed (possible if there is no list to protect) - sem_otime is moved into the semaphores and calculated on demand during semctl() then the array would be read-mostly - which would significantly improve scaling for applications that use semaphore arrays with lots of entries. The price would be expensive semctl() calls: for(i=0;i<sma->sem_nsems;i++) spin_lock(sma->sem_lock); <do stuff> for(i=0;i<sma->sem_nsems;i++) spin_unlock(sma->sem_lock); I'm not sure if the complexity is worth the effort, thus here is the documentation of the current behavior first. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Zach Brown <zach.brown@oracle.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-27 09:12:49 -07:00
Manfred Spraul	0a2b9d4c79	ipc/sem.c: move wake_up_process out of the spinlock section The wake-up part of semtimedop() consists out of two steps: - the right tasks must be identified. - they must be woken up. Right now, both steps run while the array spinlock is held. This patch reorders the code and moves the actual wake_up_process() behind the point where the spinlock is dropped. The code also moves setting sem->sem_otime to one place: It does not make sense to set the last modify time multiple times. [akpm@linux-foundation.org: repair kerneldoc] [akpm@linux-foundation.org: fix uninitialised retval] Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Zach Brown <zach.brown@oracle.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-27 09:12:49 -07:00
Manfred Spraul	fd5db42254	ipc/sem.c: optimize update_queue() for bulk wakeup calls The following series of patches tries to fix the spinlock contention reported by Chris Mason - his benchmark exposes problems of the current code: - In the worst case, the algorithm used by update_queue() is O(N^2). Bulk wake-up calls can enter this worst case. The patch series fix that. Note that the benchmark app doesn't expose the problem, it just should be fixed: Real world apps might do the wake-ups in another order than perfect FIFO. - The part of the code that runs within the semaphore array spinlock is significantly larger than necessary. The patch series fixes that. This change is responsible for the main improvement. - The cacheline with the spinlock is also used for a variable that is read in the hot path (sem_base) and for a variable that is unnecessarily written to multiple times (sem_otime). The last step of the series cacheline-aligns the spinlock. This patch: The SysV semaphore code allows to perform multiple operations on all semaphores in the array as atomic operations. After a modification, update_queue() checks which of the waiting tasks can complete. The algorithm that is used to identify the tasks is O(N^2) in the worst case. For some cases, it is simple to avoid the O(N^2). The patch adds a detection logic for some cases, especially for the case of an array where all sleeping tasks are single sembuf operations and a multi-sembuf operation is used to wake up multiple tasks. A big database application uses that approach. The patch fixes wakeup due to semctl(,,SETALL,) - the initial version of the patch breaks that. [akpm@linux-foundation.org: make do_smart_update() static] Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Zach Brown <zach.brown@oracle.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-27 09:12:49 -07:00
Alexey Dobriyan	4be929be34	kernel-wide: replace USHORT_MAX, SHORT_MAX and SHORT_MIN with USHRT_MAX, SHRT_MAX and SHRT_MIN - C99 knows about USHRT_MAX/SHRT_MAX/SHRT_MIN, not USHORT_MAX/SHORT_MAX/SHORT_MIN. - Make SHRT_MIN of type s16, not int, for consistency. [akpm@linux-foundation.org: fix drivers/dma/timb_dma.c] [akpm@linux-foundation.org: fix security/keys/keyring.c] Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-25 08:07:02 -07:00
Linus Torvalds	164d44fd92	Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: clocksource: Add clocksource_register_hz/khz interface posix-cpu-timers: Optimize run_posix_cpu_timers() time: Remove xtime_cache mqueue: Convert message queue timeout to use hrtimers hrtimers: Provide schedule_hrtimeout for CLOCK_REALTIME timers: Introduce the concept of timer slack for legacy timers ntp: Remove tickadj ntp: Make time_adjust static time: Add xtime, wall_to_monotonic to feature-removal-schedule timer: Try to survive timer callback preempt_count leak timer: Split out timer function call timer: Print function name for timer callbacks modifying preemption count time: Clean up warp_clock() cpu-timers: Avoid iterating over all threads in fastpath_timer_check() cpu-timers: Change SIGEV_NONE timer implementation cpu-timers: Return correct previous timer reload value cpu-timers: Cleanup arm_timer() cpu-timers: Simplify RLIMIT_CPU handling	2010-05-19 17:11:10 -07:00
André Goddard Rosa	a3ed2a1571	mqueue: fix kernel BUG caused by double free() on mq_open() In case of aborting because we reach the maximum amount of memory which can be allocated to message queues per user (RLIMIT_MSGQUEUE), we would try to free the message area twice when bailing out: first by the error handling code itself, and then later when cleaning up the inode through delete_inode(). Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-11 17:33:42 -07:00
Thomas Gleixner	dbb6be6d5e	Merge branch 'linus' into timers/core Reason: Further posix_cpu_timer patches depend on mainline changes Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-05-10 14:20:42 +02:00
Carsten Emde	9ca7d8e683	mqueue: Convert message queue timeout to use hrtimers The message queue functions mq_timedsend() and mq_timedreceive() have not yet been converted to use the hrtimer interface. This patch replaces the call to schedule_timeout() by a call to schedule_hrtimeout() and transforms the expiration time from timespec to ktime as required. [ tglx: Fixed whitespace wreckage ] Signed-off-by: Carsten Emde <C.Emde@osadl.org> Tested-by: Pradyumna Sampath <pradysam@gmail.com> Cc: Arjan van de Veen <arjan@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <20100402204331.715783034@osadl.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-04-06 21:50:03 +02:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Anton Blanchard	45575f5a42	ppc64 sys_ipc breakage in 2.6.34-rc2 I chased down a fail on ppc64 on 2.6.34-rc2 where an application that uses shared memory was getting a SEGV. Commit `baed7fc9b5` ("Add generic sys_ipc wrapper") changed the second argument from an unsigned long to an int. When we call shmget the system call wrappers for sys_ipc will sign extend second (ie the size) which truncates it. It took a while to track down because the call succeeds and strace shows the untruncated size :) The patch below changes second from an int to an unsigned long which fixes shmget on ppc64 (and I assume s390, sparc64 and mips64). Signed-off-by: Anton Blanchard <anton@samba.org> -- I assume the function prototypes for the other IPC methods would cause us to sign or zero extend second where appropriate (avoiding any security issues). Come to think of it, the syscall wrappers for each method should do that for us as well. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-22 09:57:19 -07:00
Jiri Slaby	f1eb1332b8	ipc: use rlimit helpers Make sure compiler won't do weird things with limits. E.g. fetching them twice may return 2 different values after writable limits are implemented. I.e. either use rlimit helpers added in `3e10e716ab` ("resource: add helpers for fetching rlimits") or ACCESS_ONCE if not applicable. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-12 15:52:39 -08:00
Christoph Hellwig	baed7fc9b5	Add generic sys_ipc wrapper Add a generic implementation of the ipc demultiplexer syscall. Except for s390 and sparc64 all implementations of the sys_ipc are nearly identical. There are slight differences in the types of the parameters, where mips and powerpc as the only 64-bit architectures with sys_ipc use unsigned long for the "third" argument as it gets casted to a pointer later, while it traditionally is an "int" like most other paramters. frv goes even further and uses unsigned long for all parameters execept for "ptr" which is a pointer type everywhere. The change from int to unsigned long for "third" and back to "int" for the others on frv should be fine due to the in-register calling conventions for syscalls (we already had a similar issue with the generic sys_ptrace), but I'd prefer to have the arch maintainers looks over this in details. Except for that h8300, m68k and m68knommu lack an impplementation of the semtimedop sub call which this patch adds, and various architectures have gets used - at least on i386 it seems superflous as the compat code on x86-64 and ia64 doesn't even bother to implement it. [akpm@linux-foundation.org: add sys_ipc to sys_ni.c] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Reviewed-by: H. Peter Anvin <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: James Morris <jmorris@namei.org> Cc: Andreas Schwab <schwab@linux-m68k.org> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Kyle McMartin <kyle@mcmartin.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-12 15:52:32 -08:00
André Goddard Rosa	2329e392ac	mqueue: fix typo "failues" -> "failures" Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-03-03 14:48:00 -05:00
André Goddard Rosa	8d8ffefaaf	mqueue: only set error codes if they are really necessary ... postponing assignments until they're needed. Doesn't change code size. Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-03-03 14:48:00 -05:00
André Goddard Rosa	04db0dde0e	mqueue: simplify do_open() error handling It reduces code size: text data bss dec hex filename 9925 72 16 10013 271d ipc/mqueue-BEFORE.o 9885 72 16 9973 26f5 ipc/mqueue-AFTER.o Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-03-03 14:48:00 -05:00
André Goddard Rosa	8834cf796a	mqueue: apply mathematics distributivity on mq_bytes calculation Code size reduction: text data bss dec hex filename 9941 72 16 10029 272d ipc/mqueue-BEFORE.o 9925 72 16 10013 271d ipc/mqueue-AFTER.o Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-03-03 14:48:00 -05:00
André Goddard Rosa	c8308b1c91	mqueue: remove unneeded info->messages initialization ... and abort earlier if we couldn't allocate the message pointers array, avoiding the u->mq_bytes accounting logic. It reduces code size: text data bss dec hex filename 9949 72 16 10037 2735 ipc/mqueue-BEFORE.o 9941 72 16 10029 272d ipc/mqueue-AFTER.o Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-03-03 14:47:59 -05:00
André Goddard Rosa	4294a8eedb	mqueue: fix mq_open() file descriptor leak on user-space processes We leak fd on lookup_one_len() failure Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-03-03 14:46:05 -05:00
David Howells	ed5e5894b2	nommu: fix SYSV SHM for NOMMU Commit `c4caa77815` ("file ->get_unmapped_area() shouldn't duplicate work of get_unmapped_area()") broke SYSV SHM for NOMMU by taking away the pointer to shm_get_unmapped_area() from shm_file_operations. Put it back conditionally on CONFIG_MMU=n. file->f_ops->get_unmapped_area() is used to find out the base address for a mapping of a mappable chardev device or mappable memory-based file (such as a ramfs file). It needs to be called prior to file->f_ops->mmap() being called. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Greg Ungerer <gerg@snapgear.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-01-16 12:15:39 -08:00

1 2 3 4 5 ...

284 Commits