linux

mirror of https://github.com/Dasharo/linux.git synced 2026-03-06 15:25:10 -08:00

Author	SHA1	Message	Date
Al Viro	bd2a31d522	get rid of fget_light() instead of returning the flags by reference, we can just have the low-level primitive return those in lower bits of unsigned long, with struct file * derived from the rest. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-03-10 11:44:42 -04:00
Eric W. Biederman	96c7a2ff21	fs/file.c:fdtable: avoid triggering OOMs from alloc_fdmem Recently due to a spike in connections per second memcached on 3 separate boxes triggered the OOM killer from accept. At the time the OOM killer was triggered there was 4GB out of 36GB free in zone 1. The problem was that alloc_fdtable was allocating an order 3 page (32KiB) to hold a bitmap, and there was sufficient fragmentation that the largest page available was 8KiB. I find the logic that PAGE_ALLOC_COSTLY_ORDER can't fail pretty dubious but I do agree that order 3 allocations are very likely to succeed. There are always pathologies where order > 0 allocations can fail when there are copious amounts of free memory available. Using the pigeon hole principle it is easy to show that it requires 1 page more than 50% of the pages being free to guarantee an order 1 (8KiB) allocation will succeed, 1 page more than 75% of the pages being free to guarantee an order 2 (16KiB) allocation will succeed and 1 page more than 87.5% of the pages being free to guarantee an order 3 allocate will succeed. A server churning memory with a lot of small requests and replies like memcached is a common case that if anything can will skew the odds against large pages being available. Therefore let's not give external applications a practical way to kill linux server applications, and specify __GFP_NORETRY to the kmalloc in alloc_fdmem. Unless I am misreading the code and by the time the code reaches should_alloc_retry in __alloc_pages_slowpath (where __GFP_NORETRY becomes signification). We have already tried everything reasonable to allocate a page and the only thing left to do is wait. So not waiting and falling back to vmalloc immediately seems like the reasonable thing to do even if there wasn't a chance of triggering the OOM killer. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Cong Wang <cwang@twopensource.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-02-10 16:01:41 -08:00
Oleg Nesterov	e6ff9a9fa4	fs: __fget_light() can use __fget() in slow path The slow path in __fget_light() can use __fget() to avoid the code duplication. Saves 232 bytes. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-01-25 03:14:38 -05:00
Oleg Nesterov	ad46183445	fs: factor out common code in fget_light() and fget_raw_light() Apart from FMODE_PATH check fget_light() and fget_raw_light() are identical, shift the code into the new helper, __fget_light(fd, mask). Saves 208 bytes. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-01-25 03:14:37 -05:00
Oleg Nesterov	1deb46e256	fs: factor out common code in fget() and fget_raw() Apart from FMODE_PATH check fget() and fget_raw() are identical, shift the code into the new simple helper, __fget(fd, mask). Saves 160 bytes. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-01-25 03:14:37 -05:00
Oleg Nesterov	ce08b62d18	change close_files() to use rcu_dereference_raw(files->fdt) put_files_struct() and close_files() do rcu_read_lock() to make rcu_dereference_check_fdtable() happy. This looks a bit ugly, files_fdtable() just reads the pointer, we can simply use rcu_dereference_raw() to avoid the warning. The patch also changes close_files() to return fdt, this avoids another rcu_read_lock()/files_fdtable() in put_files_struct(). I think close_files() needs more cleanups: - we do not need xchg() exactly because we are the last user of this files_struct - "if (file)" should be turned into WARN_ON(!file) Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-01-25 03:14:37 -05:00
Oleg Nesterov	a8d4b8345e	introduce __fcheck_files() to fix rcu_dereference_check_fdtable(), kill rcu_my_thread_group_empty() rcu_dereference_check_fdtable() looks very wrong, 1. rcu_my_thread_group_empty() was added by `844b9a8707` "vfs: fix RCU-lockdep false positive due to /proc" but it doesn't really fix the problem. A CLONE_THREAD (without CLONE_FILES) task can hit the same race with get_files_struct(). And otoh rcu_my_thread_group_empty() can suppress the correct warning if the caller is the CLONE_FILES (without CLONE_THREAD) task. 2. files->count == 1 check is not really right too. Even if this files_struct is not shared it is not safe to access it lockless unless the caller is the owner. Otoh, this check is sub-optimal. files->count == 0 always means it is safe to use it lockless even if files != current->files, but put_files_struct() has to take rcu_read_lock(). See the next patch. This patch removes the buggy checks and turns fcheck_files() into __fcheck_files() which uses rcu_dereference_raw(), the "unshared" callers, fget_light() and fget_raw_light(), can use it to avoid the warning from RCU-lockdep. fcheck_files() is trivially reimplemented as rcu_lockdep_assert() plus __fcheck_files(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-01-25 03:14:36 -05:00
Al Viro	ac3e3c5b11	don't bother with deferred freeing of fdtables Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-05-01 17:31:42 -04:00
Thomas Gleixner	eece09ec21	locking: Various static lock initializer fixes The static lock initializers want to be fed the proper name of the lock and not some random string. In mainline random strings are obfuscating the readability of debug output, but for RT they prevent the spinlock substitution. Fix it up. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2013-02-19 08:42:45 +01:00
Greg Kroah-Hartman	6ae141718e	misc: remove __dev* attributes. CONFIG_HOTPLUG is going away as an option. As a result, the __dev* markings need to be removed. This change removes the last of the __dev* markings from the kernel from a variety of different, tiny, places. Based on patches originally written by Bill Pemberton, but redone by me in order to handle some of the coding style issues better, by hand. Cc: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-01-03 15:57:16 -08:00
Linus Torvalds	9977d9b379	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull big execve/kernel_thread/fork unification series from Al Viro: "All architectures are converted to new model. Quite a bit of that stuff is actually shared with architecture trees; in such cases it's literally shared branch pulled by both, not a cherry-pick. A lot of ugliness and black magic is gone (-3KLoC total in this one): - kernel_thread()/kernel_execve()/sys_execve() redesign. We don't do syscalls from kernel anymore for either kernel_thread() or kernel_execve(): kernel_thread() is essentially clone(2) with callback run before we return to userland, the callbacks either never return or do successful do_execve() before returning. kernel_execve() is a wrapper for do_execve() - it doesn't need to do transition to user mode anymore. As a result kernel_thread() and kernel_execve() are arch-independent now - they live in kernel/fork.c and fs/exec.c resp. sys_execve() is also in fs/exec.c and it's completely architecture-independent. - daemonize() is gone, along with its parts in fs/.c - struct pt_regs is no longer passed to do_fork/copy_process/ copy_thread/do_execve/search_binary_handler/->load_binary/do_coredump. - sys_fork()/sys_vfork()/sys_clone() unified; some architectures still need wrappers (ones with callee-saved registers not saved in pt_regs on syscall entry), but the main part of those suckers is in kernel/fork.c now." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (113 commits) do_coredump(): get rid of pt_regs argument print_fatal_signal(): get rid of pt_regs argument ptrace_signal(): get rid of unused arguments get rid of ptrace_signal_deliver() arguments new helper: signal_pt_regs() unify default ptrace_signal_deliver flagday: kill pt_regs argument of do_fork() death to idle_regs() don't pass regs to copy_process() flagday: don't pass regs to copy_thread() bfin: switch to generic vfork, get rid of pointless wrappers xtensa: switch to generic clone() openrisc: switch to use of generic fork and clone unicore32: switch to generic clone(2) score: switch to generic fork/vfork/clone c6x: sanitize copy_thread(), get rid of clone(2) wrapper, switch to generic clone() take sys_fork/sys_vfork/sys_clone prototypes to linux/syscalls.h mn10300: switch to generic fork/vfork/clone h8300: switch to generic fork/vfork/clone tile: switch to generic clone() ... Conflicts: arch/microblaze/include/asm/Kbuild	2012-12-12 12:22:13 -08:00
Al Viro	a77cfcb429	fix off-by-one in argument passed by iterate_fd() to callbacks Noticed by Pavel Roskin; the thing in his patch I disagree with was compensating for that shite in callbacks instead of fixing it once in the iterator itself. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-11-29 23:01:30 -05:00
Al Viro	c4144670fd	kill daemonize() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-11-28 21:49:02 -05:00
Linus Torvalds	8d938105e4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc VFS fixes from Al Viro: "Remove a bogus BUG_ON() that can trigger spuriously + alpha bits of do_mount() constification I'd missed during the merge window." This pull request came in a week ago, I missed it for some reason. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: kill bogus BUG_ON() in do_close_on_exec() missing const in alpha callers of do_mount()	2012-11-18 09:13:48 -10:00
Al Viro	5a8477660d	kill bogus BUG_ON() in do_close_on_exec() It can be legitimately triggered via procfs access. Now, at least 2 of 3 of get_files_struct() callers in procfs are useless, but when and if we get rid of those we can always add WARN_ON() here. BUG_ON() at that spot is simply wrong. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-11-12 01:19:02 -05:00
Al Viro	08f05c4974	Return the right error value when dup[23]() newfd argument is too large Jack Lin reports that the error return from dup3() for the RLIMIT_NOFILE case changed incorrectly after 3.6. The culprit is commit `f33ff9927f` ("take rlimit check to callers of expand_files()") which when it moved the "return -EMFILE" out to the caller, didn't notice that the dup3() had special code to turn the EMFILE return into EBADF. The replace_fd() helper that got added later then inherited the bug too. Reported-by: Jack Lin <linliangjie@huawei.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [ Noted more bugs, wrote proper changelog, fixed up typos - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-30 21:27:28 -07:00
Richard W.M. Jones	aed976475b	dup3: Return an error when oldfd == newfd. I have tested the attached patch to fix the dup3 regression. Rich. From 0944e30e12dec6544b3602626b60ff412375c78f Mon Sep 17 00:00:00 2001 From: "Richard W.M. Jones" <rjones@redhat.com> Date: Tue, 9 Oct 2012 14:42:45 +0100 Subject: [PATCH] dup3: Return an error when oldfd == newfd. The following commit: commit `fe17f22d7f` Author: Al Viro <viro@zeniv.linux.org.uk> Date: Tue Aug 21 11:48:11 2012 -0400 take purely descriptor-related stuff from fcntl.c to file.c was supposed to be just code motion, but it dropped the following two lines: if (unlikely(oldfd == newfd)) return -EINVAL; from the dup3 system call. dup3 is not specified by POSIX, so Linux can do what it likes. However the POSIX proposal for dup3 [1] states that it should return an error if oldfd == newfd. [1] http://austingroupbugs.net/view.php?id=411 Signed-off-by: Richard W.M. Jones <rjones@redhat.com> Tested-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-09 23:33:38 -04:00
Al Viro	4557c669ef	export fget_light Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-26 21:10:06 -04:00
Al Viro	864bdb3b6c	new helper: daemonize_descriptors() descriptor-related parts of daemonize, done right. As the result we simplify the locking rules for ->files - we hold task_lock in all cases when we modify ->files. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-26 21:10:00 -04:00
Al Viro	c3c073f808	new helper: iterate_fd() iterates through the opened files in given descriptor table, calling a supplied function; we stop once non-zero is returned. Callback gets struct file , descriptor number and const void argument passed to iterator. It is called with files->file_lock held, so it is not allowed to block. tty_io, netprio_cgroup and selinux flush_unauthorized_files() converted to its use. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-26 21:09:59 -04:00
Al Viro	ad47bd7252	make expand_files() and alloc_fd() static no callers outside of fs/file.c left Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-26 21:09:58 -04:00
Al Viro	b8318b01a8	take __{set,clear}_{open_fd,close_on_exec}() into fs/file.c nobody uses those outside anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-26 21:09:58 -04:00
Al Viro	8280d16172	new helper: replace_fd() analog of dup2(), except that it takes struct file * as source. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-26 21:09:57 -04:00
Al Viro	fe17f22d7f	take purely descriptor-related stuff from fcntl.c to file.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-26 21:09:57 -04:00
Al Viro	6a6d27de34	take close-on-exec logics to fs/file.c, clean it up a bit ... and add cond_resched() there, while we are at it. We can get large latencies as is... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-26 21:09:56 -04:00

1 2 3 4

76 Commits