When a task is attempting to violate the RLIMIT_NPROC limit we have a
check to see if the task is sufficiently priviledged. The check first
looks at CAP_SYS_ADMIN, then CAP_SYS_RESOURCE, then if the task is uid=0.
A result is that tasks which are allowed by the uid=0 check are first
checked against the security subsystem. This results in the security
subsystem auditting a denial for sys_admin and sys_resource and then the
task passing the uid=0 check.
This patch rearranges the code to first check uid=0, since if we pass that
we shouldn't hit the security system at all. We then check sys_resource,
since it is the smallest capability which will solve the problem. Lastly
we check the fallback everything cap_sysadmin. We don't want to give this
capability many places since it is so powerful.
This will eliminate many of the false positive/needless denial messages we
get when a root task tries to violate the nproc limit. (note that
kthreads count against root, so on a sufficiently large machine we can
actually get past the default limits before any userspace tasks are
launched.)
Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move __set_special_pids() from exit.c to sys.c close to its single caller
and make it static.
And rename it to set_special_pids(), another helper with this name has
gone away.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Imho, "atomic_t call_count" is ugly and should die. It buys nothing and
in fact it can grow more than necessary, expand doesn't check if it was
already incremented by another task.
Kill it, and introduce "static int core_name_size" updated by
expand_corename(). This is obviously racy too but harmless, and
core_name_size never grows for no reason.
We do not bother to to calculate the "right" new size, we simply do
kmalloc(size_we_need) and use ksize() to rely on kmalloc_index's decision.
Finally change format_corename() to use expand_corename(), krealloc(NULL)
is fine.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Colin Walters <walters@verbum.org>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Lennart Poettering <mzxreary@0pointer.de>
Cc: Lucas De Marchi <lucas.de.marchi@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
call_usermodehelper_exec() does nothing but returns success if path[0] ==
0. The only user which needs this strange feature is request_module(), it
can check modprobe_path[0] itself like other users do if they want to
detect the "disabled by admin" case.
Kill it. Not only it looks strange, it can confuse other callers. And
this allows us to revert 264b83c0 ("usermodehelper: check
subprocess_info->path != NULL"), do_execve(NULL) is safe.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Lucas De Marchi <lucas.de.marchi@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
crtools uses a parasite code for dumping processes. The parasite code is
injected into a process with help PTRACE_SEIZE.
Currently crtools blocks signals from a parasite code. If a process has
pending signals, crtools wait while a process handles these signals.
This method is not suitable for stopped tasks. A stopped task can have a
few pending signals, when we will try to execute a parasite code, we will
need to drop SIGSTOP, but all other signals must remain pending, because a
state of processes must not be changed during checkpointing.
This patch adds two ptrace commands to set/get signal-blocked mask.
I think gdb can use this commands too.
[akpm@linux-foundation.org: be consistent with brace layout]
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A surprising number of newbies interpret this section to mean that only
one return statement is allowed per function. Part of the problem is that
the "one return statement per function" rule is an actual style guideline
that people are used to from other projects.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Eduardo Valentin <eduardo.valentin@ti.com>
Cc: Rob Landley <rob@landley.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The cp_inodes_count and cp_blocks_count are represented as __le64 type in
on-disk structure (struct nilfs_checkpoint). But analogous fields in
in-core structure (struct nilfs_root) are represented by atomic_t type.
This patch replaces atomic_t on atomic64_t type in representation of
inodes_count and blocks_count fields in struct nilfs_root.
Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Acked-by: Joern Engel <joern@logfs.org>
Cc: Clemens Eisserer <linuxhippy@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, NILFS2 returns 0 as free inodes count (f_ffree) and current
used inodes count as total file nodes in file system (f_files):
df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/loop0 2 2 0 100% /mnt/nilfs2
This patch implements real calculation of free inodes count. First of
all, it is calculated total file nodes in file system as
(desc_blocks_count * groups_per_desc_block * entries_per_group). Then, it
is calculated free inodes count as difference the total file nodes and
used inodes count. As a result, we have such output for NILFS2:
df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/loop0 4194304 2114701 2079603 51% /mnt/nilfs2
Reported-by: Clemens Eisserer <linuxhippy@gmail.com>
Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Joern Engel <joern@logfs.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On CSR SiRFprimaII/atlasVI, there is a programmable 16-bit divider
(RTC_DIV) that divides the input 32.768KHz clock to the frequency that
users need (E.g. 1 Hz). The divided real-time clock will be used to
drive a 32-bit counter (RTC_COUNTER) that provides users with the actual
time.
In each cycle of the divided real-time clock, there is a Hertz interrupt
generated to the RISC. Users can also configure an alarm (RTC_ALARM).
When RTC_COUNTER matches the alarm, there will be an alarm interrupt
generated to the RISC.
The system RTC can generate an alarm wake-up signal to notify the power
controller to wake up from power saving mode.
Signed-off-by: Xianglong Du <Xianglong.Du@csr.com>
Signed-off-by: Barry Song <Baohua.Song@csr.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
rtc-omap driver modules is used both by OMAP1/2, Davinci SoC platforms.
However, rtc wake support on OMAP1 is broken. Hence the
device_init_wakeup() was removed from rtc-omap driver and moved to
platform board files that supported it (DA850/OMAP-L138). [1]
However, recently [2] it was suggested that driver should always do a
device_init_wakeup(dev, true). Platforms that don't want/need
wakeup support can disable it from userspace via:
echo disabled > /sys/devices/.../power/wakeup
Also, with the new DT boot-up, board file doesn't exist and hence there
is no way to have device wakeup support rtc.
The fix for above issues, is to hard code device_init_wakeup() inside
driver and let platforms that don't need this, handle it through the
sysfs power entry.
[1]
https://patchwork.kernel.org/patch/136731/
[2]
http://www.mail-archive.com/davinci-linux-open-source@linux.
davincidsp.com/msg26077.html
Signed-off-by: Hebbar Gururaja <gururaja.hebbar@ti.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Acked-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>