After trying to use this feature in Fedora we found the hard coding
policy like this into the kernel was a bad idea. Surprise surprise.
We ran into these problems because it was impossible to launch a
container as a logged in user and run a login daemon inside that container.
This reverts back to the old behavior before this option was added. The
option will be re-added in a userspace selectable manor such that
userspace can choose when it is and when it is not appropriate.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
This is just a code rework. It makes things more readable. It does not
make any functional changes.
It does change the log messages to include both the old session id as
well the new and it includes a new res field, which means we get
messages even when the user did not have permission to change the
loginuid.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
The audit_status structure was not designed with extensibility in mind.
Define a new AUDIT_SET_FEATURE message type which takes a new structure
of bits where things can be enabled/disabled/locked one at a time. This
structure should be able to grow in the future while maintaining forward
and backward compatibility (based loosly on the ideas from capabilities
and prctl)
This does not actually add any features, but is just infrastructure to
allow new on/off types of audit system features.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
SFR reported this 2013-05-15:
> After merging the final tree, today's linux-next build (i386 defconfig)
> produced this warning:
>
> kernel/auditfilter.c: In function 'audit_data_to_entry':
> kernel/auditfilter.c:426:3: warning: this decimal constant is unsigned only
> in ISO C90 [enabled by default]
>
> Introduced by commit 780a7654ce ("audit: Make testing for a valid
> loginuid explicit") from Linus' tree.
Replace this decimal constant in the code with a macro to make it more readable
(add to the unsigned cast to quiet the warning).
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
When the audit=1 kernel parameter is absent and auditd is not running,
AUDIT_USER_AVC messages are being silently discarded.
AUDIT_USER_AVC messages should be sent to userspace using printk(), as
mentioned in the commit message of 4a4cd633 ("AUDIT: Optimise the
audit-disabled case for discarding user messages").
When audit_enabled is 0, audit_receive_msg() discards all user messages
except for AUDIT_USER_AVC messages. However, audit_log_common_recv_msg()
refuses to allocate an audit_buffer if audit_enabled is 0. The fix is to
special case AUDIT_USER_AVC messages in both functions.
It looks like commit 50397bd1 ("[AUDIT] clean up audit_receive_msg()")
introduced this bug.
Cc: <stable@kernel.org> # v2.6.25+
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Cc: linux-audit@redhat.com
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
If audit_filter_task() nacks the new thread it makes sense
to clear TIF_SYSCALL_AUDIT which can be copied from parent
by dup_task_struct().
A wrong TIF_SYSCALL_AUDIT is not really bad but it triggers
the "slow" audit paths in entry.S to ensure the task can not
miss audit_syscall_*() calls, this is pointless if the task
has no ->audit_context.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Steve Grubb <sgrubb@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
A newline was accidentally added during session ID helper refactorization in
commit 4d3fb709. This needlessly uses up buffer space, messes up syslog
formatting and makes userspace processing less efficient. Remove it.
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Conflicts:
drivers/net/ethernet/emulex/benet/be.h
drivers/net/netconsole.c
net/bridge/br_private.h
Three mostly trivial conflicts.
The net/bridge/br_private.h conflict was a function signature (argument
addition) change overlapping with the extern removals from Joe Perches.
In drivers/net/netconsole.c we had one change adjusting a printk message
whilst another changed "printk(KERN_INFO" into "pr_info(".
Lastly, the emulex change was a new inline function addition overlapping
with Joe Perches's extern removals.
Signed-off-by: David S. Miller <davem@davemloft.net>
Resolve cherry-picking conflicts:
Conflicts:
mm/huge_memory.c
mm/memory.c
mm/mprotect.c
See this upstream merge commit for more details:
52469b4fcd Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently check_hung_task() prints a warning if it detects the
problem, but it is not convenient to watch the system logs if
user-space wants to be notified about the hang.
Add the new trace_sched_process_hang() into check_hung_task(),
this way a user-space monitor can easily wait for the hang and
potentially resolve a problem.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Dave Sullivan <dsulliva@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20131019161828.GA7439@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
If a macro is only used within 2 times, and also its contents are
within 2 lines, recommend to expand it to shrink code line.
For our case, the macro is not portable either: some architectures'
assembler may use another character to mark newline in a macro (e.g.
'`' for arc), which will cause issue.
If still want to use macro and let it portable enough, it will also
need include additional header file (e.g "#include <linux/linkage.h>",
although it also need be fixed).
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Using a spinlock to atomically increase a counter sounds wrong -- we've
atomic_t for this!
Also move 'seq_nr' to a different cache line than 'lock' to reduce cache
line trashing. This has the nice side effect of decreasing the size of
struct parallel_data from 192 to 128 bytes for a x86-64 build, e.g.
occupying only two instead of three cache lines.
Those changes results in a 5% performance increase on an IPsec test run
using pcrypt.
Btw. the seq_lock spinlock was never explicitly initialized -- one more
reason to get rid of it.
Signed-off-by: Mathias Krause <mathias.krause@secunet.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
uprobe_copy_process() does nothing if the child shares ->mm with
the forking process, but there is a special case: CLONE_VFORK.
In this case it would be more correct to do dup_utask() but avoid
dup_xol(). This is not that important, the child should not unwind
its stack too much, this can corrupt the parent's stack, but at
least we need this to allow to ret-probe __vfork() itself.
Note: in theory, it would be better to check task_pt_regs(p)->sp
instead of CLONE_VFORK, we need to dup_utask() if and only if the
child can return from the function called by the parent. But this
needs the arch-dependant helper, and I think that nobody actually
does clone(same_stack, CLONE_VM).
Reported-by: Martin Cermak <mcermak@redhat.com>
Reported-by: David Smith <dsmith@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
This finally fixes the serious bug in uretprobes: a forked child
crashes if the parent called fork() with the pending ret probe.
Trivial test-case:
# perf probe -x /lib/libc.so.6 __fork%return
# perf record -e probe_libc:__fork perl -le 'fork || print "OK"'
(the child doesn't print "OK", it is killed by SIGSEGV)
If the child returns from the probed function it actually returns
to trampoline_vaddr, because it got the copy of parent's stack
mangled by prepare_uretprobe() when the parent entered this func.
It crashes because a) this address is not mapped and b) until the
previous change it doesn't have the proper->return_instances info.
This means that uprobe_copy_process() has to create xol_area which
has the trampoline slot, and its vaddr should be equal to parent's
xol_area->vaddr.
Unfortunately, uprobe_copy_process() can not simply do
__create_xol_area(child, xol_area->vaddr). This could actually work
but perf_event_mmap() doesn't expect the usage of foreign ->mm. So
we offload this to task_work_run(), and pass the argument via not
yet used utask->vaddr.
We know that this vaddr is fine for install_special_mapping(), the
necessary hole was recently "created" by dup_mmap() which skips the
parent's VM_DONTCOPY area, and nobody else could use the new mm.
Unfortunately, this also means that we can not handle the errors
properly, we obviously can not abort the already completed fork().
So we simply print the warning if GFP_KERNEL allocation (the only
possible reason) fails.
Reported-by: Martin Cermak <mcermak@redhat.com>
Reported-by: David Smith <dsmith@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
uprobe_copy_process() assumes that the new child doesn't need
->utask, it should be allocated by demand.
But this is not true if the forking task has the pending ret-
probes, the child should report them as well and thus it needs
the copy of parent's ->return_instances chain. Otherwise the
child crashes when it returns from the probed function.
Alternatively we could cleanup the child's stack, but this needs
per-arch changes and this is not what we want. At least systemtap
expects a .return in the child too.
Note: this change alone doesn't fix the problem, see the next
change.
Reported-by: Martin Cermak <mcermak@redhat.com>
Reported-by: David Smith <dsmith@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Currently xol_add_vma() uses get_unmapped_area() for area->vaddr,
but the next patches need to use the fixed address. So this patch
adds the new "vaddr" argument to __create_xol_area() which should
be used as area->vaddr if it is nonzero.
xol_add_vma() doesn't bother to verify that the predefined addr is
not used, insert_vm_struct() should fail if find_vma_links() detects
the overlap with the existing vma.
Also, __create_xol_area() doesn't need __GFP_ZERO to allocate area.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
No functional changes, preparation.
Extract the code which actually allocates/installs the new area
into the new helper, __create_xol_area().
While at it remove the unnecessary "ret = ENOMEM" and "ret = 0"
in xol_add_vma(), they both have no effect.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Preparation for the next patches.
Move the callsite of uprobe_copy_process() in copy_process() down
to the succesfull return. We do not care if copy_process() fails,
uprobe_free_utask() won't be called in this case so the wrong
->utask != NULL doesn't matter.
OTOH, with this change we know that copy_process() can't fail when
uprobe_copy_process() is called, the new task should either return
to user-mode or call do_exit(). This way uprobe_copy_process() can:
1. setup p->utask != NULL if necessary
2. setup uprobes_state.xol_area
3. use task_work_add(p)
Also, move the definition of uprobe_copy_process() down so that it
can see get_utask().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
perf_event_mmap_event() does kzalloc(PATH_MAX + sizeof(u64)) to
ensure we can align the size later. However this means that we
actually allocate PAGE_SIZE * 2 buffer, seems too much.
Change this code to allocate PATH_MAX==PAGE_SIZE bytes, but tell
d_path() to not use the last sizeof(u64) bytes.
Note: it is not clear why do we need __GFP_ZERO, see the next patch.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131016201004.GC23214@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>