Commit Graph

48 Commits

Author SHA1 Message Date
Oleg Nesterov 2f38d70fb4 shift current_cred() from __f_setown() to f_modown()
Shift current_cred() from __f_setown() to f_modown(). This reduces
the number of arguments and saves 48 bytes from fs/fcntl.o.

[ Note: this doesn't clear euid/uid when pid is set to NULL.  But if
  f_owner.pid == NULL we never use f_owner.uid/euid.  Otherwise we'd
  have a bug anyway: we must not send signals if pid was reset to NULL.  ]

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-16 14:19:00 -07:00
Jeff Mahoney 2b79bc4f7e dup2: Fix return value with oldfd == newfd and invalid fd
The return value of dup2 when oldfd == newfd and the fd isn't valid is
not getting properly sign extended.  We end up with 4294967287 instead
of -EBADF.

I've reproduced this on SLE11 (2.6.27.21), openSUSE Factory
(2.6.29-rc5), and Ubuntu 9.04 (2.6.28).

This patch uses a signed int for the error value so it is properly
extended.

Commit 6c5d0512a0 introduced this
regression.

Reported-by: Jiri Dluhos <jdluhos@novell.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-11 12:18:06 -07:00
Jonathan Corbet 4a6a449969 Fix a lockdep warning in fasync_helper()
Lockdep gripes if file->f_lock is taken in a no-IRQ situation, since that
is not always the case.  We don't really want to disable IRQs for every
acquisition of f_lock; instead, just move it outside of fasync_lock.

Reported-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2009-03-30 08:00:24 -06:00
Jonathan Corbet 60aa49243d Rationalize fasync return values
Most fasync implementations do something like:

     return fasync_helper(...);

But fasync_helper() will return a positive value at times - a feature used
in at least one place.  Thus, a number of other drivers do:

     err = fasync_helper(...);
     if (err < 0)
             return err;
     return 0;

In the interests of consistency and more concise code, it makes sense to
map positive return values onto zero where ->fasync() is called.

Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2009-03-16 08:34:35 -06:00
Jonathan Corbet 76398425bb Move FASYNC bit handling to f_op->fasync()
Removing the BKL from FASYNC handling ran into the challenge of keeping the
setting of the FASYNC bit in filp->f_flags atomic with regard to calls to
the underlying fasync() function.  Andi Kleen suggested moving the handling
of that bit into fasync(); this patch does exactly that.  As a result, we
have a couple of internal API changes: fasync() must now manage the FASYNC
bit, and it will be called without the BKL held.

As it happens, every fasync() implementation in the kernel with one
exception calls fasync_helper().  So, if we make fasync_helper() set the
FASYNC bit, we can avoid making any changes to the other fasync()
functions - as long as those functions, themselves, have proper locking.
Most fasync() implementations do nothing but call fasync_helper() - which
has its own lock - so they are easily verified as correct.  The BKL had
already been pushed down into the rest.

The networking code has its own version of fasync_helper(), so that code
has been augmented with explicit FASYNC bit handling.

Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: David Miller <davem@davemloft.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2009-03-16 08:32:27 -06:00
Jonathan Corbet db1dd4d376 Use f_lock to protect f_flags
Traditionally, changes to struct file->f_flags have been done under BKL
protection, or with no protection at all.  This patch causes all f_flags
changes after file open/creation time to be done under protection of
f_lock.  This allows the removal of some BKL usage and fixes a number of
longstanding (if microscopic) races.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2009-03-16 08:32:27 -06:00
Heiko Carstens a26eab2400 [CVE-2009-0029] System call wrappers part 15
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2009-01-14 14:15:24 +01:00
James Morris cbacc2c7f0 Merge branch 'next' into for-linus 2008-12-25 11:40:09 +11:00
Jonathan Corbet 218d11a8b0 Fix a race condition in FASYNC handling
Changeset a238b790d5 (Call fasync()
functions without the BKL) introduced a race which could leave
file->f_flags in a state inconsistent with what the underlying
driver/filesystem believes.  Revert that change, and also fix the same
races in ioctl_fioasync() and ioctl_fionbio().

This is a minimal, short-term fix; the real fix will not involve the
BKL.

Reported-by: Oleg Nesterov <oleg@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-05 15:35:10 -08:00
David Howells c69e8d9c01 CRED: Use RCU to access another task's creds and to release a task's own creds
Use RCU to access another task's creds and to release a task's own creds.
This means that it will be possible for the credentials of a task to be
replaced without another task (a) requiring a full lock to read them, and (b)
seeing deallocated memory.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:19 +11:00
David Howells 86a264abe5 CRED: Wrap current->cred and a few other accessors
Wrap current->cred and a few other accessors to hide their actual
implementation.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:18 +11:00
David Howells b6dff3ec5e CRED: Separate task security context from task_struct
Separate the task security context from task_struct.  At this point, the
security data is temporarily embedded in the task_struct with two pointers
pointing to it.

Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in
entry.S via asm-offsets.

With comment fixes Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com>

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:16 +11:00
David Howells da9592edeb CRED: Wrap task credential accesses in the filesystem subsystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:05 +11:00
Al Viro 1b7e190b47 [PATCH] clean dup2() up a bit
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-08-01 11:25:24 -04:00
Al Viro 1027abe882 [PATCH] merge locate_fd() and get_unused_fd()
New primitive: alloc_fd(start, flags).  get_unused_fd() and
get_unused_fd_flags() become wrappers on top of it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-08-01 11:25:23 -04:00
Al Viro 4e1e018ecc [PATCH] fix RLIM_NOFILE handling
* dup2() should return -EBADF on exceeded sysctl_nr_open
* dup() should *not* return -EINVAL even if you have rlimit set to 0;
  it should get -EMFILE instead.

Check for orig_start exceeding rlimit taken to sys_fcntl().
Failing expand_files() in dup{2,3}() now gets -EMFILE remapped to -EBADF.
Consequently, remaining checks for rlimit are taken to expand_files().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:45 -04:00
Al Viro 6c5d0512a0 [PATCH] get rid of corner case in dup3() entirely
Since Ulrich is OK with getting rid of dup3(fd, fd, flags) completely,
to hell the damn thing goes.  Corner case for dup2() is handled in
sys_dup2() (complete with -EBADF if dup2(fd, fd) is called with fd
that is not open), the rest is done in dup3().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:44 -04:00
Ulrich Drepper 3c333937ee [PATCH] dup3 fix
Al Viro notice one cornercase that the new dup3() code.  The dup2()
function, as a special case, handles dup-ing to the same file
descriptor.  In this case the current dup3() code does nothing at
all.  I.e., it ingnores the flags parameter.  This shouldn't happen,
the close-on-exec flag should be set if requested.

In case the O_CLOEXEC bit in the flags parameter is not set the
dup3() function should behave in this respect identical to dup2().
This means dup3(fd, fd, 0) should not actively reset the c-o-e
flag.

The patch below implements this minor change.

[AV: credits to Artur Grabowski for bringing that up as potential subtle point
in dup2() behaviour]

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:39 -04:00
Ulrich Drepper 336dd1f70f flag parameters: dup2
This patch adds the new dup3 syscall.  It extends the old dup2 syscall by one
parameter which is meant to hold a flag value.  Support for the O_CLOEXEC flag
is added in this patch.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_dup3
# ifdef __x86_64__
#  define __NR_dup3 292
# elif defined __i386__
#  define __NR_dup3 330
# else
#  error "need __NR_dup3"
# endif
#endif

int
main (void)
{
  int fd = syscall (__NR_dup3, 1, 4, 0);
  if (fd == -1)
    {
      puts ("dup3(0) failed");
      return 1;
    }
  int coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (coe & FD_CLOEXEC)
    {
      puts ("dup3(0) set close-on-exec flag");
      return 1;
    }
  close (fd);

  fd = syscall (__NR_dup3, 1, 4, O_CLOEXEC);
  if (fd == -1)
    {
      puts ("dup3(O_CLOEXEC) failed");
      return 1;
    }
  coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((coe & FD_CLOEXEC) == 0)
    {
      puts ("dup3(O_CLOEXEC) set close-on-exec flag");
      return 1;
    }
  close (fd);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:28 -07:00
Jonathan Corbet a238b790d5 Call fasync() functions without the BKL
lock_kernel() calls have been pushed down into code which needs it, so
there is no need to take the BKL at this level anymore.

This work inspired and aided by Andi Kleen's unlocked_fasync() patches.

Acked-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2008-07-02 15:06:28 -06:00
Al Viro 9f3acc3140 [PATCH] split linux/file.h
Initial splitoff of the low-level stuff; taken to fdtable.h

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-05-01 13:08:16 -04:00
Al Viro f8f95702f0 [PATCH] sanitize locate_fd()
* 'file' argument is unused; lose it.
* move setting flags from the caller (dupfd()) to locate_fd();
  pass cloexec flag as new argument.  Note that files_fdtable()
  that used to be in dupfd() isn't needed in the place in
  locate_fd() where the moved code ends up - we know that ->file_lock
  hadn't been dropped since the last time we calculated fdt because
  we can get there only if expand_files() returns 0 and it doesn't
  drop/reacquire in that case.
* move getting/dropping ->file_lock into locate_fd().  Now the caller
  doesn't need to do anything with files_struct *files anymore and
  we can move that inside locate_fd() as well, killing the
  struct files_struct * argument.

At that point locate_fd() is extremely similar to get_unused_fd_flags()
and the next patches will merge those two.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-25 09:24:05 -04:00
Harvey Harrison fc9b52cd8f fs: remove fastcall, it is always empty
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:31 -08:00
Pavel Emelyanov 6c5f3e7b43 Pidns: make full use of xxx_vnr() calls
Some time ago the xxx_vnr() calls (e.g.  pid_vnr or find_task_by_vpid) were
_all_ converted to operate on the current pid namespace.  After this each call
like xxx_nr_ns(foo, current->nsproxy->pid_ns) is nothing but a xxx_vnr(foo)
one.

Switch all the xxx_nr_ns() callers to use the xxx_vnr() calls where
appropriate.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:29 -08:00
Pavel Emelyanov b488893a39 pid namespaces: changes to show virtual ids to user
This is the largest patch in the set. Make all (I hope) the places where
the pid is shown to or get from user operate on the virtual pids.

The idea is:
 - all in-kernel data structures must store either struct pid itself
   or the pid's global nr, obtained with pid_nr() call;
 - when seeking the task from kernel code with the stored id one
   should use find_task_by_pid() call that works with global pids;
 - when showing pid's numerical value to the user the virtual one
   should be used, but however when one shows task's pid outside this
   task's namespace the global one is to be used;
 - when getting the pid from userspace one need to consider this as
   the virtual one and use appropriate task/pid-searching functions.

[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: nuther build fix]
[akpm@linux-foundation.org: yet nuther build fix]
[akpm@linux-foundation.org: remove unneeded casts]
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Paul Menage <menage@google.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-19 11:53:40 -07:00