Commit Graph

234 Commits

Author SHA1 Message Date
Johannes Weiner
232086b199 ipc: unbreak 32-bit shmctl/semctl/msgctl
31a985f "ipc: use __ARCH_WANT_IPC_PARSE_VERSION in ipc/util.h" would
choose the implementation of ipc_parse_version() based on a symbol
defined in <asm/unistd.h>.

But it failed to also include this header and thus broke
IPC_64-passing 32-bit userspace because the flag wasn't masked out
properly anymore and the command not understood.

Include <linux/unistd.h> to give the architecture a chance to ask for
the no-no-op ipc_parse_version().

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-21 12:48:43 -07:00
Alexey Dobriyan
665c7741fb ipcns: move free_ipcs() proto
Function is really private to ipc/ and avoid struct kern_ipc_perm
forward declaration.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18 13:03:56 -07:00
Alexey Dobriyan
b4188def44 ipcns: make free_ipc_ns() static
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18 13:03:56 -07:00
Alexey Dobriyan
612ce478fa ipcns: extract create_ipc_ns()
clone_ipc_ns() is misnamed, it doesn't clone anything and doesn't use
passed parameter.  Rename it.

create_ipc_ns() will be used by C/R to create fresh ipcns.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18 13:03:55 -07:00
Alexey Dobriyan
64424289dd ipcns: remove useless get/put while CLONE_NEWIPC
copy_ipcs() doesn't actually copy anything. If new ipcns is created, it's
created from scratch, in this case get/put on old ipcns isn't needed.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18 13:03:55 -07:00
Arnd Bergmann
31a985fbb1 ipc: use __ARCH_WANT_IPC_PARSE_VERSION in ipc/util.h
The definition of ipc_parse_version depends on
__ARCH_WANT_IPC_PARSE_VERSION, but the header file declares it
conditionally based on the architecture.

Use the macro consistently to make it easier to add new architectures.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18 13:03:54 -07:00
Linus Torvalds
3296ca27f5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (44 commits)
  nommu: Provide mmap_min_addr definition.
  TOMOYO: Add description of lists and structures.
  TOMOYO: Remove unused field.
  integrity: ima audit dentry_open failure
  TOMOYO: Remove unused parameter.
  security: use mmap_min_addr indepedently of security models
  TOMOYO: Simplify policy reader.
  TOMOYO: Remove redundant markers.
  SELinux: define audit permissions for audit tree netlink messages
  TOMOYO: Remove unused mutex.
  tomoyo: avoid get+put of task_struct
  smack: Remove redundant initialization.
  integrity: nfsd imbalance bug fix
  rootplug: Remove redundant initialization.
  smack: do not beyond ARRAY_SIZE of data
  integrity: move ima_counts_get
  integrity: path_check update
  IMA: Add __init notation to ima functions
  IMA: Minimal IMA policy and boot param for TCB IMA policy
  selinux: remove obsolete read buffer limit from sel_read_bool
  ...
2009-06-11 10:01:41 -07:00
Linus Torvalds
20f3f3ca49 Merge branch 'rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  rcu: rcu_sched_grace_period(): kill the bogus flush_signals()
  rculist: use list_entry_rcu in places where it's appropriate
  rculist.h: introduce list_entry_rcu() and list_first_entry_rcu()
  rcu: Update RCU tracing documentation for __rcu_pending
  rcu: Add __rcu_pending tracing to hierarchical RCU
  RCU: make treercu be default
2009-06-10 19:50:03 -07:00
James Morris
73fbad283c Merge branch 'next' into for-linus 2009-06-11 11:03:14 +10:00
Mike Frysinger
586c7e6a28 shm: fix unused warnings on nommu
The massive nommu update (8feae131) resulted in these warnings:
ipc/shm.c: In function `sys_shmdt':
ipc/shm.c:974: warning: unused variable `size'
ipc/shm.c:972: warning: unused variable `next'

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-09 16:59:03 -07:00
Mimi Zohar
c9d9ac525a integrity: move ima_counts_get
Based on discussion on lkml (Andrew Morton and Eric Paris),
move ima_counts_get down a layer into shmem/hugetlb__file_setup().
Resolves drm shmem_file_setup() usage case as well.

HD comment:
  I still think you're doing this at the wrong level, but recognize
  that you probably won't be persuaded until a few more users of
  alloc_file() emerge, all wanting your ima_counts_get().

  Resolving GEM's shmem_file_setup() is an improvement, so I'll say

Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-05-22 09:45:33 +10:00
Mimi Zohar
b9fc745db8 integrity: path_check update
- Add support in ima_path_check() for integrity checking without
incrementing the counts. (Required for nfsd.)
- rename and export opencount_get to ima_counts_get
- replace ima_shm_check calls with ima_counts_get
- export ima_path_check

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-05-22 09:43:41 +10:00
Jiri Pirko
05725f7eb4 rculist: use list_entry_rcu in places where it's appropriate
Use previously introduced list_entry_rcu instead of an open-coded
list_entry + rcu_dereference combination.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: dipankar@in.ibm.com
LKML-Reference: <20090414181715.GA3634@psychotron.englab.brq.redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-15 12:05:25 +02:00
Geert Uytterhoeven
f26ec5baa6 namespaces: move get_mq() inside #ifdef CONFIG_SYSCTL
| ipc/mq_sysctl.c:26: warning: 'get_mq' defined but not used

Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-13 15:04:29 -07:00
Serge E. Hallyn
bdc8e5f85f namespaces: mqueue namespace: adapt sysctl
Largely inspired from ipc/ipc_sysctl.c.  This patch isolates the mqueue
sysctl stuff in its own file.

[akpm@linux-foundation.org: build fix]
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-07 08:31:09 -07:00
Serge E. Hallyn
7eafd7c74c namespaces: ipc namespaces: implement support for posix msqueues
Implement multiple mounts of the mqueue file system, and link it to usage
of CLONE_NEWIPC.

Each ipc ns has a corresponding mqueuefs superblock.  When a user does
clone(CLONE_NEWIPC) or unshare(CLONE_NEWIPC), the unshare will cause an
internal mount of a new mqueuefs sb linked to the new ipc ns.

When a user does 'mount -t mqueue mqueue /dev/mqueue', he mounts the
mqueuefs superblock.

Posix message queues can be worked with both through the mq_* system calls
(see mq_overview(7)), and through the VFS through the mqueue mount.  Any
usage of mq_open() and friends will work with the acting task's ipc
namespace.  Any actions through the VFS will work with the mqueuefs in
which the file was created.  So if a user doesn't remount mqueuefs after
unshare(CLONE_NEWIPC), mq_open("/ab") will not be reflected in "ls
/dev/mqueue".

If task a mounts mqueue for ipc_ns:1, then clones task b with a new ipcns,
ipcns:2, and then task a is the last task in ipc_ns:1 to exit, then (1)
ipc_ns:1 will be freed, (2) it's superblock will live on until task b
umounts the corresponding mqueuefs, and vfs actions will continue to
succeed, but (3) sb->s_fs_info will be NULL for the sb corresponding to
the deceased ipc_ns:1.

To make this happen, we must protect the ipc reference count when

a) a task exits and drops its ipcns->count, since it might be dropping
   it to 0 and freeing the ipcns

b) a task accesses the ipcns through its mqueuefs interface, since it
   bumps the ipcns refcount and might race with the last task in the ipcns
   exiting.

So the kref is changed to an atomic_t so we can use
atomic_dec_and_lock(&ns->count,mq_lock), and every access to the ipcns
through ns = mqueuefs_sb->s_fs_info is protected by the same lock.

Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-07 08:31:09 -07:00
Serge E. Hallyn
614b84cf4e namespaces: mqueue ns: move mqueue_mnt into struct ipc_namespace
Move mqueue vfsmount plus a few tunables into the ipc_namespace struct.
The CONFIG_IPC_NS boolean and the ipc_namespace struct will serve both the
posix message queue namespaces and the SYSV ipc namespaces.

The sysctl code will be fixed separately in patch 3.  After just this
patch, making a change to posix mqueue tunables always changes the values
in the initial ipc namespace.

Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-07 08:31:09 -07:00
Linus Torvalds
8fe74cf053 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  Remove two unneeded exports and make two symbols static in fs/mpage.c
  Cleanup after commit 585d3bc06f
  Trim includes of fdtable.h
  Don't crap into descriptor table in binfmt_som
  Trim includes in binfmt_elf
  Don't mess with descriptor table in load_elf_binary()
  Get rid of indirect include of fs_struct.h
  New helper - current_umask()
  check_unsafe_exec() doesn't care about signal handlers sharing
  New locking/refcounting for fs_struct
  Take fs_struct handling to new file (fs/fs_struct.c)
  Get rid of bumping fs_struct refcount in pivot_root(2)
  Kill unsharing fs_struct in __set_personality()
2009-04-02 21:09:10 -07:00
Serge E. Hallyn
11dea19009 proc_sysctl: use CONFIG_PROC_SYSCTL around ipc and utsname proc_handlers
As pointed out by Cedric Le Goater (in response to Alexey's original
comment wrt mqns), ipc_sysctl.c and utsname_sysctl.c are using
CONFIG_PROC_FS, not CONFIG_PROC_SYSCTL, to determine whether to define
the proc_handlers.  Change that.

Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Acked-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:05:01 -07:00
Tony Battersby
e562aebc6c ipc: make shm_get_stat() more robust
shm_get_stat() assumes idr_find(&shm_ids(ns).ipcs_idr) returns "struct
shmid_kernel *"; all other callers assume that it returns "struct
kern_ipc_perm *".  This works because "struct kern_ipc_perm" is currently
the first member of "struct shmid_kernel", but it would be better to use
container_of() to prevent future breakage.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Cc: Jiri Olsa <olsajiri@gmail.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:05:01 -07:00
Al Viro
ce3b0f8d5c New helper - current_umask()
current->fs->umask is what most of fs_struct users are doing.
Put that into a helper function.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-03-31 23:00:26 -04:00
Linus Torvalds
8e9d208972 Merge branch 'bkl-removal' of git://git.lwn.net/linux-2.6
* 'bkl-removal' of git://git.lwn.net/linux-2.6:
  Rationalize fasync return values
  Move FASYNC bit handling to f_op->fasync()
  Use f_lock to protect f_flags
  Rename struct file->f_ep_lock
2009-03-26 16:14:02 -07:00
James Morris
703a3cd728 Merge branch 'master' into next 2009-03-24 10:52:46 +11:00
Jonathan Corbet
db1dd4d376 Use f_lock to protect f_flags
Traditionally, changes to struct file->f_flags have been done under BKL
protection, or with no protection at all.  This patch causes all f_flags
changes after file open/creation time to be done under protection of
f_lock.  This allows the removal of some BKL usage and fixes a number of
longstanding (if microscopic) races.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2009-03-16 08:32:27 -06:00
Mel Gorman
5a6fe12595 Do not account for the address space used by hugetlbfs using VM_ACCOUNT
When overcommit is disabled, the core VM accounts for pages used by anonymous
shared, private mappings and special mappings. It keeps track of VMAs that
should be accounted for with VM_ACCOUNT and VMAs that never had a reserve
with VM_NORESERVE.

Overcommit for hugetlbfs is much riskier than overcommit for base pages
due to contiguity requirements. It avoids overcommiting on both shared and
private mappings using reservation counters that are checked and updated
during mmap(). This ensures (within limits) that hugepages exist in the
future when faults occurs or it is too easy to applications to be SIGKILLed.

As hugetlbfs makes its own reservations of a different unit to the base page
size, VM_ACCOUNT should never be set. Even if the units were correct, we would
double account for the usage in the core VM and hugetlbfs. VM_NORESERVE may
be set because an application can request no reserves be made for hugetlbfs
at the risk of getting killed later.

With commit fc8744adc8, VM_NORESERVE and
VM_ACCOUNT are getting unconditionally set for hugetlbfs-backed mappings. This
breaks the accounting for both the core VM and hugetlbfs, can trigger an
OOM storm when hugepage pools are too small lockups and corrupted counters
otherwise are used. This patch brings hugetlbfs more in line with how the
core VM treats VM_NORESERVE but prevents VM_ACCOUNT being set.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-10 10:48:42 -08:00