Commit Graph

3334 Commits

Author SHA1 Message Date
Abhishek Sagar
f47cd9b553 kprobes: kretprobe user entry-handler
Provide support to add an optional user defined callback to be run at
function entry of a kretprobe'd function.  Also modify the kprobe smoke
tests to include an entry-handler during the kretprobe sanity test.

Signed-off-by: Abhishek Sagar <sagar.abhishek@gmail.com>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Acked-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:11 -08:00
Andrew Morton
ec03d70739 speed up jiffies conversion functions if HZ==USER_HZ
Avoid calling do_div(x, 1) in this case.

Cc: David Fries <david@fries.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:10 -08:00
David Fries
6ffc787a44 system timer: fix crash in <100Hz system timer
The kernel has a divide by zero crash when trying to run the system timer
less than 100Hz.  The problem is x/(HZ/USER_HZ) and related.  Now
x*(USER_HZ/HZ) will be used if HZ<USER_HZ.

I'm running the Linux kernel under qemu and went to run a slower system
timer to take less CPU (and battery) on the host.  I found that the kernel
paniced under emulation because of a divide by zero in three places.  Here
is the patch.  The base git was updated today 01-05-2008.  I went for a
20Hz system time by adding config HZ_20 etc to kernel/Kconfig.hz.  With
this patch I verified the system timer by looking at /proc/interrupts.

[akpm@linux-foundation.org: partially clean up the macro maze]
Signed-off-by: David Fries <david@fries.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:10 -08:00
Eric Dumazet
1bf47346d7 kernel/sys.c: get rid of expensive divides in groups_sort()
groups_sort() can be quite long if user loads a large gid table.

This is because GROUP_AT(group_info, some_integer) uses an integer divide.
So having to do XXX thousand divides during one syscall can lead to very
high latencies.  (NGROUPS_MAX=65536)

In the past (25 Mar 2006), an analog problem was found in groups_search()
(commit d74beb9f33 ) and at that time I
changed some variables to unsigned int.

I believe that a more generic fix is to make sure NGROUPS_PER_BLOCK is
unsigned.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:09 -08:00
Adrian Bunk
6b2fb3c658 idle_regs() must be __cpuinit
Fix the following section mismatch with CONFIG_HOTPLUG=n,
CONFIG_HOTPLUG_CPU=y:

WARNING: vmlinux.o(.text+0x399a6): Section mismatch: reference to .init.text.5:idle_regs (between 'fork_idle' and 'get_task_mm')

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:08 -08:00
Richard Knutsson
eb38a996eb kernel/params.c: remove sparse-warning (different signedness)
Fixing:
  CHECK   kernel/params.c
kernel/params.c:329:41: warning: incorrect type in argument 8 (different signedness)
kernel/params.c:329:41:    expected int *num
kernel/params.c:329:41:    got unsigned int *

Signed-off-by: Richard Knutsson <ricknu-0@student.ltu.se>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:08 -08:00
Daniel Walker
6c6080f74c stopmachine: semaphore to mutex
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Daniel Walker <dwalker@mvista.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:08 -08:00
Roland McGrath
1a669c2f16 Add arch_ptrace_stop
This adds support to allow asm/ptrace.h to define two new macros,
arch_ptrace_stop_needed and arch_ptrace_stop.  These control special
machine-specific actions to be done before a ptrace stop.  The new code
compiles away to nothing when the new macros are not defined.  This is the
case on all machines to begin with.

On ia64, these macros will be defined to solve the long-standing issue of
ptrace vs register backing store.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Matthew Wilcox <willy@debian.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:07 -08:00
Nick Piggin
a1e096129b relay: nopage
Convert relay from nopage to fault.
Remove redundant vma range checks.
Switch from OOM to SIGBUS if the resource is not available.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Tom Zanussi <zanussi@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:07 -08:00
Eric Dumazet
9cfe015aa4 get rid of NR_OPEN and introduce a sysctl_nr_open
NR_OPEN (historically set to 1024*1024) actually forbids processes to open
more than 1024*1024 handles.

Unfortunatly some production servers hit the not so 'ridiculously high
value' of 1024*1024 file descriptors per process.

Changing NR_OPEN is not considered safe because of vmalloc space potential
exhaust.

This patch introduces a new sysctl (/proc/sys/fs/nr_open) wich defaults to
1024*1024, so that admins can decide to change this limit if their workload
needs it.

[akpm@linux-foundation.org: export it for sparc64]
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:06 -08:00
Oleg Nesterov
0a76fe8e50 do_wait: remove one "else if" branch
Minor cleanup. We can remove one "else if" branch.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:04 -08:00
Denys Vlasenko
eed4a2aba7 printk.c: use unsigned ints instead of longs for logbuf index
Stop using unsigned _longs_ for printk buffer indexes.  Log buffer is way
smaller than 2 gigabytes and unsigned ints will work too .  Indeed, they do
work nicely on all 32-bit platforms where longs and ints are the same.

With this patch, we have following size savings on amd64:

   text    data     bss     dec     hex filename
   5997     313   17736   24046    5dee 2.6.23.1.t64/kernel/printk.o
   5858     313   17700   23871    5d3f 2.6.23.1.printk.t64/kernel/printk.o

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:04 -08:00
Miao Xie
5e2cb1018a time: fix sysfs_show_{available,current}_clocksources() buffer overflow problem
I found that there is a buffer overflow problem in the following code.

Version:	2.6.24-rc2,
File:		kernel/time/clocksource.c:417-432
--------------------------------------------------------------------
static ssize_t
sysfs_show_available_clocksources(struct sys_device *dev, char *buf)
{
	struct clocksource *src;
	char *curr = buf;

	spin_lock_irq(&clocksource_lock);
	list_for_each_entry(src, &clocksource_list, list) {
		curr += sprintf(curr, "%s ", src->name);
	}
	spin_unlock_irq(&clocksource_lock);

	curr += sprintf(curr, "\n");

	return curr - buf;
}
-----------------------------------------------------------------------

sysfs_show_current_clocksources() also has the same problem though in practice
the size of current clocksource's name won't exceed PAGE_SIZE.

I fix the bug by using snprintf according to the specification of the kernel
(Version:2.6.24-rc2,File:Documentation/filesystems/sysfs.txt)

Fix sysfs_show_available_clocksources() and sysfs_show_current_clocksources()
buffer overflow problem with snprintf().

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:03 -08:00
Adrian Bunk
c166f23cb5 kernel/notifier.c should #include <linux/reboot.h>
Every file should include the headers containing the prototypes for its global
functions (in this case {,un}register_reboot_notifier()).

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:02 -08:00
Adrian Bunk
bb695170d8 make srcu_readers_active() static
Make the needlessly global srcu_readers_active() static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:02 -08:00
Adrian Bunk
f17d30a803 kernel/ptrace.c should #include <linux/syscalls.h>
Every file should include the headers containing the prototypes for its global
functions (in this case sys_ptrace()).

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:02 -08:00
Robin Getz
a3b81113fb remove support for un-needed _extratext section
When passing a zero address to kallsyms_lookup(), the kernel thought it was
a valid kernel address, even if it is not.  This is because is_ksym_addr()
called is_kernel_extratext() and checked against labels that don't exist on
many archs (which default as zero).  Since PPC was the only kernel which
defines _extra_text, (in 2005), and no longer needs it, this patch removes
_extra_text support.

For some history (provided by Jon):
 http://ozlabs.org/pipermail/linuxppc-dev/2005-September/019734.html
 http://ozlabs.org/pipermail/linuxppc-dev/2005-September/019736.html
 http://ozlabs.org/pipermail/linuxppc-dev/2005-September/019751.html

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jon Loeliger <jdl@freescale.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:01 -08:00
Oleg Nesterov
d9ae90ac4b use __set_task_state() for TRACED/STOPPED tasks
1. It is much easier to grep for ->state change if __set_task_state() is used
   instead of the direct assignment.

2. ptrace_stop() and handle_group_stop() use set_task_state() which adds the
   unneeded mb() (btw even if we use mb() it is still possible that do_wait()
   sees the new ->state but not ->exit_code, but this is ok).

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:00 -08:00
Michael Neuling
06b8e878a9 taskstats scaled time cleanup
This moves the ability to scale cputime into generic code.  This allows us
to fix the issue in kernel/timer.c (noticed by Balbir) where we could only
add an unscaled value to the scaled utime/stime.

This adds a cputime_to_scaled function.  As before, the POWERPC version
does the scaling based on the last SPURR/PURR ratio calculated.  The
generic and s390 (only other arch to implement asm/cputime.h) versions are
both NOPs.

Also moves the SPURR and PURR snapshots closer.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: Jay Lan <jlan@engr.sgi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:00 -08:00
Mark Gross
f011e2e2df latency.c: use QoS infrastructure
Replace latency.c use with pm_qos_params use.

Signed-off-by: mark gross <mgross@linux.intel.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Jaroslav Kysela <perex@suse.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-05 09:44:22 -08:00
Mark Gross
d82b35186e pm qos infrastructure and interface
The following patch is a generalization of the latency.c implementation done
by Arjan last year.  It provides infrastructure for more than one parameter,
and exposes a user mode interface for processes to register pm_qos
expectations of processes.

This interface provides a kernel and user mode interface for registering
performance expectations by drivers, subsystems and user space applications on
one of the parameters.

Currently we have {cpu_dma_latency, network_latency, network_throughput} as
the initial set of pm_qos parameters.

The infrastructure exposes multiple misc device nodes one per implemented
parameter.  The set of parameters implement is defined by pm_qos_power_init()
and pm_qos_params.h.  This is done because having the available parameters
being runtime configurable or changeable from a driver was seen as too easy to
abuse.

For each parameter a list of performance requirements is maintained along with
an aggregated target value.  The aggregated target value is updated with
changes to the requirement list or elements of the list.  Typically the
aggregated target value is simply the max or min of the requirement values
held in the parameter list elements.

>From kernel mode the use of this interface is simple:

pm_qos_add_requirement(param_id, name, target_value):

  Will insert a named element in the list for that identified PM_QOS
  parameter with the target value.  Upon change to this list the new target is
  recomputed and any registered notifiers are called only if the target value
  is now different.

pm_qos_update_requirement(param_id, name, new_target_value):

  Will search the list identified by the param_id for the named list element
  and then update its target value, calling the notification tree if the
  aggregated target is changed.  with that name is already registered.

pm_qos_remove_requirement(param_id, name):

  Will search the identified list for the named element and remove it, after
  removal it will update the aggregate target and call the notification tree
  if the target was changed as a result of removing the named requirement.

>From user mode:

  Only processes can register a pm_qos requirement.  To provide for
  automatic cleanup for process the interface requires the process to register
  its parameter requirements in the following way:

  To register the default pm_qos target for the specific parameter, the
  process must open one of /dev/[cpu_dma_latency, network_latency,
  network_throughput]

  As long as the device node is held open that process has a registered
  requirement on the parameter.  The name of the requirement is
  "process_<PID>" derived from the current->pid from within the open system
  call.

  To change the requested target value the process needs to write a s32
  value to the open device node.  This translates to a
  pm_qos_update_requirement call.

  To remove the user mode request for a target value simply close the device
  node.

[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: fix build again]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: mark gross <mgross@linux.intel.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Jaroslav Kysela <perex@suse.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Adam Belay <abelay@novell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-05 09:44:22 -08:00
Adrian Bunk
4ef7229ffa make kernel_shutdown_prepare() static
kernel_shutdown_prepare() can now become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-05 09:44:22 -08:00
Adrian Bunk
47a460d5a3 kernel/power/disk.c: make code static
resume_file[] and create_image() can become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-05 09:44:22 -08:00
Serge E. Hallyn
3b7391de67 capabilities: introduce per-process capability bounding set
The capability bounding set is a set beyond which capabilities cannot grow.
 Currently cap_bset is per-system.  It can be manipulated through sysctl,
but only init can add capabilities.  Root can remove capabilities.  By
default it includes all caps except CAP_SETPCAP.

This patch makes the bounding set per-process when file capabilities are
enabled.  It is inherited at fork from parent.  Noone can add elements,
CAP_SETPCAP is required to remove them.

One example use of this is to start a safer container.  For instance, until
device namespaces or per-container device whitelists are introduced, it is
best to take CAP_MKNOD away from a container.

The bounding set will not affect pP and pE immediately.  It will only
affect pP' and pE' after subsequent exec()s.  It also does not affect pI,
and exec() does not constrain pI'.  So to really start a shell with no way
of regain CAP_MKNOD, you would do

	prctl(PR_CAPBSET_DROP, CAP_MKNOD);
	cap_t cap = cap_get_proc();
	cap_value_t caparray[1];
	caparray[0] = CAP_MKNOD;
	cap_set_flag(cap, CAP_INHERITABLE, 1, caparray, CAP_DROP);
	cap_set_proc(cap);
	cap_free(cap);

The following test program will get and set the bounding
set (but not pI).  For instance

	./bset get
		(lists capabilities in bset)
	./bset drop cap_net_raw
		(starts shell with new bset)
		(use capset, setuid binary, or binary with
		file capabilities to try to increase caps)

************************************************************
cap_bound.c
************************************************************
 #include <sys/prctl.h>
 #include <linux/capability.h>
 #include <sys/types.h>
 #include <unistd.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>

 #ifndef PR_CAPBSET_READ
 #define PR_CAPBSET_READ 23
 #endif

 #ifndef PR_CAPBSET_DROP
 #define PR_CAPBSET_DROP 24
 #endif

int usage(char *me)
{
	printf("Usage: %s get\n", me);
	printf("       %s drop <capability>\n", me);
	return 1;
}

 #define numcaps 32
char *captable[numcaps] = {
	"cap_chown",
	"cap_dac_override",
	"cap_dac_read_search",
	"cap_fowner",
	"cap_fsetid",
	"cap_kill",
	"cap_setgid",
	"cap_setuid",
	"cap_setpcap",
	"cap_linux_immutable",
	"cap_net_bind_service",
	"cap_net_broadcast",
	"cap_net_admin",
	"cap_net_raw",
	"cap_ipc_lock",
	"cap_ipc_owner",
	"cap_sys_module",
	"cap_sys_rawio",
	"cap_sys_chroot",
	"cap_sys_ptrace",
	"cap_sys_pacct",
	"cap_sys_admin",
	"cap_sys_boot",
	"cap_sys_nice",
	"cap_sys_resource",
	"cap_sys_time",
	"cap_sys_tty_config",
	"cap_mknod",
	"cap_lease",
	"cap_audit_write",
	"cap_audit_control",
	"cap_setfcap"
};

int getbcap(void)
{
	int comma=0;
	unsigned long i;
	int ret;

	printf("i know of %d capabilities\n", numcaps);
	printf("capability bounding set:");
	for (i=0; i<numcaps; i++) {
		ret = prctl(PR_CAPBSET_READ, i);
		if (ret < 0)
			perror("prctl");
		else if (ret==1)
			printf("%s%s", (comma++) ? ", " : " ", captable[i]);
	}
	printf("\n");
	return 0;
}

int capdrop(char *str)
{
	unsigned long i;

	int found=0;
	for (i=0; i<numcaps; i++) {
		if (strcmp(captable[i], str) == 0) {
			found=1;
			break;
		}
	}
	if (!found)
		return 1;
	if (prctl(PR_CAPBSET_DROP, i)) {
		perror("prctl");
		return 1;
	}
	return 0;
}

int main(int argc, char *argv[])
{
	if (argc<2)
		return usage(argv[0]);
	if (strcmp(argv[1], "get")==0)
		return getbcap();
	if (strcmp(argv[1], "drop")!=0 || argc<3)
		return usage(argv[0]);
	if (capdrop(argv[2])) {
		printf("unknown capability\n");
		return 1;
	}
	return execl("/bin/bash", "/bin/bash", NULL);
}
************************************************************

[serue@us.ibm.com: fix typo]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew G. Morgan <morgan@kernel.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>a
Signed-off-by: "Serge E. Hallyn" <serue@us.ibm.com>
Tested-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-05 09:44:20 -08:00
Andrew Morgan
e338d263a7 Add 64-bit capability support to the kernel
The patch supports legacy (32-bit) capability userspace, and where possible
translates 32-bit capabilities to/from userspace and the VFS to 64-bit
kernel space capabilities.  If a capability set cannot be compressed into
32-bits for consumption by user space, the system call fails, with -ERANGE.

FWIW libcap-2.00 supports this change (and earlier capability formats)

 http://www.kernel.org/pub/linux/libs/security/linux-privs/kernel-2.6/

[akpm@linux-foundation.org: coding-syle fixes]
[akpm@linux-foundation.org: use get_task_comm()]
[ezk@cs.sunysb.edu: build fix]
[akpm@linux-foundation.org: do not initialise statics to 0 or NULL]
[akpm@linux-foundation.org: unused var]
[serue@us.ibm.com: export __cap_ symbols]
Signed-off-by: Andrew G. Morgan <morgan@kernel.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: James Morris <jmorris@namei.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-05 09:44:20 -08:00