Commit Graph

1226 Commits

Author SHA1 Message Date
Srivatsa Vaddagiri
15f0b676a4 [PATCH] sched_domai: Use kmalloc_node
The sched group structures used to represent various nodes need to be
allocated from respective nodes (as suggested here also:

	http://uwsg.ucs.indiana.edu/hypermail/linux/kernel/0603.3/0051.html)

Signed-off-by: Srivatsa Vaddagiri <vatsa@in.ibm.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:45 -07:00
Srivatsa Vaddagiri
d3a5aa9858 [PATCH] sched_domai: Don't use GFP_ATOMIC
Replace GFP_ATOMIC allocation for sched_group_nodes with GFP_KERNEL based
allocation.

Signed-off-by: Srivatsa Vaddagiri <vatsa@in.ibm.com
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:45 -07:00
Srivatsa Vaddagiri
51888ca25a [PATCH] sched_domain: handle kmalloc failure
Try to handle mem allocation failures in build_sched_domains by bailing out
and cleaning up thus-far allocated memory.  The patch has a direct consequence
that we disable load balancing completely (even at sibling level) upon *any*
memory allocation failure.

[Lee.Schermerhorn@hp.com: bugfix]
Signed-off-by: Srivatsa Vaddagir <vatsa@in.ibm.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:45 -07:00
Peter Williams
615052dc3b [PATCH] sched: Avoid unnecessarily moving highest priority task move_tasks()
Problem:

To help distribute high priority tasks evenly across the available CPUs
move_tasks() does not, under some circumstances, skip tasks whose load
weight is bigger than the designated amount.  Because the highest priority
task on the busiest queue may be on the expired array it may be moved as a
result of this mechanism.  Apart from not being the most desirable way to
redistribute the high priority tasks (we'd rather move the second highest
priority task), there is a risk that this could set up a loop with this
task bouncing backwards and forwards between the two queues.  (This latter
possibility can be demonstrated by running a nice==-20 CPU bound task on an
otherwise quiet 2 CPU system.)

Solution:

Modify the mechanism so that it does not override skip for the highest
priority task on the CPU.  Of course, if there are more than one tasks at
the highest priority then it will allow the override for one of them as
this is a desirable redistribution of high priority tasks.

Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:44 -07:00
Peter Williams
50ddd96917 [PATCH] sched: modify move_tasks() to improve load balancing outcomes
Problem:

The move_tasks() function is designed to move UP TO the amount of load it
is asked to move and in doing this it skips over tasks looking for ones
whose load weights are less than or equal to the remaining load to be
moved.  This is (in general) a good thing but it has the unfortunate result
of breaking one of the original load balancer's good points: namely, that
(within the limits imposed by the active/expired array model and the fact
the expired is processed first) it moves high priority tasks before low
priority ones and this means there's a good chance (see active/expired
problem for why it's only a chance) that the highest priority task on the
queue but not actually on the CPU will be moved to the other CPU where (as
a high priority task) it may preempt the current task.

Solution:

Modify move_tasks() so that high priority tasks are not skipped when moving
them will make them the highest priority task on their new run queue.

Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:44 -07:00
Peter Williams
2dd73a4f09 [PATCH] sched: implement smpnice
Problem:

The introduction of separate run queues per CPU has brought with it "nice"
enforcement problems that are best described by a simple example.

For the sake of argument suppose that on a single CPU machine with a
nice==19 hard spinner and a nice==0 hard spinner running that the nice==0
task gets 95% of the CPU and the nice==19 task gets 5% of the CPU.  Now
suppose that there is a system with 2 CPUs and 2 nice==19 hard spinners and
2 nice==0 hard spinners running.  The user of this system would be entitled
to expect that the nice==0 tasks each get 95% of a CPU and the nice==19
tasks only get 5% each.  However, whether this expectation is met is pretty
much down to luck as there are four equally likely distributions of the
tasks to the CPUs that the load balancing code will consider to be balanced
with loads of 2.0 for each CPU.  Two of these distributions involve one
nice==0 and one nice==19 task per CPU and in these circumstances the users
expectations will be met.  The other two distributions both involve both
nice==0 tasks being on one CPU and both nice==19 being on the other CPU and
each task will get 50% of a CPU and the user's expectations will not be
met.

Solution:

The solution to this problem that is implemented in the attached patch is
to use weighted loads when determining if the system is balanced and, when
an imbalance is detected, to move an amount of weighted load between run
queues (as opposed to a number of tasks) to restore the balance.  Once
again, the easiest way to explain why both of these measures are necessary
is to use a simple example.  Suppose that (in a slight variation of the
above example) that we have a two CPU system with 4 nice==0 and 4 nice=19
hard spinning tasks running and that the 4 nice==0 tasks are on one CPU and
the 4 nice==19 tasks are on the other CPU.  The weighted loads for the two
CPUs would be 4.0 and 0.2 respectively and the load balancing code would
move 2 tasks resulting in one CPU with a load of 2.0 and the other with
load of 2.2.  If this was considered to be a big enough imbalance to
justify moving a task and that task was moved using the current
move_tasks() then it would move the highest priority task that it found and
this would result in one CPU with a load of 3.0 and the other with a load
of 1.2 which would result in the movement of a task in the opposite
direction and so on -- infinite loop.  If, on the other hand, an amount of
load to be moved is calculated from the imbalance (in this case 0.1) and
move_tasks() skips tasks until it find ones whose contributions to the
weighted load are less than this amount it would move two of the nice==19
tasks resulting in a system with 2 nice==0 and 2 nice=19 on each CPU with
loads of 2.1 for each CPU.

One of the advantages of this mechanism is that on a system where all tasks
have nice==0 the load balancing calculations would be mathematically
identical to the current load balancing code.

Notes:

struct task_struct:

has a new field load_weight which (in a trade off of space for speed)
stores the contribution that this task makes to a CPU's weighted load when
it is runnable.

struct runqueue:

has a new field raw_weighted_load which is the sum of the load_weight
values for the currently runnable tasks on this run queue.  This field
always needs to be updated when nr_running is updated so two new inline
functions inc_nr_running() and dec_nr_running() have been created to make
sure that this happens.  This also offers a convenient way to optimize away
this part of the smpnice mechanism when CONFIG_SMP is not defined.

int try_to_wake_up():

in this function the value SCHED_LOAD_BALANCE is used to represent the load
contribution of a single task in various calculations in the code that
decides which CPU to put the waking task on.  While this would be a valid
on a system where the nice values for the runnable tasks were distributed
evenly around zero it will lead to anomalous load balancing if the
distribution is skewed in either direction.  To overcome this problem
SCHED_LOAD_SCALE has been replaced by the load_weight for the relevant task
or by the average load_weight per task for the queue in question (as
appropriate).

int move_tasks():

The modifications to this function were complicated by the fact that
active_load_balance() uses it to move exactly one task without checking
whether an imbalance actually exists.  This precluded the simple
overloading of max_nr_move with max_load_move and necessitated the addition
of the latter as an extra argument to the function.  The internal
implementation is then modified to move up to max_nr_move tasks and
max_load_move of weighted load.  This slightly complicates the code where
move_tasks() is called and if ever active_load_balance() is changed to not
use move_tasks() the implementation of move_tasks() should be simplified
accordingly.

struct sched_group *find_busiest_group():

Similar to try_to_wake_up(), there are places in this function where
SCHED_LOAD_SCALE is used to represent the load contribution of a single
task and the same issues are created.  A similar solution is adopted except
that it is now the average per task contribution to a group's load (as
opposed to a run queue) that is required.  As this value is not directly
available from the group it is calculated on the fly as the queues in the
groups are visited when determining the busiest group.

A key change to this function is that it is no longer to scale down
*imbalance on exit as move_tasks() uses the load in its scaled form.

void set_user_nice():

has been modified to update the task's load_weight field when it's nice
value and also to ensure that its run queue's raw_weighted_load field is
updated if it was runnable.

From: "Siddha, Suresh B" <suresh.b.siddha@intel.com>

With smpnice, sched groups with highest priority tasks can mask the imbalance
between the other sched groups with in the same domain.  This patch fixes some
of the listed down scenarios by not considering the sched groups which are
lightly loaded.

a) on a simple 4-way MP system, if we have one high priority and 4 normal
   priority tasks, with smpnice we would like to see the high priority task
   scheduled on one cpu, two other cpus getting one normal task each and the
   fourth cpu getting the remaining two normal tasks.  but with current
   smpnice extra normal priority task keeps jumping from one cpu to another
   cpu having the normal priority task.  This is because of the
   busiest_has_loaded_cpus, nr_loaded_cpus logic..  We are not including the
   cpu with high priority task in max_load calculations but including that in
   total and avg_load calcuations..  leading to max_load < avg_load and load
   balance between cpus running normal priority tasks(2 Vs 1) will always show
   imbalanace as one normal priority and the extra normal priority task will
   keep moving from one cpu to another cpu having normal priority task..

b) 4-way system with HT (8 logical processors).  Package-P0 T0 has a
   highest priority task, T1 is idle.  Package-P1 Both T0 and T1 have 1 normal
   priority task each..  P2 and P3 are idle.  With this patch, one of the
   normal priority tasks on P1 will be moved to P2 or P3..

c) With the current weighted smp nice calculations, it doesn't always make
   sense to look at the highest weighted runqueue in the busy group..
   Consider a load balance scenario on a DP with HT system, with Package-0
   containing one high priority and one low priority, Package-1 containing one
   low priority(with other thread being idle)..  Package-1 thinks that it need
   to take the low priority thread from Package-0.  And find_busiest_queue()
   returns the cpu thread with highest priority task..  And ultimately(with
   help of active load balance) we move high priority task to Package-1.  And
   same continues with Package-0 now, moving high priority task from package-1
   to package-0..  Even without the presence of active load balance, load
   balance will fail to balance the above scenario..  Fix find_busiest_queue
   to use "imbalance" when it is lightly loaded.

[kernel@kolivas.org: sched: store weighted load on up]
[kernel@kolivas.org: sched: add discrete weighted cpu load function]
[suresh.b.siddha@intel.com: sched: remove dead code]
Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Con Kolivas <kernel@kolivas.org>
Cc: John Hawkes <hawkes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:44 -07:00
Kirill Korotaev
efc30814a8 [PATCH] sched: CPU hotplug race vs. set_cpus_allowed()
There is a race between set_cpus_allowed() and move_task_off_dead_cpu().
__migrate_task() doesn't report any err code, so task can be left on its
runqueue if its cpus_allowed mask changed so that dest_cpu is not longer a
possible target.  Also, chaning cpus_allowed mask requires rq->lock being
held.

Signed-off-by: Kirill Korotaev <dev@openvz.org>
Acked-By: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:44 -07:00
Steven Rostedt
cc94abfcbc [PATCH] unnecessary long index i in sched
Unless we expect to have more than 2G CPUs, there's no reason to have 'i'
as a long long here.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:44 -07:00
Con Kolivas
72d2854d4e [PATCH] sched: fix interactive ceiling code
The relationship between INTERACTIVE_SLEEP and the ceiling is not perfect
and not explicit enough.  The sleep boost is not supposed to be any larger
than without this code and the comment is not clear enough about what
exactly it does, just the reason it does it.  Fix it.

There is a ceiling to the priority beyond which tasks that only ever sleep
for very long periods cannot surpass.  Fix it.

Prevent the on-runqueue bonus logic from defeating the idle sleep logic.

Opportunity to micro-optimise.

Signed-off-by: Con Kolivas <kernel@kolivas.org>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:44 -07:00
Steven Rostedt
d444886e14 [PATCH] sched: simplify bitmap definition
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:44 -07:00
Chen, Kenneth W
c96d145e71 [PATCH] sched: fix smt nice lock contention and optimization
Initial report and lock contention fix from Chris Mason:

Recent benchmarks showed some performance regressions between 2.6.16 and
2.6.5.  We tracked down one of the regressions to lock contention in
schedule heavy workloads (~70,000 context switches per second)

kernel/sched.c:dependent_sleeper() was responsible for most of the lock
contention, hammering on the run queue locks.  The patch below is more of a
discussion point than a suggested fix (although it does reduce lock
contention significantly).  The dependent_sleeper code looks very expensive
to me, especially for using a spinlock to bounce control between two
different siblings in the same cpu.

It is further optimized:

* perform dependent_sleeper check after next task is determined
* convert wake_sleeping_dependent to use trylock
* skip smt runqueue check if trylock fails
* optimize double_rq_lock now that smt nice is converted to trylock
* early exit in searching first SD_SHARE_CPUPOWER domain
* speedup fast path of dependent_sleeper

[akpm@osdl.org: cleanup]
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Con Kolivas <kernel@kolivas.org>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Chris Mason <mason@suse.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:44 -07:00
Chandra Seetharaman
26c2143b63 [PATCH] cpu hotplug: make cpu_notifier related notifier calls __cpuinit only
Make notifier_calls associated with cpu_notifier as __cpuinit.

__cpuinit makes sure that the function is init time only unless
CONFIG_HOTPLUG_CPU is defined.

[akpm@osdl.org: section fix]
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:41 -07:00
Chandra Seetharaman
65edc68c34 [PATCH] cpu hotplug: make [un]register_cpu_notifier init time only
CPUs come online only at init time (unless CONFIG_HOTPLUG_CPU is defined).
So, cpu_notifier functionality need to be available only at init time.

This patch makes register_cpu_notifier() available only at init time, unless
CONFIG_HOTPLUG_CPU is defined.

This patch exports register_cpu_notifier() and unregister_cpu_notifier() only
if CONFIG_HOTPLUG_CPU is defined.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:41 -07:00
Chandra Seetharaman
054cc8a2d8 [PATCH] cpu hotplug: revert initdata patch submitted for 2.6.17
This patch reverts notifier_block changes made in 2.6.17

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:41 -07:00
Chandra Seetharaman
9c7b216d23 [PATCH] cpu hotplug: revert init patch submitted for 2.6.17
In 2.6.17, there was a problem with cpu_notifiers and XFS.  I provided a
band-aid solution to solve that problem.  In the process, i undid all the
changes you both were making to ensure that these notifiers were available
only at init time (unless CONFIG_HOTPLUG_CPU is defined).

We deferred the real fix to 2.6.18.  Here is a set of patches that fixes the
XFS problem cleanly and makes the cpu notifiers available only at init time
(unless CONFIG_HOTPLUG_CPU is defined).

If CONFIG_HOTPLUG_CPU is defined then cpu notifiers are available at run
time.

This patch reverts the notifier_call changes made in 2.6.17

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:40 -07:00
Paul E. McKenney
c32e066057 [PATCH] rcutorture: add call_rcu_bh() operations
Add operations for the call_rcu_bh() variant of RCU.  Also add an
rcu_batches_completed_bh() function, which is needed by rcutorture.

Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:40 -07:00
Paul E. McKenney
72e9bb5492 [PATCH] rcutorture: add ops vector and Classic RCU ops
Add an ops vector to rcutorture, and add the ops for Classic RCU.  Update
the rcutorture documentation to reflect slight change to the dmesg formats.

Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:40 -07:00
Paul E. McKenney
29766f1eb3 [PATCH] rcutorture: catchup doc fixes for idle-hz tests
This just catches the RCU torture documentation up with the recent fixes
that test RCU for architectures that turn of the scheduling-clock interrupt
for idle CPUs and the addition of a SUCCESS/FAILURE indication, fixing up
an obsolete comment as well.

Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:40 -07:00
Randy Dunlap
1dbe83c344 [PATCH] fix kernel-doc in kernel/ dir
Fix kernel-doc parameters in kernel/

Warning(/var/linsrc/linux-2617-g9//kernel/auditsc.c:1376): No description found for parameter 'u_abs_timeout'
Warning(/var/linsrc/linux-2617-g9//kernel/auditsc.c:1420): No description found for parameter 'u_msg_prio'
Warning(/var/linsrc/linux-2617-g9//kernel/auditsc.c:1420): No description found for parameter 'u_abs_timeout'
Warning(/var/linsrc/linux-2617-g9//kernel/acct.c:526): No description found for parameter 'pacct'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:39 -07:00
Ingo Molnar
34af946a22 [PATCH] spin/rwlock init cleanups
locking init cleanups:

 - convert " = SPIN_LOCK_UNLOCKED" to spin_lock_init() or DEFINE_SPINLOCK()
 - convert rwlocks in a similar manner

this patch was generated automatically.

Motivation:

 - cleanliness
 - lockdep needs control of lock initialization, which the open-coded
   variants do not give
 - it's also useful for -rt and for lock debugging in general

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:39 -07:00
Randy Dunlap
a7807a32bb [PATCH] poison: add & use more constants
Add more poison values to include/linux/poison.h.  It's not clear to me
whether some others should be added or not, so I haven't added any of
these:

./include/linux/libata.h:#define ATA_TAG_POISON		0xfafbfcfdU
./arch/ppc/8260_io/fcc_enet.c:1918:	memset((char *)(&(immap->im_dprambase[(mem_addr+64)])), 0x88, 32);
./drivers/usb/mon/mon_text.c:429:	memset(mem, 0xe5, sizeof(struct mon_event_text));
./drivers/char/ftape/lowlevel/ftape-ctl.c:738:		memset(ft_buffer[i]->address, 0xAA, FT_BUFF_SIZE);
./drivers/block/sx8.c:/* 0xf is just arbitrary, non-zero noise; this is sorta like poisoning */

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:38 -07:00
Ingo Molnar
e6e5494cb2 [PATCH] vdso: randomize the i386 vDSO by moving it into a vma
Move the i386 VDSO down into a vma and thus randomize it.

Besides the security implications, this feature also helps debuggers, which
can COW a vma-backed VDSO just like a normal DSO and can thus do
single-stepping and other debugging features.

It's good for hypervisors (Xen, VMWare) too, which typically live in the same
high-mapped address space as the VDSO, hence whenever the VDSO is used, they
get lots of guest pagefaults and have to fix such guest accesses up - which
slows things down instead of speeding things up (the primary purpose of the
VDSO).

There's a new CONFIG_COMPAT_VDSO (default=y) option, which provides support
for older glibcs that still rely on a prelinked high-mapped VDSO.  Newer
distributions (using glibc 2.3.3 or later) can turn this option off.  Turning
it off is also recommended for security reasons: attackers cannot use the
predictable high-mapped VDSO page as syscall trampoline anymore.

There is a new vdso=[0|1] boot option as well, and a runtime
/proc/sys/vm/vdso_enabled sysctl switch, that allows the VDSO to be turned
on/off.

(This version of the VDSO-randomization patch also has working ELF
coredumping, the previous patch crashed in the coredumping code.)

This code is a combined work of the exec-shield VDSO randomization
code and Gerd Hoffmann's hypervisor-centric VDSO patch. Rusty Russell
started this patch and i completed it.

[akpm@osdl.org: cleanups]
[akpm@osdl.org: compile fix]
[akpm@osdl.org: compile fix 2]
[akpm@osdl.org: compile fix 3]
[akpm@osdl.org: revernt MAXMEM change]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Cc: Gerd Hoffmann <kraxel@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:38 -07:00
KAMEZAWA Hiroyuki
2842f11419 [PATCH] catch valid mem range at onlining memory
This patch allows hot-add memory which is not aligned to section.

Now, hot-added memory has to be aligned to section size.  Considering big
section sized archs, this is not useful.

When hot-added memory is registerd as iomem resoruce by iomem resource
patch, we can make use of that information to detect valid memory range.

Note: With this, not-aligned memory can be registerd. To allow hot-add
      memory with holes, we have to do more work around add_memory().
      (It doesn't allows add memory to already existing mem section.)

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:36 -07:00
Andrew Morton
5c31f2738a [PATCH] pm_trace is dangerous
CONFIG_PM_TRACES scrogs your RTC.  Mark it as experimental, and defaulting to
`off'.

Also beef up the help message a bit.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:35 -07:00
Randy Dunlap
7f32a25f63 [PATCH] kernel/acct: fix function definition
kernel/acct.c:579:19: warning: non-ANSI function declaration of function 'acct_process'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:35 -07:00