From: Christoph Lameter <clameter@sgi.com>
Looks like a comma was left from the conversion from a struct to an
assignment.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Few of the notifier_chain_register() callers use __init in the definition
of notifier_call. It is incorrect as the function definition should be
available after the initializations (they do not unregister them during
initializations).
This patch fixes all such usages to _not_ have the notifier_call __init
section.
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A couple of places are forgetting to take it.
The kswapd case is probably unimportant. keventd_create_kthread() was racy.
The whole thing is a bit flakey: you start a kernel thread, get its pid from
kernel_thread() then look up its task_struct.
a) It assumes that pid recycling takes a "long" time.
b) We get a task_struct but no reference was taken on it. The owner of the
kswapd and kthread task_struct*'s must assume that the new thread won't
exit unexpectedly. Because if it does, they're left holding dead memory
and any attempt to control or stop that task will crash.
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Centralize the page migration functions in anticipation of additional
tinkering. Creates a new file mm/migrate.c
1. Extract buffer_migrate_page() from fs/buffer.c
2. Extract central migration code from vmscan.c
3. Extract some components from mempolicy.c
4. Export pageout() and remove_from_swap() from vmscan.c
5. Make it possible to configure NUMA systems without page migration
and non-NUMA systems with page migration.
I had to so some #ifdeffing in mempolicy.c that may need a cleanup.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make shrink_all_memory() repeat the attempts to free more memory if there
seems to be no pages to free.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As suggested by Marcelo:
1. The optimization introduced recently for not calling
page_referenced() during zone reclaim makes two additional checks in
shrink_list unnecessary.
2. The if (unlikely(sc->may_swap)) in refill_inactive_zone is optimized
for the zone_reclaim case. However, most peoples system only does swap.
Undo that.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove __put_page from outside the core mm/. It is dangerous because it does
not handle compound pages nicely, and misses 1->0 transitions. If a user
later appears that really needs the extra speed we can reevaluate.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In shrink_inactive_list(), nr_scan is not accounted when nr_taken is 0.
But 0 pages taken does not mean 0 pages scanned.
Move the goto statement below the accounting code to fix it.
Signed-off-by: Wu Fengguang <wfg@mail.ustc.edu.cn>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In isolate_lru_pages(), *scanned reports one more scan because the scan
counter is increased one more time on exit of the while-loop.
Change the while-loop to for-loop to fix it.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Wu Fengguang <wfg@mail.ustc.edu.cn>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add some comments to explain how zone reclaim works. And it fixes the
following issues:
- PF_SWAPWRITE needs to be set for RECLAIM_SWAP to be able to write
out pages to swap. Currently RECLAIM_SWAP may not do that.
- remove setting nr_reclaimed pages after slab reclaim since the slab shrinking
code does not use that and the nr_reclaimed pages is just right for the
intended follow up action.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We have:
try_to_free_pages
->shrink_caches(struct zone **zones, ..)
->shrink_zone(struct zone *, ...)
->shrink_cache(struct zone *, ...)
->shrink_list(struct list_head *, ...)
->refill_inactive_list((struct zone *, ...)
which is fairly irrational.
Rename things so that we have
try_to_free_pages
->shrink_zones(struct zone **zones, ..)
->shrink_zone(struct zone *, ...)
->shrink_inactive_list(struct zone *, ...)
->shrink_page_list(struct list_head *, ...)
->shrink_active_list(struct zone *, ...)
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Change all the vmscan functions to retunr the number-of-reclaimed pages and
remove scan_conrtol.nr_reclaimed.
Saves ten-odd bytes of text and makes things clearer and more consistent.
The patch also changes the behaviour of zone_reclaim() when it falls back to slab shrinking. Christoph says
"Setting this to one means that we will rescan and shrink the slab for
each allocation if we are out of zone memory and RECLAIM_SLAB is set. Plus
if we do an order 0 allocation we do not go off node as intended.
"We better set this to zero. This means the allocation will go offnode
despite us having potentially freed lots of memory on the zone. Future
allocations can then again be done from this zone."
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Turn basically everything in vmscan.c into `unsigned long'. This is to avoid
the possibility that some piece of code in there might decide to operate upon
more than 4G (or even 2G) of pages in one hit.
This might be silly, but we'll need it one day.
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Initialise as much of scan_control as possible at the declaration site. This
tidies things up a bit and assures us that all unmentioned fields are zeroed
out.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make nr_to_scan and priority a parameter instead of putting it into scan
control. This allows various small optimizations and IMHO makes the code
easier to read.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The VM has an interesting race where a page refcount can drop to zero, but it
is still on the LRU lists for a short time. This was solved by testing a 0->1
refcount transition when picking up pages from the LRU, and dropping the
refcount in that case.
Instead, use atomic_add_unless to ensure we never pick up a 0 refcount page
from the LRU, thus a 0 refcount page will never have its refcount elevated
until it is allocated again.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
PG_active is protected by zone->lru_lock, it does not need TestSet/TestClear
operations.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
PG_lru is protected by zone->lru_lock. It does not need TestSet/TestClear
operations.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If vmscan finds a zero refcount page on the lru list, never ClearPageLRU
it. This means the release code need not hold ->lru_lock to stabilise
PageLRU, so that lock may be skipped entirely when releasing !PageLRU pages
(because we know PageLRU won't have been temporarily cleared by vmscan,
which was previously guaranteed by holding the lock to synchronise against
vmscan).
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
page migration currently simply retries a couple of times if try_to_unmap()
fails without inspecting the return code.
However, SWAP_FAIL indicates that the page is in a vma that has the
VM_LOCKED flag set (if ignore_refs ==1). We can check for that return code
and avoid retrying the migration.
migrate_page_remove_references() now needs to return a reason why the
failure occured. So switch migrate_page_remove_references to use -Exx
style error messages.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If the process has already set PF_MALLOC and is already using
current->reclaim_state then do not try to reclaim memory from the zone.
This is set by kswapd and/or synchrononous global reclaim which will not
take it lightly if we zap the reclaim_state.
Signed-off-by: Christoph Lameter <clameter@sig.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- PF_SWAPWRITE needs to be set for RECLAIM_SWAP to be able to write
out pages to swap. Currently RECLAIM_SWAP may not do that.
- remove setting nr_reclaimed pages after slab reclaim since the slab shrinking
code does not use that and the nr_reclaimed pages is just right for the
intended follow up action.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This puts the variables and the way to get to reclaim_mapped in one block.
And allows zone_reclaim or other things to skip the determination (maybe
this whole block of code does not belong into refill_inactive_zone()?)
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
shrink_zone() already increments reclaim_in_progress. No need to do it in
balance_pgdat.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>