Originally __free_pages_bulk used the relative page number within a zone to
define its buddies. This meant that to maintain the "maximally aligned"
requirements (that an allocation of size N will be aligned at least to N
physically) zones had to also be aligned to 1<<MAX_ORDER pages. When
__free_pages_bulk was updated to use the relative page frame numbers of the
free'd pages to pair buddies this released the alignment constraint on the
'left' edge of the zone. This allows _either_ edge of the zone to contain
partial MAX_ORDER sized buddies. These simply never will have matching
buddies and thus will never make it to the 'top' of the pyramid.
The patch below removes a now redundant check ensuring that the mem_map was
aligned to MAX_ORDER.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The madvise() system call returns -EBADF for areas which does not map to
files, only for *behaviour* request MADV_WILLNEED.
According to man pages, madvise returns :
EBADF - the map exists, but the area maps something that isn't a file.
Fixes bug 2995.
Signed-off-by: Suzuki K P <suzuki@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix bug identifued by Richard Purdie <rpurdie@rpsys.net>.
oprofile calls check_user_page_readable() from interrupt context, so we
deadlock over various VFS locks.
But check_user_page_readable() doesn't imply either a read or a write of the
page's contents. Change __follow_page() so that check_user_page_readable()
can tell __follow_page() that we're not accessing the page's contents, and use
that info to avoid the troublesome lock-takings.
Also, make follow_page() inline for the single callsite in memory.c to save a
bit of stack space.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch includes feedback from Andrew and Christoph. Thanks for
taking time to review.
Use of empty_zero_page was eliminated to fix compilation for architectures
that don't have it.
This patch removes setting pages up-to-date in ext2_get_xip_page and all
bug checks to verify that the page is indeed up to date. Setting the page
state on mapping to userland is bogus. None of the code patchs involved
with these pages in mm cares about the page state.
still on my ToDo list: identify a place outside second extended where
__inode_direct_access should reside
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
mm/filemap_xip.c: In function `__xip_unmap':
mm/filemap_xip.c:194: request for member `pte' in something not a structure or union
Apparently pte_pfn() takes a pte_t, not a pointer to a pte_t. From looking
at asm/page.h, it seems to be the same on ia32 or ppc (iff
STRICT_MM_TYPECHECKS is enabled, which is disabled by default on ppc).
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We now print statistics when invoking the OOM killer, however this
information is not rate limited and you can get into situations where the
console is continually spammed.
For example, when a task is exiting the OOM killer will simply return
(waiting for that task to exit and clear up memory). If the VM continually
calls back into the OOM killer we get thousands of copies of show_mem() on
the console.
Use printk_ratelimit() to quieten it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove completly bogus comment from did_some_progress != 0 handling (that
same comment is a few lines below on did_some_progress = 0 case, where it
belongs).
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch used to be in Andrew's tree before the NUMA slab allocator went
in. Either this patch or the NUMA slab allocator is needed in order for
kmalloc_node to work correctly.
pcibus_to_node may be used to generate the node information passed to
kmalloc_node. pcibus_to_node returns -1 if it was not able to determine
on which node a pcibus is located. For that case kmalloc_node must
work like kmalloc.
Signed-off-by: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I spotted this issue while in memmap_init last week. I can't say the
change has any test coverage by me. start_pfn was formerly used in main
"for" loop. The fix is replace start_pfn with pfn.
Signed-off-by: Bob Picco <bob.picco@hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
1. Establish a simple API for process freezing defined in linux/include/sched.h:
frozen(process) Check for frozen process
freezing(process) Check if a process is being frozen
freeze(process) Tell a process to freeze (go to refrigerator)
thaw_process(process) Restart process
frozen_process(process) Process is frozen now
2. Remove all references to PF_FREEZE and PF_FROZEN from all
kernel sources except sched.h
3. Fix numerous locations where try_to_freeze is manually done by a driver
4. Remove the argument that is no longer necessary from two function calls.
5. Some whitespace cleanup
6. Clear potential race in refrigerator (provides an open window of PF_FREEZE
cleared before setting PF_FROZEN, recalc_sigpending does not check
PF_FROZEN).
This patch does not address the problem of freeze_processes() violating the rule
that a task may only modify its own flags by setting PF_FREEZE. This is not clean
in an SMP environment. freeze(process) is therefore not SMP safe!
Signed-off-by: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch makes use of ALIGN() to remove duplicate round-up code.
Signed-off-by: Nick Wilson <njw@osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch retrieves the max_pfn being used by previous kernel and stores it
in a safe location (saved_max_pfn) before it is overwritten due to user
defined memory map. This pfn is used to make sure that user does not try to
read the physical memory beyond saved_max_pfn.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Here is the fix for the problem described in
http://bugzilla.kernel.org/show_bug.cgi?id=4721
Basically, problem is generic_file_buffered_write() is accessing beyond end
of the iov[] vector after handling the last vector. If we happen to cross
page boundary, we get a fault.
I think this simple patch is good enough. If we really don't want to
depend on the "count", then we need pass nr_segs to
filemap_set_next_iovec() and decrement it and check it.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
CONFIG_PM_DISK is long gone, but it still managed to survived at few
places.
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Out-of-tree user of remap_pfn_range hit kernel BUG at mm/memory.c:1112! It
passes an unrounded size to remap_pfn_range, which was okay before 2.6.12,
but misses remap_pte_range's new end condition. An audit of all the other
ptwalks confirms that this is the only one so exposed.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix a bug on error handling in the direct I/O function.
Currently, if a file is opened with the O_DIRECT|O_SYNC flag, the write()
syscall cannot receive the EIO error after an I/O error (SCSI cable is
disconnected etc.).
Return values of other points that call generic_osync_inode() are treated
appropriately.
Signed-off-by: Hisashi Hifumi <hifumi.hisashi@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch reworks filemap_xip.c with the goal to reduce code duplication
from mm/filemap.c. It applies agains 2.6.12-rc6-mm1. Instead of
implementing the aio functions, this one implements the synchronous
read/write functions only. For readv and writev, the generic fallback is
used. For aio, we rely on the application doing the fallback. Since our
"synchronous" function does memcpy immediately anyway, there is no
performance difference between using the fallbacks or implementing each
operation.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- generic_file* file operations do no longer have a xip/non-xip split
- filemap_xip.c implements a new set of fops that require get_xip_page
aop to work proper. all new fops are exported GPL-only (don't like to
see whatever code use those except GPL modules)
- __xip_unmap now uses page_check_address, which is no longer static
in rmap.c, and defined in linux/rmap.h
- mm/filemap.h is now much more clean, plainly having just Linus'
inline funcs moved here from filemap.c
- fix includes in filemap_xip to make it build cleanly on i386
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch updates some comments to match code changes.
Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>