This is for use with slab users that pass a dynamically allocated slab name in
kmem_cache_create, so that before destroying the slab one can retrieve the name
and free its memory.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
fault_in_pages_readable() is being passed an incorrect `end' address, which
can result in writes accidentally faulting in pages which will not be affected
by the write() call.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
try_to_unmap_cluster() does:
for (pte = pte_offset_map(pmd, address);
address < end; pte++, address += PAGE_SIZE) {
...
}
pte_unmap(pte);
It may take a little staring to notice, but pte can actually fall off the
end of the pte page in this iteration, which makes life difficult for
kmap_atomic() and the users not expecting it to BUG(). Of course, we're
somewhat lucky in that arithmetic elsewhere in the function guarantees that
at least one iteration is made, lest this force larger rearrangements to be
made. This issue and patch also apply to non-mm mainline and with trivial
adjustments, at least two related kernels.
Discovered during internal testing at Oracle.
Signed-off-by: William Irwin <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I came across the following problem while running ltp-aiodio testcases from
ltp-full-20050405 on linux-2.6.12-rc3-mm3. I tried running the tests with
EXT3 as well as JFS filesystems.
One or two fsx-linux testcases were hung after some time. These testcases
were hanging at wait_for_all_aios().
Debugging shows that there were some iocbs which were not getting completed
eventhough the last retry for those returned -EIOCBQUEUED. Also all such
pending iocbs represented READ operation.
Further debugging revealed that all such iocbs hit EOF in the DIO layer.
To be more precise, the "pos" from which they were trying to read was
greater than the "size" of the file. So the generic_file_direct_IO
returned 0.
This happens rarely as there is already a check in
__generic_file_aio_read(), for whether "pos" < "size" before calling direct
IO routine.
>size = i_size_read(inode);
>if (pos < size) {
> retval = generic_file_direct_IO(READ, iocb,
> iov, pos, nr_segs);
But for READ, we are taking the inode->i_sem only in the DIO layer. So it
is possible that some other process can change the size of the file before
we take the i_sem. In such a case ( when "pos" > "size"), the
__generic_file_aio_read() would return -EIOCBQUEUED even though there were
no I/O requests submitted by the DIO layer. This would cause the AIO layer
to expect aio_complete() for THE iocb, which doesnot happen. And thus the
test hangs forever, waiting for an I/O completion, where there are no
requests submitted at all.
The following patch makes __generic_file_aio_read() return 0 (instead of
returning -EIOCBQUEUED), on getting 0 from generic_file_direct_IO(), so
that the AIO layer does the aio_complete().
Testing:
I have tested the patch on a SMP machine(with 2 Pentium 4 (HT)) running
linux-2.6.12-rc3-mm3. I ran the ltp-aiodio testcases and none of the
fsx-linux tests hung. Also the aio-stress tests ran without any problem.
Signed-off-by: Suzuki K P <suzuki@in.ibm.com>
Signed-off-by: Suparna Bhattacharya <suparna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Caused oopses again. Also fix potential mismatch in checking if
change_page_attr was needed.
To do it without races I needed to change mm/vmalloc.c to export a
__remove_vm_area that does not take vmlist lock.
Noticed by Terence Ripperda and based on a patch of his.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As noted by Chris Wright, we need to do the full range of tests regardless
of whether MAP_FIXED is set or not, so re-organize get_unmapped_area()
slightly to do the sanity checks unconditionally.
Prevent the topdown allocator from allocating mmap areas all the way
down to address zero.
We still allow a MAP_FIXED mapping of page 0 (needed for various things,
ranging from Wine and DOSEMU to people who want to allow speculative
loads off a NULL pointer).
Tested by Chris Wright.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There is a bug in do_swap_page(): when swap page happens to be unreadable,
page filled with random data is mapped into user address space. The fix is
to check for PageUptodate and send SIGBUS in case of error.
Signed-Off-By: Kirill Korotaev <dev@sw.ru>
Signed-Off-By: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Acked-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix OOPS when swapping on a device that doesn't have an unplug_io_fn defined
(eg, ATA Over Ethernet)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus changed the second argument of __vmalloc from int to unsigned int
breaking the compilation for CONFIG_MMU=n configurations (since he only
changed vmalloc.c but not nommu.c).
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch fixes mm->total_vm and mm->locked_vm acctounting in case when
move_page_tables() fails inside move_vma().
Signed-Off-By: Kirill Korotaev <dev@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch fixes a bug introduced by the "mm counter operations through
macros" patch, which replaced a decrement operation in with an increment
macro in try_to_unmap_one().
Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Export node_online_map and node_possible_map so that kernel modules can use
the nodemask macros, like, for_each_node() and for_each_online_node().
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Some KernelDoc descriptions are updated to match the current code.
No code changes.
Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I have recompiled Linux kernel 2.6.11.5 documentation for me and our
university students again. The documentation could be extended for more
sources which are equipped by structured comments for recent 2.6 kernels. I
have tried to proceed with that task. I have done that more times from 2.6.0
time and it gets boring to do same changes again and again. Linux kernel
compiles after changes for i386 and ARM targets. I have added references to
some more files into kernel-api book, I have added some section names as well.
So please, check that changes do not break something and that categories are
not too much skewed.
I have changed kernel-doc to accept "fastcall" and "asmlinkage" words reserved
by kernel convention. Most of the other changes are modifications in the
comments to make kernel-doc happy, accept some parameters description and do
not bail out on errors. Changed <pid> to @pid in the description, moved some
#ifdef before comments to correct function to comments bindings, etc.
You can see result of the modified documentation build at
http://cmp.felk.cvut.cz/~pisa/linux/lkdb-2.6.11.tar.gz
Some more sources are ready to be included into kernel-doc generated
documentation. Sources has been added into kernel-api for now. Some more
section names added and probably some more chaos introduced as result of quick
cleanup work.
Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch changes calls to synchronize_kernel(), deprecated in the earlier
"Deprecate synchronize_kernel, GPL replacement" patch to instead call the new
synchronize_rcu() and synchronize_sched() APIs.
Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove PAGE_BUG - repalce it with BUG and BUG_ON.
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Replace a number of memory barriers with smp_ variants. This means we won't
take the unnecessary hit on UP machines.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The smp_mb() is becaus sync_page() doesn't have PG_locked while it accesses
page_mapping(page). The comments in the patch (the entire patch is the
addition of this comment) try to explain further how and why smp_mb() is
used.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Always use page counts when doing RLIMIT_MEMLOCK checking to avoid possible
overflow.
Signed-off-by: Chris Wright <chrisw@osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is a patch for counting the number of pages for bounce buffers. It's
shown in /proc/vmstat.
Currently, the number of bounce pages are not counted anywhere. So, if
there are many bounce pages, it seems that there are leaked pages. And
it's difficult for a user to imagine the usage of bounce pages. So, it's
meaningful to show # of bouce pages.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use the new __GFP_NOMEMALLOC to simplify the previous handling of
PF_MEMALLOC.
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Mempool is pretty clever. Looks too clever for its own good :) It
shouldn't really know so much about page reclaim internals.
- don't guess about what effective page reclaim might involve.
- don't randomly flush out all dirty data if some unlikely thing
happens (alloc returns NULL). page reclaim can (sort of :P) handle
it.
I think the main motivation is trying to avoid pool->lock at all costs.
However the first allocation is attempted with __GFP_WAIT cleared, so it
will be 'can_try_harder' if it hits the page allocator. So if allocation
still fails, then we can probably afford to hit the pool->lock - and what's
the alternative? Try page reclaim and hit zone->lru_lock?
A nice upshot is that we don't need to do any fancy memory barriers or do
(intentionally) racy access to pool-> fields outside the lock.
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>