Commit Graph

3139 Commits

Author SHA1 Message Date
Hugh Dickins
f449952bc8 [PATCH] mm: mm_struct hiwaters moved
Slight and timid rearrangement of mm_struct: hiwater_rss and hiwater_vm were
tacked on the end, but it seems better to keep them near _file_rss, _anon_rss
and total_vm, in the same cacheline on those arches verified.

There are likely to be more profitable rearrangements, but less obvious (is it
good or bad that saved_auxv[AT_VECTOR_SIZE] isolates cpu_vm_mask and context
from many others?), needing serious instrumentation.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:39 -07:00
Hugh Dickins
365e9c87a9 [PATCH] mm: update_hiwaters just in time
update_mem_hiwater has attracted various criticisms, in particular from those
concerned with mm scalability.  Originally it was called whenever rss or
total_vm got raised.  Then many of those callsites were replaced by a timer
tick call from account_system_time.  Now Frank van Maarseveen reports that to
be found inadequate.  How about this?  Works for Frank.

Replace update_mem_hiwater, a poor combination of two unrelated ops, by macros
update_hiwater_rss and update_hiwater_vm.  Don't attempt to keep
mm->hiwater_rss up to date at timer tick, nor every time we raise rss (usually
by 1): those are hot paths.  Do the opposite, update only when about to lower
rss (usually by many), or just before final accounting in do_exit.  Handle
mm->hiwater_vm in the same way, though it's much less of an issue.  Demand
that whoever collects these hiwater statistics do the work of taking the
maximum with rss or total_vm.

And there has been no collector of these hiwater statistics in the tree.  The
new convention needs an example, so match Frank's usage by adding a VmPeak
line above VmSize to /proc/<pid>/status, and also a VmHWM line above VmRSS
(High-Water-Mark or High-Water-Memory).

There was a particular anomaly during mremap move, that hiwater_vm might be
captured too high.  A fleeting such anomaly remains, but it's quickly
corrected now, whereas before it would stick.

What locking?  None: if the app is racy then these statistics will be racy,
it's not worth any overhead to make them exact.  But whenever it suits,
hiwater_vm is updated under exclusive mmap_sem, and hiwater_rss under
page_table_lock (for now) or with preemption disabled (later on): without
going to any trouble, minimize the time between reading current values and
updating, to minimize those occasions when a racing thread bumps a count up
and back down in between.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:39 -07:00
Nick Piggin
b5810039a5 [PATCH] core remove PageReserved
Remove PageReserved() calls from core code by tightening VM_RESERVED
handling in mm/ to cover PageReserved functionality.

PageReserved special casing is removed from get_page and put_page.

All setting and clearing of PageReserved is retained, and it is now flagged
in the page_alloc checks to help ensure we don't introduce any refcount
based freeing of Reserved pages.

MAP_PRIVATE, PROT_WRITE of VM_RESERVED regions is tentatively being
deprecated.  We never completely handled it correctly anyway, and is be
reintroduced in future if required (Hugh has a proof of concept).

Once PageReserved() calls are removed from kernel/power/swsusp.c, and all
arch/ and driver code, the Set and Clear calls, and the PG_reserved bit can
be trivially removed.

Last real user of PageReserved is swsusp, which uses PageReserved to
determine whether a struct page points to valid memory or not.  This still
needs to be addressed (a generic page_is_ram() should work).

A last caveat: the ZERO_PAGE is now refcounted and managed with rmap (and
thus mapcounted and count towards shared rss).  These writes to the struct
page could cause excessive cacheline bouncing on big systems.  There are a
number of ways this could be addressed if it is an issue.

Signed-off-by: Nick Piggin <npiggin@suse.de>

Refcount bug fix for filemap_xip.c

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:39 -07:00
Hugh Dickins
4294621f41 [PATCH] mm: rss = file_rss + anon_rss
I was lazy when we added anon_rss, and chose to change as few places as
possible.  So currently each anonymous page has to be counted twice, in rss
and in anon_rss.  Which won't be so good if those are atomic counts in some
configurations.

Change that around: keep file_rss and anon_rss separately, and add them
together (with get_mm_rss macro) when the total is needed - reading two
atomics is much cheaper than updating two atomics.  And update anon_rss
upfront, typically in memory.c, not tucked away in page_add_anon_rmap.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:38 -07:00
Hugh Dickins
a8fb5618da [PATCH] mm: unlink_file_vma, remove_vma
Divide remove_vm_struct into two parts: first anon_vma_unlink plus
unlink_file_vma, to unlink the vma from the list and tree by which rmap or
vmtruncate might find it; then remove_vma to close, fput and free.

The intention here is to do the anon_vma_unlink and unlink_file_vma earlier,
in free_pgtables before freeing any page tables: so we can be sure that any
page tables traversed by rmap and vmtruncate are stable (and other, ordinary
cases are stabilized by holding mmap_sem).

This will be crucial to traversing pgd,pud,pmd without page_table_lock.  But
testing the split-out patch showed that lifting the page_table_lock is
symbiotically necessary to make this change - the lock ordering is wrong to
move those unlinks into free_pgtables while it's under ptlock.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:37 -07:00
Hugh Dickins
ab50b8ed81 [PATCH] mm: vm_stat_account unshackled
The original vm_stat_account has fallen into disuse, with only one user, and
only one user of vm_stat_unaccount.  It's easier to keep track if we convert
them all to __vm_stat_account, then free it from its __shackles.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:37 -07:00
Andi Kleen
dfcd3c0dc4 [PATCH] Convert mempolicies to nodemask_t
The NUMA policy code predated nodemask_t so it used open coded bitmaps.
Convert everything to nodemask_t.  Big patch, but shouldn't have any actual
behaviour changes (except I removed one unnecessary check against
node_online_map and one unnecessary BUG_ON)

Signed-off-by: "Andi Kleen" <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:35 -07:00
Rik Van Riel
eb92f4ef32 [PATCH] add sem_is_read/write_locked()
Add sem_is_read/write_locked functions to the read/write semaphores, along the
same lines of the *_is_locked spinlock functions.  The swap token tuning patch
uses sem_is_read_locked; sem_is_write_locked is added for completeness.

Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:35 -07:00
Christoph Lameter
930fc45a49 [PATCH] vmalloc_node
This patch adds

vmalloc_node(size, node)	-> Allocate necessary memory on the specified node

and

get_vm_area_node(size, flags, node)

and the other functions that it depends on.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:35 -07:00
Jeff Garzik
0169e284f6 [libata] remove ata_chk_err(), ->check_err() hook.
We now depend on ->tf_read() to provide us with the contents
of the Error shadow register.
2005-10-29 21:25:10 -04:00
Herbert Xu
d32311fed7 [PATCH] Introduce sg_set_buf
sg_init_one is a nice tool for the block layer.  However, users
of struct scatterlist in other subsystems don't usually need the
DMA attributes.  For them it's a waste of time and space to
initialise the whole struct scatterlist structure.

Therefore this patch adds a new function sg_set_buf to initialise
a scatterlist without zeroing the DMA attributes.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-10-30 11:14:39 +11:00
Jeff Garzik
b0c4e148bd Merge branch 'master' 2005-10-29 17:49:12 -04:00
Russell King
bbbf508d64 [DRIVER MODEL] Add missing platform_device.h header.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-10-29 22:17:58 +01:00
Linus Torvalds
e9d52234e3 Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus 2005-10-29 12:19:15 -07:00
Pete Popov
26a940e217 Cleaned up AMD Au1200 IDE driver:
- converted to platform bus
- removed pci dependencies
- removed virt_to_phys/phys_to_virt calls
    
System now can root off of a disk.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

diff --git a/Documentation/mips/AU1xxx_IDE.README b/Documentation/mips/AU1xxx_IDE.README
new file mode 100644
2005-10-29 19:32:20 +01:00
Pete Popov
bdf21b18b4 Philips PNX8550 support: MIPS32-like core with 2 Trimedias on it.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:31:54 +01:00
Linus Torvalds
62d3af1b5f Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2005-10-29 11:25:16 -07:00
Russell King
d052d1beff Create platform_device.h to contain all the platform device details.
Convert everyone who uses platform_bus_type to include
linux/platform_device.h.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-10-29 19:07:23 +01:00
Arnaldo Carvalho de Melo
fc228a04a4 Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6 2005-10-29 03:10:35 -02:00
Olaf Hering
146c98782b [PATCH] ppc64 boot: remove include from include/linux/zutil.h
zutil.h does not need errno.h

Signed-off-by: Olaf Hering <olh@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-29 15:04:02 +10:00
Andy Fleming
b37665e0ba [PATCH] ppc32: 85xx PHY Platform Update
This patch updates the 85xx platform code to support the new PHY Layer.

Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Kumar Gala <Kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-29 14:42:28 +10:00
Stephen Hemminger
360ac8e2f1 [ETH]: ether address compare
Expose faster ether compare for use by protocols and other
driver. And change name to be more consistent with other ether
address manipulation routines in same file

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-29 02:23:58 -02:00
Jeff Garzik
5615ca7906 Merge branch 'upstream' 2005-10-28 21:32:01 -04:00
Andrew Morton
9a7834d06d [PATCH] USB: fix pm patches with CONFIG_PM off part 2
With CONFIG_PM=n:

drivers/built-in.o(.text+0x1098c): In function `hub_thread':
drivers/usb/core/hub.c:2673: undefined reference to `.dpm_runtime_resume'
drivers/built-in.o(.text+0x10998):drivers/usb/core/hub.c:2674: undefined reference to `.dpm_runtime_resume'

Please, never ever ever put extern decls into .c files.  Use the darn header
files :(

Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-10-28 16:47:52 -07:00
Alan Stern
4f62efe67f [PATCH] usbcore: Fix handling of sysfs strings and other attributes
This patch (as592) makes a few small improvements to the way device
strings are handled, and it fixes some bugs in a couple of other sysfs
attribute routines.  (Look at show_configuration_string() to see what I
mean.)

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-10-28 16:47:51 -07:00