Commit Graph

231 Commits

Author SHA1 Message Date
David Gibson 8e561e7eda [POWERPC] Kill typedef-ed structs for hash PTEs and BATs
Using typedefs to rename structure types if frowned on by CodingStyle.
However, we do so for the hash PTE structure on both ppc32 (where it's
called "PTE") and ppc64 (where it's called "hpte_t").  On ppc32 we
also have such a typedef for the BATs ("BAT").

This removes this unhelpful use of typedefs, in the process
bringing ppc32 and ppc64 closer together, by using the name "struct
hash_pte" in both cases.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-06-14 22:30:16 +10:00
David Gibson c0770f686c [POWERPC] Remove a couple of unused definitions from pgtable_32.c
In arch/powerpc/mm/pgtable_32.c, the variable io_bat_index and the
macro is_power_of_4() no longer have any users.  This removes them.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-06-14 22:30:15 +10:00
David Gibson f21f49ea63 [POWERPC] Remove the dregs of APUS support from arch/powerpc
APUS (the Amiga Power-Up System) is not supported under arch/powerpc
and it's unlikely it ever will be.  Therefore, this patch removes the
fragments of APUS support code from arch/powerpc which have been
copied from arch/ppc.

A few APUS references are left in asm-powerpc in .h files which are
still used from arch/ppc.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-06-14 22:30:15 +10:00
David Gibson 90ac19a8b2 [POWERPC] Abolish iopa(), mm_ptov(), io_block_mapping() from arch/powerpc
These old-fashioned IO mapping functions no longer have any callers in
code which remains relevant on arch/powerpc.  Therefore, this removes
them from arch/powerpc.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-06-14 22:30:15 +10:00
will schmidt effe24bdd4 [POWERPC] During VM oom condition, kill all threads in process group
We have had complaints where a threaded application is left in a bad state
after one of it's threads is killed when we hit a VM: out_of_memory
condition.

Killing just one of the process threads can leave the application in a
bad state, whereas killing the entire process group would allow for
the application to restart, or be otherwise handled, and makes it very
obvious that something has gone wrong.

This change allows the entire process group to be taken down, rather than
just the one thread.

lightly tested on powerpc

Signed-off-by: Will <will_schmidt@vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-06-14 22:29:59 +10:00
Benjamin Herrenschmidt 3d5134ee83 [POWERPC] Rewrite IO allocation & mapping on powerpc64
This rewrites pretty much from scratch the handling of MMIO and PIO
space allocations on powerpc64.  The main goals are:

 - Get rid of imalloc and use more common code where possible
 - Simplify the current mess so that PIO space is allocated and
   mapped in a single place for PCI bridges
 - Handle allocation constraints of PIO for all bridges including
   hot plugged ones within the 2GB space reserved for IO ports,
   so that devices on hotplugged busses will now work with drivers
   that assume IO ports fit in an int.
 - Cleanup and separate tracking of the ISA space in the reserved
   low 64K of IO space. No ISA -> Nothing mapped there.

I booted a cell blade with IDE on PIO and MMIO and a dual G5 so
far, that's it :-)

With this patch, all allocations are done using the code in
mm/vmalloc.c, though we use the low level __get_vm_area with
explicit start/stop constraints in order to manage separate
areas for vmalloc/vmap, ioremap, and PCI IOs.

This greatly simplifies a lot of things, as you can see in the
diffstat of that patch :-)

A new pair of functions pcibios_map/unmap_io_space() now replace
all of the previous code that used to manipulate PCI IOs space.
The allocation is done at mapping time, which is now called from
scan_phb's, just before the devices are probed (instead of after,
which is by itself a bug fix). The only other caller is the PCI
hotplug code for hot adding PCI-PCI bridges (slots).

imalloc is gone, as is the "sub-allocation" thing, but I do beleive
that hotplug should still work in the sense that the space allocation
is always done by the PHB, but if you unmap a child bus of this PHB
(which seems to be possible), then the code should properly tear
down all the HPTE mappings for that area of the PHB allocated IO space.

I now always reserve the first 64K of IO space for the bridge with
the ISA bus on it. I have moved the code for tracking ISA in a separate
file which should also make it smarter if we ever are capable of
hot unplugging or re-plugging an ISA bridge.

This should have a side effect on platforms like powermac where VGA IOs
will no longer work. This is done on purpose though as they would have
worked semi-randomly before. The idea at this point is to isolate drivers
that might need to access those and fix them by providing a proper
function to obtain an offset to the legacy IOs of a given bus.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-06-14 22:29:56 +10:00
Benjamin Herrenschmidt c19c03fc74 [POWERPC] unmap_vm_area becomes unmap_kernel_range for the public
This makes unmap_vm_area static and a wrapper around a new
exported unmap_kernel_range that takes an explicit range instead
of a vm_area struct.

This makes it more versatile for code that wants to play with kernel
page tables outside of the standard vmalloc area.

(One example is some rework of the PowerPC PCI IO space mapping
code that depends on that patch and removes some code duplication
and horrible abuse of forged struct vm_struct).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-06-14 22:29:56 +10:00
Jon Tollefson 3f1df7a260 [POWERPC] Move common code out of if/else
Move common code out of if/else.

Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com>
----

hash_native_64.c |    3 +--
 1 files changed, 1 insertion(+), 2 deletions(-)
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-06-14 22:29:55 +10:00
Kumar Gala f1aed92464 [POWERPC] Fix modpost warning
Mark pte_alloc_one_kernel as __init_refok to fix the following warning:

WARNING: arch/powerpc/mm/built-in.o(.text+0x1068): Section mismatch: reference to .init.text:early_get_page (between 'pte_alloc_one_kernel' and 'steal_context')

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2007-05-23 07:49:37 -05:00
Benjamin Herrenschmidt 5453e7723b [POWERPC] Fix warning in 32-bit builds with CONFIG_HIGHMEM
Some missing fixup for the removal of 4 level fixup header.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-22 20:20:57 +10:00
Alexey Dobriyan e8edc6e03a Detach sched.h from mm.h
First thing mm.h does is including sched.h solely for can_do_mlock() inline
function which has "current" dereference inside. By dealing with can_do_mlock()
mm.h can be detached from sched.h which is good. See below, why.

This patch
a) removes unconditional inclusion of sched.h from mm.h
b) makes can_do_mlock() normal function in mm/mlock.c
c) exports can_do_mlock() to not break compilation
d) adds sched.h inclusions back to files that were getting it indirectly.
e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
   getting them indirectly

Net result is:
a) mm.h users would get less code to open, read, preprocess, parse, ... if
   they don't need sched.h
b) sched.h stops being dependency for significant number of files:
   on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
   after patch it's only 3744 (-8.3%).

Cross-compile tested on

	all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
	alpha alpha-up
	arm
	i386 i386-up i386-defconfig i386-allnoconfig
	ia64 ia64-up
	m68k
	mips
	parisc parisc-up
	powerpc powerpc-up
	s390 s390-up
	sparc sparc-up
	sparc64 sparc64-up
	um-x86_64
	x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig

as well as my two usual configs.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-21 09:18:19 -07:00
Jon Tollefson 5b82583185 [POWERPC] Correct #endif comment
Fix up comment on two #endifs to match their #ifs.

Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com>
----

 hash_utils_64.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-17 21:11:19 +10:00
Benjamin Herrenschmidt 017e3c53f1 [POWERPC] Add spinlock to request_phb_iospace()
request_phb_iospace() can be called from different CPUs at init
time (at least with my next patch) and thus needs a spinlock.
As for the next patch, this is a temporary workaround for 2.6.22
issues until my rewrite of IO mappings is ready (for 2.6.23)

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-17 21:11:14 +10:00
Kumar Gala 991eb43af9 [POWERPC] Fix COMMON symbol warnings
We get the following warnings in various ARCH=powerpc builds:

WARNING: "ee_restarts" [arch/powerpc/kernel/built-in] is COMMON symbol
WARNING: "fee_restarts" [arch/powerpc/kernel/built-in] is COMMON symbol
WARNING: "htab_hash_searches" [arch/powerpc/mm/built-in] is COMMON symbol
WARNING: "next_slot" [arch/powerpc/mm/built-in] is COMMON symbol
WARNING: "mmu_hash_lock" [arch/powerpc/mm/built-in] is COMMON symbol
WARNING: "primary_pteg_full" [arch/powerpc/mm/built-in] is COMMON symbol
WARNING: "global_dbcr0" [arch/powerpc/kernel/built-in] is COMMON symbol

Switch to moving local symbols (except mmu_hash_lock which is global) and
space directive instead.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2007-05-17 21:10:15 +10:00
Stephen Rothwell 8980ae8677 [POWERPC] Remove unused variable in hpte_decode()
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-12 11:32:47 +10:00
Stephen Rothwell 0c12fe5697 [POWERPC] Assign correct variable in hpte_decode()
This case will never be hit, but it should be corrected anyway.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-12 11:32:47 +10:00
Paul Mackerras 2454c7e98c [POWERPC] Fix warning in hpte_decode(), and generalize it
This adds the necessary support to hpte_decode() to handle 1TB
segments and 16GB pages, and removes an uninitialized value
warning on avpn.

We don't have any code to generate HPTEs for 1TB segments or 16GB
pages yet, so this is mostly for completeness, and to fix the
warning.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2007-05-10 21:28:13 +10:00
Linus Torvalds aabded9c3a Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
  [POWERPC] Further fixes for the removal of 4level-fixup hack from ppc32
  [POWERPC] EEH: log all PCI-X and PCI-E AER registers
  [POWERPC] EEH: capture and log pci state on error
  [POWERPC] EEH: Split up long error msg
  [POWERPC] EEH: log error only after driver notification.
  [POWERPC] fsl_soc: Make mac_addr const in fs_enet_of_init().
  [POWERPC] Don't use SLAB/SLUB for PTE pages
  [POWERPC] Spufs support for 64K LS mappings on 4K kernels
  [POWERPC] Add ability to 4K kernel to hash in 64K pages
  [POWERPC] Introduce address space "slices"
  [POWERPC] Small fixes & cleanups in segment page size demotion
  [POWERPC] iSeries: Make HVC_ISERIES the default
  [POWERPC] iSeries: suppress build warning in lparmap.c
  [POWERPC] Mark pages that don't exist as nosave
  [POWERPC] swsusp: Introduce register_nosave_region_late
2007-05-09 12:56:01 -07:00
Rafael J. Wysocki 8bb7844286 Add suspend-related notifications for CPU hotplug
Since nonboot CPUs are now disabled after tasks and devices have been
frozen and the CPU hotplug infrastructure is used for this purpose, we need
special CPU hotplug notifications that will help the CPU-hotplug-aware
subsystems distinguish normal CPU hotplug events from CPU hotplug events
related to a system-wide suspend or resume operation in progress.  This
patch introduces such notifications and causes them to be used during
suspend and resume transitions.  It also changes all of the
CPU-hotplug-aware subsystems to take these notifications into consideration
(for now they are handled in the same way as the corresponding "normal"
ones).

[oleg@tv-sign.ru: cleanups]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-09 12:30:56 -07:00
David Gibson f1a1eb299a [POWERPC] Further fixes for the removal of 4level-fixup hack from ppc32
Commit d1953c8888 removed the use of
4level-fixup.h for 32-bit systems under arch/powerpc.  However, I
missed a few things activated on some configurations, resulting in
some warnings (at least with STRICT_MM_TYPECHECKS enabled) and build
errors in some circumstances.  This fixes it.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-09 16:35:01 +10:00
Hugh Dickins 517e22638c [POWERPC] Don't use SLAB/SLUB for PTE pages
The SLUB allocator relies on struct page fields first_page and slab,
overwritten by ptl when SPLIT_PTLOCK: so the SLUB allocator cannot then
be used for the lowest level of pagetable pages.  This was obstructing
SLUB on PowerPC, which uses kmem_caches for its pagetables.  So convert
its pte level to use normal gfp pages (whereas pmd, pud and 64k-page pgd
want partpages, so continue to use kmem_caches for pmd, pud and pgd).

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-09 16:35:00 +10:00
Benjamin Herrenschmidt 16c2d47623 [POWERPC] Add ability to 4K kernel to hash in 64K pages
This adds the ability for a kernel compiled with 4K page size
to have special slices containing 64K pages and hash the right type
of hash PTEs.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-09 16:35:00 +10:00
Benjamin Herrenschmidt d0f13e3c20 [POWERPC] Introduce address space "slices"
The basic issue is to be able to do what hugetlbfs does but with
different page sizes for some other special filesystems; more
specifically, my need is:

 - Huge pages

 - SPE local store mappings using 64K pages on a 4K base page size
kernel on Cell

 - Some special 4K segments in 64K-page kernels for mapping a dodgy
type of powerpc-specific infiniband hardware that requires 4K MMU
mappings for various reasons I won't explain here.

The main issues are:

 - To maintain/keep track of the page size per "segment" (as we can
only have one page size per segment on powerpc, which are 256MB
divisions of the address space).

 - To make sure special mappings stay within their allotted
"segments" (including MAP_FIXED crap)

 - To make sure everybody else doesn't mmap/brk/grow_stack into a
"segment" that is used for a special mapping

Some of the necessary mechanisms to handle that were present in the
hugetlbfs code, but mostly in ways not suitable for anything else.

The patch relies on some changes to the generic get_unmapped_area()
that just got merged.  It still hijacks hugetlb callbacks here or
there as the generic code hasn't been entirely cleaned up yet but
that shouldn't be a problem.

So what is a slice ?  Well, I re-used the mechanism used formerly by our
hugetlbfs implementation which divides the address space in
"meta-segments" which I called "slices".  The division is done using
256MB slices below 4G, and 1T slices above.  Thus the address space is
divided currently into 16 "low" slices and 16 "high" slices.  (Special
case: high slice 0 is the area between 4G and 1T).

Doing so simplifies significantly the tracking of segments and avoids
having to keep track of all the 256MB segments in the address space.

While I used the "concepts" of hugetlbfs, I mostly re-implemented
everything in a more generic way and "ported" hugetlbfs to it.

Slices can have an associated page size, which is encoded in the mmu
context and used by the SLB miss handler to set the segment sizes.  The
hash code currently doesn't care, it has a specific check for hugepages,
though I might add a mechanism to provide per-slice hash mapping
functions in the future.

The slice code provide a pair of "generic" get_unmapped_area() (bottomup
and topdown) functions that should work with any slice size.  There is
some trickiness here so I would appreciate people to have a look at the
implementation of these and let me know if I got something wrong.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-09 16:35:00 +10:00
Benjamin Herrenschmidt 16f1c74675 [POWERPC] Small fixes & cleanups in segment page size demotion
The code for demoting segments to 4K had some issues, like for example,
when using _PAGE_4K_PFN flag, the first CPU to hit it would do the
demotion, but other CPUs hitting the same page wouldn't properly flush
their SLBs if mmu_ci_restriction isn't set.  There are also potential
issues with hash_preload not handling _PAGE_4K_PFN.  All of these are
non issues on current hardware but might bite us in the future.

This patch thus fixes it by:

 - Taking the test comparing the mm and current CPU context page
sizes to decide to flush SLBs out of the mmu_ci_restrictions test
since that can also be triggered by _PAGE_4K_PFN pages

 - Due to the above being done all the time, demote_segment_4k
doesn't need update the context and flush the SLB

 - demote_segment_4k can be static and doesn't need an EXPORT_SYMBOL

 - Making hash_preload ignore anything that has either _PAGE_4K_PFN
or _PAGE_NO_CACHE set, thus avoiding duplication of the complicated
logic in hash_page() (and possibly making hash_preload a little bit
faster for the normal case).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-09 16:35:00 +10:00
Johannes Berg 4e8ad3e816 [POWERPC] Mark pages that don't exist as nosave
On some powerpc architectures (notably 64-bit powermac) there is a memory
hole, for example on powermacs between 2G and 4G. Since we use the flat
memory model regardless, these pages must be marked as nosave (for suspend
to disk.)

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-09 16:34:56 +10:00