Commit Graph

844 Commits

Author SHA1 Message Date
Paul Mackerras
e05b3b4adb powerpc/32: Restore previous version of 32-bit PCI code
When I removed the powermac support from arch/ppc/kernel/pci.c,
I overlooked the fact that that file is used in 32-bit ARCH=powerpc
builds.  To prevent problems in future, restore the original version
of that file as arch/powerpc/kernel/pci_32.c, and use that.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-15 22:05:47 +11:00
Paul Mackerras
a7fdd90bc4 [PATCH] ppc: Remove powermac support from ARCH=ppc
This makes it possible to build kernels for PReP and/or CHRP
with ARCH=ppc by removing the (non-building) powermac support.
It's now also possible to select PReP and CHRP independently.
Powermac users should now build with ARCH=powerpc instead of
ARCH=ppc.  (This does mean that it is no longer possible to
build a 32-bit kernel for a G5.)

Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-15 17:30:44 +11:00
Linus Torvalds
3e2b32b693 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6 2006-01-14 10:42:40 -08:00
Adrian Bunk
3824ba7df9 [PATCH] remove unused tmp_buf_sem's
tmp_buf_sem sems to be a common name for something completely unused...

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de> ("usb portion")
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-14 10:41:42 -08:00
Paul Mackerras
91f62a2491 ppc: Remove duplicate export of get_wchan
The arch/powerpc version of process.c exports get_wchan itself.  When
I moved ARCH=ppc over to using arch/powerpc/kernel/process.c the
get_wchan export in arch/ppc/kernel/ppc_ksyms.c became redundant, so
remove it.

Signed-off-by: Paul Mackerras <paulus@samba.org>
(cherry picked from 9871166ad692121d6b944159ef3f053570158ea8 commit)
2006-01-14 11:48:29 +11:00
Marcelo Tosatti
8f069b1a90 [PATCH] powerpc/8xx: Use 8MB D-TLB's for kernel static mapping faults
The following implements support for instantiation of 8MB D-TLB
entries for the kernel direct virtual mapping on 8xx, thus reducing TLB
space consumed for the kernel.

Test used: writing 40MB from /dev/zero to file in ext2fs over 
RAMDISK.

$ time dd if=/dev/zero of=file bs=4k count=10000 

VANILLA			8MB kernel data pages

real    0m11.485s	real    0m11.267s
user    0m0.218s        user    0m0.250s
sys     0m8.939s	sys     0m9.108s

real    0m11.518s	real    0m10.978s
user    0m0.203s 	user    0m0.222s
sys     0m9.585s	sys     0m9.138s

real    0m11.554s	real    0m10.967s
user    0m0.228s    	user    0m0.222s
sys     0m9.497s	sys     0m9.127s

real    0m11.633s	real	0m11.286s
user    0m0.214s	user    0m0.196s
sys     0m9.529s	sys     0m9.134s

and averages for both:

real	11.54750	real 11.12450

Which is a 3.6% improvement in execution time. More improvement is
expected for loads with larger kernel data footprint (real workloads).

Signed-off-by: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-14 11:14:27 +11:00
Russell King
91fb53866d [PATCH] Add ocp_bus_type probe and remove methods
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-01-13 11:26:06 -08:00
Kumar Gala
7e78e5e502 [PATCH] powerpc: Updated platforms that use gianfar to match driver
The gianfar driver changed how it required MDIO bus and phy id's
to be passed to it.  Also, it no longer passes the physical address
of the MDIO bus.  Instead we have a proper platform device.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-13 21:16:18 +11:00
Linus Torvalds
45bfe98bd7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc-merge
Fix up delete/modify conflict of arch/ppc/kernel/process.c by hand (it's
gone, gone, gone).

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-12 10:21:22 -08:00
Al Viro
639074354b [PATCH] m68k: kill mach_floppy_setup, convert to proper __setup() in drivers
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-12 09:09:05 -08:00
Al Viro
b4290a23cf [PATCH] m68k: namespace pollution fix (custom->amiga_custom)
in amigahw.h custom renamed to amiga_custom, in drivers with few instances the
same replacement, in the rest - #define custom amiga_custom in driver itself

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-12 09:09:00 -08:00
Al Viro
0cec6fd137 [PATCH] powerpc: task_stack_page()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-12 09:08:57 -08:00
Al Viro
b5e2fc1c62 [PATCH] powerpc: task_thread_info()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-12 09:08:57 -08:00
akpm@osdl.org
198e2f1811 [PATCH] scheduler cache-hot-autodetect
)

From: Ingo Molnar <mingo@elte.hu>

This is the latest version of the scheduler cache-hot-auto-tune patch.

The first problem was that detection time scaled with O(N^2), which is
unacceptable on larger SMP and NUMA systems. To solve this:

- I've added a 'domain distance' function, which is used to cache
  measurement results. Each distance is only measured once. This means
  that e.g. on NUMA distances of 0, 1 and 2 might be measured, on HT
  distances 0 and 1, and on SMP distance 0 is measured. The code walks
  the domain tree to determine the distance, so it automatically follows
  whatever hierarchy an architecture sets up. This cuts down on the boot
  time significantly and removes the O(N^2) limit. The only assumption
  is that migration costs can be expressed as a function of domain
  distance - this covers the overwhelming majority of existing systems,
  and is a good guess even for more assymetric systems.

  [ People hacking systems that have assymetries that break this
    assumption (e.g. different CPU speeds) should experiment a bit with
    the cpu_distance() function. Adding a ->migration_distance factor to
    the domain structure would be one possible solution - but lets first
    see the problem systems, if they exist at all. Lets not overdesign. ]

Another problem was that only a single cache-size was used for measuring
the cost of migration, and most architectures didnt set that variable
up. Furthermore, a single cache-size does not fit NUMA hierarchies with
L3 caches and does not fit HT setups, where different CPUs will often
have different 'effective cache sizes'. To solve this problem:

- Instead of relying on a single cache-size provided by the platform and
  sticking to it, the code now auto-detects the 'effective migration
  cost' between two measured CPUs, via iterating through a wide range of
  cachesizes. The code searches for the maximum migration cost, which
  occurs when the working set of the test-workload falls just below the
  'effective cache size'. I.e. real-life optimized search is done for
  the maximum migration cost, between two real CPUs.

  This, amongst other things, has the positive effect hat if e.g. two
  CPUs share a L2/L3 cache, a different (and accurate) migration cost
  will be found than between two CPUs on the same system that dont share
  any caches.

(The reliable measurement of migration costs is tricky - see the source
for details.)

Furthermore i've added various boot-time options to override/tune
migration behavior.

Firstly, there's a blanket override for autodetection:

	migration_cost=1000,2000,3000

will override the depth 0/1/2 values with 1msec/2msec/3msec values.

Secondly, there's a global factor that can be used to increase (or
decrease) the autodetected values:

	migration_factor=120

will increase the autodetected values by 20%. This option is useful to
tune things in a workload-dependent way - e.g. if a workload is
cache-insensitive then CPU utilization can be maximized by specifying
migration_factor=0.

I've tested the autodetection code quite extensively on x86, on 3
P3/Xeon/2MB, and the autodetected values look pretty good:

Dual Celeron (128K L2 cache):

 ---------------------
 migration cost matrix (max_cache_size: 131072, cpu: 467 MHz):
 ---------------------
           [00]    [01]
 [00]:     -     1.7(1)
 [01]:   1.7(1)    -
 ---------------------
 cacheflush times [2]: 0.0 (0) 1.7 (1784008)
 ---------------------

Here the slow memory subsystem dominates system performance, and even
though caches are small, the migration cost is 1.7 msecs.

Dual HT P4 (512K L2 cache):

 ---------------------
 migration cost matrix (max_cache_size: 524288, cpu: 2379 MHz):
 ---------------------
           [00]    [01]    [02]    [03]
 [00]:     -     0.4(1)  0.0(0)  0.4(1)
 [01]:   0.4(1)    -     0.4(1)  0.0(0)
 [02]:   0.0(0)  0.4(1)    -     0.4(1)
 [03]:   0.4(1)  0.0(0)  0.4(1)    -
 ---------------------
 cacheflush times [2]: 0.0 (33900) 0.4 (448514)
 ---------------------

Here it can be seen that there is no migration cost between two HT
siblings (CPU#0/2 and CPU#1/3 are separate physical CPUs). A fast memory
system makes inter-physical-CPU migration pretty cheap: 0.4 msecs.

8-way P3/Xeon [2MB L2 cache]:

 ---------------------
 migration cost matrix (max_cache_size: 2097152, cpu: 700 MHz):
 ---------------------
           [00]    [01]    [02]    [03]    [04]    [05]    [06]    [07]
 [00]:     -    19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
 [01]:  19.2(1)    -    19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
 [02]:  19.2(1) 19.2(1)    -    19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
 [03]:  19.2(1) 19.2(1) 19.2(1)    -    19.2(1) 19.2(1) 19.2(1) 19.2(1)
 [04]:  19.2(1) 19.2(1) 19.2(1) 19.2(1)    -    19.2(1) 19.2(1) 19.2(1)
 [05]:  19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)    -    19.2(1) 19.2(1)
 [06]:  19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)    -    19.2(1)
 [07]:  19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)    -
 ---------------------
 cacheflush times [2]: 0.0 (0) 19.2 (19281756)
 ---------------------

This one has huge caches and a relatively slow memory subsystem - so the
migration cost is 19 msecs.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Cc: <wilder@us.ibm.com>
Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-12 09:08:50 -08:00
Paul Mackerras
624cee31bc powerpc: make ARCH=ppc use arch/powerpc/kernel/process.c
Commit 5388fb1025 made signal_32.c
use discard_lazy_cpu_state, which broke ARCH=ppc because that
uses the common signal_32.c but has its own process.c.  Make ARCH=ppc
use the common process.c to fix this and to reduce the amount
of duplicated code.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-12 21:22:34 +11:00
Randy Dunlap
a941564458 [PATCH] capable/capability.h (arch/)
arch: Use <linux/capability.h> where capable() is used.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11 18:42:14 -08:00
Kumar Gala
08264cbc9f [PATCH] powerpc: Updated Kconfig and Makefiles for 83xx support
Updated Kconfig & Makefiles in prep for adding support for the Freescale
MPC83xx family of processors to arch/powerpc.  Moved around some config
options that are more globally applicable to other PowerPC processors.
Added a temporary config option (83xx) to match existing arch/ppc support
for the MPC83xx line.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-11 15:35:29 +11:00
Linus Torvalds
ab396e91bf Merge ssh://master.kernel.org/pub/scm/linux/kernel/git/sam/kbuild
Fix up some trivial conflicts in {i386|ia64}/Makefile
2006-01-10 08:21:33 -08:00
Vivek Goyal
cc57165874 [PATCH] kdump: dynamic per cpu allocation of memory for saving cpu registers
- In case of system crash, current state of cpu registers is saved in memory
  in elf note format.  So far memory for storing elf notes was being allocated
  statically for NR_CPUS.

- This patch introduces dynamic allocation of memory for storing elf notes.
  It uses alloc_percpu() interface.  This should lead to better memory usage.

- Introduced based on Andi Kleen's and Eric W. Biederman's suggestions.

- This patch also moves memory allocation for elf notes from architecture
  dependent portion to architecture independent portion.  Now crash_notes is
  architecture independent.  The whole idea is that size of memory to be
  allocated per cpu (MAX_NOTE_BYTES) can be architecture dependent and
  allocation of this memory can be architecture independent.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:01:26 -08:00
Jiri Slaby
48d6877362 [PATCH] PCI: pci_find_device remove (ppc/platforms/85xx/mpc85xx_cds_common.c)
Signed-off-by: Jiri Slaby <xslaby@fi.muni.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-01-09 12:13:14 -08:00
Jiri Slaby
cee0295381 [PATCH] PCI: pci_find_device remove (ppc/kernel/pci.c)
Signed-off-by: Jiri Slaby <xslaby@fi.muni.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
 arch/ppc/kernel/pci.c |   21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)
2006-01-09 12:13:14 -08:00
Linus Torvalds
6150c32589 Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc-merge 2006-01-09 10:03:44 -08:00
Paul Janzen
aed9c6ccb8 [PATCH] ppc32: Put cache flush routines back into .relocate_code section
In 2.6.14, we had the following definition of _GLOBAL() in
include/asm-ppc/processor.h:

#define _GLOBAL(n)\
        .stabs __stringify(n:F-1),N_FUN,0,0,n;\
        .globl n;\
n:

In 2.6.15, as part of the great powerpc merge, we moved this definition to
include/asm-powerpc/ppc_asm.h, where it appears (to 32-bit code) as:

#define _GLOBAL(n)      \
        .text;          \
        .stabs __stringify(n:F-1),N_FUN,0,0,n;\
        .globl n;       \
n:

Mostly, this is fine.  However, we also have the following, in
arch/ppc/boot/common/util.S:

        .section ".relocate_code","xa"
[...]
_GLOBAL(flush_instruction_cache)
[...]
_GLOBAL(flush_data_cache)
[...]

The addition of the .text section definition in the definition of
_GLOBAL overrides the .relocate_code section definition.  As a result,
these two functions don't end up in .relocate_code, so they don't get
relocated correctly, and the boot fails.

There's another suspicious-looking usage at kernel/swsusp.S:37 that
someone should look into.  I did not exhaustively search the source
tree, though.

The following is the minimal patch that fixes the immediate problem.
I could easily be convinced that the _GLOBAL definition should be
modified to remove the ".text;" line either instead of, or in addition
to, this fix.

Signed-off-by: Paul Janzen <pcj@linux.sez.to>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 15:47:08 +11:00
Matt Mackall
e585e47031 [PATCH] tiny: Make *[ug]id16 support optional
Configurable 16-bit UID and friends support

This allows turning off the legacy 16 bit UID interfaces on embedded platforms.

   text    data     bss     dec     hex filename
3330172  529036  190556 4049764  3dcb64 vmlinux-baseline
3328268  529040  190556 4047864  3dc3f8 vmlinux

From: Adrian Bunk <bunk@stusta.de>

    UID16 was accidentially disabled for !EMBEDDED.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08 20:14:11 -08:00
Kumar Gala
e5cd040409 [PATCH] powerpc: Fix compile problem in pci.c for ppc32
pci_address_to_pio is missing a closing curly brace

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 15:05:59 +11:00