This patch adds code, linker script and makefile support to allow
building the zImage wrapper around the kernel as a position independent
executable. This results in an ET_DYN instead of an ET_EXEC ELF output
file, which can be loaded at any location by the firmware and will
process its own relocations to work correctly at the loaded address.
This is of interest particularly since the standard ePAPR image format
must be an ET_DYN (although this patch alone is not sufficient to
produce a fully ePAPR compliant boot image).
Note for now we don't enable building with -pie for anything.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On Book3E, MMU_NO_CONTEXT != 0, but the slice_mm_new_context()
macro assumes that it is. This means that the map of the
page sizes for each slice is always initialized to zeroes
(which happens to be 4k pages), rather than to the correct
default base page size value - which might be 64k.
This patch corrects the problem.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use MMU_NO_CONTEXT as the initialiser for mm_context.id on
nohash and hash64.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We need to do that to guarantee they see any code change done by
dynamic patching during boot.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Wakeup comes from the system reset handler with a potential loss of
the non-hypervisor CPU state. We save the non-volatile state on the
stack and a pointer to it in the PACA, which the system reset handler
uses to restore things
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We need to wait a bit for them to have done their CPU setup
or we might end up with translation and EE on with different
LPCR values between threads
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We do it before we loop on the PACA start flag. This way, we get a
chance to set critical SPRs on all CPUs before Linux tries to start
them up, which avoids problems when changing some bits such as LPCR
bits that need to be identical on all threads of a core or similar
things like that. Ideally, some of that should also be done before
the MMU is enabled, but that's a separate issue which would require
moving some of the SMP startup code earlier, let's not get there
for now, it works with that change alone.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This sets the default data stream prefetch size for operating
systems that don't set their own value in DSCR. We use 4 which
is "medium".
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This uses feature sections to arrange that we always use HSPRG1
as the scratch register in the interrupt entry code rather than
SPRG2 when we're running in hypervisor mode on POWER7. This will
ensure that we don't trash the guest's SPRG2 when we are running
KVM guests. To simplify the code, we define GET_SCRATCH0() and
SET_SCRATCH0() macros like the GET_PACA/SET_PACA macros.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Rework exception macros a bit to split offset from vector and add
some basic support for HDEC, HDSI, HISI and a few more.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Pass the register type to the prolog, also provides alternate "HV"
version of hardware interrupt (0x500) and adjust LPES accordingly
We tag those interrupts by setting bit 0x2 in the trap number
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When running in Hypervisor mode (arch 2.06 or later), we store the PACA
in HSPRG0 instead of SPRG1. The architecture specifies that SPRGs may be
lost during a "nap" power management operation (though they aren't
currently on POWER7) and this enables use of SPRG1 by KVM guests.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This bit indicates that we are operating in hypervisor mode on a CPU
compliant to architecture 2.06 or later (currently server only).
We set it on POWER7 and have a boot-time CPU setup function that
clears it if MSR:HV isn't set (booting under a hypervisor).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Even when nothing is specified in the device tree, and despite the
fact that we don't setup links properly yet, we still need a reasonable
value in there or some interrupts won't be setup properly to point to
an existing processor.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This adds more SPR definitions used on newer processors when running
in hypervisor mode. Along with some other P7 specific bits and pieces
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This is a significant rework of the XICS driver, too significant to
conveniently break it up into a series of smaller patches to be honest.
The driver is moved to a more generic location to allow new platforms
to use it, and is broken up into separate ICP and ICS "backends". For
now we have the native and "hypervisor" ICP backends and one common
RTAS ICS backend.
The driver supports one ICP backend instanciation, and many ICS ones,
in order to accomodate future platforms with multiple possibly different
interrupt "sources" mechanisms.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: xen-kbdfront - fix mouse getting stuck after save/restore
Input: estimate number of events per packet
Input: evdev - indicate buffer overrun with SYN_DROPPED
Input: document event types and codes and their intended use
Input: add KEY_IMAGES specifically for AL Image Browser
Input: twl4030_keypad - fix potential NULL dereference in twl4030_kp_probe()
Input: h3600_ts - fix error handling at connect
Input: twl4030_keypad - avoid potential NULL-pointer dereference
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: add blk_run_queue_async
block: blk_delay_queue() should use kblockd workqueue
md: fix up raid1/raid10 unplugging.
md: incorporate new plugging into raid5.
md: provide generic support for handling unplug callbacks.
md - remove old plugging code.
md/dm - remove remains of plug_fn callback.
md: use new plugging interface for RAID IO.
block: drop queue lock before calling __blk_run_queue() for kblockd punt
Revert "block: add callback function for unplug notification"
block: Enhance new plugging support to support general callbacks
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/powermac: Build fix with SMP and CPU hotplug
powerpc/perf_event: Skip updating kernel counters if register value shrinks
powerpc: Don't write protect kernel text with CONFIG_DYNAMIC_FTRACE enabled
powerpc: Fix oops if scan_dispatch_log is called too early
powerpc/pseries: Use a kmem cache for DTL buffers
powerpc/kexec: Fix regression causing compile failure on UP
powerpc/85xx: disable Suspend support if SMP enabled
powerpc/e500mc: Remove CPU_FTR_MAYBE_CAN_NAP/CPU_FTR_MAYBE_CAN_DOZE
powerpc/book3e: Fix CPU feature handling on 64-bit e5500
powerpc: Check device status before adding serial device
powerpc/85xx: Don't add disabled PCIe devices
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (24 commits)
Btrfs: fix free space cache leak
Btrfs: avoid taking the chunk_mutex in do_chunk_alloc
Btrfs end_bio_extent_readpage should look for locked bits
Btrfs: don't force chunk allocation in find_free_extent
Btrfs: Check validity before setting an acl
Btrfs: Fix incorrect inode nlink in btrfs_link()
Btrfs: Check if btrfs_next_leaf() returns error in btrfs_real_readdir()
Btrfs: Check if btrfs_next_leaf() returns error in btrfs_listxattr()
Btrfs: make uncache_state unconditional
btrfs: using cached extent_state in set/unlock combinations
Btrfs: avoid taking the trans_mutex in btrfs_end_transaction
Btrfs: fix subvolume mount by name problem when default mount subvolume is set
fix user annotation in ioctl.c
Btrfs: check for duplicate iov_base's when doing dio reads
btrfs: properly handle overlapping areas in memmove_extent_buffer
Btrfs: fix memory leaks in btrfs_new_inode()
Btrfs: check for duplicate iov_base's when doing dio reads
Btrfs: reuse the extent_map we found when calling btrfs_get_extent
Btrfs: do not use async submit for small DIO io's
Btrfs: don't split dio bios if we don't have to
...
Rather than pass in some random truncated offset to the pid-related
functions, check that the offset is in range up-front.
This is just cleanup, the previous commit fixed the real problem.
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>