On a large machine I noticed the columns of /proc/interrupts failed to line up
with the header after CPU9. At sufficiently large numbers of CPUs it becomes
impossible to line up the CPU number with the counts.
While fixing this I noticed x86 has a number of updates that we may as well
pull in. On PowerPC we currently omit an interrupt completely if there is no
active handler, whereas on x86 it is printed if there is a non zero count.
The x86 code also spaces the first column correctly based on nr_irqs.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
PowerPC is currently using asm-generic/hardirq.h which statically allocates an
NR_CPUS irq_stat array. Switch to an arch specific implementation which uses
per cpu data:
On a kernel with NR_CPUS=1024, this saves quite a lot of memory:
text data bss dec hex filename
8767938 2944132 1636796 13348866 cbb002 vmlinux.baseline
8767779 2944260 1505724 13217763 c9afe3 vmlinux.irq_cpustat
A saving of around 128kB.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The clockevent multiplier and shift is useful information, but we
only need to print it once.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
RTAS should never cause an exception but if it does (for example accessing
outside our RMO) then we might go a long way through the kernel before
oopsing. If we unset MSR_RI we should at least stop things on exception
exit.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We use firmware_has_feature quite a lot these days, so it's worth putting
powerpc_firmware_features into __read_mostly.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add printout of last accessed sysfs file, added to x86 in
ae87221d3c (sysfs: crash debugging)
Also add the notify_die hook that allows us to print out the ftrace
buffer on oops. This is useful in conjunction with ftrace function_graph:
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=128 NUMA pSeries
last sysfs file: /sys/class/net/tunl0/type
Dumping ftrace buffer:
...
0) | .sysrq_handle_crash() {
0) 0.476 us | .hash_page();
0) 0.488 us | .xmon_fault_handler();
0) | .bad_page_fault() {
0) | .search_exception_tables() {
0) 0.590 us | .search_module_extables();
0) 2.546 us | }
0) | .printk() {
0) | .vprintk() {
0) 0.488 us | ._raw_spin_lock();
0) 0.572 us | .emit_log_char();
Showing the function graph of a sysrq-c crash.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
String constants that are continued on subsequent lines with \
are not good.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Updated variant of a patch by Joel Schopp.
The field containing the number of supported cores which we pass to
firmware via the ibm,client-architecture call was set by a previous
patch statically as high as is possible (NR_CPUS).
However, that value isn't quite right for a system that supports
multiple threads per core, thus permitting the firmware to assign
more cores to a Linux partition than it can really cope with.
This patch improves it by using the device-tree to determine the
number of threads supported by the processors in order to adjust
the value passed to firmware.
Signed-off-by: Joel Schopp <jschopp@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch adds 2 fields to the ibm_architecture_vec array.
The first of these fields indicates the number of cores which Linux can
boot. It does not account for SMT, so it may result in cpus assigned to
Linux which cannot be booted. A second patch follows that dynamically
updates this for SMT.
The second field just indicates that our OS is Linux, and not another
OS. The system may or may not use this hint to performance tune
settings for Linux.
Signed-off-by: Joel Schopp <jschopp@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Using perf to trace L1 dcache misses and dumping data addresses I found a few
variables taking a lot of misses. Since they are almost never written, they
should go into the __read_mostly section.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The cputime code has a few places that do per_cpu(, smp_processor_id()).
Replace them with __get_cpu_var().
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Here are the powerpc bits to remove TIF_ABI_PENDING now that
set_personality() is called at the appropriate place in exec.
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add missing call to pci_fixup_device(pci_fixup_early, ...) when
building the pci_dev from scratch off the Open Firmware device-tree
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add missing hookup to existing pci_slot when building the pci_dev from
scratch off the Open Firmware device-tree
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We are missing these when building the pci_dev from scratch off
the Open Firmware device-tree
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Move the defintion and lock helper routines for the cpu hotplug driver
lock from pseries to powerpc code to avoid build breaks for platforms
other than pseries that use cpu hotplug.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Acked-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The newly added fixup for buggy dcbX insn's has
a bug that always trigger a kernel TLB walk so a user space
dcbX insn will cause a Kernel Machine Check if it hits DTLB error.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We noticed that recent kernels didn't boot on our 1GHz Canyonlands 460EX
boards anymore. As it seems, patch 8d165db1 [powerpc: Improve
decrementer accuracy] introduced this problem. The routine div_sc()
overflows with shift = 32 resulting in this incorrect setup:
time_init: decrementer frequency = 1000.000012 MHz
time_init: processor frequency = 1000.000012 MHz
clocksource: timebase mult[400000] shift[22] registered
clockevent: decrementer mult[33] shift[32] cpu[0]
This patch now introduces a local div_dc64() version of this function
so that this overflow doesn't happen anymore.
Signed-off-by: Stefan Roese <sr@denx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Detlev Zundel <dzu@denx.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
It seems there is a thinko in the TLB invalidation code that makes the
tlbie in the loop executed just once. The intended check was probably
'gt', not 'lt'.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Various kernel asm modifies SRR0/SRR1 just before executing
a rfi. If such code crosses a page boundary you risk a TLB miss
which will clobber SRR0/SRR1. Avoid this by always pinning
kernel instruction TLB space.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI/cardbus: Add a fixup hook and fix powerpc
PCI: change PCI nomenclature in drivers/pci/ (non-comment changes)
PCI: change PCI nomenclature in drivers/pci/ (comment changes)
PCI: fix section mismatch on update_res()
PCI: add Intel 82599 Virtual Function specific reset method
PCI: add Intel USB specific reset method
PCI: support device-specific reset methods
PCI: Handle case when no pci device can provide cache line size hint
PCI/PM: Propagate wake-up enable for PCIe devices too
vgaarbiter: fix a typo in the vgaarbiter Documentation
This patch fixes the handling of VSX alignment faults in little-endian
mode (the current code assumes the processor is in big-endian mode).
The patch also makes the handlers clear the top 8 bytes of the register
when handling an 8 byte VSX load.
This is based on 2.6.32.
Signed-off-by: Neil Campbell <neilc@linux.vnet.ibm.com>
Cc: <stable@kernel.org>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The cardbus code creates PCI devices without ever going through the
necessary fixup bits and pieces that normal PCI devices go through.
There's in fact a commented out call to pcibios_fixup_bus() in there,
it's commented because ... it doesn't work.
I could make pcibios_fixup_bus() do the right thing on powerpc easily
but I felt it cleaner instead to provide a specific hook pci_fixup_cardbus
for which a weak empty implementation is provided by the PCI core.
This fixes cardbus on powerbooks and probably all other PowerPC
platforms which was broken completely for ever on some platforms and
since 2.6.31 on others such as PowerBooks when we made the DMA ops
mandatory (since those are setup by the fixups).
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>