The commit 543c043cba ("powerpc/fsl_msi: change the irq handler from
chained to normal") changes the msi cascade handler from chained to
normal. Since cascade handler must run in hard interrupt context, this
will cause kernel panic if we force threading of all the interrupt
handler via kernel command parameter 'threadirqs'. So mark the irq
handler IRQF_NO_THREAD explicitly.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Pull more powerpc updates from Michael Ellerman:
"Here's some more updates for powerpc for 3.18.
They are a bit late I know, though must are actually bug fixes. In my
defence I nearly cut the top of my finger off last weekend in a
gruesome bike maintenance accident, so I spent a good part of the week
waiting around for doctors. True story, I can send photos if you like :)
Probably the most interesting fix is the sys_call_table one, which
enables syscall tracing for powerpc. There's a fix for HMI handling
for old firmware, more endian fixes for firmware interfaces, more EEH
fixes, Anton fixed our routine that gets the current stack pointer,
and a few other misc bits"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (22 commits)
powerpc: Only do dynamic DMA zone limits on platforms that need it
powerpc: sync pseries_le_defconfig with pseries_defconfig
powerpc: Add printk levels to setup_system output
powerpc/vphn: NUMA node code expects big-endian
powerpc/msi: Use WARN_ON() in msi bitmap selftests
powerpc/msi: Fix the msi bitmap alignment tests
powerpc/eeh: Block CFG upon frozen Shiner adapter
powerpc/eeh: Don't collect logs on PE with blocked config space
powerpc/eeh: Block PCI config access upon frozen PE
powerpc/pseries: Drop config requests in EEH accessors
powerpc/powernv: Drop config requests in EEH accessors
powerpc/eeh: Rename flag EEH_PE_RESET to EEH_PE_CFG_BLOCKED
powerpc/eeh: Fix condition for isolated state
powerpc/pseries: Make CPU hotplug path endian safe
powerpc/pseries: Use dump_stack instead of show_stack
powerpc: Rename __get_SP() to current_stack_pointer()
powerpc: Reimplement __get_SP() as a function not a define
powerpc/numa: Add ability to disable and debug topology updates
powerpc/numa: check error return from proc_create
powerpc/powernv: Fallback to old HMI handling behavior for old firmware
...
As demonstrated in the previous commit, the failure message from the msi
bitmap selftests is a bit subtle, it's easy to miss a failure in a busy
boot log.
So drop our check() macro and use WARN_ON() instead. This necessitates
inverting all the conditions as well.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When we added the alignment tests recently we failed to check they were
actually passing - oops.
They weren't passing, because the bitmap was full. We should also be a
bit more careful when checking the return code, a negative error return
could by divisible by our alignment value.
Fixes: b0345bbc6d ("powerpc/msi: Improve IRQ bitmap allocator")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull powerpc updates from Michael Ellerman:
"Here's a first pull request for powerpc updates for 3.18.
The bulk of the additions are for the "cxl" driver, for IBM's Coherent
Accelerator Processor Interface (CAPI). Most of it's in drivers/misc,
which Greg & Arnd maintain, Greg said he was happy for us to take it
through our tree.
There's the usual minor cleanups and fixes, including a bit of noise
in drivers from some of those. A bunch of updates to our EEH code,
which has been getting more testing. Several nice speedups from
Anton, including 20% in clear_page().
And a bunch of updates for freescale from Scott"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (130 commits)
cxl: Fix afu_read() not doing finish_wait() on signal or non-blocking
cxl: Add documentation for userspace APIs
cxl: Add driver to Kbuild and Makefiles
cxl: Add userspace header file
cxl: Driver code for powernv PCIe based cards for userspace access
cxl: Add base builtin support
powerpc/mm: Add hooks for cxl
powerpc/opal: Add PHB to cxl mode call
powerpc/mm: Add new hash_page_mm()
powerpc/powerpc: Add new PCIe functions for allocating cxl interrupts
cxl: Add new header for call backs and structs
powerpc/powernv: Split out set MSI IRQ chip code
powerpc/mm: Export mmu_kernel_ssize and mmu_linear_psize
powerpc/msi: Improve IRQ bitmap allocator
powerpc/cell: Make spu_flush_all_slbs() generic
powerpc/cell: Move data segment faulting code out of cell platform
powerpc/cell: Move spu_handle_mm_fault() out of cell platform
powerpc/pseries: Use new defines when calling H_SET_MODE
powerpc: Update contact info in Documentation files
powerpc/perf/hv-24x7: Simplify catalog_read()
...
Currently msi_bitmap_alloc_hwirqs() will round up any IRQ allocation requests
to the nearest power of 2. eg. ask for 5 IRQs and you'll get 8. This wastes a
lot of IRQs which can be a scarce resource.
For cxl we may require multiple IRQs for every context that is attached to the
accelerator. There may be 1000s of contexts attached, hence we can easily run
out of IRQs, especially if we are needlessly wasting them.
This changes the msi_bitmap_alloc_hwirqs() to allocate only the required number
of IRQs, hence avoiding this wastage. It keeps the natural alignment
requirement though.
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Freescale updates from Scott (27 commits):
"Highlights include DMA32 zone support (SATA, USB, etc now works on 64-bit
FSL kernels), MSI changes, 8xx optimizations and cleanup, t104x board
support, and PrPMC PCI enumeration."
Move MSI checks from arch_msi_check_device() to arch_setup_msi_irqs().
This makes the code more compact and allows removing
arch_msi_check_device() from generic MSI code.
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
On PowerNV platforms, when a CPU is offline, we put it into nap mode.
It's possible that the CPU wakes up from nap mode while it is still
offline due to a stray IPI. A misdirected device interrupt could also
potentially cause it to wake up. In that circumstance, we need to clear
the interrupt so that the CPU can go back to nap mode.
In the past the clearing of the interrupt was accomplished by briefly
enabling interrupts and allowing the normal interrupt handling code
(do_IRQ() etc.) to handle the interrupt. This has the problem that
this code calls irq_enter() and irq_exit(), which call functions such
as account_system_vtime() which use RCU internally. Use of RCU is not
permitted on offline CPUs and will trigger errors if RCU checking is
enabled.
To avoid calling into any generic code which might use RCU, we adopt
a different method of clearing interrupts on offline CPUs. Since we
are on the PowerNV platform, we know that the system interrupt
controller is a XICS being driven directly (i.e. not via hcalls) by
the kernel. Hence this adds a new icp_native_flush_interrupt()
function to the native-mode XICS driver and arranges to call that
when an offline CPU is woken from nap. This new function reads the
interrupt from the XICS. If it is an IPI, it clears the IPI; if it
is a device interrupt, it prints a warning and disables the source.
Then it does the end-of-interrupt processing for the interrupt.
The other thing that briefly enabling interrupts did was to check and
clear the irq_happened flag in this CPU's PACA. Therefore, after
flushing the interrupt from the XICS, we also clear all bits except
the PACA_IRQ_HARD_DIS (interrupts are hard disabled) bit from the
irq_happened flag. The PACA_IRQ_HARD_DIS flag is set by power7_nap()
and is left set to indicate that interrupts are hard disabled. This
means we then have to ignore that flag in power7_nap(), which is
reasonable since it doesn't indicate that any interrupt event needs
servicing.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
of_device_ids (i.e. compatible strings and the respective data) are not
supposed to change at runtime. All functions working with of_device_ids
provided by <linux/of.h> work with const of_device_ids. This allows to
mark all struct of_device_id const, too.
While touching these line also put the __init annotation at the right
position where necessary.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This reverts commit c822e73731.
This commit conflicted with a bitmap allocator change that partially
accomplishes the same thing, but which does so more correctly. Revert
this one until it can be respun on top of the correct change.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Allocate msis such that each time a new interrupt is requested,
the SRS (MSIR register select) to be used is allocated in a
round-robin fashion.
The end result is that the msi interrupts will be spread across
distinct MSIRs with the main benefit that now users can set
affinity to each msi int through the mpic irq backing up the
MSIR register.
This is achieved with the help of a newly introduced msi bitmap
api that allows specifying the starting point when searching
for a free msi interrupt.
Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Rename the irq controller associated with a MSI
interrupt to fsl-msi-<V>, where <V> is the virq
of the cascade irq backing up this MSI interrupt.
This way, one can set the affinity of a MSI
through the cascade irq associated with said MSI
interrupt.
Given this example /proc/interrupts snippet:
CPU0 CPU1 CPU2 CPU3
16: 0 0 0 0 OpenPIC 16 Edge mpic-error-int
17: 0 4 0 0 fsl-msi-224 0 Edge eth0-rx-0
18: 0 5 0 0 fsl-msi-225 1 Edge eth0-tx-0
19: 0 2 0 0 fsl-msi-226 2 Edge eth0
[...]
224: 0 11 0 0 OpenPIC 224 Edge fsl-msi-cascade
225: 0 0 0 0 OpenPIC 225 Edge fsl-msi-cascade
226: 0 0 0 0 OpenPIC 226 Edge fsl-msi-cascade
[...]
To change the affinity of MSI interrupt 17
(having the irq controller named "fsl-msi-224")
instead of writing /proc/irq/17/smp_affinity, use
the associated MSI cascade irq, in this case,
interrupt 224, e.g.:
echo 6 > /proc/irq/224/smp_affinity
Note that a MSI cascade irq covers several MSI
interrupts, so changing the affinity on the
cascade will impact all of the associated MSI
interrupts.
Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
As we do for other fsl-mpic related cascaded irqchips
(e.g. error ints, mpic timers), use a normal irq handler
for msi irqs too.
This brings some advantages such as mask/unmask/ack/eoi
and irq state taken care behind the scenes, kstats
updates a.s.o plus access to features provided by mpic,
such as affinity.
Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Store cascade_data in an array inside the driver
data for later use.
Get rid of the msi_virq array since now we can
encapsulate the virqs in the cascade_data
directly and access them through the array
mentioned earlier.
Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
The following commit prevents the MPC8548E on the XPedite5200 PrPMC
module from enumerating its PCI/PCI-X bus:
powerpc/fsl-pci: use 'Header Type' to identify PCIE mode
The previous patch prevents any Freescale PCI-X bridge from enumerating
the bus, if it is hardware strapped into Agent mode.
In PCI-X, the Host is responsible for driving the PCI-X initialization
pattern to devices on the bus, so that they know whether to operate in
conventional PCI or PCI-X mode as well as what the bus timing will be.
For a PCI-X PrPMC, the pattern is driven by the mezzanine carrier it is
installed onto. Therefore, PrPMCs are PCI-X Agents, but one per system
may still enumerate the bus.
This patch causes the device node of any PCI/PCI-X bridge strapped into
Agent mode to be checked for the fsl,pci-agent-force-enum property. If
the property is present in the node, the bridge will be allowed to
enumerate the bus.
Cc: Minghuan Lian <Minghuan.Lian@freescale.com>
Signed-off-by: Aaron Sierra <asierra@xes-inc.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
The new MSI block in MPIC 4.3 added the MSIIR1 register,
with a different layout, in order to support 16 MSIR
registers. The msi binding was also updated so that
the "reg" reflects the newly introduced MSIIR1 register.
Virtual machines advertise these msi nodes by using the
compatible "fsl,vmpic-msi-v4.3" so add support for it.
Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com>
Cc: Scott Wood <scottwood@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Scott writes:
Highlights include e6500 hardware threading support, an e6500 TLB erratum
workaround, corenet error reporting, support for a new board, and some
minor fixes.
In commit ae91d60ba8, a bug was fixed that
involved converting !x & y to !(x & y). The code below shows the same
pattern, and thus should perhaps be fixed in the same way.
This is not tested and clearly changes the semantics, so it is only
something to consider.
The Coccinelle semantic patch that makes this change is as follows:
// <smpl>
@@ expression E1,E2; @@
(
!E1 & !E2
|
- !E1 & E2
+ !(E1 & E2)
)
// </smpl>
Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
mpic_msgrs has type struct mpic_msgr **, not struct mpic_msgr *, so the
elements of the array should have pointer type, not structure type.
The advantage of kcalloc is, that will prevent integer overflows which
could result from the multiplication of number of elements and size and
it is also a bit nicer to read.
The Coccinelle semantic patch that makes the first change is as follows:
// <smpl>
@disable sizeof_type_expr@
type T;
T **x;
@@
x =
<+...sizeof(
- T
+ *x
)...+>
// </smpl>
Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
The DART table allocation is registered to kmemleak via the
memblock_alloc_base() call. However, the DART table is later unmapped
and dart_tablebase VA no longer accessible. This patch tells kmemleak
not to scan this block and avoid an unhandled paging request.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>