Commit Graph

469419 Commits

Author SHA1 Message Date
Michael Ellerman
edcee77fef powerpc/kdump: crash_dump.c needs to include io.h
For __ioremap().

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-03 18:03:35 +10:00
Michael Ellerman
d3b94e4b3b powerpc: Don't build powernv for other platform defconfigs
Because powernv arrived after these other platforms, the defconfigs
didn't have PPC_POWERNV disabled, and being default y it gets turned on.

If we're going to bother having defconfigs for the specific platforms
then they should only build the code required for those platforms.

The grab bag of everything config is ppc64_defconfig.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-03 18:03:11 +10:00
Wei Yang
8abf29f829 powerpc/pci: remove duplicate declaration of pci_bus_find_capability
pci_bus_find_capability() is decleared in pci.h, so it is not necessary to do
it again.

This patch removes it.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-03 17:27:38 +10:00
Alexey Kardashevskiy
9410e0185e powerpc/iommu/ddw: Fix endianness
rtas_call() accepts and returns values in CPU endianness.
The ddw_query_response and ddw_create_response structs members are
defined and treated as BE but as they are passed to rtas_call() as
(u32 *) and they get byteswapped automatically, the data is CPU-endian.
This fixes ddw_query_response and ddw_create_response definitions and use.

of_read_number() is designed to work with device tree cells - it assumes
the input is big-endian and returns data in CPU-endian. However due
to the ddw_create_response struct fix, create.addr_hi/lo are already
CPU-endian so do not byteswap them.

ddw_avail is a pointer to the "ibm,ddw-applicable" property which contains
3 cells which are big-endian as it is a device tree. rtas_call() accepts
a RTAS token in CPU-endian. This makes use of of_property_read_u32_array
to byte swap and avoid the need for a number of be32_to_cpu calls.

Cc: stable@vger.kernel.org # v3.13+
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[aik: folded Anton's patch with of_property_read_u32_array]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-03 14:22:34 +10:00
Anton Blanchard
a7696b36c0 powerpc: Add printk levels to powerpc code
Add printk levels to some places in the powerpc port.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-02 17:33:55 +10:00
Anton Blanchard
9a4f5cd0a5 powerpc: Add printk levels to powernv platform code
Add printk levels to powernv platform code, and convert to
pr_err() etc while here.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-02 17:33:55 +10:00
Anton Blanchard
3e47d1474c powerpc: Remove powerpc specific cmd_line
There is no need for yet another copy of the command line, just
use boot_command_line like everyone else.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-02 17:33:55 +10:00
Anton Blanchard
c7d1f6afe0 powerpc: Use pr_fmt in module loader code
Use pr_fmt to give some context to the error messages in the
module code, and convert open coded debug printk to pr_debug.

Use pr_err for error messages.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-02 17:33:54 +10:00
Anton Blanchard
9d57472f61 powerpc: Fill in si_addr_lsb siginfo field
Fill in the si_addr_lsb siginfo field so the hwpoison code can
pass to userspace the length of memory that has been corrupted.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-02 17:15:17 +10:00
Anton Blanchard
3913fdd7a2 powerpc: Add VM_FAULT_HWPOISON handling to powerpc page fault handler
do_page_fault was missing knowledge of HWPOISON, and we would oops
if userspace tried to access a poisoned page:

kernel BUG at arch/powerpc/mm/fault.c:180!

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-02 17:15:13 +10:00
Anton Blanchard
63af52629a powerpc: Simplify do_sigbus
Exit out early for a kernel fault, avoiding indenting of
most of the function.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-02 17:15:09 +10:00
Anton Blanchard
e35735b9a5 powerpc: Speed up clear_page by unrolling it
Unroll clear_page 8 times. A simple microbenchmark which
allocates and frees a zeroed page:

for (i = 0; i < iterations; i++) {
	unsigned long p = __get_free_page(GFP_KERNEL | __GFP_ZERO);
	free_page(p);
}

improves 20% on POWER8.

This assumes cacheline sizes won't grow beyond 512 bytes or
page sizes wont drop below 1kB, which is unlikely, but we could
add a runtime check during early init if it makes people nervous.

Michael found that some versions of gcc produce quite bad code
(all multiplies), so we give gcc a hand by using shifts and adds.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-02 16:04:21 +10:00
Gavin Shan
2013add4ce powerpc/eeh: Show hex prefix for PE state sysfs
As Michael suggested, the hex prefix for the output of EEH PE
state sysfs entry (/sys/bus/pci/devices/xxx/eeh_pe_state) is
always informative to users.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-01 22:23:34 +10:00
Gavin Shan
fe7e85c6f5 powerpc/powernv: Override dma_get_required_mask()
The dma_get_required_mask() function is used by some drivers to
query the platform about what DMA mask is needed to cover all of
memory. This is a bit of a strange semantic when we have to choose
between IOMMU translation or bypass, but essentially what it means
is "what DMA mask will give best performances".

Currently, our IOMMU backend always returns a 32-bit mask here, we
don't do anything special to it when we have bypass available. This
causes some drivers to choose a 32-bit mask, thus losing the ability
to use the bypass window, thinking this is more efficient. The problem
was reported from the driver of following device:

0004:03:00.0 0107: 1000:0087 (rev 05)
0004:03:00.0 Serial Attached SCSI controller: LSI Logic / Symbios \
             Logic SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 05)

This patch adds an override of that function in order to, instead,
return a 64-bit mask whenever a bypass window is available in order
for drivers to prefer this configuration.

Reported-by: Murali N. Iyer <mniyer@us.ibm.com>
Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:20 +10:00
Gavin Shan
372fb80db9 powerpc/powernv: Fetch frozen PE on top level
It should have been part of commit 1ad7a72c5 ("powerpc/eeh: Report
frozen parent PE prior to child PE"). There are 2 ways to report
EEH errors: proactively polling because of 0xFF's returned from
PCI config or IO read, or interrupt driven event. We missed to
report and handle parent frozen PE prior to child frozen PE for
the later case on PowerNV platform.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:20 +10:00
Gavin Shan
f2e0be5e76 powerpc/eeh: Dump PCI config space for all child devices
The PEs can be organized as nested. Current implementation doesn't
dump PCI config space for subordinate devices of child PEs. However,
the frozen PE could be caused by those subordinate devices of its
child PEs.

The patch dumps PCI config space for all subordinate devices of the
problematic PE.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:19 +10:00
Gavin Shan
5cfb20b96f powerpc/eeh: Emulate EEH recovery for VFIO devices
When enabling EEH functionality on passed through devices (PE)
with VFIO, the devices in the PE would be removed permanently
from guest side. In that case, the PE remains frozen state.
When returning PE to host, or restarting the guest again, we
had mechanism unfreezing the PE by clearing PESTA/B frozen
bits. However, that's not enough for some adapters, which are
indicated as following "lspci" shows. Those adapters require
hot reset on the parent bus to bring their firmware back to
workable state. Otherwise, those adaptrs won't be operative
and the host (for returning case) or the guest will fail to
load the drivers for those adapters without exception.

0000:01:00.0 Ethernet controller: Emulex Corporation OneConnect \
             10Gb NIC (be3) (rev 02)
0000:01:00.0 0200: 19a2:0710 (rev 02)
0001:03:00.0 Ethernet controller: Emulex Corporation OneConnect \
             NIC (Lancer) (rev 10)
0001:03:00.0 0200: 10df:e220 (rev 10)

The patch adds mechanism to emulate EEH recovery (for hot reset
on parent PCI bus) on 3 gates to fix the issue: open/release one
adapter of the PE, enable EEH functionality on one adapter of the
PE.

Reported-by:  Murilo Fossa Vicentini <muvic@br.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:18 +10:00
Gavin Shan
93e8b36d7b powerpc/eeh: Tag reset state for user owned PE
PE would be owned by userland, which probably request PE reset
done in host side. During the reset, we should drop the PCI
config accesses to the PE with help of flag EEH_PE_RESET.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:17 +10:00
Gavin Shan
d1a85eee35 powerpc/powernv: Sync OpalPciResetScope with firmware
The names of PCI reset scopes aren't sychronized with firmware.
The patch fixes it.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:17 +10:00
Gavin Shan
4ba5a0fc64 powerpc/pseries: Decrease message level on EEH initialization
As Anton suggested, the patch decreases the message level on EEH
initialization to avoid unnecessary messages if required. Also,
we have unified hint if any of needful RTAS calls is missed, and
then we can check /proc/device-tree to figure out the missed RTAS
calls.

Suggested-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:16 +10:00
Gavin Shan
9372dddb18 powerpc/eeh: Block PCI config access during reset
Function pcibios_set_pcie_reset_state() can be used to do PCI
reset. PCI config access during the reset usually causes EEH
errors unexpectedly. In order to avoid the EEH error, the patch
blocks PCI config access during reset with the help of flag
EEH_PE_RESET, which is similar to what we did in EEH PE reset
path.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:15 +10:00
Gavin Shan
c9dd014397 powerpc/eeh: Use eeh_unfreeze_pe()
The patch uses eeh_unfreeze_pe() to replace the logic clearing
frozen IO and DMA, in order to simplify the code.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:14 +10:00
Gavin Shan
4eeeff0ebc powerpc/eeh: Unfreeze PE on enabling EEH functionality
When passing through PE to guest, that's possibly in frozen
state. The driver for the pass-through devices on guest side
can't be loaded successfully as reported. We already had one
gate in eeh_dev_open() to clear PE frozen state accordingly,
but that's not enough because the function is only called at
QEMU startup for once.

The patch adds another gate in eeh_pe_set_option() so that the
PE frozen state can be cleared at QEMU restart time.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:14 +10:00
Gavin Shan
4d4f577e4b powerpc/eeh: Fix improper condition in eeh_pci_enable()
The function eeh_pci_enable() is called to apply various requests
to one particular PE: Enabling EEH, Disabling EEH, Enabling IO,
Enabling DMA, Freezing PE. When enabling IO or DMA on one specific
PE, we need check that IO or DMA isn't enabled previously. But
the condition used to do the check isn't completely correct because
one PE would be in DMA frozen state with workable IO path, or vice
versa.

The patch fixes the improper condition.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:13 +10:00
Gavin Shan
22fca17924 powerpc/eeh: Clear frozen device state in time
The problem was reported by Carol: In the scenario of passing mlx4
adapter to guest, EEH error could be recovered successfully. When
returning the device back to host, the driver (mlx4_core.ko)
couldn't be loaded successfully because of error number -5 (-EIO)
returned from mlx4_get_ownership(), which hits offlined PCI device.
The root cause is that we missed to put the affected devices into
normal state on clearing PE isolated state right after PE reset.

The patch fixes above issue by putting the affected devices to
normal state when clearing PE isolated state in eeh_pe_state_clear().

Cc: stable@vger.kernel.org
Reported-by: Carol L. Soto <clsoto@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:12 +10:00