There is a couple of commented debug prints which still use
IOMMU_PAGE_SHIFT() which is not defined for POWERPC anymore, replace
them with it_page_shift.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The PCI config accessors check for PE frozen state and clear it if
EEH isn't functional. The patch handles compound PE in config accessors
if PHB supports it. For consistency, all PEs will be put into frozen
state if any one in compound group gets frozen by hardware.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch handles compound PE for EEH backend. If one specific
PE in compound group has been frozen, we enforces to freeze
all PEs in the group. If we're enable DMA or MMIO for one PE
in compound group, DMA or MMIO of all PEs in the group will be
enabled.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch introduces 3 PHB callbacks: compound PE state retrieval,
force freezing and unfreezing compound PE. The PCI config accessors
and PowerNV EEH backend can use them in subsequent patches.
We don't export the capability of compound PE to EEH core, which
helps avoiding more complexity to EEH core.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Function ioda_eeh_get_state() is used to fetch EEH state for PHB
or PE. We're going to support compound PE and the function becomes
more complicated with that. The patch splits the function into two
functions for PHB and PE cases separately to improve readability.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch synchronizes header file with firmware to have new OPAL
API opal_pci_eeh_freeze_set(), which is used to freeze the specified
PE in order to support "compound" PE.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch enables M64 aperatus for PHB3.
We already had platform hook (ppc_md.pcibios_window_alignment) to affect
the PCI resource assignment done in PCI core so that each PE's M32 resource
was built on basis of M32 segment size. Similarly, we're using that for
M64 assignment on basis of M64 segment size.
* We're using last M64 BAR to cover M64 aperatus, and it's shared by all
256 PEs.
* We don't support P7IOC yet. However, some function callbacks are added
to (struct pnv_phb) so that we can reuse them on P7IOC in future.
* PE, corresponding to PCI bus with large M64 BAR device attached, might
span multiple M64 segments. We introduce "compound" PE to cover the case.
The compound PE is a list of PEs and the master PE is used as before.
The slave PEs are just for MMIO isolation.
Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch allows PE (struct eeh_pe) instance to have auxillary data,
whose size is configurable on basis of platform. For PowerNV, the
auxillary data will be used to cache PHB diag-data for that PE
(frozen PE or fenced PHB). In turn, we can retrieve the diag-data
at any later points.
It's useful for the case of VFIO PCI devices where the error log
should be cached, and then be retrieved by the guest at later point.
Also, it can avoid PHB diag-data overwritting if another frozen PE
reported and the previous diag-data isn't fetched by guest.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
pr_warn() is equal to pr_warning(), but the former is a bit more
formal according to commit fc62f2f ("kernel.h: add pr_warn for
symmetry to dev_warn, netdev_warn").
The patch replaces pr_warning() with pr_warn().
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch prints 4 PCIE or AER config registers each line, which
is part of the EEH log so that it looks a bit more compact.
Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
According to the experiment I did, PCI config access is blocked
on P7IOC frozen PE by hardware, but PHB3 doesn't do that. That
means we always get 0xFF's while dumping PCI config space of the
frozen PE on P7IOC. We don't have the problem on PHB3. So we have
to enable I/O prioir to collecting error log. Otherwise, meaningless
0xFF's are always returned.
The patch fixes it by EEH flag (EEH_ENABLE_IO_FOR_LOG), which is
selectively set to indicate the case for: P7IOC on PowerNV platform,
pSeries platform.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
There are multiple global EEH flags. Almost each flag has its own
accessor, which doesn't make sense. The patch refactors EEH flag
accessors so that they look unified:
eeh_add_flag(): Add EEH flag
eeh_clear_flag(): Clear EEH flag
eeh_has_flag(): Check if one specific flag has been set
eeh_enabled(): Check if EEH functionality has been enabled
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Function eeh_iommu_group_to_pe() iterates each PCI device to check
the binding IOMMU group with get_iommu_table_base(), which possibly
fetches pdev->dev.archdata.dma_data.dma_offset. It's (0x1 << 59)
for "bypass" cases.
The patch fixes the issue by iterating devices hooked to the IOMMU
group and fetch IOMMU table there.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On PHB3, PCI devices can bypass IOMMU for DMA access. If we pass
through one PCI device, whose hose driver ever enable the bypass
mode, pdev->dev.archdata.dma_data.iommu_table_base isn't IOMMU
table. However, EEH needs access the IOMMU table when the device
is owned by guest.
The patch fixes pdev->dev.archdata.dma_data.iommu_table when
passing through the device to guest in pnv_pci_ioda2_set_bypass().
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
pci_get_slot() is called with hold of PCI bus semaphore and it's not
safe to be called in interrupt context. However, we possibly checks
EEH error and calls the function in interrupt context. To avoid using
pci_get_slot(), we turn into device tree for fetching location code.
Otherwise, we might run into WARN_ON() as following messages indicate:
WARNING: at drivers/pci/search.c:223
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-rc3+ #72
task: c000000001367af0 ti: c000000001444000 task.ti: c000000001444000
NIP: c000000000497b70 LR: c000000000037530 CTR: 000000003003d114
REGS: c000000001446fa0 TRAP: 0700 Not tainted (3.16.0-rc3+)
MSR: 9000000000029032 <SF,HV,EE,ME,IR,DR,RI> CR: 48002422 XER: 20000000
CFAR: c00000000003752c SOFTE: 0
:
NIP [c000000000497b70] .pci_get_slot+0x40/0x110
LR [c000000000037530] .eeh_pe_loc_get+0x150/0x190
Call Trace:
.of_get_property+0x30/0x60 (unreliable)
.eeh_pe_loc_get+0x150/0x190
.eeh_dev_check_failure+0x1b4/0x550
.eeh_check_failure+0x90/0xf0
.lpfc_sli_check_eratt+0x504/0x7c0 [lpfc]
.lpfc_poll_eratt+0x64/0x100 [lpfc]
.call_timer_fn+0x64/0x190
.run_timer_softirq+0x2cc/0x3e0
Cc: stable@vger.kernel.org
Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit bcdde7e made __sysfs_remove_dir() recursive and introduced a BUG_ON
during PHB removal while attempting to delete the power managment attribute
group of the bus. This is a result of tearing the bridge and bus devices down
out of order in remove_phb_dynamic. Since, the the bus resides below the bridge
in the sysfs device tree it should be torn down first.
This patch simply moves the device_unregister call for the PHB bridge device
after the device_unregister call for the PHB bus.
Fixes: bcdde7e221 ("sysfs: make __sysfs_remove_dir() recursive")
Cc: stable@vger.kernel.org
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc defines various machine-specific routines for handling
pci_set_dma_mask(). The routines for machine "PowerNV" may neglect
to set dev->dma_mask. This could confuse anyone (e.g. drivers) that
consult dev->dma_mask to find the current mask. Set the dma_mask in
the PowerNV leaf routine.
Signed-off-by: Brian W. Hart <hartb@linux.vnet.ibm.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This silences a section mismatch warning. early_alloc_pgtable() is
called from map_kernel_page() which cannot be __init, but only when
slab_is_available() returns false which can only happen during early
boot.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cpufreq depends on platform firmware to implement PStates. In case of
platform firmware failure, cpufreq should not panic host kernel with
BUG_ON(). Less severe pr_warn() will suffice.
Add firmware_has_feature(FW_FEATURE_OPALv3) check to
skip probing for device-tree on non-powernv platforms.
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Acked-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The sysfs entries are lost because of commit 2213fb1 ("powerpc/eeh:
Skip eeh sysfs when eeh is disabled"). That commit added condition
to create sysfs entries with EEH_ENABLED, which isn't populated
when trying to create sysfs entries on PowerNV platform during system
boot time. The patch fixes the issue by:
* Reoder EEH initialization functions so that they're same on
PowerNV/pSeries.
* Cache PE's primary bus by PowerNV platform instead of EEH core
to avoid kernel crash caused by the function reorder. Another
benefit with this is to avoid one eeh_probe_mode_dev() in EEH
core.
Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch adds new IOCTL commands for sPAPR VFIO container device
to support EEH functionality for PCI devices, which have been passed
through from host to somebody else via VFIO.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alexander Graf <agraf@suse.de>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>