PCIe ERR_FATAL errors mean the Link is unreliable. Components on the Link
may need to be reset to return to reliable operation (PCIe r4.0, sec
6.2.2). We previously handled these errors much differently depending on
whether the platform supports Downstream Port Containment (DPC) (PCIe r4.0,
sec 6.2.10) or not.
The AER driver has historically logged the error details, called
driver-supplied pci_error_handlers callbacks, and reset the Link. This
reset downstream devices, but did not remove them from the PCI subsystem,
re-enumerate them, or call their driver .remove() or .probe() methods.
DPC is different because the hardware automatically disables the Link when
it detects ERR_FATAL, which resets downstream devices. There's no
opportunity for pci_error_handlers callbacks before resetting the Link.
The DPC driver removes affected devices (which calls their driver .remove()
methods), brings the Link back up, and re-enumerates (which calls driver
.probe() methods).
Align AER ERR_FATAL handling with DPC by resetting the Link in software,
skipping the driver pci_error_handlers callbacks, removing the devices from
the PCI subsystem, and re-enumerating. The idea is that drivers and
devices should see the same behavior for ERR_FATAL events, regardless of
whether they're handled by AER or DPC.
Here are the basic ERR_FATAL recovery steps, showing the previous AER
behavior, the AER behavior after this patch, and the DPC behavior:
AER AER DPC
previous new behavior
-------- --- --------
Log error yes yes yes (minimal)
drv.error_detected() yes no no
Reset Link yes yes yes
drv.mmio_enabled() yes no no
drv.slot_reset() yes no no
drv.resume() yes no no
Remove PCI devices no yes yes
(calls drv.remove())
Re-enumerate no yes yes
(calls drv.probe())
N.B. With DPC, the Link reset happens before the driver .remove() calls,
while with AER, the reset happens *after* the .remove() calls. The goal is
to eventually do the reset before .remove() for AER as well.
Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
[bhelgaas: changelog, squash doc patch into this, remove unused
"result_data"]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
DocBook is mentioned several times at the documentation. Update
the obsolete references from it at the DocBook.
Acked-by: SeongJae Park <sj38.park@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Pull PCI updates from Bjorn Helgaas:
- add framework for supporting PCIe devices in Endpoint mode (Kishon
Vijay Abraham I)
- use non-postable PCI config space mappings when possible (Lorenzo
Pieralisi)
- clean up and unify mmap of PCI BARs (David Woodhouse)
- export and unify Function Level Reset support (Christoph Hellwig)
- avoid FLR for Intel 82579 NICs (Sasha Neftin)
- add pci_request_irq() and pci_free_irq() helpers (Christoph Hellwig)
- short-circuit config access failures for disconnected devices (Keith
Busch)
- remove D3 sleep delay when possible (Adrian Hunter)
- freeze PME scan before suspending devices (Lukas Wunner)
- stop disabling MSI/MSI-X in pci_device_shutdown() (Prarit Bhargava)
- disable boot interrupt quirk for ASUS M2N-LR (Stefan Assmann)
- add arch-specific alignment control to improve device passthrough by
avoiding multiple BARs in a page (Yongji Xie)
- add sysfs sriov_drivers_autoprobe to control VF driver binding
(Bodong Wang)
- allow slots below PCI-to-PCIe "reverse bridges" (Bjorn Helgaas)
- fix crashes when unbinding host controllers that don't support
removal (Brian Norris)
- add driver for MicroSemi Switchtec management interface (Logan
Gunthorpe)
- add driver for Faraday Technology FTPCI100 host bridge (Linus
Walleij)
- add i.MX7D support (Andrey Smirnov)
- use generic MSI support for Aardvark (Thomas Petazzoni)
- make Rockchip driver modular (Brian Norris)
- advertise 128-byte Read Completion Boundary support for Rockchip
(Shawn Lin)
- advertise PCI_EXP_LNKSTA_SLC for Rockchip root port (Shawn Lin)
- convert atomic_t to refcount_t in HV driver (Elena Reshetova)
- add CPU IRQ affinity in HV driver (K. Y. Srinivasan)
- fix PCI bus removal in HV driver (Long Li)
- add support for ThunderX2 DMA alias topology (Jayachandran C)
- add ThunderX pass2.x 2nd node MCFG quirk (Tomasz Nowicki)
- add ITE 8893 bridge DMA alias quirk (Jarod Wilson)
- restrict Cavium ACS quirk only to CN81xx/CN83xx/CN88xx devices
(Manish Jaggi)
* tag 'pci-v4.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (146 commits)
PCI: Don't allow unbinding host controllers that aren't prepared
ARM: DRA7: clockdomain: Change the CLKTRCTRL of CM_PCIE_CLKSTCTRL to SW_WKUP
MAINTAINERS: Add PCI Endpoint maintainer
Documentation: PCI: Add userguide for PCI endpoint test function
tools: PCI: Add sample test script to invoke pcitest
tools: PCI: Add a userspace tool to test PCI endpoint
Documentation: misc-devices: Add Documentation for pci-endpoint-test driver
misc: Add host side PCI driver for PCI test function device
PCI: Add device IDs for DRA74x and DRA72x
dt-bindings: PCI: dra7xx: Add DT bindings to enable unaligned access
PCI: dwc: dra7xx: Workaround for errata id i870
dt-bindings: PCI: dra7xx: Add DT bindings for PCI dra7xx EP mode
PCI: dwc: dra7xx: Add EP mode support
PCI: dwc: dra7xx: Facilitate wrapper and MSI interrupts to be enabled independently
dt-bindings: PCI: Add DT bindings for PCI designware EP mode
PCI: dwc: designware: Add EP mode support
Documentation: PCI: Add binding documentation for pci-test endpoint function
ixgbe: Use pcie_flr() instead of duplicating it
IB/hfi1: Use pcie_flr() instead of duplicating it
PCI: imx6: Fix spelling mistake: "contol" -> "control"
...
* pci/virtualization:
ixgbe: Use pcie_flr() instead of duplicating it
IB/hfi1: Use pcie_flr() instead of duplicating it
PCI: Call pcie_flr() from reset_chelsio_generic_dev()
PCI: Call pcie_flr() from reset_intel_82599_sfp_virtfn()
PCI: Export pcie_flr()
PCI: Add sysfs sriov_drivers_autoprobe to control VF driver binding
PCI: Avoid FLR for Intel 82579 NICs
Conflicts:
include/linux/pci.h
Add documentation to help users use pci-epf-test function driver and
pci_endpoint_test host driver for testing PCI.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add binding documentation for pci-test endpoint function that helps in
adding and configuring pci-test endpoint function.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Sometimes it is not desirable to bind SR-IOV VFs to drivers. This can save
host side resource usage by VF instances that will be assigned to VMs.
Add a new PCI sysfs interface "sriov_drivers_autoprobe" to control that
from the PF. To modify it, echo 0/n/N (disable probe) or 1/y/Y (enable
probe) to:
/sys/bus/pci/devices/<DOMAIN:BUS:DEVICE.FUNCTION>/sriov_drivers_autoprobe
Note that this must be done before enabling VFs. The change will not take
effect if VFs are already enabled. Simply, one can disable VFs by setting
sriov_numvfs to 0, choose whether to probe or not, and then re-enable the
VFs by restoring sriov_numvfs.
[bhelgaas: changelog, ABI doc]
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Add specification for the *PCI test* virtual function device. The endpoint
function driver and the host PCI driver should be created based on this
specification.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add Documentation to help users use PCI endpoint to configure PCI endpoint
function and to bind the endpoint function with endpoint controller.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Acked-By: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add Documentation to help users use endpoint library to enable endpoint
mode in the PCI controller and add new PCI endpoint functions.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Acked-By: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Include whitespace shooting; correction; typo fix; superfluous word
dropping.
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Pull documentation fixes from Jonathan Corbet:
"A few fixes for the docs tree, including one for a 4.11 build
regression"
* tag 'docs-4.11-fixes' of git://git.lwn.net/linux:
Documentation/sphinx: fix primary_domain configuration
docs: Fix htmldocs build failure
doc/ko_KR/memory-barriers: Update control-dependencies section
pcieaer doc: update the link
Documentation: Update path to sysrq.txt
* pci/msi:
PCI/MSI: Update MSI/MSI-X bits in PCIEBUS-HOWTO
PCI/MSI: Document pci_alloc_irq_vectors(), deprecate pci_enable_msi()
PCI/MSI: Return -ENOSPC if pci_enable_msi_range() can't get enough vectors
PCI/portdrv: Use pci_irq_alloc_vectors()
PCI/MSI: Check that we have a legacy interrupt line before using it
PCI/MSI: Remove pci_msi_domain_{alloc,free}_irqs()
PCI/MSI: Remove unused pci_msi_create_default_irq_domain()
PCI/MSI: Return failure when msix_setup_entries() fails
PCI/MSI: Remove pci_enable_msi_{exact,range}()
amd-xgbe: Update PCI support to use new IRQ functions
[media] cobalt: use pci_irq_allocate_vectors()
PCI/MSI: Fix msi_capability_init() kernel-doc warnings
Update the MSI/MSI-X bits in PCIEBUS-HOWTO. Stop talking about low-level
details that mention deprecated APIs and concentrate on what service
drivers should do and why.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Document pci_alloc_irq_vectors() instead of the deprecated pci_enable_msi()
and pci_enable_msix() APIs.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
No hardware seems to actually call .link_reset(), and no driver implements
it as more than a nop stub.
Drop mentions of the callback from everywhere. It's dropped from the
documentation as well, but the doc really needs to be updated to reflect
reality better (e.g., on PCIe, slot reset is the link reset). This will be
done in a later patch.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
All multi-MSI allocations are now done through pci_irq_alloc_vectors(), so
remove the old pci_enable_msi_range() and pci_enable_msi_exact()
interfaces.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/host-vmd:
x86/PCI: VMD: Move VMD driver to drivers/pci/host
x86/PCI: VMD: Synchronize with RCU freeing MSI IRQ descs
x86/PCI: VMD: Eliminate index member from IRQ list
x86/PCI: VMD: Eliminate vmd_vector member from list type
x86/PCI: VMD: Convert to use pci_alloc_irq_vectors() API
x86/PCI: VMD: Allocate IRQ lists with correct MSI-X count
PCI: Use positive flags in pci_alloc_irq_vectors()
PCI: Update "pci=resource_alignment" documentation
Conflicts:
drivers/pci/host/Kconfig
drivers/pci/host/Makefile
Per the PCI Firmware spec, r3.0, sec 4.5.1, on ACPI systems, the OS must
not use AER unless _OSC is present and _OSC grants AER control to the OS.
The aerdriver.forceload kernel parameter was a way to enable Linux AER
support on ACPI systems that lack _OSC or fail to grant control the the OS.
Enabling Linux AER support when the firmware doesn't want us to is a recipe
for problems, e.g., the firmware might be handling AER itself.
Remove the aerdriver.forceload kernel parameter and related supporting
code.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The aerdriver.nosourceid kernel parameter was intended for working around
broken chipsets don't supply the source ID for AER events. We recently
added PCI_BUS_FLAGS_NO_AERSID, which can be set by quirks for the same
purpose.
Remove the aerdriver.nosourceid kernel parameter. For anything other than
debugging, asking users to find and use kernel parameters is a poor user
experience. Instead, we should add PCI_BUS_FLAGS_NO_AERSID quirks for any
hardware that needs it.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Instead of passing negative flags like PCI_IRQ_NOMSI to prevent use of
certain interrupt types, pass positive flags like PCI_IRQ_LEGACY,
PCI_IRQ_MSI, etc., to specify the acceptable interrupt types.
This is based on a number of pending driver conversions that just happend
to be a whole more obvious to read this way, and given that we have no
users in the tree yet it can still easily be done.
I've also added a PCI_IRQ_ALL_TYPES catchall to keep the case of accepting
all interrupt types very simple.
[bhelgaas: changelog, fix PCI_IRQ_AFFINITY doc typo, remove mention of
PCI_IRQ_NOLEGACY]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alexander Gordeev <agordeev@redhat.com>