linux-apfs

mirror of https://github.com/linux-apfs/linux-apfs.git synced 2026-05-01 15:00:59 -07:00

Author	SHA1	Message	Date
Ian Munsie	79384e4b71	cxl: Add kernel APIs to get & set the max irqs per context These APIs will be used by the Mellanox CX4 support. While they function standalone to configure existing behaviour, their primary purpose is to allow the Mellanox driver to inform the cxl driver of a hardware limitation, which will be used in a future patch. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-14 20:26:57 +10:00
Ian Munsie	317f5ef1b3	cxl: Add support for using the kernel API with a real PHB This hooks up support for using the kernel API with a real PHB. After the AFU initialisation has completed it calls into the PHB code to pass it the AFU that will be used by other peer physical functions on the adapter. The cxl_pci_to_afu API is extended to work with peer PCI devices, retrieving the peer AFU from the PHB. This API may also now return an error if it is called on a PCI device that is not associated with either a cxl vPHB or a peer PCI device to an AFU, and this error is propagated down. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-14 20:26:56 +10:00
Ian Munsie	e4f5fc001a	cxl: Do not create vPHB if there are no AFU configuration records The vPHB model of the cxl kernel API is a hierarchy where the AFU is represented by the vPHB, and it's AFU configuration records are exposed as functions under that vPHB. If there are no AFU configuration records we will create a vPHB with nothing under it, which is a waste of resources and will opt us into EEH handling despite not having anything special to handle. This also does not make sense for cards using the peer model of the cxl kernel API, where the other functions of the device are exposed via additional peer physical functions rather than AFU configuration records. This model will also not work with the existing EEH handling in the cxl driver, as that is designed around the vPHB model. Skip creating the vPHB for AFUs without any AFU configuration records, and opt out of EEH handling for them. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-14 20:26:53 +10:00
Ian Munsie	a19bd79e31	cxl: Allow a default context to be associated with an external pci_dev The cxl kernel API has a concept of a default context associated with each PCI device under the virtual PHB. The Mellanox CX4 will also use the cxl kernel API, but it does not use a virtual PHB - rather, the AFU appears as a physical function as a peer to the networking functions. In order to allow the kernel API to work with those networking functions, we will need to associate a default context with them as well. To this end, refactor the corresponding code to do this in vphb.c and export it so that it can be called from the PHB code. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-14 20:26:52 +10:00
Ian Munsie	62ccf2d2ef	cxl: Move cxl_afu_get / cxl_afu_put to base The Mellanox CX4 uses a model where the AFU is one physical function of the device, and is used by other peer physical functions of the same device. This will require those other devices to grab a reference on the AFU when they are initialised to make sure that it does not go away during their lifetime. Move the AFU refcount functions to base.c so they can be called from the PHB code. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-14 20:26:36 +10:00
Ian Munsie	48b3adf334	cxl: Enable bus mastering for devices using CAPP DMA mode Devices that use CAPP DMA mode (such as the Mellanox CX4) require bus master to be enabled in order for the CAPI traffic to flow. This should be harmless to enable for other cxl devices, so unconditionally enable it in the adapter init flow. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-14 20:26:35 +10:00
Ian Munsie	4e56f858bd	cxl: Add cxl_slot_is_supported API This extends the check that the adapter is in a CAPI capable slot so that it may be called by external users in the kernel API. This will be used by the upcoming Mellanox CX4 support, which needs to know ahead of time if the card can be switched to cxl mode so that it can leave it in PCI mode if it is not. This API takes a parameter to check if CAPP DMA mode is supported, which it currently only allows on P8NVL systems, since that mode currently has issues accessing memory < 4GB on P8, and we cannot realistically avoid that. This API does not currently check if a CAPP unit is available (i.e. not already assigned to another PHB) on P8. Doing so would be racy since it is assigned on a first come first serve basis, and so long as CAPP DMA mode is not supported on P8 we don't need this, since the only anticipated user of this API requires CAPP DMA mode. Cc: Philippe Bergheaud <felix@linux.vnet.ibm.com> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-14 20:26:33 +10:00
Wei Yongjun	fc9f75ef2f	cxl: Use for_each_compatible_node() macro Use for_each_compatible_node() macro instead of open coding it. Generated by Coccinelle. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-14 20:26:25 +10:00
Philippe Bergheaud	3b3dcd61fa	cxl: Ignore CAPI adapters misplaced in switched slots One should not attempt to switch a PHB into CAPI mode if there is a switch between the PHB and the adapter. This patch modifies the cxl driver to ignore CAPI adapters misplaced in switched slots. Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-08 22:23:57 +10:00
Paul Gortmaker	e00878be3f	cxl: make base more explicitly non-modular The Kconfig/Makefile currently controlling compilation of this code is: drivers/misc/cxl/Kconfig:config CXL_BASE drivers/misc/cxl/Kconfig: bool drivers/misc/cxl/Makefile:obj-$(CONFIG_CXL_BASE) += base.o ...meaning that it currently is not being built as a module by anyone. Lets convert the one module_init into device_initcall so that when reading the driver it more clear that it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. We don't replace module.h with init.h since the file is doing other modular stuff (module_get/put) even though it is built-in. Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Michael Neuling <mikey@neuling.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-08 22:22:58 +10:00
Philippe Bergheaud	6e0c50f9e8	cxl: Refine slice error debug messages The PSL Slice Error Register (PSL_SERR_An) reports implementation dependent AFU errors, in the form of a bitmap. The PSL_SERR_An register content is printed in the form of hex dump debug message. This patch decodes the PSL_ERR_An register contents, and prints a specific error message for each possible error bit. It also dumps the secondary registers AFU_ERR_An and PSL_DSISR_An, that may contain extra debug information. This patch also removes the large WARN message that used to report the cxl slice error interrupt, and replaces it by a short informative message, that draws attention to AFU implementation errors. Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-08 22:22:03 +10:00
Ian Munsie	f5c9df9a44	cxl: Fix NULL pointer dereference on kernel contexts with no AFU interrupts If a kernel context is initialised and does not have any AFU interrupts allocated it will cause a NULL pointer dereference when the context is detached since the irq_names list will not have been initialised. Move the initialisation of the irq_names list into the cxl_context_init routine so that it will be valid for the entire lifetime of the context and will not cause a NULL pointer dereference. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-08 22:13:34 +10:00
Ian Munsie	2a4f667aad	cxl: Workaround XSL bug that does not clear the RA bit after a reset An issue was noted in our debug logs where the XSL would leave the RA bit asserted after an AFU reset operation, which would effectively prevent further AFU reset operations from working. Workaround the issue by clearing the RA bit with an MMIO write if it is still asserted after any AFU control operation. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-08 22:13:06 +10:00
Ian Munsie	5e7823c9bc	cxl: Fix bug where AFU disable operation had no effect The AFU disable operation has a bug where it will not clear the enable bit and therefore will have no effect. To date this has likely been masked by fact that we perform an AFU reset before the disable, which also has the effect of clearing the enable bit, making the following disable operation effectively a noop on most hardware. This patch modifies the afu_control function to take a parameter to clear from the AFU control register so that the disable operation can clear the appropriate bit. This bug was uncovered on the Mellanox CX4, which uses an XSL rather than a PSL. On the XSL the reset operation will not complete while the AFU is enabled, meaning the enable bit was still set at the start of the disable and as a result this bug was hit and the disable also timed out. Because of this difference in behaviour between the PSL and XSL, this patch now makes the reset dependent on the card using a PSL to avoid waiting for a timeout on the XSL. It is entirely possible that we may be able to drop the reset altogether if it turns out we only ever needed it due to this bug - however I am not willing to drop it without further regression testing and have added comments to the code explaining the background. This also fixes a small issue where the AFU_Cntl register was read outside of the lock that protects it. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-08 22:11:18 +10:00
Ian Munsie	2224b6719b	cxl: Fix allocating a minimum of 2 pages for the SPA The Scheduled Process Area is allocated dynamically with enough pages to fit at least as many processes as the AFU descriptor indicated. Since the calculation is non-trivial, it does this by calculating how many processes could fit in an allocation of a given order, and increasing that order until it can fit enough processes or hits the maximum supported size. Currently, it will start this search using a SPA of 2 pages instead of 1. This can waste a page of memory if the AFU's maximum number of supported processes was small enough to fit in one page. Fix the algorithm to start the search at 1 page. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-08 22:10:44 +10:00
Ian Munsie	49e9c99f47	cxl: Fix allowing bogus AFU descriptors with 0 maximum processes If the AFU descriptor of an AFU directed AFU indicates that it supports 0 maximum processes, we will accept that value and attempt to use it. The SPA will still be allocated (with 2 pages due to another minor bug and room for 958 processes), and when a context is allocated we will pass the value of 0 to idr_alloc as the maximum. However, idr_alloc will treat that as meaning no maximum and will allocate a context number and we return a valid context. Conceivably, this could lead to a buffer overflow of the SPA if more than 958 contexts were allocated, however this is mitigated by the fact that there are no known AFUs in the wild with a bogus AFU descriptor like this, and that only the root user is allowed to flash an AFU image to a card. Add a check when validating the AFU descriptor to reject any with 0 maximum processes. We do still allow a dedicated process only AFU to indicate that it supports 0 contexts even though that is forbidden in the architecture, as in that case we ignore the value and use 1 instead. This is just on the off-chance that such a dedicated process AFU may exist (not that I am aware of any), since their developers are less likely to have cared about this value at all. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-08 22:10:42 +10:00
Michael Neuling	ad42de859f	cxl: Add set and get private data to context struct This provides AFU drivers a means to associate private data with a cxl context. This is particularly intended for make the new callbacks for driver specific events easier for AFU drivers to use, as they can easily get back to any private data structures they may use. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-28 18:35:08 +10:00
Philippe Bergheaud	b810253bd9	cxl: Add mechanism for delivering AFU driver specific events This adds an afu_driver_ops structure with fetch_event() and event_delivered() callbacks. An AFU driver such as cxlflash can fill this out and associate it with a context to enable passing custom AFU specific events to userspace. This also adds a new kernel API function cxl_context_pending_events(), that the AFU driver can use to notify the cxl driver that new specific events are ready to be delivered, and wake up anyone waiting on the context wait queue. The current count of AFU driver specific events is stored in the field afu_driver_events of the context structure. The cxl driver checks the afu_driver_events count during poll, select, read, etc. calls to check if an AFU driver specific event is pending, and calls fetch_event() to obtain and deliver that event. This way, the cxl driver takes care of all the usual locking semantics around these calls and handles all the generic cxl events, so that the AFU driver only needs to worry about it's own events. fetch_event() return a struct cxl_event_afu_driver_reserved, allocated by the AFU driver, and filled in with the specific event information and size. Total event size (header + data) should not be greater than CXL_READ_MIN_SIZE (4K). Th cxl driver prepends an appropriate cxl event header, copies the event to userspace, and finally calls event_delivered() to return the status of the operation to the AFU driver. The event is identified by the context and cxl_event_afu_driver_reserved pointers. Since AFU drivers provide their own means for userspace to obtain the AFU file descriptor (i.e. cxlflash uses an ioctl on their scsi file descriptor to obtain the AFU file descriptor) and the generic cxl driver will never use this event, the ABI of the event is up to each individual AFU driver. Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-28 18:34:56 +10:00
Frederic Barrat	a430739009	cxl: Make vPHB device node match adapter's On bare-metal, when a device is attached to the cxl card, lsvpd shows a location code such as (with cxlflash): # lsvpd -l sg22 ... YL U78CB.001.WZS0073-P1-C33-B0-T0-L0 which makes it hard to easily identify the cxl adapter owning the flash device, since in this example C33 refers to a P8 processor. lsvpd looks in the parent devices until it finds a location code, so the device node for the vPHB ends up being used. By reusing the device node of the adapter for the vPHB, lsvpd shows: # lsvpd -l sg16 ... YL U78C9.001.WZS09XA-P1-C7-B1-T0-L3 where C7 is the PCI slot of the cxl adapter. On powerVM, the vPHB was already using the adapter device node, so there's no change there. Tested by cxlflash on bare-metal and powerVM. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-16 23:11:30 +10:00
Ian Munsie	b385c9e971	cxl: Add support for CAPP DMA mode This adds support for using CAPP DMA mode, which is required for XSL based cards such as the Mellanox CX4 to function. This is currently an RFC as it depends on the corresponding support to be merged into skiboot first, which was submitted here: http://patchwork.ozlabs.org/patch/625582/ In the event that the skiboot on the system does not have the above support, it will indicate as such in the kernel log and abort the init process. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-16 23:10:26 +10:00
Frederic Barrat	6d382616ac	cxl: Abstract the differences between the PSL and XSL The XSL (Translation Service Layer) is a stripped down version of the PSL (Power Service Layer) used in some cards such as the Mellanox CX4. Like the PSL, it implements the CAIA architecture, but has a number of differences, mostly in it's implementation dependent registers. This adds an ops structure to abstract these differences to bring initial support for XSL CAPI devices. The XSL does not implement the optional architected SERR register, however while it treats it as a reserved register and should work with no special treatment, attempting to access it will cause the XSL_FEC (First Error Capture) register to be filled out, preventing it from capturing any subsequent errors. Therefore, this patch also prevents the kernel from trying to set up the SERR register so that the FEC register may still be useful, and to save one interrupt. The XSL also uses a special DMA cxl mode, which uses a slightly different init sequence for the CAPP and PHB. The kernel support for this will be in a future patch once the corresponding support has been merged into skiboot. Co-authored-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-16 23:08:54 +10:00
Ian Munsie	292841b096	cxl: Update process element after allocating interrupts In the kernel API, it is possible to attempt to allocate AFU interrupts after already starting a context. Since the process element structure used by the hardware is only filled out at the time the context is started, it will not be updated with the interrupt numbers that have just been allocated and therefore AFU interrupts will not work unless they were allocated prior to starting the context. This can present some difficulties as each CAPI enabled PCI device in the kernel API has a default context, which may need to be started very early to enable translations, potentially before interrupts can easily be set up. This patch makes the API more flexible to allow interrupts to be allocated after a context has already been started and takes care of updating the PE structure used by the hardware and notifying it to discard any cached copy it may have. The update is currently performed via a terminate/remove/add sequence. This is necessary on some hardware such as the XSL that does not properly support the update LLCMD. Note that this is only supported on powernv at present - attempting to perform this ordering on PowerVM will raise a warning. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-16 23:08:49 +10:00
Andrew Donnellan	64417a3989	cxl: static-ify variables to fix sparse warnings Make a couple more variables static. Found by sparse. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: fbarrat@linux.vnet.ibm.com Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-16 22:49:27 +10:00
Linus Torvalds	c04a588029	Merge tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights: - Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V - Live patching support for ppc64le (also merged via livepatching.git) Various cleanups & minor fixes from: - Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie, Lennart Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Michael Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras, Rashmica Gupta, Russell Currey, Suraj Jitindar Singh, Thiago Jung Bauermann, Valentin Rothberg, Vipin K Parashar. General: - Update LMB associativity index during DLPAR add/remove from Nathan Fontenot - Fix branching to OOL handlers in relocatable kernel from Hari Bathini - Add support for userspace Power9 copy/paste from Chris Smart - Always use STRICT_MM_TYPECHECKS from Michael Ellerman - Add mask of possible MMU features from Michael Ellerman PCI: - Enable pass through of NVLink to guests from Alexey Kardashevskiy - Cleanups in preparation for powernv PCI hotplug from Gavin Shan - Don't report error in eeh_pe_reset_and_recover() from Gavin Shan - Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan - Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" from Guilherme G Piccoli - Remove the dependency on EEH struct in DDW mechanism from Guilherme G Piccoli selftests: - Test cp_abort during context switch from Chris Smart - Add several tests for transactional memory support from Rashmica Gupta perf: - Add support for sampling interrupt register state from Anju T - Add support for unwinding perf-stackdump from Chandan Kumar cxl: - Configure the PSL for two CAPI ports on POWER8NVL from Philippe Bergheaud - Allow initialization on timebase sync failures from Frederic Barrat - Increase timeout for detection of AFU mmio hang from Frederic Barrat - Handle num_of_processes larger than can fit in the SPA from Ian Munsie - Ensure PSL interrupt is configured for contexts with no AFU IRQs from Ian Munsie - Add kernel API to allow a context to operate with relocate disabled from Ian Munsie - Check periodically the coherent platform function's state from Christophe Lombard Freescale: - Updates from Scott: "Contains 86xx fixes, minor device tree fixes, an erratum workaround, and a kconfig dependency fix." * tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (192 commits) powerpc/86xx: Fix PCI interrupt map definition powerpc/86xx: Move pci1 definition to the include file powerpc/fsl: Fix build of the dtb embedded kernel images powerpc/fsl: Fix rcpm compatible string powerpc/fsl: Remove FSL_SOC dependency from FSL_LBC powerpc/fsl-pci: Add a workaround for PCI 5 errata powerpc/fsl: Fix SPI compatible on t208xrdb and t1040rdb powerpc/powernv/npu: Add PE to PHB's list powerpc/powernv: Fix insufficient memory allocation powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" powerpc/eeh: Drop unnecessary label in eeh_pe_change_owner() powerpc/eeh: Ignore handlers in eeh_pe_reset_and_recover() powerpc/eeh: Restore initial state in eeh_pe_reset_and_recover() powerpc/eeh: Don't report error in eeh_pe_reset_and_recover() Revert "powerpc/powernv: Exclude root bus in pnv_pci_reset_secondary_bus()" powerpc/powernv/npu: Enable NVLink pass through powerpc/powernv/npu: Rework TCE Kill handling powerpc/powernv/npu: Add set/unset window helpers powerpc/powernv/ioda2: Export debug helper pe_level_printk() ...	2016-05-20 10:12:41 -07:00
Christophe Lombard	266eab8f32	cxl: Check periodically the coherent platform function's state In the PowerVM environment, the PHYP CoherentAccel component manages the state of the Coherent Accelerator Processor Interface adapter and virtualizes CAPI resources, handles CAPP, PSL, PSL Slice errors - and interrupts - and provides a new set of hcalls for the OS APIs to utilize Accelerator Function Unit (AFU). During the course of operation, a coherent platform function can encounter errors. Some possible reason for errors are: • Hardware recoverable and unrecoverable errors • Transient and over-threshold correctable errors PHYP implements its own state model for the coherent platform function. The state of the AFU is available through a hcall. The current implementation of the cxl driver, for the PowerVM environment, checks this state of the AFU only when an action is requested - open a device, ioctl command, memory map, attach/detach a process - from an external driver - cxlflash, libcxl. If an error is detected the cxl driver handles the error according the content of the Power Architecture Platform Requirements document. But in case of low-level troubles (or error injection), the PHYP component may reset the card and change the AFU state. The PHYP interface doesn't provide any way to be notified when that happens thus implies that the cxl driver: • cannot handle immediatly the state change of the AFU. • cannot notify other drivers (cxlflash, ...) The purpose of this patch is to wake up the cpu periodically to check the current state of each AFU and to see if we need to enter an error recovery path. Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-11 21:54:11 +10:00

1 2 3 4 5 ...

171 Commits