commit d5f8e166c25750adc147b0adf64a62a91653438a upstream.
pm_runtime_autosuspend can take synchronous or asynchronous
paths, Because we are calling pm_runtime_mark_last_busy just before
this most of the cases it takes the asynchronous way. However,
when the FW or driver resets during already running runtime suspend,
the call will result in calling to the driver's rpm callback and results
in a deadlock on device_lock.
The simplest fix is to replace pm_runtime_autosuspend with
asynchronous pm_request_autosuspend.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is fix of the backported patch only, it places
KBL DIDs on correct place to easy on backporting of
further DIDs.
Fixes: 5c99f32c46 ('mei: me: add kaby point device ids')
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8c57cac1457f3125a5d13dc03635c0708c61bff0 upstream.
Sunrise Point PCH with SPS Firmware doesn't expose working
MEI interface, we need to quirk it out.
The SPS Firmware is identifiable only on the first PCI function
of the device.
Tested-by: Sujith Pandel <sujith_pandel@dell.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 582ab27a063a506ccb55fc48afcc325342a2deba upstream.
NFC version reply size checked against only header size, not against
full message size. That may lead potentially to uninitialized memory access
in version data.
That leads to warnings when version data is accessed:
drivers/misc/mei/bus-fixup.c: warning: '*((void *)&ver+11)' may be used uninitialized in this function [-Wuninitialized]: => 212:2
Reported in
Build regressions/improvements in v4.9-rc3
https://lkml.org/lkml/2016/10/30/57
Fixes: 59fcd7c63a (mei: nfc: Initial nfc implementation)
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a7a7aeefbca2982586ba2c9fd7739b96416a6d1d upstream.
When interrupting an application which was allocating DMAable
memory, it was possible, that the DMA memory was deallocated
twice, leading to the error symptoms below.
Thanks to Gerald, who analyzed the problem and provided this
patch.
I agree with his analysis of the problem: ddcb_cmd_fixups() ->
genwqe_alloc_sync_sgl() (fails in f/lpage, but sgl->sgl != NULL
and f/lpage maybe also != NULL) -> ddcb_cmd_cleanup() ->
genwqe_free_sync_sgl() (double free, because sgl->sgl != NULL and
f/lpage maybe also != NULL)
In this scenario we would have exactly the kind of double free that
would explain the WARNING / Bad page state, and as expected it is
caused by broken error handling (cleanup).
Using the Ubuntu git source, tag Ubuntu-4.4.0-33.52, he was able to reproduce
the "Bad page state" issue, and with the patch on top he could not reproduce
it any more.
------------[ cut here ]------------
WARNING: at /build/linux-o03cxz/linux-4.4.0/arch/s390/include/asm/pci_dma.h:141
Modules linked in: qeth_l2 ghash_s390 prng aes_s390 des_s390 des_generic sha512_s390 sha256_s390 sha1_s390 sha_common genwqe_card qeth crc_itu_t qdio ccwgroup vmur dm_multipath dasd_eckd_mod dasd_mod
CPU: 2 PID: 3293 Comm: genwqe_gunzip Not tainted 4.4.0-33-generic #52-Ubuntu
task: 0000000032c7e270 ti: 00000000324e4000 task.ti: 00000000324e4000
Krnl PSW : 0404c00180000000 0000000000156346 (dma_update_cpu_trans+0x9e/0xa8)
R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:3
Krnl GPRS: 00000000324e7bcd 0000000000c3c34a 0000000027628298 000000003215b400
0000000000000400 0000000000001fff 0000000000000400 0000000116853000
07000000324e7b1e 0000000000000001 0000000000000001 0000000000000001
0000000000001000 0000000116854000 0000000000156402 00000000324e7a38
Krnl Code: 000000000015633a: 95001000 cli 0(%r1),0
000000000015633e: a774ffc3 brc 7,1562c4
#0000000000156342: a7f40001 brc 15,156344
>0000000000156346: 92011000 mvi 0(%r1),1
000000000015634a: a7f4ffbd brc 15,1562c4
000000000015634e: 0707 bcr 0,%r7
0000000000156350: c00400000000 brcl 0,156350
0000000000156356: eb7ff0500024 stmg %r7,%r15,80(%r15)
Call Trace:
([<00000000001563e0>] dma_update_trans+0x90/0x228)
[<00000000001565dc>] s390_dma_unmap_pages+0x64/0x160
[<00000000001567c2>] s390_dma_free+0x62/0x98
[<000003ff801310ce>] __genwqe_free_consistent+0x56/0x70 [genwqe_card]
[<000003ff801316d0>] genwqe_free_sync_sgl+0xf8/0x160 [genwqe_card]
[<000003ff8012bd6e>] ddcb_cmd_cleanup+0x86/0xa8 [genwqe_card]
[<000003ff8012c1c0>] do_execute_ddcb+0x110/0x348 [genwqe_card]
[<000003ff8012c914>] genwqe_ioctl+0x51c/0xc20 [genwqe_card]
[<000000000032513a>] do_vfs_ioctl+0x3b2/0x518
[<0000000000325344>] SyS_ioctl+0xa4/0xb8
[<00000000007b86c6>] system_call+0xd6/0x264
[<000003ff9e8e520a>] 0x3ff9e8e520a
Last Breaking-Event-Address:
[<0000000000156342>] dma_update_cpu_trans+0x9a/0xa8
---[ end trace 35996336235145c8 ]---
BUG: Bad page state in process jbd2/dasdb1-8 pfn:3215b
page:000003d100c856c0 count:-1 mapcount:0 mapping: (null) index:0x0
flags: 0x3fffc0000000000()
page dumped because: nonzero _count
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Frank Haverkamp <haver@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 43605e293eb13c07acb546c14f407a271837af17 upstream.
SEC registers are not accessible when the TXE device is in low power
state, hence the SEC interrupt cannot be processed if device is not
awake.
In some rare cases entrance to low power state (aliveness off) and input
ready bits can be signaled at the same time, resulting in communication
stall as input ready won't be signaled again after waking up. To resolve
this IPC_HHIER_SEC bit in HHISR_REG should not be cleaned if the
interrupt is not processed.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit aa09545589ceeff884421d8eb38d04963190afbe ]
GCC 4.6.3 does not support -Wno-unused-const-variable. Instead, use the
kbuild infrastructure that checks if this options exists.
Fixes: 2cd55c68c0 ("cxl: Fix build failure due to -Wunused-variable behaviour change")
Suggested-by: Michal Marek <mmarek@suse.com>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7b8ad495d59280b634a7b546f4cdf58cf4d65f61 ]
Presently when a user-space process issues CXL_IOCTL_START_WORK ioctl we
store the pid of the current task_struct and use it to get pointer to
the mm_struct of the process, while processing page or segment faults
from the capi card. However this causes issues when the thread that had
originally issued the start-work ioctl exits in which case the stored
pid is no more valid and the cxl driver is unable to handle faults as
the mm_struct corresponding to process is no more accessible.
This patch fixes this issue by using the mm_struct of the next alive
task in the thread group. This is done by iterating over all the tasks
in the thread group starting from thread group leader and calling
get_task_mm on each one of them. When a valid mm_struct is obtained the
pid of the associated task is stored in the context replacing the
exiting one for handling future faults.
The patch introduces a new function named get_mem_context that checks if
the current task pointed to by ctx->pid is dead? If yes it performs the
steps described above. Also a new variable cxl_context.glpid is
introduced which stores the pid of the thread group leader associated
with the context owning task.
Reported-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reported-by: Frank Haverkamp <HAVERKAM@de.ibm.com>
Suggested-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1b5df59e50874b9034c0fa389cd52b65f1f93292 ]
An idr warning is reported when a context is release after the capi card
is unbound from the cxl driver via sysfs. Below are the steps to
reproduce:
1. Create multiple afu contexts in an user-space application using libcxl.
2. Unbind capi card from cxl using command of form
echo <capi-card-pci-addr> > /sys/bus/pci/drivers/cxl-pci/unbind
3. Exit/kill the application owning afu contexts.
After above steps a warning message is usually seen in the kernel logs
of the form "idr_remove called for id=<context-id> which is not
allocated."
This is caused by the function cxl_release_afu which destroys the
contexts_idr table. So when a context is release no entry for context pe
is found in the contexts_idr table and idr code prints this warning.
This patch fixes this issue by increasing & decreasing the ref-count on
the afu device when a context is initialized or when its freed
respectively. This prevents the afu from being released until all the
afu contexts have been released. The patch introduces two new functions
namely cxl_afu_get/put that manage the ref-count on the afu device.
Also the patch removes code inside cxl_dev_context_init that increases ref
on the afu device as its guaranteed to be alive during this function.
Reported-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bc46b45a421a64a0895dd41a34d3d2086e1ac7f6 upstream.
Ensure that mei_cl_read_start is called under the device lock
also in the bus layer. The function updates global ctrl_wr_list
which should be locked.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9d04ee11db7bf0d848266cbfd7db336097a0e239 upstream.
When a message is received and amthif client is not in reading state
the message is ignored and left dangling in the queue. This may happen
after one of the amthif host connections is closed w/o completing the
reading. Another client will pick up a wrong message on next read
attempt which will lead to link reset.
To prevent this the driver has to properly discard the message when
amthif client is not in reading state.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6a8d648c8d1824117a9e9edb948ed1611fb013c0 upstream.
In the case when disconnection is initiated from the FW
the driver is flushing items from the write control list while
iterating over it:
mei_irq_write_handler()
list_for_each_entry_safe(ctrl_wr_list) <-- outer loop
mei_cl_irq_disconnect_rsp()
mei_cl_set_disconnected()
mei_io_list_flush(ctrl_wr_list) <-- destorying list
We move the list flushing to the completion routine.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f3df53e4d70b5736368a8fe8aa1bb70c1cb1f577 upstream.
Fix RDAC read back errors caused by a typo. Value must shift by 2.
Fixes: a4bd394956 ("drivers/misc/ad525x_dpot.c: new features")
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7b64dbf849abdd7e769820e25120758f956a7f13 upstream.
Signed integer overflow is undefined. Also I added a check for
"(offset < 0)" in scif_unregister() because that makes it match the
other conditions and because I didn't want to subtract a negative.
Fixes: ba612aa8b4 ('misc: mic: SCIF memory registration and unregistration')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 50e6315dba721cbc24ccd6d7b299f1782f210a98 upstream.
Commit 985087dbcb 'misc: add support for bmp18x chips to the bmp085
driver' changed the BMP085 config symbol to a boolean. I see no
reason why the shared code cannot be built as a module, so change it
back to tristate.
Fixes: 985087dbcb ("misc: add support for bmp18x chips to the bmp085 driver")
Cc: Eric Andersson <eric.andersson@unixphere.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d6776bba44d9752f6cdf640046070e71ee4bba7b upstream.
Keep IRQ mappings on context teardown. This won't leak IRQs as if we
allocate the mapping again, the generic code will give the same
mapping used last time.
Doing this works around a race in the generic code. Masking the
interrupt introduces a race which can crash the kernel or result in
IRQ that is never EOIed. The lost of EOI results in all subsequent
mappings to the same HW IRQ never receiving an interrupt.
We've seen this race with cxl test cases which are doing heavy context
startup and teardown at the same time as heavy interrupt load.
A fix to the generic code is being investigated also.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Tested-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 15c13dfcad883a1e76b714480fb27be96247fd82 upstream.
The bus data transfer interface was missing the check if the device is
in enabled state, this may lead to stack corruption during link reset.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 923adb1646d5ba739d2a1e63ee20d60574d9da8e upstream.
The PSL timebase synchronization is seemingly failing for
configuration not including VIRT_CPU_ACCOUNTING_NATIVE. The driver
shows the following trace in dmesg:
PSL: Timebase sync: giving up!
The PSL timebase register is actually syncing correctly, but the cxl
driver is not detecting it. Fix is to use the proper timebase-to-time
conversion.
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 48f0f6b717e314a30be121b67e1d044f6d311d66 upstream.
When writing a value to config space, cxl_pcie_write_config() calls
cxl_pcie_config_info() to obtain a mask and shift value, shifts the new
value accordingly, then uses the mask to combine the shifted value with the
existing value at the address as part of a read-modify-write pattern.
Currently, we use a logical OR operator rather than a bitwise OR operator,
which means any use of this function results in an incorrect value being
written. Replace the logical OR operator with a bitwise OR operator so the
value is written correctly.
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Fixes: 6f7f0b3df6 ("cxl: Add AFU virtual PHB and kernel API")
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7326fffb712f09a315bc73cc1ee63843f59b8bd4 upstream.
This patch address a possible security issue:
The request field in client notify request ioctl comes from user space
as u32 and is downcasted to u8 with out validation.
Check request field to have approved values
MEI_HBM_NOTIFICATION_STAR/STOP
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>