There is a rare race when we remove an entry from the global list
hv_context.percpu_list[cpu] in hv_process_channel_removal() ->
percpu_channel_deq() -> list_del(): at this time, if vmbus_on_event() ->
process_chn_event() -> pcpu_relid2channel() is trying to query the list,
we can get the kernel fault.
Similarly, we also have the issue in the code path: vmbus_process_offer() ->
percpu_channel_enq().
We can resolve the issue by disabling the tasklet when updating the list.
The patch also moves vmbus_release_relid() to a later place where
the channel has been removed from the per-cpu and the global lists.
Reported-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Background: userspace daemons registration protocol for Hyper-V utilities
drivers has two steps:
1) daemon writes its own version to kernel
2) kernel reads it and replies with module version
at this point we consider the handshake procedure being completed and we
do hv_poll_channel() transitioning the utility device to HVUTIL_READY
state. At this point we're ready to handle messages from kernel.
When hvutil_transport is in HVUTIL_TRANSPORT_CHARDEV mode we have a
single buffer for outgoing message. hvutil_transport_send() puts to this
buffer and till the buffer is cleared with hvt_op_read() returns -EFAULT
to all consequent calls. Host<->guest protocol guarantees there is no more
than one request at a time and we will not get new requests till we reply
to the previous one so this single message buffer is enough.
Now to the race. When we finish negotiation procedure and send kernel
module version to userspace with hvutil_transport_send() it goes into the
above mentioned buffer and if the daemon is slow enough to read it from
there we can get a collision when a request from the host comes, we won't
be able to put anything to the buffer so the request will be lost. To
solve the issue we need to know when the negotiation is really done (when
the version message is read by the daemon) and transition to HVUTIL_READY
state after this happens. Implement a callback on read to support this.
Old style netlink communication is not affected by the change, we don't
really know when these messages are delivered but we don't have a single
message buffer there.
Reported-by: Barry Davis <barry_davis@stormagic.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
vmbus_teardown_gpadl() can result in infinite wait when it is called on 5
second timeout in vmbus_open(). The issue is caused by the fact that gpadl
teardown operation won't ever succeed for an opened channel and the timeout
isn't always enough. As a guest, we can always trust the host to respond to
our request (and there is nothing we can do if it doesn't).
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In some cases create_gpadl_header() allocates submessages but we never
free them.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We use messagecount only once in vmbus_establish_gpadl() to check if
it is safe to iterate through the submsglist. We can just initialize
the list header in all cases in create_gpadl_header() instead.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When we crash from NMI context (e.g. after NMI injection from host when
'sysctl -w kernel.unknown_nmi_panic=1' is set) we hit
kernel BUG at mm/vmalloc.c:1530!
as vfree() is denied. While the issue could be solved with in_nmi() check
instead I opted for skipping vfree on all sorts of crashes to reduce the
amount of work which can cause consequent crashes. We don't really need to
free anything on crash.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Replace explicit computation of vma page count by a call to
vma_pages()
Signed-off-by: Muhammad Falak R Wani <falakreyaz@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When building with W=1, the __scif_rma_destroy_tcw function
causes a harmless warning about an argument variable that is
modified but not used:
drivers/misc/mic/scif/scif_dma.c: In function ‘__scif_rma_destroy_tcw’:
drivers/misc/mic/scif/scif_dma.c:118:27: error: parameter ‘ep’ set but not used [-Werror=unused-but-set-parameter]
In this case, we can just remove the argument, since all callers
are in the same file.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The device lock was unnecessary obtained in bus rescan work before the
amthif client search. That causes incorrect lock ordering and task
hang:
...
[88004.613213] INFO: task kworker/1:14:21832 blocked for more than 120 seconds.
...
[88004.645934] Workqueue: events mei_cl_bus_rescan_work
...
The correct lock order is
cl_bus_lock
device_lock
me_clients_rwsem
Move device_lock into amthif init function that called
after me_clients_rwsem is released.
This fixes regression introduced by commit:
commit 025fb792ba ("mei: split amthif client init from end of clients enumeration")
Cc: <stable@vger.kernel.org> # 4.6+
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mei_amthif_read have only one difference from mei_read, it is not
calling mei_read_start().
Make mei_read_start return immediately for amthif client and drop the
special mei_amthif_read function.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The FW supports only one pending read per host client, in order to
support issuing of consecutive reads the driver queues read requests
internally and send them to the firmware after pending one has
completed.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Enclose the boiler plate code of allocating a control/hbm command cb
and enqueueing it onto ctrl_wr.list in a convenient wrapper
mei_cl_enqueue_ctrl_wr_cb().
This is a preparatory patch for enabling consecutive reads.
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With the introduction of the receive control flow credits prefixed with
rx_ we add tx_ prefix to the variables and function used for tracking
the transmit control flow credits.
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use RX flow control counter in the host client structure to
track the number of simultaneous outstanding reads.
This eliminates search in queues and makes ground for
enabling for parallel read.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The read callbacks for the fixed address clients, that don't have flow
control are built now on the receive path. In order to have a single
allocation place we remove the allocation from the read request.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The read callback is always prepared with MTU-sized buffer and the FW
can't send more than the MTU in one message.
Checking for buffer existence and krealloc to increase receive buffer
size are redundant and may be safely discarded.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Open code mei_clear_lists into its only caller mei_amthif_releas
and drop unused parameter 'dev' form from mei_clear_list function.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Fixed address clients do not work with the flow control, and the
packet RX callback was allocated upon TX with anticipation of a
following RX. This won't work if the clients with unsolicited Rx. Rather
than preparing read callback upon a write we allocate one directly on
the reciev path if one doesn't exists.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Store the file associated with a client in the host client structure,
this enables dropping the special amthif client file pointer from struct
mei_device, and this is also a preparation for changing the way rx
packet allocation for fixed_address clients
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Move read cb to the completion queue if a read finds out that client
is not connected. This expedite user space reader wake on error
condition.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the course of the read flow we want to wait for read completion only
if the read queue is empty.
However the calling list_empty(&cl->rd_completed) is a duplication as the
same check was performed by mei_cl_read_cb() and the waiting is skipped
if it returns not NULL.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>