Michael S. Tsirkin noticed a race condition:
we reset device on freeze, but system WQ is still
running so it might try adding bufs to a VQ meanwhile.
To fix, switch to handling events from the freezable WQ.
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
virtio spec requires drivers to set DRIVER_OK before using VQs.
This is set automatically after restore returns, virtio scsi violated
this rule on restore by kicking event vq within restore.
To fix, call virtio_device_ready before using event queue.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We currently kick event within virtscsi_init,
before host is fully initialized.
This can in theory confuse guest if device
consumes the buffers immediately.
To fix, move virtscsi_kick_event_all out to scan/restore.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
change_queue_depth allows changing per-target queue depth via sysfs.
It also allows the SCSI midlayer to ramp down the number of concurrent
inflight requests in response to a SCSI BUSY status response and allows
the midlayer to ramp the count back up to the device maximum when the
BUSY condition has resolved.
Signed-off-by: Venkatesh Srinivas <venkateshs@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The spinlock of tgt_lock is only for serializing read and write
req_vq, one lockless seqcount is enough for the purpose.
On one 16core VM with vhost-scsi backend, the patch can improve
IOPS with 3% on random read test.
Signed-off-by: Ming Lei <ming.lei@canonical.com>
[Add initialization in virtscsi_target_alloc. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Even though the virtio-scsi spec guarantees that all requests related
to the TMF will have been completed by the time the TMF itself completes,
the request queue's callback might not have run yet. This causes requests
to be completed more than once, and as a result triggers a variety of
BUGs or oopses.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Venkatesh Srinivas <venkateshs@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Christoph Hellwig <hch@lst.de>
Calling the workqueue interface on uninitialized work items isn't a
good idea even if they're zeroed. It's not failing catastrophically only
through happy accidents.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Christoph Hellwig <hch@lst.de>
Pull SCSI target updates from Nicholas Bellinger:
"The highlights this round include:
- Add support for T10 PI pass-through between vhost-scsi +
virtio-scsi (MST + Paolo + MKP + nab)
- Add support for T10 PI in qla2xxx target mode (Quinn + MKP + hch +
nab, merged through scsi.git)
- Add support for percpu-ida pre-allocation in qla2xxx target code
(Quinn + nab)
- A number of iser-target fixes related to hardening the network
portal shutdown path (Sagi + Slava)
- Fix response length residual handling for a number of control CDBs
(Roland + Christophe V.)
- Various iscsi RFC conformance fixes in the CHAP authentication path
(Tejas and Calsoft folks + nab)
- Return TASK_SET_FULL status for tcm_fc(FCoE) DataIn + Response
failures (Vasu + Jun + nab)
- Fix long-standing ABORT_TASK + session reset hang (nab)
- Convert iser-initiator + iser-target to include T10 bytes into EDTL
(Sagi + Or + MKP + Mike Christie)
- Fix NULL pointer dereference regression related to XCOPY introduced
in v3.15 + CC'ed to v3.12.y (nab)"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (34 commits)
target: Fix NULL pointer dereference for XCOPY in target_put_sess_cmd
vhost-scsi: Include prot_bytes into expected data transfer length
TARGET/sbc,loopback: Adjust command data length in case pi exists on the wire
libiscsi, iser: Adjust data_length to include protection information
scsi_cmnd: Introduce scsi_transfer_length helper
target: Report correct response length for some commands
target/sbc: Check that the LBA and number of blocks are correct in VERIFY
target/sbc: Remove sbc_check_valid_sectors()
Target/iscsi: Fix sendtargets response pdu for iser transport
Target/iser: Fix a wrong dereference in case discovery session is over iser
iscsi-target: Fix ABORT_TASK + connection reset iscsi_queue_req memory leak
target: Use complete_all for se_cmd->t_transport_stop_comp
target: Set CMD_T_ACTIVE bit for Task Management Requests
target: cleanup some boolean tests
target/spc: Simplify INQUIRY EVPD=0x80
tcm_fc: Generate TASK_SET_FULL status for response failures
tcm_fc: Generate TASK_SET_FULL status for DataIN failures
iscsi-target: Reject mutual authentication with reflected CHAP_C
iscsi-target: Remove no-op from iscsit_tpg_del_portal_group
iscsi-target: Fix CHAP_A parameter list handling
...
Pull virtio updates from Rusty Russell:
"Main excitement is a virtio_scsi fix for alloc holding spinlock on the
abort path, which I refuse to CC stable since (1) I discovered it
myself, and (2) it's been there forever with no reports"
* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
virtio_scsi: don't call virtqueue_add_sgs(... GFP_NOIO) holding spinlock.
virtio-rng: fixes for device registration/unregistration
virtio-rng: fix boot with virtio-rng device
virtio-rng: support multiple virtio-rng devices
virtio_ccw: introduce device_lost in virtio_ccw_device
virtio: virtio_break_device() to mark all virtqueues broken.
This patch updates virtscsi_probe() to setup necessary Scsi_Host
level protection resources. (currently hardcoded to 1)
It changes virtscsi_add_cmd() to attach outgoing / incoming
protection SGLs preceeding the data payload, and is using the
new virtio_scsi_cmd_req_pi->pi_bytes[out,in] field to signal
to signal to vhost/scsi bytes to expect for protection data.
(Add missing #include <linux/blkdev.h> for blk_integrity - sfr + nab)
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Sagi Grimberg <sagig@dev.mellanox.co.il>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Taken almost entirely from Nicholas Bellinger's scsi-mq conversion.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Access to tgt->req_vq is strictly serialized by spin_lock
of tgt->tgt_lock, so the ACCESS_ONCE() isn't necessary.
smp_read_barrier_depends() in virtscsi_req_done was introduced
to order reading req_vq and decreasing tgt->reqs, but it isn't
needed now because req_vq is read from
scsi->req_vqs[vq->index - VIRTIO_SCSI_VQ_BASE] instead of
tgt->req_vq, so remove the unnecessary barrier.
Also remove related comment about the barrier.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
vqs are freed in virtscsi_freeze but the hotcpu_notifier is not
unregistered. We will have a use-after-free usage when the notifier
callback is called after virtscsi_freeze.
Fixes: 285e71ea6f
("virtio-scsi: reset virtqueue affinity when doing cpu hotplug")
Cc: stable@vger.kernel.org
Signed-off-by: Asias He <asias.hejun@gmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If virtqueue_get_buf() returned with a NULL pointer avoid a possibly
endless loop by checking for a broken virtqueue.
Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This lets the transport do endian conversion if necessary, and insulates
the drivers from the difference.
Most drivers can use the simple helpers virtio_cread() and virtio_cwrite().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The freeze and restore functions defined in virtio drivers are used
for suspend and hibernate, so CONFIG_PM_SLEEP is more appropriate than
CONFIG_PM. This patch replace all CONFIG_PM with CONFIG_PM_SLEEP for
virtio drivers that implement freeze and restore callbacks.
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This patch adds queue steering to virtio-scsi. When a target is sent
multiple requests, we always drive them to the same queue so that FIFO
processing order is kept. However, if a target was idle, we can choose
a queue arbitrarily. In this case the queue is chosen according to the
current VCPU, so the driver expects the number of request queues to be
equal to the number of VCPUs. This makes it easy and fast to select
the queue, and also lets the driver optimize the IRQ affinity for the
virtqueues (each virtqueue's affinity is set to the CPU that "owns"
the queue).
The speedup comes from improving cache locality and giving CPU affinity
to the virtqueues, which is why this scheme was selected. Assuming that
the thread that is sending requests to the device is I/O-bound, it is
likely to be sleeping at the time the ISR is executed, and thus executing
the ISR on the same processor that sent the requests is cheap.
However, the kernel will not execute the ISR on the "best" processor
unless you explicitly set the affinity. This is because in practice
you will have many such I/O-bound processes and thus many otherwise
idle processors. Then the kernel will execute the ISR on a random
processor, rather than the one that is sending requests to the device.
The alternative to per-CPU virtqueues is per-target virtqueues. To
achieve the same locality, we could dynamically choose the virtqueue's
affinity based on the CPU of the last task that sent a request. This
is less appealing because we do not set the affinity directly---we only
provide a hint to the irqbalanced running in userspace. Dynamically
changing the affinity only works if the userspace applies the hint
fast enough.
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Asias He <asias@redhat.com>
Tested-by: Venkatesh Srinivas <venkateshs@google.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
virtio_scsi_target_state is now empty. We will find new uses for it in
the next few patches, so this patch does not drop it completely.
And as James suggested, we use entries target_alloc and target_destroy
in the host template to allocate and destroy the virtio_scsi_target_state
of each target, attach this struct to scsi_target->hostdata. Now
we can get at it from the sdev with scsi_target(sdev)->hostdata.
No messing around with fixed size arrays and bulk memory allocation
and no need to pass in the maximum target size as a parameter because
everything should now happen dynamically.
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Asias He <asias@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>