Avoid taking the host-wide host_lock to check the per-host queue limit.
Instead we do an atomic_inc_return early on to grab our slot in the queue,
and if necessary decrement it after finishing all checks.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Webb Scales <webbnh@hp.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Tested-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Robert Elliott <elliott@hp.com>
Using dev_printk variants prefixes the logging message with
the originating device, which makes debugging easier.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
scsi_put_command() is either invoked before blk_start_request() or
after block layer processing has completed. scsi_cmnd.abort_work
is scheduled from inside the SCSI timeout handler. The block layer
guarantees that either the regular completion handler
(softirq_done_fn()) or the timeout handler (rq_timed_out_fn()) is
invoked but not both. This means that scsi_put_command() is never
invoked while abort_work is scheduled. Hence remove the
cancel_delayed_work() call from scsi_put_command().
Similarly, scsi_abort_command() is only invoked from the SCSI
timeout handler. If scsi_abort_command() is invoked for a SCSI
command with the SCSI_EH_ABORT_SCHEDULED flag set this means that
scmd_eh_abort_handler() has already invoked scsi_queue_insert() and
hence that scsi_cmnd.abort_work is no longer pending. Hence also
remove the cancel_delayed_work() call from scsi_abort_command().
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Any callbacks in scsi_timeout_out() might return BLK_EH_RESET_TIMER,
in which case we should leave the result alone and not set
DID_TIME_OUT, as the command didn't actually timeout.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
After scsi_try_to_abort_cmd returns, the eh_abort_handler may have
already found that the command has completed in the device, causing
the host_byte to be nonzero (e.g. it could be DID_ABORT). When
this happens, ORing DID_TIME_OUT into the host byte will corrupt
the result field and initiate an unwanted command retry.
Fix this by using set_host_byte instead, following the model of
commit 2082ebc45a.
Cc: stable@vger.kernel.org
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
[Fix all instances according to review comments. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Pull block layer fixes from Jens Axboe:
"Final small batch of fixes to be included before -rc1. Some general
cleanups in here as well, but some of the blk-mq fixes we need for the
NVMe conversion and/or scsi-mq. The pull request contains:
- Support for not merging across a specified "chunk size", if set by
the driver. Some NVMe devices perform poorly for IO that crosses
such a chunk, so we need to support it generically as part of
request merging avoid having to do complicated split logic. From
me.
- Bump max tag depth to 10Ki tags. Some scsi devices have a huge
shared tag space. Before we failed with EINVAL if a too large tag
depth was specified, now we truncate it and pass back the actual
value. From me.
- Various blk-mq rq init fixes from me and others.
- A fix for enter on a dying queue for blk-mq from Keith. This is
needed to prevent oopsing on hot device removal.
- Fixup for blk-mq timer addition from Ming Lei.
- Small round of performance fixes for mtip32xx from Sam Bradshaw.
- Minor stack leak fix from Rickard Strandqvist.
- Two __init annotations from Fabian Frederick"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: add __init to blkcg_policy_register
block: add __init to elv_register
block: ensure that bio_add_page() always accepts a page for an empty bio
blk-mq: add timer in blk_mq_start_request
blk-mq: always initialize request->start_time
block: blk-exec.c: Cleaning up local variable address returnd
mtip32xx: minor performance enhancements
blk-mq: ->timeout should be cleared in blk_mq_rq_ctx_init()
blk-mq: don't allow queue entering for a dying queue
blk-mq: bump max tag depth to 10K tags
block: add blk_rq_set_block_pc()
block: add notion of a chunk size for request merging
With the optimizations around not clearing the full request at alloc
time, we are leaving some of the needed init for REQ_TYPE_BLOCK_PC
up to the user allocating the request.
Add a blk_rq_set_block_pc() that sets the command type to
REQ_TYPE_BLOCK_PC, and properly initializes the members associated
with this type of request. Update callers to use this function instead
of manipulating rq->cmd_type directly.
Includes fixes from Christoph Hellwig <hch@lst.de> for my half-assed
attempt.
Signed-off-by: Jens Axboe <axboe@fb.com>
->queuecommand returns '0' for successful command submission,
so we need to set the correct SCSI midlayer return value
when calling scsi_log_completion().
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reported-by: Robert Elliott <elliott@hp.com>
Cc: Stephen Cameron <scameron@beardog.cce.hp.com>
Tested-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
This patch fixes a corner case in the previous USB Deadlock fix patch (12023e7
[SCSI] Fix USB deadlock caused by SCSI error handling).
The scenario is abort command, set flag, abort completes, send TUR, TUR
doesn't return, so we now try to abort the TUR, but scsi_abort_eh_cmnd()
will skip the abort because the flag is set and move straight to reset.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
USB requires that every command be aborted first before we escalate to reset.
In particular, USB will deadlock if we try to reset first before aborting the
command.
Unfortunately, the flag we use to tell if a command has already been aborted:
SCSI_EH_ABORT_SCHEDULED is not cleared properly leading to cases where we can
requeue a command with the flag set and proceed immediately to reset if it
fails (thus causing USB to deadlock).
Fix by clearing the SCSI_EH_ABORT_SCHEDULED flag if it has been set. Which
means this will be the second time scsi_abort_command() has been called for
the same command. IE the first abort went out, did its thing, but now the
same command has timed out again.
So this flag gets cleared, and scsi_abort_command() returns FAILED, and _no_
asynchronous abort is being scheduled. scsi_times_out() will then proceed to
call scsi_eh_scmd_add(). But as we've cleared the SCSI_EH_ABORT_SCHEDULED
flag the SCSI_EH_CANCEL_CMD flag will continue to be set, and the command will
be aborted with the main SCSI EH routine.
Reported-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Andreas Reis <andreas.reis@gmail.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
We're seeing a case where the contents of scmd->result isn't being reset after
a SCSI command encounters an error, is resubmitted, times out and then gets
handled. The error handler acts on the stale result of the previous error
instead of the timeout. Fix this by properly zeroing the scmd->status before
the command is resubmitted.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
We unconditionally execute scsi_eh_get_sense() to make sure all failed
commands that should have sense attached, do. However, the routine forgets
that some commands, because of the way they fail, will not have any sense code
... we should not bother them with a REQUEST_SENSE command. Fix this by
testing to see if we actually got a CHECK_CONDITION return and skip asking for
sense if we don't.
Tested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Many callers won't need this and we can optimize them away. In addition
the handling in the __-prefixed variants was inconsistant to start with.
Based on an earlier patch from Bart Van Assche.
[jejb: fix kerneldoc probelm picked up by Fengguang Wu]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The former minimum valid value of 'eh_deadline' is 1s, which means
the earliest occasion to shorten EH is 1 second later since a
command is failed or timed out. But if we want to skip EH steps
ASAP, we have to wait until the first EH step is finished. If the
duration of the first EH step is long, this waiting time is
excruciating. So, it is necessary to accept 0 as the minimum valid
value for 'eh_deadline'.
According to my test, with Hannes' patchset 'New EH command timeout
handler' as well, the minimum IO time is improved from 73s
(eh_deadline = 1) to 43s(eh_deadline = 0) when commands are timed
out by disabling RSCN and target port.
Signed-off-by: Ren Mingxin <renmx@cn.fujitsu.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
32bit accesses are guaranteed to be atomic, so we can remove
the spinlock when checking for eh_deadline. We only need to
make sure to catch any updates which might happened during
the call to time_before(); if so we just recheck with the
correct value.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
When a command runs into a timeout we need to send an 'ABORT TASK'
TMF. This is typically done by the 'eh_abort_handler' LLDD callback.
Conceptually, however, this function is a normal SCSI command, so
there is no need to enter the error handler.
This patch implements a new scsi_abort_command() function which
invokes an asynchronous function scsi_eh_abort_handler() to
abort the commands via the usual 'eh_abort_handler'.
If abort succeeds the command is either retried or terminated,
depending on the number of allowed retries. However, 'eh_eflags'
records the abort, so if the retry would fail again the
command is pushed onto the error handler without trying to
abort it (again); it'll be cleared up from SCSI EH.
[hare: smatch detected stray switch fixed]
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Commit 18a4d0a22e
(Handle disk devices which can not process medium access commands)
was introduced to offline any device which cannot process medium
access commands.
However, commit 3eef6257de
(Reduce error recovery time by reducing use of TURs) reduced
the number of TURs by sending it only on the first failing
command, which might or might not be a medium access command.
So in combination this results in an erratic device offlining
during EH; if the command where the TUR was sent upon happens
to be a medium access command the device will be set offline,
if not everything proceeds as normal.
This patch moves the check to the final test, eliminating
this problem.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
If a command abort fails there is a fair chance that all other
aborts will be failing, too.
So we should be calling LUN reset directly after the first failed
abort and skip aborting the remaining commands.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patchs adds an 'eh_deadline' sysfs attribute to the scsi
host which limits the overall runtime of the SCSI EH.
The 'eh_deadline' value is stored in the now obsolete field
'resetting'.
When a command is failed the start time of the EH is stored
in 'last_reset'. If the overall runtime of the SCSI EH is longer
than last_reset + eh_deadline, the EH is short-circuited and
falls through to issue a host reset only.
[jejb: add comments in Scsi_Host about new fields]
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Generate a uevent when the following Unit Attention ASC/ASCQ
codes are received:
2A/01 MODE PARAMETERS CHANGED
2A/09 CAPACITY DATA HAS CHANGED
38/07 THIN PROVISIONING SOFT THRESHOLD REACHED
3F/03 INQUIRY DATA HAS CHANGED
3F/0E REPORTED LUNS DATA HAS CHANGED
Log kernel messages when the following Unit Attention ASC/ASCQ
codes are received that are not as specific as those above:
2A/xx PARAMETERS CHANGED
3F/xx TARGET OPERATING CONDITIONS HAVE CHANGED
Added logic to set expecting_lun_change for other LUNs on the target
after REPORTED LUNS DATA HAS CHANGED is received, so that duplicate
uevents are not generated, and clear expecting_lun_change when a
REPORT LUNS command completes, in accordance with the SPC-3
specification regarding reporting of the 3F 0E ASC/ASCQ UA.
[jejb: remove SPC3 test in scsi_report_lun_change and some docbook fixes and
unused variable fix, both reported by Fengguang Wu]
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
When a medium error is detected the SCSI stack should return
ENODATA to the upper layers.
[jejb: fix whitespace error]
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
When the thin provisioning hard threshold is reached we
should return ENOSPC to inform upper layers about this fact.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
We should be modifying the host_byte status in scsi_check_sense()
directly; this saves us to introduce a special return code for
each and every condition.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Pull first round of SCSI updates from James Bottomley:
"The patch set is mostly driver updates (usf, zfcp, lpfc, mpt2sas,
megaraid_sas, bfa, ipr) and a few bug fixes. Also of note is that the
Buslogic driver has been rewritten to a better coding style and 64 bit
support added. We also removed the libsas limitation on 16 bytes for
the command size (currently no drivers make use of this)"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (101 commits)
[SCSI] megaraid: minor cut and paste error fixed.
[SCSI] ufshcd-pltfrm: remove unnecessary dma_set_coherent_mask() call
[SCSI] ufs: fix register address in UIC error interrupt handling
[SCSI] ufshcd-pltfrm: add missing empty slot in ufs_of_match[]
[SCSI] ufs: use devres functions for ufshcd
[SCSI] ufs: Fix the response UPIU length setting
[SCSI] ufs: rework link start-up process
[SCSI] ufs: remove version check before IS reg clear
[SCSI] ufs: amend interrupt configuration
[SCSI] ufs: wrap the i/o access operations
[SCSI] storvsc: Update the storage protocol to win8 level
[SCSI] storvsc: Increase the value of scsi timeout for storvsc devices
[SCSI] MAINTAINERS: Add myself as the maintainer for BusLogic SCSI driver
[SCSI] BusLogic: Port driver to 64-bit.
[SCSI] BusLogic: Fix style issues
[SCSI] libiscsi: Added new boot entries in the session sysfs
[SCSI] aacraid: Fix for arrays are going offline in the system. System hangs
[SCSI] ipr: IOA Status Code(IOASC) update
[SCSI] sd: Update WRITE SAME heuristics
[SCSI] fnic: potential dead lock in fnic_is_abts_pending()
...