Commit Graph

287701 Commits

Author SHA1 Message Date
Dan Williams a3a142524a [SCSI] libsas: prevent double completion of scmds from eh
We invoke task->task_done() to free the task in the eh case, but at this
point we are prepared for scsi_eh_flush_done_q() to finish off the scmd.

Introduce sas_end_task() to capture the final response status from the
lldd and free the task.

Also take the opportunity to kill this warning.
drivers/scsi/libsas/sas_scsi_host.c: In function ‘sas_end_task’:
drivers/scsi/libsas/sas_scsi_host.c:102:3: warning: case value ‘2’ not in enumerated type ‘enum exec_status’ [-Wswitch]

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 14:04:52 -06:00
Dan Williams 3dff5721e4 [SCSI] libsas: close error handling vs sas_ata_task_done() race
Since sas_ata does not implement ->freeze(), completions for scmds and
internal commands can still arrive concurrent with
ata_scsi_cmd_error_handler() and sas_ata_post_internal() respectively.
By the time either of those is called libata has committed to completing
the qc, and the ATA_PFLAG_FROZEN flag tells sas_ata_task_done() it has
lost the race.

In the sas_ata_post_internal() case we take on the additional
responsibility of freeing the sas_task to close the race with
sas_ata_task_done() freeing the the task while sas_ata_post_internal()
is in the process of invoking ->lldd_abort_task().

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 13:58:38 -06:00
Dan Williams e500a34b02 [SCSI] libsas: kill invocation of scsi_eh_finish_cmd from sas_ata_task_done
Prior to the conversion to the new-style libata-eh sas_ata_task_done()
may have been the last opportunity to clean up the scmd, but now
libata-eh explicitly handles this case.  It also races against sas-eh.
If a lldd completes a task after SAS_TASK_STATE_ABORTED is set it could
trigger a spurious decrement of shost->host_failed.  Current lldds have
the band-aid of checking SAS_TASK_STATE_ABORTED before calling
->task_done(), but better to just let the scmds escalate to libata for
race free cleanup.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 13:57:01 -06:00
Dan Williams b91bb29618 [SCSI] libsas: use ->set_dmamode to notify lldds of NCQ parameters
sas_discover_sata() notifies lldds of sata devices twice.  Once to allow
the 'identify' to be sent, and a second time to allow aic94xx (the only
libsas driver that cares about sata_dev.identify) to setup NCQ
parameters before the device becomes known to the midlayer.  Replace
this double notification and intervening 'identify' with an explicit
->lldd_ata_set_dmamode notification.  With this change all ata internal
commands are issued by libata, so we no longer need sas_issue_ata_cmd().

The data from the identify command only needs to be cached in one
location so ata_device.id replaces domain_device.sata_dev.identify.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 13:55:42 -06:00
Dan Williams 87c8331fcf [SCSI] libsas: prevent domain rediscovery competing with ata error handling
libata error handling provides for a timeout for link recovery.  libsas
must not rescan for previously known devices in this interval otherwise
it may remove a device that is simply waiting for its link to recover.
Let libata-eh make the determination of when the link is stable and
prevent libsas (host workqueue) from taking action while this
determination is pending.

Using a mutex (ha->disco_mutex) to flush and disable revalidation while
eh is running requires any discovery action that may block on eh be
moved to its own context outside the lock.  Probing ATA devices
explicitly waits on ata-eh and the cache-flush-io issued during device
removal may also pend awaiting eh completion.  Essentially any rphy
add/remove activity needs to run outside the lock.

This adds two new cleanup states for sas_unregister_domain_devices()
'allocated-but-not-probed', and 'flagged-for-destruction'.  In the
'allocated-but-not-probed' state  dev->rphy points to a rphy that is
known to have not been through a sas_rphy_add() event.  At domain
teardown check if this device is still pending probe and cleanup
accordingly.  Similarly if a device has already been queued for removal
then sas_unregister_domain_devices has nothing to do.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 13:52:34 -06:00
Dan Williams e139942d77 [SCSI] libsas: convert dev->gone to flags
In preparation for adding tracking of another device state "destroy".

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 13:51:23 -06:00
Dan Williams 312d3e5611 [SCSI] libsas: remove ata_port.lock management duties from lldds
Each libsas driver (mvsas, pm8001, and isci) has invented a different
method for managing the ap->lock.  The lock is held by the ata
->queuecommand() path.  mvsas drops it prior to acquiring any internal
locks which allows it to hold its internal lock across calls to
task->task_done().  This capability is important as it is the only way
the driver can flush task->task_done() instances to guarantee that it no
longer has any in-flight references to a domain_device at
->lldd_dev_gone() time.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 13:50:12 -06:00
Dan Williams b1124cd3ec [SCSI] libsas: introduce sas_drain_work()
When an lldd invokes ->notify_port_event() it can trigger a chain of libsas
events to:

  1/ form the port and find the direct attached device

  2/ if the attached device is an expander perform domain discovery

A call to flush_workqueue() will only flush the initial port formation work.
Currently libsas users need to call scsi_flush_work() up to the max depth of
chain (which will grow from 2 to 3 when ata discovery is moved to its own
discovery event).  Instead of open coding multiple calls switch to use
drain_workqueue() to flush sas work.

drain_workqueue() does not handle new work submitted during the drain so
libsas needs a bit of infrastructure to hold off unchained work submissions
while a drain is in flight.  A lldd ->notify() event is considered 'unchained'
while a sas_discover_event() is 'chained'.  As Tejun notes:

  "For now, I think it would be best to add private wrapper in libsas to
   support deferring unchained work items while draining."

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 13:48:51 -06:00
Dan Williams f8daa6e6d8 [SCSI] libsas: convert ha->state to flags
In preparation for adding new states (SAS_HA_DRAINING, SAS_HA_FROZEN),
convert ha->state into a set of flags.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 13:47:29 -06:00
Dan Williams b15ebe0b5d [SCSI] libsas: replace event locks with atomic bitops
The locks only served to make sure the pending event bitmask was updated
consistently.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 13:41:04 -06:00
Dan Williams 756f173fb5 [SCSI] libsas: fix leak of dev->sata_dev.identify_[packet_]device
These are never freed in the nominal path.  A domain_device has a
different lifetime than a sas_rphy we need a dev->rphy independent way
of identifying sata devices.

Reviewed-by: Jack Wang <jack_wang@usish.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 13:39:36 -06:00
Dan Williams 735f7d2fed [SCSI] libsas: fix domain_device leak
Arrange for the deallocation of a struct domain_device object when it no
longer has:
1/ any children
2/ references by any scsi_targets
3/ references by a lldd

The comment about domain_device lifetime in
Documentation/scsi/libsas.txt is stale as it appears mainline never had
a version of a struct domain_device that was registered as a kobject.
We now manage domain_device reference counts on behalf of external
agents.

Reviewed-by: Jack Wang <jack_wang@usish.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 13:37:47 -06:00
Dan Williams 6f4e75a49f [SCSI] libsas: kill sas_slave_destroy
Per commit 3e4ec344 "libata: kill ATA_FLAG_DISABLED" needing to set
ATA_DEV_NONE is a holdover from before libsas converted to the
"new-style" ata-eh.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 13:36:36 -06:00
Dan Williams 95ac7fd189 [SCSI] libsas: remove unused ata_task_resp fields
Commit 1e34c838 "[SCSI] libsas: remove spurious sata control register
read/write" removed the routines to fake the presence of the sata
control registers, now remove the unused data structure fields to kill
any remaining confusion.

Acked-by: Jack Wang <jack_wang@usish.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 13:32:33 -06:00
Martin K. Petersen 18a4d0a22e [SCSI] Handle disk devices which can not process medium access commands
We have experienced several devices which fail in a fashion we do not
currently handle gracefully in SCSI. After a failure these devices will
respond to the SCSI primary command set (INQUIRY, TEST UNIT READY, etc.)
but any command accessing the storage medium will time out.

The following patch adds an callback that can be used by upper level
drivers to inspect the results of an error handling command. This in
turn has been used to implement additional checking in the SCSI disk
driver.

If a medium access command fails twice but TEST UNIT READY succeeds both
times in the subsequent error handling we will offline the device. The
maximum number of failed commands required to take a device offline can
be tweaked in sysfs.

Also add a new error flag to scsi_debug which allows this scenario to be
easily reproduced.

[jejb: fix up integer parsing to use kstrtouint]
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 10:14:52 -06:00
Andrew Morton a78e21dc5e [SCSI] mpt2sas: spell "primitive" correctly in function prototype
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: "Nandigama, Nagalakshmi" <Nagalakshmi.Nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 10:04:33 -06:00
Paolo Bonzini 4fe74b1cb0 [SCSI] virtio-scsi: SCSI driver for QEMU based virtual machines
The virtio-scsi HBA is the basis of an alternative storage stack
for QEMU-based virtual machines (including KVM).  Compared to
virtio-blk it is more scalable, because it supports many LUNs
on a single PCI slot), more powerful (it more easily supports
passthrough of host devices to the guest) and more easily
extensible (new SCSI features implemented by QEMU should not
require updating the driver in the guest).

Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 09:50:20 -06:00
Tomas Henzl 5a4f934e65 [SCSI] hpsa: add some older controllers to the kdump blacklist
Some other older controllers also do have problems to perform a kdump.
Adding controllers to this list means that the driver will signal
this non-ability via a resettable flag correctly.
The unsupported list was created after a consultation with HP.

Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 09:40:51 -06:00
Mike Snitzer 47ac56db13 [SCSI] scsi_error: classify some ILLEGAL_REQUEST sense as a permanent TARGET_ERROR
Permanent target failures are non-retryable and should be classified as
TARGET_ERROR; otherwise dm-multipath will retry an IO request that will
always fail at the target.

A SCSI command that fails with ILLEGAL_REQUEST sense and Additional
sense 0x20, 0x21, 0x24 or 0x26 represents a permanent TARGET_ERROR.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 09:39:59 -06:00
Martin K. Petersen 89730393f2 [SCSI] sd: Make sure provisioning mode is reported correctly
The provisioning_mode parameter in sysfs did not get updated in the
SD_LBP_DISABLE case. Make sure the provisioning mode is always set
correctly.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 09:39:09 -06:00
Martin K. Petersen 66a651aa7a [SCSI] Ensure discard failure gets treated as a target problem
The error reported up the stack for a discard failure did not clearly
indicate that the command was processed and subsequently failed by the
target device.

Return -EREMOTEIO so multipathing does not classify this condition as a
path failure.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 09:38:01 -06:00
Tomas Henzl c834b1c4ec [SCSI] mpt2sas: add missing allocation check
The __get_free_pages can fail, so the return value should be checked.
Spotted thanks to Stanislaw.

Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: "Nandigama, Nagalakshmi" <Nagalakshmi.Nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 09:37:10 -06:00
Vikas Chaudhary cf3059a129 [SCSI] qla4xxx: Update driver version to 5.02.00-k14
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 09:36:01 -06:00
Vikas Chaudhary c0b9d3f750 [SCSI] qla4xxx: Added ping support
Added ping support for network connection diagnostics.

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 09:35:23 -06:00
Vikas Chaudhary ac20c7bf07 [SCSI] iscsi_transport: Added Ping support
Added ping support for iscsi adapter, application can use this
interface for diagnostic network connection.

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 09:34:50 -06:00