Commit Graph

49 Commits

Author SHA1 Message Date
Dan Williams 2955b47d2c [SCSI] async: introduce 'async_domain' type
This is in preparation for teaching async_synchronize_full() to sync all
pending async work, and not just on the async_running domain.  This
conversion is functionally equivalent, just embedding the existing list
in a new async_domain type.

The .registered attribute is used in a later patch to distinguish
between domains that want to be flushed by async_synchronize_full()
versus those that only expect async_synchronize_{full|cookie}_domain to
be used for flushing.

[jejb: add async.h to scsi_priv.h for struct async_domain]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Tested-by: Eldad Zack <eldad@fogrefinery.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 09:05:54 +01:00
Bart Van Assche 84feb1664e [SCSI] Change return type of scsi_queue_insert() into void
The return value of scsi_queue_insert() is ignored by all its
callers, hence change the return type of this function into
void.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:41 +01:00
Bart Van Assche 67bd941300 [SCSI] Fix device removal NULL pointer dereference
Use blk_queue_dead() to test whether the queue is dead instead
of !sdev. Since scsi_prep_fn() may be invoked concurrently with
__scsi_remove_device(), keep the queuedata (sdev) pointer in
__scsi_remove_device(). This patch fixes a kernel oops that
can be triggered by USB device removal. See also
http://www.spinics.net/lists/linux-scsi/msg56254.html.

Other changes included in this patch:
- Swap the blk_cleanup_queue() and kfree() calls in
  scsi_host_dev_release() to make that code easier to grasp.
- Remove the queue dead check from scsi_run_queue() since the
  queue state can change anyway at any point in that function
  where the queue lock is not held.
- Remove the queue dead check from the start of scsi_request_fn()
  since it is redundant with the scsi_device_online() check.

Reported-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Tejun Heo <tj@kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:40 +01:00
Mike Christie 5d9fb5cc1b [SCSI] core, classes, mpt2sas: have scsi_internal_device_unblock take new state
This has scsi_internal_device_unblock/scsi_target_unblock take
the new state to set the devices as an argument instead of
always setting to running. The patch also converts users of these
functions.

This allows the FC and iSCSI class to transition devices from blocked
to transport-offline, so that when fast_io_fail/replacement_timeout
has fired we do not set the devices back to running. Instead, we
set them to SDEV_TRANSPORT_OFFLINE.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:22 +01:00
Dan Williams a7a20d1039 [SCSI] sd: limit the scope of the async probe domain
sd injects and synchronizes probe work on the global kernel-wide domain.
This runs into conflict with PM that wants to perform resume actions in
async context:

[  494.237079] INFO: task kworker/u:3:554 blocked for more than 120 seconds.
[  494.294396] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  494.360809] kworker/u:3     D 0000000000000000     0   554      2 0x00000000
[  494.420739]  ffff88012e4d3af0 0000000000000046 ffff88013200c160 ffff88012e4d3fd8
[  494.484392]  ffff88012e4d3fd8 0000000000012500 ffff8801394ea0b0 ffff88013200c160
[  494.548038]  ffff88012e4d3ae0 00000000000001e3 ffffffff81a249e0 ffff8801321c5398
[  494.611685] Call Trace:
[  494.632649]  [<ffffffff8149dd25>] schedule+0x5a/0x5c
[  494.674687]  [<ffffffff8104b968>] async_synchronize_cookie_domain+0xb6/0x112
[  494.734177]  [<ffffffff810461ff>] ? __init_waitqueue_head+0x50/0x50
[  494.787134]  [<ffffffff8131a224>] ? scsi_remove_target+0x48/0x48
[  494.837900]  [<ffffffff8104b9d9>] async_synchronize_cookie+0x15/0x17
[  494.891567]  [<ffffffff8104ba49>] async_synchronize_full+0x54/0x70  <-- here we wait for async contexts to complete
[  494.943783]  [<ffffffff8104b9f5>] ? async_synchronize_full_domain+0x1a/0x1a
[  495.002547]  [<ffffffffa00114b1>] sd_remove+0x2c/0xa2 [sd_mod]
[  495.051861]  [<ffffffff812fe94f>] __device_release_driver+0x86/0xcf
[  495.104807]  [<ffffffff812fe9bd>] device_release_driver+0x25/0x32  <-- here we take device_lock()

[  853.511341] INFO: task kworker/u:4:549 blocked for more than 120 seconds.
[  853.568693] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  853.635119] kworker/u:4     D ffff88013097b5d0     0   549      2 0x00000000
[  853.695129]  ffff880132773c40 0000000000000046 ffff880130790000 ffff880132773fd8
[  853.758990]  ffff880132773fd8 0000000000012500 ffff88013288a0b0 ffff880130790000
[  853.822796]  0000000000000246 0000000000000040 ffff88013097b5c8 ffff880130790000
[  853.886633] Call Trace:
[  853.907631]  [<ffffffff8149dd25>] schedule+0x5a/0x5c
[  853.949670]  [<ffffffff8149cc44>] __mutex_lock_common+0x220/0x351
[  854.001225]  [<ffffffff81304bd7>] ? device_resume+0x58/0x1c4
[  854.049082]  [<ffffffff81304bd7>] ? device_resume+0x58/0x1c4
[  854.097011]  [<ffffffff8149ce48>] mutex_lock_nested+0x2f/0x36   <-- here we wait for device_lock()
[  854.145591]  [<ffffffff81304bd7>] device_resume+0x58/0x1c4
[  854.192066]  [<ffffffff81304d61>] async_resume+0x1e/0x45
[  854.237019]  [<ffffffff8104bc93>] async_run_entry_fn+0xc6/0x173  <-- ...while running in async context

Provide a 'scsi_sd_probe_domain' so that async probe actions actions can
be flushed without regard for the state of PM, and allow for the resume
path to handle devices that have transitioned from SDEV_QUIESCE to
SDEV_DEL prior to resume.

Acked-by: Alan Stern <stern@rowland.harvard.edu>
[alan: uplevel scsi_sd_probe_domain, clarify scsi_device_resume]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
[jejb: remove unneeded config guards in include file]
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-05-17 09:10:46 +01:00
Alan Stern fea6d607e1 [SCSI] scsi_pm: Fix bug in the SCSI power management handler
This patch (as1520) fixes a bug in the SCSI layer's power management
implementation.

LUN scanning can be carried out asynchronously in do_scan_async(), and
sd uses an asynchronous thread for the time-consuming parts of disk
probing in sd_probe_async().  Currently nothing coordinates these
async threads with system sleep transitions; they can and do attempt
to continue scanning/probing SCSI devices even after the host adapter
has been suspended.  As one might expect, the outcome is not ideal.

This is what the "prepare" stage of system suspend was created for.
After the prepare callback has been called for a host, target, or
device, drivers are not allowed to register any children underneath
them.  Currently the SCSI prepare callback is not implemented; this
patch rectifies that omission.

For SCSI hosts, the prepare routine calls scsi_complete_async_scans()
to wait until async scanning is finished.  It might be slightly more
efficient to wait only until the host in question has been scanned,
but there's currently no way to do that.  Besides, during a sleep
transition we will ultimately have to wait until all the host scanning
has finished anyway.

For SCSI devices, the prepare routine calls async_synchronize_full()
to wait until sd probing is finished.  The routine does nothing for
SCSI targets, because asynchronous target scanning is done only as
part of host scanning.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: <stable@kernel.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-18 08:54:19 -06:00
Moger, Babu 2b132577a0 [SCSI] scsi_dh: code cleanup and remove the references to scsi_dev_info
All the handlers have now implemented the match function so We don't need to
use scsi_dev_info any more for matching purposes.

Signed-off-by: Babu Moger <babu.moger@netapp.com>
Acked-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-12-15 10:55:00 +04:00
Linus Torvalds c55d267de2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (170 commits)
  [SCSI] scsi_dh_rdac: Add MD36xxf into device list
  [SCSI] scsi_debug: add consecutive medium errors
  [SCSI] libsas: fix ata list corruption issue
  [SCSI] hpsa: export resettable host attribute
  [SCSI] hpsa: move device attributes to avoid forward declarations
  [SCSI] scsi_debug: Logical Block Provisioning (SBC3r26)
  [SCSI] sd: Logical Block Provisioning update
  [SCSI] Include protection operation in SCSI command trace
  [SCSI] hpsa: fix incorrect PCI IDs and add two new ones (2nd try)
  [SCSI] target: Fix volume size misreporting for volumes > 2TB
  [SCSI] bnx2fc: Broadcom FCoE offload driver
  [SCSI] fcoe: fix broken fcoe interface reset
  [SCSI] fcoe: precedence bug in fcoe_filter_frames()
  [SCSI] libfcoe: Remove stale fcoe-netdev entries
  [SCSI] libfcoe: Move FCOE_MTU definition from fcoe.h to libfcoe.h
  [SCSI] libfc: introduce __fc_fill_fc_hdr that accepts fc_hdr as an argument
  [SCSI] fcoe, libfc: initialize EM anchors list and then update npiv EMs
  [SCSI] Revert "[SCSI] libfc: fix exchange being deleted when the abort itself is timed out"
  [SCSI] libfc: Fixing a memory leak when destroying an interface
  [SCSI] megaraid_sas: Version and Changelog update
  ...

Fix up trivial conflicts due to whitespace differences in
drivers/scsi/libsas/{sas_ata.c,sas_scsi_host.c}
2011-03-17 17:54:40 -07:00
Rafael J. Wysocki aa33860158 PM: Remove CONFIG_PM_OPS
After redefining CONFIG_PM to depend on (CONFIG_PM_SLEEP ||
CONFIG_PM_RUNTIME) the CONFIG_PM_OPS option is redundant and can be
replaced with CONFIG_PM.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-03-15 00:43:15 +01:00
Peter Jones 940d7faa48 [SCSI] scsi_dh: Use scsi_devinfo functions to do matching of device_handler tables.
Previously we were using strncmp in order to avoid having to include
whitespace in the devlist, but this means "HSV1000" matches a device
list entry that says "HSV100", which is wrong.  This patch changes
scsi_dh.c to use scsi_devinfo's matching functions instead, since they
handle these cases correctly.

Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-01-24 12:02:09 -06:00
Peter Jones 38a039be2e [SCSI] Add scsi_dev_info_list_del_keyed()
For scsi_dh.c to use devinfo lists, we have to be able to remove entries
before rmmod.

Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-01-24 12:01:07 -06:00
Alan Stern e6da54d84f SCSI: remove fake "address-of" expression
Fake "address-of" expressions that evaluate to NULL generally confuse
readers and can provoke compiler warnings.  This patch (as1411) removes
one such fake expression, using an "#ifdef" in its place.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-06 09:17:02 -07:00
Alan Stern bc4f24014d [SCSI] implement runtime Power Management
This patch (as1398b) adds runtime PM support to the SCSI layer.  Only
the machanism is provided; use of it is up to the various high-level
drivers, and the patch doesn't change any of them.  Except for sg --
the patch expicitly prevents a device from being runtime-suspended
while its sg device file is open.

The implementation is simplistic.  In general, hosts and targets are
automatically suspended when all their children are asleep, but for
them the runtime-suspend code doesn't actually do anything.  (A host's
runtime PM status is propagated up the device tree, though, so a
runtime-PM-aware lower-level driver could power down the host adapter
hardware at the appropriate times.)  There are comments indicating
where a transport class might be notified or some other hooks added.

LUNs are runtime-suspended by calling the drivers' existing suspend
handlers (and likewise for runtime-resume).  Somewhat arbitrarily, the
implementation delays for 100 ms before suspending an eligible LUN.
This is because there typically are occasions during bootup when the
same device file is opened and closed several times in quick
succession.

The way this all works is that the SCSI core increments a device's
PM-usage count when it is registered.  If a high-level driver does
nothing then the device will not be eligible for runtime-suspend
because of the elevated usage count.  If a high-level driver wants to
use runtime PM then it can call scsi_autopm_put_device() in its probe
routine to decrement the usage count and scsi_autopm_get_device() in
its remove routine to restore the original count.

Hosts, targets, and LUNs are not suspended while they are being probed
or removed, or while the error handler is running.  In fact, a fairly
large part of the patch consists of code to make sure that things
aren't suspended at such times.

[jejb: fix up compile issues in PM config variations]
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-28 09:07:50 -05:00
Alan Stern db5bd1e0b5 [SCSI] convert to the new PM framework
This patch (as1397b) converts the SCSI midlayer to use the new PM
callbacks (struct dev_pm_ops).  A new source file, scsi_pm.c, is
created to hold the new callback routines, and the existing
suspend/resume code is moved there.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-28 09:07:49 -05:00
David Brownell a4dbd6740d driver model: constify attribute groups
Let attribute group vectors be declared "const".  We'd
like to let most attribute metadata live in read-only
sections... this is a start.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-15 09:50:47 -07:00
Hannes Reinecke b391277a56 sd, sr: fix Driver 'sd' needs updating message
If a SCSI ULD driver sets blk_queue_prep_rq(), it should clean it
up itself on remove(), and not from the bus callbacks. This
removes the need to hook into bus->remove(), which should not
be used at the same time as driver->remove().

[jejb: fix sdkp initialisation problem due to mismerge]
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-06-21 12:01:27 -05:00
James Bottomley a9e0edb687 scsi_transport_spi: Blacklist Ultrium-3 tape for IU transfers
There have been several bug reports which identified the Ultrium-3
tape as just hanging up on the bus during certain types of IU
transfer.  The identified culpret is type 0x02 (MULTIPLE COMMAND)
transfers.  The only way to prevent this tape wedging is to prevent it
from using IU transfers at all.  So this patch uses the exported
blacklist matching technology to recognise the drive and force it not
to use IU transfers.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-06-21 10:52:46 -05:00
James Bottomley 598fa4b775 enhance device info matching for multiple tables
The current scsi_devinfo.c matching routines use a single table for
the global blacklist.  However, we're developing a need to blacklist
from specific transports too (notably some tape drives using SPI which
don't respond well to high speed protocols).  Instead of developing
separate blacklist matching for each transport class needing it,
enhance the current list matching to permit multiple lists.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-06-21 10:52:45 -05:00
Rafael J. Wysocki c751085943 PM/Hibernate: Wait for SCSI devices scan to complete during resume
There is a race between resume from hibernation and the asynchronous
scanning of SCSI devices and to prevent it from happening we need to
call scsi_complete_async_scans() during resume from hibernation.

In addition, if the resume from hibernation is userland-driven, it's
better to wait for all device probes in the kernel to complete before
attempting to open the resume device.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-13 11:37:07 -07:00
Mike Christie 4a27446f3e [SCSI] modify scsi to handle new fail fast flags.
This checks the errors the scsi-ml determined were retryable
and returns if we should fast fail it based on the request
fail fast flags.

Without the patch, drivers like lpfc, qla2xxx and fcoe would return
DID_ERROR for what it determines is a temporary communication problem.
There is no loss of connectivity at that time and the driver thinks
that it would be fast to retry at the driver level. SCSI-ml will however
sees fast fail on the request and DID_ERROR and will fast fail the io.
This will then cause dm-multipath to fail the path and possibley switch
target controllers when we should be retrying at the scsi layer.

We also were fast failing device errors to dm multiapth when
unless the scsi_dh modules think otherwis we want to retry at
the scsi layer because multipath can only retry the IO like scsi
should have done. multipath is a little dumber though because it
does not what the error was for and assumes that it should fail
the paths.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-10-13 09:28:52 -04:00
Jens Axboe 242f9dcb8b block: unify request timeout handling
Right now SCSI and others do their own command timeout handling.
Move those bits to the block layer.

Instead of having a timer per command, we try to be a bit more clever
and simply have one per-queue. This avoids the overhead of having to
tear down and setup a timer for each command, so it will result in a lot
less timer fiddling.

Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:13 +02:00
Martin K. Petersen 7027ad72a6 [SCSI] Support devices with protection information
Implement support for DMA of protection information for devices that
are data integrity capable.

 - Add support for mapping an extra scatter-gather list containing
   the protection information.

 - Allocate protection scsi_data_buffer if host is DIX (integrity DMA)
   capable.

 - Accessor function for checking whether a device has protection
   enabled.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-26 15:14:55 -04:00
Hannes Reinecke f7120a4f75 [SCSI] use default attributes for scsi_host
This patch removes the unused sysfs attibute overwriting logic for
the scsi host attibutes, and plugs them into the driver core default
attribute creation.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-22 15:16:31 -05:00
Linus Torvalds 7b3d9545f9 Revert "scsi: revert "[SCSI] Get rid of scsi_cmnd->done""
This reverts commit ac40532ef0, which gets
us back the original cleanup of 6f5391c283.

It turns out that the bug that was triggered by that commit was
apparently not actually triggered by that commit at all, and just the
testing conditions had changed enough to make it appear to be due to it.

The real problem seems to have been found by Peter Osterlund:

  "pktcdvd sets it [block device size] when opening the /dev/pktcdvd
   device, but when the drive is later opened as /dev/scd0, there is
   nothing that sets it back.  (Btw, 40944 is possible if the disk is a
   CDRW that was formatted with "cdrwtool -m 10236".)

   The problem is that pktcdvd opens the cd device in non-blocking mode
   when pktsetup is run, and doesn't close it again until pktsetup -d is
   run.  The effect is that if you meanwhile open the cd device,
   blkdev.c:do_open() doesn't call bd_set_size() because
   bdev->bd_openers is non-zero."

In particular, to repeat the bug (regardless of whether commit
6f5391c283 is applied or not):

  " 1. Start with an empty drive.
    2. pktsetup 0 /dev/scd0
    3. Insert a CD containing an isofs filesystem.
    4. mount /dev/pktcdvd/0 /mnt/tmp
    5. umount /mnt/tmp
    6. Press the eject button.
    7. Insert a DVD containing a non-writable filesystem.
    8. mount /dev/scd0 /mnt/tmp
    9. find /mnt/tmp -type f -print0 | xargs -0 sha1sum >/dev/null
    10. If the DVD contains data beyond the physical size of a CD, you
        get I/O errors in the terminal, and dmesg reports lots of
        "attempt to access beyond end of device" errors."

which in turn is because the nested open after the media change won't
cause the size to be set properly (because the original open still holds
the block device, and we only do the bd_set_size() when we don't have
other people holding the device open).

The proper fix for that is probably to just do something like

	bdev->bd_inode->i_size = (loff_t)get_capacity(disk)<<9;

in fs/block_dev.c:do_open() even for the cases where we're not the
original opener (but *not* call bd_set_size(), since that will also
change the block size of the device).

Cc: Peter Osterlund <petero2@telia.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-06 10:17:12 -08:00
Linus Torvalds 3a62b5f3cd Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
  [SCSI] scsi_sysfs: restore prep_fn when ULD is removed
2008-01-03 11:59:27 -08:00