Some kernel transport drivers unconditionally disable
retrieval of the Caching mode page. One such for example is
the BBB/CBI transport over USB. Such a restraint is too
harsh as some devices do support the Caching mode
page. Unconditionally enabling the retrieval of this mode
page over those transports at their transport code level may
result in some devices failing and becoming unusable.
This patch implements a method of retrieving the Caching
mode page without unconditionally enabling it in the
transports which unconditionally disable it. The idea is to
ask for all supported pages, page code 0x3F, and then search
for the Caching mode page in the mode parameter data
returned. The sd driver already asks for all the mode pages
supported by the attached device by setting the page code to
0x3F in order to find out if the media is write protected by
reading the WP bit in the Device Specific Parameter
field. It then attempts to retrieve only the Caching mode
page by setting the page code to 8 and actually attempting
to retrieve it if and only if the transport allows it.
The method implemented here is that if the transport doesn't
allow retrieval of the Caching mode page and the device is
not RBC, then we ask for all pages supported by setting the
page code to 0x3F (similarly to how the WP bit is retrieved
above), and then we search for the Caching mode page in the
mode parameter data returned.
With this patch, devices over SATA, report this (no change):
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: [sda] 976773168 512-byte logical blocks: (500 GB/465 GiB)
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: Attached scsi generic sg0 type 0
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: [sda] Write Protect is off
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Smart devices report their Caching mode page. This is a
change where we'd previously see the kernel making
assumption about the device's cache being write-through:
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: Attached scsi generic sg2 type 0
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: [sdb] 610472646 4096-byte logical blocks: (2.50 TB/2.27 TiB)
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: [sdb] Write Protect is off
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: [sdb] Mode Sense: 47 00 10 08
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA
And "dumb" devices over BBB, are correctly shown not to
support reporting the Caching mode page:
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] 15663104 512-byte logical blocks: (8.01 GB/7.46 GiB)
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] Write Protect is off
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] Mode Sense: 23 00 00 00
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] No Caching mode page present
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] Assuming drive cache: write through
Version 2 adds this:
Some devices don't support page code 0x3F, and others require a
fixed transfer length of 192 bytes. This single commit includes a
patch by Alan Stern which fixes this.
Reported-and-tested-by: Richard Senior <richard@r-senior.demon.co.uk>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Luben Tuikov <ltuikov@yahoo.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
The block layer discard alignment is reported in bytes, not in units of
the logical block size.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
SBC3r26 contains many changes to the Logical Block Provisioning
interfaces (formerly known as Thin Provisioning ditto). This patch
implements support for both the old and new schemes using the same
heuristic as before (whether the LBP VPD page is present).
The new code also allows the provisioning mode (i.e. choice of command)
to be overridden on a per-device basis via sysfs. Two additional modes
are supported in this version:
- WRITE SAME(10) with the UNMAP bit set
- WRITE SAME(10) without the UNMAP bit set. This allows us to support
devices that predate the TP/LBP enhancements in SBC3 and which work
by way zero-detection
Switching between modes has been consolidated in a helper function that
also updates the block layer topology according to the limitations of
the chosen command.
I experimented with trying WRITE SAME(16) if UNMAP fails, WRITE SAME(10)
if WRITE SAME(16) fails, etc. but found several devices that got
cranky. So for now we'll disable discard if one of the commands
fail. The user still has the option of selecting a different mode in
sysfs.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
SDEV_MEDIA_CHANGE event was first added by commit a341cd0f (SCSI: add
asynchronous event notification API) for SATA AN support and then
extended to cover generic media change events by commit 285e9670
([SCSI] sr,sd: send media state change modification events).
This event was mapped to block device in userland with all properties
stripped to simulate CHANGE event on the block device, which, in turn,
was used to trigger further userspace action on media change.
The recent addition of disk event framework kept this event for
backward compatibility but it turns out to be unnecessary and causes
erratic and inefficient behavior. The new disk event generates proper
events on the block devices and the compat events are mapped to block
device with all properties stripped, so the block device ends up
generating multiple duplicate events for single actual event.
This patch removes the compat event generation from both sr and sd as
suggested by Kay Sievers. Both existing and newer versions of udev
and the associated tools will behave better with the removal of these
events as they from the beginning were expecting events on the block
devices.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Replace sd_media_change() with sd_check_events().
* Move media removed logic into set_media_not_present() and
media_not_present() and set sdev->changed iff an existing media is
removed or the device indicates UNIT_ATTENTION.
* Make sd_check_events() sets sdev->changed if previously missing
media becomes present.
* Event is reported only if sdev->changed is set.
This makes media presence event reported if scsi_disk->media_present
actually changed or the device indicated UNIT_ATTENTION. For backward
compatibility, SDEV_EVT_MEDIA_CHANGE is generated each time
sd_check_events() detects media change event.
[jejb: fix boot failure]
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
* 'for-2.6.38/core' of git://git.kernel.dk/linux-2.6-block: (43 commits)
block: ensure that completion error gets properly traced
blktrace: add missing probe argument to block_bio_complete
block cfq: don't use atomic_t for cfq_group
block cfq: don't use atomic_t for cfq_queue
block: trace event block fix unassigned field
block: add internal hd part table references
block: fix accounting bug on cross partition merges
kref: add kref_test_and_get
bio-integrity: mark kintegrityd_wq highpri and CPU intensive
block: make kblockd_workqueue smarter
Revert "sd: implement sd_check_events()"
block: Clean up exit_io_context() source code.
Fix compile warnings due to missing removal of a 'ret' variable
fs/block: type signature of major_to_index(int) to major_to_index(unsigned)
block: convert !IS_ERR(p) && p to !IS_ERR_NOR_NULL(p)
cfq-iosched: don't check cfqg in choose_service_tree()
fs/splice: Pull buf->ops->confirm() from splice_from_pipe actors
cdrom: export cdrom_check_events()
sd: implement sd_check_events()
sr: implement sr_check_events()
...
Our current handling of medium error assumes that data is returned up
to the bad sector. This assumption holds good for all disk devices,
all DIF arrays and most ordinary arrays. However, an LSI array engine
was recently discovered which reports a medium error without returning
any data. This means that when we report good data up to the medium
error, we've reported junk originally in the buffer as good. Worse,
if the read consists of requested data plus a readahead, and the error
occurs in readahead, we'll just strip off the readahead and report
junk up to userspace as good data with no error.
The fix for this is to have the error position computation take into
account the amount of data returned by the driver using the scsi
residual data. Unfortunately, not every driver fills in this data,
but for those who don't, it's set to zero, which means we'll think a
full set of data was transferred and the behaviour will be identical
to the prior behaviour of the code (believe the buffer up to the error
sector). All modern drivers seem to set the residual, so that should
fix up the LSI failure/corruption case.
Reported-by: Douglas Gilbert <dgilbert@interlog.com>
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This reverts commit c8d2e93735.
We run into merging problems with the SCSI tree, revert this one
so it can be handled by a postmerge tree there.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Some kernel transport drivers unconditionally disable
retrieval of the Caching mode page. One such for example is
the BBB/CBI transport over USB. Such a restraint is too
harsh as some devices do support the Caching mode
page. Unconditionally enabling the retrieval of this mode
page over those transports at their transport code level may
result in some devices failing and becoming unusable.
This patch implements a method of retrieving the Caching
mode page without unconditionally enabling it in the
transports which unconditionally disable it. The idea is to
ask for all supported pages, page code 0x3F, and then search
for the Caching mode page in the mode parameter data
returned. The sd driver already asks for all the mode pages
supported by the attached device by setting the page code to
0x3F in order to find out if the media is write protected by
reading the WP bit in the Device Specific Parameter
field. It then attempts to retrieve only the Caching mode
page by setting the page code to 8 and actually attempting
to retrieve it if and only if the transport allows it.
The method implemented here is that if the transport doesn't
allow retrieval of the Caching mode page and the device is
not RBC, then we ask for all pages supported by setting the
page code to 0x3F (similarly to how the WP bit is retrieved
above), and then we search for the Caching mode page in the
mode parameter data returned.
With this patch, devices over SATA, report this (no change):
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: [sda] 976773168 512-byte logical blocks: (500 GB/465 GiB)
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: Attached scsi generic sg0 type 0
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: [sda] Write Protect is off
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Smart devices report their Caching mode page. This is a
change where we'd previously see the kernel making
assumption about the device's cache being write-through:
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: Attached scsi generic sg2 type 0
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: [sdb] 610472646 4096-byte logical blocks: (2.50 TB/2.27 TiB)
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: [sdb] Write Protect is off
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: [sdb] Mode Sense: 47 00 10 08
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA
And "dumb" devices over BBB, are correctly shown not to
support reporting the Caching mode page:
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] 15663104 512-byte logical blocks: (8.01 GB/7.46 GiB)
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] Write Protect is off
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] Mode Sense: 23 00 00 00
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] No Caching mode page present
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] Assuming drive cache: write through
Signed-off-by: Luben Tuikov <ltuikov@yahoo.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch (as1415) improves the formerly incomprehensible logic in
sd_media_changed() (the current code refers to "changed" as a state,
whereas in fact it is a relation between two states). It also adds a
big comment so that everyone can understand what is really going on.
The patch also improves efficiency by not reporting a media change
when no medium was ever present. If no medium was present the last
time we checked and there's still no medium, it's not necessary to
tell the caller that a change occurred. Doing so merely causes the
caller to attempt to revalidate a non-existent disk, which is a waste
of time.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Replace sd_media_change() with sd_check_events(). sd used to set the
changed state whenever the device is not ready, which can cause event
loop while the device is not ready. Media presence handling code is
changed such that the changed state is set iff the media presence
actually changes. UA still always sets the changed state and
NOT_READY always (at least where it used to set ->changed) clears
media presence, so no event is lost.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
The usage of TUR has been confusing involving several different
commits updating different parts over time. Currently, the only
differences between scsi_test_unit_ready() and sr_test_unit_ready()
are,
* scsi_test_unit_ready() also sets sdev->changed on NOT_READY.
* scsi_test_unit_ready() returns 0 if TUR ended with UNIT_ATTENTION or
NOT_READY.
Due to the above two differences, sr is using its own
sr_test_unit_ready(), but sd - the sole user of the above extra
handling - doesn't even need them.
Where scsi_test_unit_ready() is used in sd_media_changed(), the code
is looking for device ready w/ media present state which is true iff
TUR succeeds w/o sense data or UA, and when the device is not ready
for whatever reason sd_media_changed() explicitly marks media as
missing so there's no reason to set sdev->changed automatically from
scsi_test_unit_ready() on NOT_READY.
Drop both special handlings from scsi_test_unit_ready(), which makes
it equivalant to sr_test_unit_ready(), and replace
sr_test_unit_ready() with scsi_test_unit_ready(). Also, drop the
unnecessary explicit NOT_READY check from sd_media_changed().
Checking return value is enough for testing device readiness.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
The big kernel lock has been removed from all these files at some point,
leaving only the #include.
Remove this too as a cleanup.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Create a sysfs entry that reports the negotiated DIX/DIF protection mode
for a SCSI disk. This depends on the protection type the disk is
formatted with as well as the protection capabilities advertised by the
controller.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (141 commits)
USB: mct_u232: fix broken close
USB: gadget: amd5536udc.c: fix error path
USB: imx21-hcd - fix off by one resource size calculation
usb: gadget: fix Kconfig warning
usb: r8a66597-udc: Add processing when USB was removed.
mxc_udc: add workaround for ENGcm09152 for i.MX35
USB: ftdi_sio: add device ids for ScienceScope
USB: musb: AM35x: Workaround for fifo read issue
USB: musb: add musb support for AM35x
USB: AM35x: Add musb support
usb: Fix linker errors with CONFIG_PM=n
USB: ohci-sh - use resource_size instead of defining its own resource_len macro
USB: isp1362-hcd - use resource_size instead of defining its own resource_len macro
USB: isp116x-hcd - use resource_size instead of defining its own resource_len macro
USB: xhci: Fix compile error when CONFIG_PM=n
USB: accept some invalid ep0-maxpacket values
USB: xHCI: PCI power management implementation
USB: xHCI: bus power management implementation
USB: xHCI: port remote wakeup implementation
USB: xHCI: port power management implementation
...
Manually fix up (non-data) conflict: the SCSI merge gad renamed the
'hw_sector_size' member to 'physical_block_size', and the USB tree
brought a new use of it.
I seem to have a knack for digging up buggy usb devices which don't work
with Linux, and I'm crazy enough to try to make them work. So this time a
friend of mine asked me to get an mp4 player (an mp3 player which can play
videos on a small screen) to work with Linux.
It is based on the well known rockbox chipset for which we already have an
unusual devs entries to work around some of its bugs. But this model
comes with an additional twist.
This model chokes on read_capacity_16 calls. Now normally we don't make
those calls, but this model comes with an sdcard slot and when there is no
card in there (and shipped from the factory there is none), it reports a
size of 0. However this time the programmers actually got the
read_capacity_10 response right! So they substract one from the size as
stored internally in the mp3 player before reporting it back, resulting in
an answer of ... 0xffffffff sectors, causing sd.c to try a
read_capacity_16, on which the device crashes.
This patch adds a flag to scsi_device to indicate that a a device cannot
handle read_capacity_16, and when this flag is set if a device reports an
lba of 0xffffffff as answer to a read_capacity_10, assumes it tries to
report a size of 0.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The hw_sector_size variable could overflow if a device reported huge
physical blocks. Switch to the more accurate physical_block_size
terminology and make sure we use an unsigned int to match the range
permitted by READ CAPACITY(16).
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Following a site power outage which re-enabled all the ports on my FC
switches, my system subsequently booted with far too many luns! I had
let it run hoping it would make multi-user. It didn't. :( It hung solid
after exhausting the last sd device, sdzzz, and attempting to create sdaaaa
and beyond. I was unable to get a dump.
Discovered using a 2.6.32.13 based system.
correct this by detecting when the last index is utilized and failing
the sd probe of the device. Patch applies to scsi-misc-2.6.
Signed-off-by: Michael Reed <mdr@sgi.com>
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Add support for the Thin Provisioning VPD page and use the TPU and TPWS
bits to switch between UNMAP and WRITE SAME(16) for discards. If no TP
VPD page is present we fall back to old scheme where the max descriptor
count combined with the max lba count are used trigger UNMAP.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Barrier is deemed too heavy and will soon be replaced by FLUSH/FUA
requests. Deprecate barrier. All REQ_HARDBARRIERs are failed with
-EOPNOTSUPP and blk_queue_ordered() is replaced with simpler
blk_queue_flush().
blk_queue_flush() takes combinations of REQ_FLUSH and FUA. If a
device has write cache and can flush it, it should set REQ_FLUSH. If
the device can handle FUA writes, it should also set REQ_FUA.
All blk_queue_ordered() users are converted.
* ORDERED_DRAIN is mapped to 0 which is the default value.
* ORDERED_DRAIN_FLUSH is mapped to REQ_FLUSH.
* ORDERED_DRAIN_FLUSH_FUA is mapped to REQ_FLUSH | REQ_FUA.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Boaz Harrosh <bharrosh@panasas.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Pierre Ossman <drzeus@drzeus.cx>
Cc: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>