The current sas_scsi_clear_queue_lu() is wrongly checking for commands
which match the pointer to the one passed in. It should be checking for
commands which are on the same logical unit as the one passed in. Fix
this by checking target pointer and LUN for equality.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The clear nexus I_T and clear nexus I_T_L functions in the aic94xx
specify the SUSPEND_TX flag which causes the sequencer to be suspended
until it receives a RESUME_TX. Unfortunately, nothing ever sends the
resume, so the sequencer on the link is stopped forever, leading to
eventual timeouts and I/O errors.
Since clear nexus commands are only executed as part of error recovery,
it's perfectly fine to keep the sequencer running on the link ... as
soon as the recovery function is completed, we'll send it the commands
to retry.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Remove the now useless counting of adjacent pages from the debugging code in
to make it compile when DEBUG is set non-zero.
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
stex_internal_copy copies an in-kernel buffer to a sg list by using
scsi_kmap_atomic_sg. Some functions calls stex_internal_copy with
sg_count in struct st_ccb, which is the value that dma_map_sg
returned. However it might be shorter than the actual number of sg
entries (if the IOMMU merged the sg entries).
scsi_kmap_atomic_sg doesn't see sg->dma_length so stex_internal_copy
should be called with the actual number of sg entries
(i.e. scsi_sg_count), because if the sg entries were merged,
stex_direct_copy wrongly think that the data length in the sg list is
shorter than the actual length.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Ed Lin <ed.lin@promise.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
stex_direct_copy copies an in-kernel buffer to a sg list in order to
spoof some SCSI commands. stex_direct_copy calls dma_map_sg and then
stex_internal_copy with the value that dma_map_sg returned. It calls
scsi_kmap_atomic_sg to copy data.
scsi_kmap_atomic_sg doesn't see sg->dma_length so if dma_map_sg merges
sg entries, stex_internal_copy gets the smaller number of sg entries
than the acutual number, which means it wrongly think that the data
length in the sg list is shorter than the actual length.
stex_direct_copy shouldn't call dma_map_sg and it doesn't need since
this code path doesn't involve dma transfers. This patch removes
stex_direct_copy and simply calls stex_internal_copy with the actual
number of sg entries.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Ed Lin <ed.lin@promise.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
the check in the residual case has an incorrect test of scsi_status
(the logic is reversed, it should be scsi_status != 0 instead of
!scsi_status. Since we checked a few lines above that scsi_status was
non-zero, just eliminate this test
Signed-off-by: David C Somayajulu <david.somayajulu@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The libsas error handler has two fairly fatal bugs
1. scsi_sas_task_done calls scsi_eh_finish_cmd() too early. This
happens if the task completes after it has been aborted but before
the error handler starts up. Because scsi_eh_finish_cmd()
decrements host_failed and adds the task to the done list, the
error handler start check (host_failed == host_busy) never passes
and the eh never starts.
2. The multiple task completion paths sas_scsi_clear_queue_... all
simply delete the task from the error queue. This causes it to
disappear into the ether, since a command must be placed on the
done queue to be finished off by the error handler. This behaviour
causes the HBA to hang on pending commands.
Fix 1. by moving the SAS_TASK_STATE_ABORTED check to an exit clause at
the top of the routine and calling ->scsi_done() unconditionally (it
is a nop if the timer has fired). This keeps the task in the error
handling queue until the eh starts.
Fix 2. by making sure every task goes through task complete followed
by scsi_eh_finish_cmd().
Tested this by firing resets across a disk running a hammer test (now
it actually survives without hanging the system)
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
arcmsr_iop_message_xfer() is called from atomic context under the
queuecommand scsi_host_template handler. James Bottomley pointed out
that the current GFP_KERNEL|GFP_DMA flags are wrong: firstly we are in
atomic context, secondly this memory is not used for DMA.
Also removed some unneeded casts.
Signed-off-by: Daniel Drake <dsd@gentoo.org>
Cc: Nick Cheng <nick.cheng@areca.com.tw>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The spinlock is held over too large a region: pscratch is a permanent
address (it's allocated at boot time and never changes). All you need
the smp lock for is mediating the scratch in use flag, so fix this by
moving the spinlock into the case where we set the pscratch_busy flag
to false.
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
scsi/qla2xxx/qla_dfs.c: In function 'qla2x00_dfs_fce_show':
scsi/qla2xxx/qla_dfs.c:26: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t'
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The panic occurs if we get a MSGIN or MSGOUT for an unidentified SCB
(meaning we didn't identify the outstanding command it was for). For
MSGIN this is wrong because it could be an unsolicited negotiation
MSGIN from the target.
Still panic on unsolicited MSGOUT because this would represent a
mistake in the negotiation phases. However, we should fix this as
well. The specs say we should go to bus free for unexpected msgin.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
scsi_debug does at several places:
for_each_sg(sdb->table.sgl, sg, sdb->table.nents, k) {
kaddr = (unsigned char *)
kmap_atomic(sg_page(sg), KM_USER0);
We cannot do something like that with the clustering enabled (or we
can use scsi_kmap_atomic_sg).
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Douglas Gilbert <dougg@torque.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Apparently the fix to [SCSI] fas216: Use scsi_eh API for REQUEST_SENSE
invocation didn't show up in the final version sent to linus.
Correct this omission.
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
From conversations with the maintainers the _p isn't needed so kill it.
That removes the last non ISA _p user from the SCSI layer to my knowledge.
Signed-off-by: Alan Cox <alan@redhat.com>
Acked-by: "Yang, Bo" <Bo.Yang@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This driver has been failing under heavy load with
aic94xx: escb_tasklet_complete: REQ_TASK_ABORT, reason=0x6
aic94xx: escb_tasklet_complete: Can't find task (tc=4) to abort!
The second message is because the driver fails to identify the task
it's being asked to abort. On closer inpection, there's a thinko in
the for each task loop over pending tasks in both the REQ_TASK_ABORT
and REQ_DEVICE_RESET cases where it doesn't look at the task on the
pending list but at the one on the ESCB (which is always NULL).
Fix by looking at the right task. Also add a print for the case where
the pending SCB doesn't have a task attached.
Not sure if this will fix all the problems, but it's a definite first
step.
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
one system: initrd get courrupted:
RAMDISK: Compressed image found at block 0
RAMDISK: incomplete write (-28 != 2048) 134217728
crc error
VFS: Mounted root (ext2 filesystem).
Freeing unused kernel memory: 388k freed
init_special_inode: bogus i_mode (177777)
Warning: unable to open an initial console.
init_special_inode: bogus i_mode (177777)
init_special_inode: bogus i_mode (177777)
Kernel panic - not syncing: No init found. Try passing init= option to kernel.
bisected to
commit 9927c68864
Author: James Bottomley <James.Bottomley@HansenPartnership.com>
Date: Sun Feb 3 15:48:56 2008 -0600
[SCSI] ses: add new Enclosure ULD
changes:
1. change char to unsigned char to avoid type change later.
2. preserve len for page1
3. need to move desc_ptr even the entry is not enclosure_component_device/raid.
so keep desc_ptr on right position
4. record page7 len, and double check if desc_ptr out of boundary before touch.
5. fix typo in subenclosure checking: should use hdr_buf instead.
[jejb: style fixes]
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Fix compilation warning in gdth.c, which was using the deprecated
pci_find_device.
drivers/scsi/gdth.c:645: warning: 'pci_find_device' is deprecated (declared at include/linux/pci.h:495)
Changing it to use pci_get_device, instead.
Signed-off-by: Sergio Luis <sergio@larces.uece.br>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
It looks like there's been a bug in the module parameter setup forever.
The upshot doesn't really matter, because even if no parameters are ever
set, we just call sym53c416_setup() three times, but the zero values in
the arrays eventually cause nothing to happen. Unfortunately gcc has
started to notice this now too:
drivers/scsi/sym53c416.c: In function 'sym53c416_detect':
drivers/scsi/sym53c416.c:624: warning: the address of 'sym53c416' will always evaluate as 'true'
drivers/scsi/sym53c416.c:630: warning: the address of 'sym53c416_1' will always evaluate as 'true'
drivers/scsi/sym53c416.c:636: warning: the address of 'sym53c416_2' will always evaluate as 'true'
drivers/scsi/sym53c416.c:642: warning: the address of 'sym53c416_3' will always evaluate as 'true'
So fix this longstanding bug to keep gcc quiet.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>