Stall error handler if attempting resets/aborts while an rport is blocked.
This avoids device offline scenarios due to errors in the error handler.
Background:
Although the transport is using the scsi_timed_out functionality to
restart the timeout if the rport is blocked, if the timeout has already
fired before the block occurs, the eh handler still runs and can take
the device offline. Ultimately, this window cannot be resolved without
significant work in the error handler thread. Christoph noted the first
level of these issues when he noted the poor error response handling
by the error thread.
We found, under heavy load and error testing, that time window from when
the scsi_times_out() adds the io to the queue to when the scsi_error_handler
gets around to servicing it, can be in the several seconds range. In most
cases, these test conditions are highly unusual, but possible.
As a result, we're stalling the error handler in this race window so that
we can avoid the device_offline transitions.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Misc Bug Fixes:
- Cap MBX_DOWN_LINK command timeout to 60 seconds
- Fix double free of ndlp object
- Don't free mbox structures on error. The completion handlers expect to do so.
- Clear host attention work items when going offline
- Fixed discovery issues in multi-initiator environments.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch corrects a problem in mptfc which can result in targets
being removed after executing an "lsiutil 99" reset of the fibre
channel ports.
The last rescan event was being processed before the setup reset work
due to an inappropriate optimization in the event processing logic.
Every rescan event is now queued for execution and the setup reset
work now executes in the proper sequence.
Signed-off-by: Michael Reed <mdr@sgi.com>
Acked-by: Moore, Eric <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Based upon a conversation I had with LSI's fibre channel firmware guru,
this patch adds another condition under which the driver waits for the
firmware link initialization / target discovery to complete.
Signed-off-by: Michael Reed <mdr@sgi.com>
Acked-by: Moore, Eric <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The hptiop just got merged with a horrible amount of really bad ioctl
code that is against the standards for new scsi drivers. This patch
backs it out (and fixes a small bug where scsi_add_host is called to
early). We can re-add proper APIs once we agree on them.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Conflicts:
arch/ia64/hp/sim/simscsi.c
Stylistic differences in two separate fixes for buffer->request_buffer
problem.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Bug fixes for zfcp's erp:
- trigger adapter reopen if do_QDIO fails
- avoid erp deadlock if registration of scsi target or remote port hang
- do not treat as error if exchange port data fails
- decrease timeout for target reset and aborts
- mark unit failed if slave_destroy is called
Additionally some code cleanup was done:
- made some functions void when retval is not of interest
- shortened initialization of zfcp's host_template
- corrected some comments
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Currently it is impossible to reset provided by Qlogic QLA2xxx driver
SCSI devices externally using corresponding sg devices, particularly via
sg_reset utility, because qla2xxx driver in qla2xxx_eh_device_reset()
function checks if the input scsi_cmnd has its private data (CMD_SP())
attached. Then the found pointer isn't used anywhere inside of
qla2xxx_eh_device_reset(). If the RESET request comes from sg device, it
doesn't have such private data.
The attached patch removes check for non-NULL CMD_SP() from
qla2xxx_eh_device_reset(), hence allows to reset QLA2xxx's devices using
corresponding sg devices.
AV: change applies to bus/host reset handlers as well.
Signed-off-by: Vladislav Bolkhovitin <vst@vlnb.net>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
ID String and Message fixes
- Fix switch symbolic name registration to match cross-OS values
- Replace printk's with more standard lpfc_printf_log calls
- Make all lpfc_printf_log message numbers unique
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Short bug fixes:
- Fix iocbq list corruption due to missing list_del's in ct handling
- Missing unlock in lpfc_sli_next_iotag()
- Fix initialization of can_queue value
- Differentiate sysfs mailbox errors with different codes.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
* master.kernel.org:/pub/scm/linux/kernel/git/perex/alsa:
[ALSA] Don't reject O_RDWR at opening PCM OSS with read/write-only device
[ALSA] snd-emu10k1: Implement support for Audigy 2 ZS [SB0353]
[ALSA] add MAINTAINERS entry for snd-aoa
[ALSA] aoa: platform function gpio: ignore errors from functions that don't exist
[ALSA] make snd-powermac load even when it can't bind the device
[ALSA] aoa: fix toonie codec
[ALSA] aoa: feature gpio layer: fix IRQ access
[ALSA] Conversions from kmalloc+memset to k(z|c)alloc
[ALSA] snd-emu10k1: Fixes ALSA bug#2190
While busy-waiting for completion, check the hardware after scheduling;
don't schedule and then immediately check the _timeout_. If the yield()
took a long time (as it does on my OLPC prototype board when it's busy),
we'd report a timeout even though the hardware was now ready.
This fixes it, and also switches the yield() for a cond_resched() because
we don't actually want to be _that_ nice about it. I see nice
tightly-packed SMBus transactions now, rather than waiting for milliseconds
between successive phases.
Actually, we shouldn't be busy-waiting here at all. We should be using
interrupts. That's an exercise for another day though.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Christer Weinigel <wingel@nano-system.com>
Cc: <Jordan.Crouse@amd.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I saw an oops down this path when trying to create a new file on a UDF
filesystem which was internally marked as readonly, but mounted rw:
udf_create
udf_new_inode
new_inode
alloc_inode
udf_alloc_inode
udf_new_block
returns EIO due to readonlyness
iput (on error)
udf_put_inode
udf_discard_prealloc
udf_next_aext
udf_current_aext
udf_get_fileshortad
OOPS
the udf_discard_prealloc() path was examining uninitialized fields of the
udf inode.
udf_discard_prealloc() already has this code to short-circuit the discard
path if no extents are preallocated:
if (UDF_I_ALLOCTYPE(inode) == ICBTAG_FLAG_AD_IN_ICB ||
inode->i_size == UDF_I_LENEXTENTS(inode))
{
return;
}
so if we initialize UDF_I_LENEXTENTS(inode) = 0 earlier in udf_new_inode,
we won't try to free the (not) preallocated blocks, since this will match
the i_size = 0 set when the inode was initialized.
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>