Commit Graph

5816 Commits

Author SHA1 Message Date
Anirban Chakraborty
a67093d46e [SCSI] qla2xxx: Obtain proper host structure during response-queue processing.
Original code incorrectly assumed only status-type-0
IOCBs would be queued to the response-queue, and thus all
entries would safely reference a VHA from the IOCB
'handle.'

Cc: stable@kernel.org
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-02-08 13:45:55 -06:00
Xiaotian Feng
0f19bc681e [SCSI] qla2xxx: make msix interrupt handler safe for irq
Yinghai has reported a lockdep warning on qla2xxx:

[   77.965784] WARNING: at kernel/lockdep.c:2332
trace_hardirqs_on_caller+0xc6/0x14b()
[   77.977492] Hardware name: Sun
[   77.979485] Modules linked in:
[   77.994337] Pid: 0, comm: swapper Not tainted
2.6.33-rc4-tip-yh-03949-g3a8e3f5-dirty #64
[   78.000120] Call Trace:
[   78.013298]  <IRQ>  [<ffffffff81076b54>] warn_slowpath_common+0x7c/0x94
[   78.017746]  [<ffffffff81cd712c>] ? _raw_spin_unlock_irq+0x30/0x36
[   78.035171]  [<ffffffff81076b80>] warn_slowpath_null+0x14/0x16
[   78.040152]  [<ffffffff810a2ae8>] trace_hardirqs_on_caller+0xc6/0x14b
[   78.055400]  [<ffffffff810a2b7a>] trace_hardirqs_on+0xd/0xf
[   78.058951]  [<ffffffff81cd712c>] _raw_spin_unlock_irq+0x30/0x36
[   78.074889]  [<ffffffff816461ef>] qla24xx_msix_default+0x243/0x281
[   78.091598]  [<ffffffff810a5752>] ? __lock_release+0xa5/0xae
[   78.096799]  [<ffffffff810c02ae>] handle_IRQ_event+0x53/0x113
[   78.111568]  [<ffffffff810c2061>] handle_edge_irq+0xf3/0x13b
[   78.116255]  [<ffffffff81035109>] handle_irq+0x24/0x2f
[   78.132063]  [<ffffffff81cdc4b4>] do_IRQ+0x5c/0xc3
[   78.134684]  [<ffffffff81cd7393>] ret_from_intr+0x0/0xf
[   78.137903]  <EOI>  [<ffffffff81039a56>] ? mwait_idle+0xaf/0xbb
[   78.155674]  [<ffffffff81039a4d>] ? mwait_idle+0xa6/0xbb
[   78.158600]  [<ffffffff81031c7c>] cpu_idle+0x61/0xa1
[   78.174333]  [<ffffffff81c85d7a>] rest_init+0x7e/0x80
[   78.178122]  [<ffffffff82832d1f>] start_kernel+0x316/0x31d
[   78.193623]  [<ffffffff82832297>] x86_64_start_reservations+0xa7/0xab
[   78.198924]  [<ffffffff8283237f>] x86_64_start_kernel+0xe4/0xeb
[   78.214540] ---[ end trace be4529f30a2e4ef5 ]---

This was happened when qla2xxx msix interrupt handler is trying to enable
IRQs by spin_unlock_irq(). We should make interrupt handler safe for IRQs,
use spin_lock_irqsave/spin_unlock_irqrestore, this will not break the IRQs
status in interrupt handler.

Reported-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Acked-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-02-08 13:40:18 -06:00
Hannes Reinecke
534ef056db [SCSI] aic79xx: check for non-NULL scb in ahd_handle_nonpkt_busfree
When removing several devices aic79xx will occasionally Oops
in ahd_handle_nonpkt_busfree during rescan. Looking at the
code I found that we're indeed not checking if the scb in
question is NULL. So check for it before accessing it.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-01-17 12:48:12 -06:00
Swen Schillig
b8f08645f8 [SCSI] scsi_transport_fc: Allow LLD to reset FC BSG timeout
The hardware used with zfcp cannot abort a currently pending CT or ELS
request. Therefore we need the option to postpone the timeout
triggered request abort within the fc layer, since there is nothing
zfcp can do to stop the request at this point.

Cc: James Smart <James.Smart@emulex.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-01-17 12:40:11 -06:00
Giridhar Malavali
22c24734ce [SCSI] qla2xxx: Update version number to 8.03.01-k10.
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-01-17 12:36:35 -06:00
Andrew Vasquez
368bbe0777 [SCSI] qla2xxx: Perform fast mailbox read of flash regardless of size nor address alignment.
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-01-17 12:36:31 -06:00
Andrew Vasquez
f08b7251c4 [SCSI] qla2xxx: Correct FCP2 recovery handling.
The driver did not account for non-tape devices needing to employ
proper FCP2 recovery.  Driver now checks the FCP2-capable flag
only, rather than using a midlayer-determined flag (TYPE_TAPE).

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-01-17 12:36:28 -06:00
Boaz Harrosh
63c43b0ec1 [SCSI] scsi_lib: Fix bug in completion of bidi commands
Because of the terrible structuring of scsi-bidi-commands
it breaks some of the life time rules of a scsi-command.
It is now not allowed to free up the block-request before
cleanup and partial deallocation of the scsi-command. (Which
is not so for none bidi commands)

The right fix to this problem would be to make bidi command
a first citizen by allocating a scsi_sdb pointer at scsi command
just like cmd->prot_sdb. The bidi sdb should be allocated/deallocated
as part of the get/put_command (Again like the prot_sdb) and the
current decoupling of scsi_cmnd and blk-request should be kept.

For now make sure scsi_release_buffers() is called before the
call to blk_end_request_all() which might cause the suicide of
the block requests. At best the leak of bidi buffers, at worse
a crash, as there is a race between the existence of the bidi_request
and the free of the associated bidi_sdb.

The reason this was never hit before is because only OSD has the potential
of doing asynchronous bidi commands. (So does bsg but it is never used)
And OSD clients just happen to do all their bidi commands synchronously, up
until recently.

CC: Stable Tree <stable@kernel.org>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-01-17 12:16:18 -06:00
Penchala Narasimha Reddy Chilakala, ERS-HCLTech
cacb6dc3d7 [SCSI] aacraid: fix File System going into read-only mode
These particular problems were reported by Cisco and SAP and customers
as well. Cisco reported on RHEL4 U6 and SAP reported on SLES9 SP4 and
SLES10 SP2. We added these fixes on RHEL4 U6 and gave a private build
to IBM and Cisco. Cisco and IBM tested it for more than 15 days and
they reported that they did not see the issue so far. Before the fix,
Cisco used to see the issue within 5 days. We generated a patch for
SLES9 SP4 and SLES10 SP2 and submitted to Novell. Novell applied the
patch and gave a test build to SAP. SAP tested and reported that the
build is working properly.

We also tested in our lab using the tools "dishogsync", which is IO
stress tool and the tool was provided by Cisco.

Issue1:  File System going into read-only mode

Root cause: The driver tends to not free the memory (FIB) when the
management request exits prematurely. The accumulation of such
un-freed memory causes the driver to fail to allocate anymore memory
(FIB) and hence return 0x70000 value to the upper layer, which puts
the file system into read only mode.

Fix details: The fix makes sure to free the memory (FIB) even if the
request exits prematurely hence ensuring the driver wouldn't run out
of memory (FIBs).


Issue2: False Raid Alert occurs

When the Physical Drives and Logical drives are reported as deleted or
added, even though there is no change done on the system

Root cause: Driver IOCTLs is signaled with EINTR while waiting on
response from the lower layers. Returning "EINTR" will never initiate
internal retry.

Fix details: The issue was fixed by replacing "EINTR" with
"ERESTARTSYS" for mid-layer retries.

Signed-off-by: Penchala Narasimha Reddy <ServeRAIDDriver@hcl.in>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-01-17 12:16:17 -06:00
James Bottomley
e6622df3bb [SCSI] lpfc: fix file permissions
lpfc_hbadisc.c and lpfc_hw4.h accidentally got set executable.

Reported-by: Thomas Backlund <tmb@mandriva.org>
Cc: James Smart <James.Smart@Emulex.Com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-01-17 12:14:03 -06:00
Bryn M. Reeves
bb7d3f24c7 [SCSI] megaraid_sas: remove sysfs poll_mode_io world writeable permissions
/sys/bus/pci/drivers/megaraid_sas/poll_mode_io defaults to being
world-writable, which seems bad (letting any user affect kernel driver
behavior).

This turns off group and user write permissions, so that on typical
production systems only root can write to it.

Signed-off-by: Bryn M. Reeves <bmr@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-12 21:12:36 -08:00
James Smart
500af638b3 [SCSI] lpfc 8.3.7: Update Driver version to 8.3.7
Update Driver version to 8.3.7

Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-01-04 11:39:47 -06:00
James Smart
9795724476 [SCSI] lpfc 8.3.7: Fix discovery failures.
Fix discovery failures:
- Move all accesses to the fc_flag field inside the host lock.
- Restore link state after going through linkdown processing for FCF DEAD event.

Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-01-04 11:39:46 -06:00
James Smart
aacc20e35e [SCSI] lpfc 8.3.7: Fix SCSI protocol related errors.
Fix SCSI protocol related errors:
- Avoid I/O failures during EEH and HBA/CNA reset by correcting when
  we block the targets on the adapter.

Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-01-04 11:39:45 -06:00
James Smart
def9c7a994 [SCSI] lpfc 8.3.7: Fix hardware/SLI relates issues
Fix hardware/SLI relates issues:
- Fix CNA uses more than one EQ when in INTx interrupt mode.
- Fix driver tries to process failed read FCF record mailbox request.
- Fix allocating single receive buffer breaks FCoE receive queue.
- Support new read FCF record mailbox error case.

Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-01-04 11:39:44 -06:00
James Smart
1987807d4a [SCSI] lpfc 8.3.7: Fix NPIV operation errors
Fix NPIV operation errors:
- Fix vport not logging out of fabric when being deleted
- Fix vport fails to discover targets after devloss timeout.

Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-01-04 11:39:43 -06:00
James Smart
eeead81152 [SCSI] lpfc 8.3.7: Fix FC protocol errors
Fix FC protocol errors:
- Fix multi-frame unsolicited sequences not queued properly
- Fix frames for unsolicited sequences not being associated with sequence.
- Fix unsolicited frame buffer sizes are not set properly
- Fix Sequence count for unsolicited frame headers not byte swapped.
- Fix Multi-frame sequence response frames go to wrong DID.

Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-01-04 11:39:42 -06:00
Ed Lin
91e6ecada7 [SCSI] stex: fix scan of nonexistent lun
During a manual scan, a user can send command to a nonexistent
lun, precisely at the point of max_lun. Normally it's possible
(but not required) that the firmware has the knowledge that it
is an invalid lun. In the particular case when max_lun is 256,
however, the nonexistent lun 256 will be confused with lun 0,
because the lun member in a request message is only u8, and 256
will become 0. So we need to fix the problem, at least, at the
driver level.

Signed-off-by: Ed Lin <ed.lin@promise.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-01-04 11:39:41 -06:00
Anil Ravindranath
a70757ba9a [SCSI] pmcraid: fix to avoid twice scsi_dma_unmap for a command
For a particular driver error condition, driver was doing double
scsi_dma_unmaps. Driver was calling scsi_dma_unmap in
pmcraid_error_handler and return 0. This pmcraid_error_handler is called
by pmcraid_io_done which will do scsi_dma_unmap again when it has
return 0 from pmcraid_error_handler.

Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-12-30 11:53:22 -06:00
Giridhar Malavali
3b9c212a5c [SCSI] qla2xxx: Update version number to 8.03.01-k9.
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-12-30 11:09:53 -06:00
Duane Grigsby
ca79cf6648 [SCSI] qla2xxx: Added to EEH support.
Added fundamental reset and pci save state.

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-12-30 11:09:50 -06:00
Andrew Vasquez
8588080193 [SCSI] qla2xxx: Extend base EEH support in qla2xxx.
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-12-30 11:09:49 -06:00
Anirban Chakraborty
5c66f5d193 [SCSI] qla2xxx: Fix for a multiqueue bug in CPU affinity mode
Hold the hardware lock while do the response completion in work queue threads as
it involves sharing a common request queue among multiple threads.

Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-12-30 11:09:47 -06:00
Michael Hernandez
3064ff39b8 [SCSI] qla2xxx: Get the link data rate explicitly during device resync.
When the hba port gets logged out of the fabric, or other
such transitional state when the physical link is still present,
the driver doesn't receive a loop up asyn event (where the link
data rate currently gets set). Hence send a explicit mailbox command
to get the link rate in such conditions.
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-12-30 11:09:45 -06:00
Rakesh Ranjan
44214ab474 [SCSI] cxgb3i: Fix a login over vlan issue
Fix a target login issue, when parent interface is vlan and we are
using cxgb3i sepecific private ip address in '/etc/iscsi/ifaces/'
iface file.

Signed-off-by: Rakesh Ranjan <rakesh@chelsio.com>
Acked-by: Karen Xie <kxie@chelsio.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-12-30 11:03:41 -06:00