Commit Graph

198 Commits

Author SHA1 Message Date
Kleber Sacilotto de Souza a92fa25c63 [SCSI] ipr: fix eeh recovery for 64-bit adapters
In some scenarios, an EEH error can take a long time to be detected, since the
driver issues an MMIO read only after a device reset command times out and we
try to reset the adapter. This patch adds some code in ipr_cancel_op() to read
a hardware register so we detect the error earlier in case the op is being
aborted because of a timeout caused by a frozen adapter slot.

Another problem in such scenarios is that in __ipr_eh_host_reset() we change the
dump state flag from WAIT_FOR_DUMP to GET_DUMP, and the flag is later changed
from GET_DUMP to READ_DUMP in ipr_reset_restore_cfg_space(). However, if when
__ipr_eh_host_reset() is called by the SCSI error handling the function
ipr_reset_restore_cfg_space() has already been called by the PCI EEH code, we
end up with the flag in an inconsistent state. This patch also prevents this
problem.

Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-18 08:33:13 -06:00
Jan Kiszka fb51ccbf21 PCI: Rework config space blocking services
pci_block_user_cfg_access was designed for the use case that a single
context, the IPR driver, temporarily delays user space accesses to the
config space via sysfs. This assumption became invalid by the time
pci_dev_reset was added as locking instance. Today, if you run two loops
in parallel that reset the same device via sysfs, you end up with a
kernel BUG as pci_block_user_cfg_access detect the broken assumption.

This reworks the pci_block_user_cfg_access to a sleeping service
pci_cfg_access_lock and an atomic-compatible variant called
pci_cfg_access_trylock. The former not only blocks user space access as
before but also waits if access was already locked. The latter service
just returns false in this case, allowing the caller to resolve the
conflict instead of raising a BUG.

Adaptions of the ipr driver were originally written by Brian King.

Acked-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:33 -08:00
Wayne Boyer 5a918353ec [SCSI] ipr: add definitions for additional adapter
Add the appropriate definition and table entry for an additional adapter.

Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-31 13:12:50 +04:00
Brian King 4c647e909f [SCSI] ipr: Fix BUG on adapter dump timeout
If an adapter dump times out, the ipr driver will abort the
dump and proceed to reset and recover the adapter. When an
adapter dump completes, the work thread which is reading the
adapter dump will initiate an adapter reset to recover the
adapter. However, when the adapter dump gets aborted, the
work thread should not initiate an adapter reset, since an
adapter reset is already in progress. This fixes a case of
calling pci_block_user_cfg_access overlapped, which results
in a BUG.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-20 10:19:55 -05:00
Wayne Boyer 14ed9cc7e7 [SCSI] ipr: Add support to flash FPGA and flash back DRAM images
The write buffer command is used to download and burn new IOA FW images.
The same interface can now be used to flash FPGA and flash back DRAM images.
To download and flash the new images takes more than 15 minutes, so increase
the write buffer command timeout to 30 minutes.

The FPGA and flash back DRAM images don't have the same card_type as the IOA FW
image. So, remove the sanity checking from the driver.  The adapter has sanity
checking and will only accept a valid image.

Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-16 10:56:27 -05:00
Brian King 41e9a69641 [SCSI] ipr: Stop reading adapter dump prematurely
When the ipr driver decides to dump the adapter, it changes the
sdt_state to GET_DUMP, then prepares the adapter so that the dump
can be read. However, if the ipr worker thread wakes up for some
reason before the driver has put the adapter in a state where it
can succesfully dump the adapter, the driver will start dumping
the adapter too early, which can potentially trigger a BUG check
in the pci config blocking API. Fix this by adding a new
sdt_state to differentiate between the ipr driver wanting to dump
the adapter in the near future and wanting to dump the adapter now.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-09-22 15:30:28 +04:00
Anton Blanchard 5d7c20b7fa [SCSI] ipr: Always initiate hard reset in kdump kernel
During kdump testing I noticed timeouts when initialising each IPR
adapter. While the driver has logic to detect an adapter in an
indeterminate state, it wasn't triggering and each adapter went
through a 5 minute timeout before finally going operational.

Some analysis showed the needs_hard_reset flag wasn't getting set.
We can check the reset_devices kernel parameter which is set by
kdump and force a full reset. This fixes the problem.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-08-27 08:35:55 -06:00
Julia Lawall f170c684b5 [SCSI] ipr: reorder error handling code to include iounmap
The out_msi_disable label should be before cleanup_nomem to additionally
benefit from the call to iounmap.  Subsequent gotos are adjusted to go to
out_msi_disable instead of cleanup_nomem, which now follows it.  This is
safe because pci_disable_msi does nothing if pci_enable_msi was not called.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@r@
expression e1,e2;
statement S;
@@

e1 = pci_ioremap_bar(...);
... when != e1 = e2
    when != iounmap(e1)
    when any
(
 if (<+...e1...+>) S
|
 if(...) { ... return 0; }
|
 if (...) { ... when != iounmap(e1)
                when != if (...) { ... iounmap(e1) ... }
* return ...;
 } else S
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-07-27 17:45:08 +04:00
Wayne Boyer a5442ba4a4 [SCSI] ipr: fix possible false positive detection of stuck interrupt
If the driver is getting flooded with interrupts, there's a possibility
that the interrupt service routine could falsely detect a stuck interrupt
condition and reset the adapter.

This patch changes the logic such that the routine will loop back into
the command processing code one more time after detecting the stuck
interrupt signature.  If there are no commands to process after that pass,
and the interrupt is still not cleared, then the driver will print the
"Error clearing HRRQ" message and reset the adapter.

Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:37:50 -04:00
Anton Blanchard 51f52a4752 [SCSI] ipr: Rate limit DMA mapping errors
I noticed a stream of errors from the IPR driver while doing
IOMMU fault injection. Rate limit the errors so we don't clog
up the console and logfiles.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-17 11:17:07 +04:00
Kleber Sacilotto de Souza 4d4dd70655 [SCSI] ipr: increase the dump size for 64 bit adapters
Currently the size of the dump generated by the driver is limited
in 4MB, which is insufficient to gather much useful data from the
new 64 bit adapters.

This patch makes the needed changes to increase the dump limit
for the 64 bit adapters to 32MB, or even to a bigger value in the
future, but keeping the current limitations for the legacy 32 bit
adapters.

Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01 12:09:20 -05:00
Wayne Boyer 7dacb64f49 [SCSI] ipr: improve interrupt service routine performance
During performance testing on P7 machines it was observed that the interrupt
service routine was doing unnecessary MMIO operations.

This patch rearranges the logic of the routine and moves some of the code out
of the main routine.  The result is that there are now fewer MMIO operations in
the performance path of the code.

As a result of the above change, an existing condition was exposed where the
driver could get an "unexpected" hrrq interrupt.  The original code would flag
the interrupt as unexpected and then reset the adapter.  After further analysis
it was confirmed that this condition can occasionally occur and that the
interrupt can safely be ignored.  Additional code in this patch detects this
condition, clears the interrupt and allows the driver to continue without
resetting the adapter.

Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01 10:48:53 -05:00
Wayne Boyer 630ad8317f [SCSI] ipr: remove unneeded volatile declarations
This patch removes three volatile declarations based on some feedback and code
analysis.

Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01 10:44:18 -05:00
Wayne Boyer ab6c10b136 [SCSI] ipr: fix synchronous request flags for better performance
In testing it was noticed that Extended Delay after Reset flag was being set
for gscsi and volume set devices.  This had a negative effect on performance
for volume sets.  The fix is to only set the flag for gscsi devices.

Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01 10:16:44 -05:00
Linus Torvalds c55d267de2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (170 commits)
  [SCSI] scsi_dh_rdac: Add MD36xxf into device list
  [SCSI] scsi_debug: add consecutive medium errors
  [SCSI] libsas: fix ata list corruption issue
  [SCSI] hpsa: export resettable host attribute
  [SCSI] hpsa: move device attributes to avoid forward declarations
  [SCSI] scsi_debug: Logical Block Provisioning (SBC3r26)
  [SCSI] sd: Logical Block Provisioning update
  [SCSI] Include protection operation in SCSI command trace
  [SCSI] hpsa: fix incorrect PCI IDs and add two new ones (2nd try)
  [SCSI] target: Fix volume size misreporting for volumes > 2TB
  [SCSI] bnx2fc: Broadcom FCoE offload driver
  [SCSI] fcoe: fix broken fcoe interface reset
  [SCSI] fcoe: precedence bug in fcoe_filter_frames()
  [SCSI] libfcoe: Remove stale fcoe-netdev entries
  [SCSI] libfcoe: Move FCOE_MTU definition from fcoe.h to libfcoe.h
  [SCSI] libfc: introduce __fc_fill_fc_hdr that accepts fc_hdr as an argument
  [SCSI] fcoe, libfc: initialize EM anchors list and then update npiv EMs
  [SCSI] Revert "[SCSI] libfc: fix exchange being deleted when the abort itself is timed out"
  [SCSI] libfc: Fixing a memory leak when destroying an interface
  [SCSI] megaraid_sas: Version and Changelog update
  ...

Fix up trivial conflicts due to whitespace differences in
drivers/scsi/libsas/{sas_ata.c,sas_scsi_host.c}
2011-03-17 17:54:40 -07:00
Sergei Shtylyov 9cbe056f6c libata: remove ATA_FLAG_NO_LEGACY
All checks of ATA_FLAG_NO_LEGACY have been removed by the commits
c791c30670 ([libata] minor PCI IDE probe
fixes and cleanups) and f0d36efdc6 (libata:
update libata core layer to use devres), so I think it's time to finally
get rid of this flag...

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2011-03-02 02:36:46 -05:00
Sergei Shtylyov 3696df3099 libata: remove ATA_FLAG_MMIO
Commit 0d5ff56677 (libata: convert to iomap)
removed all checks of ATA_FLAG_MMIO but neglected to remove the flag itself.
Do it now, at last...

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2011-03-02 02:36:46 -05:00
Sergei Shtylyov c10f97b9d8 libata: remove ATA_FLAG_{SRST|SATA_RESET}
These flags are marked as obsolete and the checks for them have been removed
by commit 294440887b (libata-sff: kill unused
ata_bus_reset()), so I think it's time to finally get rid of them...

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2011-03-02 02:36:46 -05:00
Sergei Shtylyov 0f2e0330a8 ipr/sas_ata: use mode mask macros from <linux/ata.h>
Commit 14bdef982c ([libata] convert drivers to
use ata.h mode mask defines) didn't convert these two libata driver outside
drivers/ata/...

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2011-03-02 02:36:46 -05:00
Kleber Sacilotto de Souza 5767a1c498 [SCSI] ipr: Fix a race on multiple configuration changes
In a multiple configuration change scenario a remove notification can be
followed by an immediate add notification for the same device, which
will cause the device to be removed but never added back. This patch
fixes the problem by ensuring that in such situations the device will be
added back.

Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-02-18 12:29:15 -06:00
Tejun Heo a684b8da35 [SCSI] remove flush_scheduled_work() usages
Simple conversions to drop flush_scheduled_work() usages in
drivers/scsi.  More involved ones will be done in separate patches.

* NCR5380, megaraid_sas: cancel_delayed_work() +
  flush_scheduled_work() -> cancel_delayed_work_sync().

* mpt2sas_scsih: drop unnecessary flush_scheduled_work().

* arcmsr_hba, ipr, pmcraid: flush the used work explicitly instead of
  using flush_scheduled_work().

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-02-12 10:31:02 -06:00
Linus Torvalds d73b388459 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI/PM: Report wakeup events before resuming devices
  PCI/PM: Use pm_wakeup_event() directly for reporting wakeup events
  PCI: sysfs: Update ROM to include default owner write access
  x86/PCI: make Broadcom CNB20LE driver EMBEDDED and EXPERIMENTAL
  x86/PCI: don't use native Broadcom CNB20LE driver when ACPI is available
  PCI/ACPI: Request _OSC control once for each root bridge (v3)
  PCI: enable pci=bfsort by default on future Dell systems
  PCI/PCIe: Clear Root PME Status bits early during system resume
  PCI: pci-stub: ignore zero-length id parameters
  x86/PCI: irq and pci_ids patch for Intel Patsburg
  PCI: Skip id checking if no id is passed
  PCI: fix __pci_device_probe kernel-doc warning
  PCI: make pci_restore_state return void
  PCI: Disable ASPM if BIOS asks us to
  PCI: Add mask bit definition for MSI-X table
  PCI: MSI: Move MSI-X entry definition to pci_regs.h

Fix up trivial conflicts in drivers/net/{skge.c,sky2.c} that had in the
meantime been converted to not use legacy PCI power management, and thus
no longer use pci_restore_state() at all (and that caused trivial
conflicts with the "make pci_restore_state return void" patch)
2011-01-14 09:29:05 -08:00
Linus Torvalds 1542dec1c9 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  pata_platform: Remove CONFIG_HAVE_PATA_PLATFORM's dependencies.
  pata_hpt37x: actually limit HPT370 to UltraDMA/66
  pata_hpt3x2n: coding style cleanup
  pata_hpt37x: coding style cleanup
  pata_hpt366: coding style cleanup
  pata_hpt3x2n: calculate average f_CNT
  pata_hpt3x2n: clarify about HPT371N support
  pata_hpt{37x|3x2n}: SATA mode filtering
  [libata] avoid needlessly passing around ptr to SCSI completion func
  [libata] new driver acard_ahci, for ATP8620 host controller
2011-01-10 08:22:33 -08:00
Jeff Garzik b27dcfb067 [libata] avoid needlessly passing around ptr to SCSI completion func
It's stored in struct scsi_cmnd->scsi_done, making several 'done'
parameters to functions redundant.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2011-01-05 19:43:22 -05:00
Jon Mason 1d3c16a818 PCI: make pci_restore_state return void
pci_restore_state only ever returns 0, thus there is no benefit in
having it return any value.  Also, a large majority of the callers do
not check the return code of pci_restore_state.  Make the
pci_restore_state a void return and avoid the overhead.

Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-12-23 12:53:09 -08:00