Commit Graph

136 Commits

Author SHA1 Message Date
Jens Axboe 242f9dcb8b block: unify request timeout handling
Right now SCSI and others do their own command timeout handling.
Move those bits to the block layer.

Instead of having a timer per command, we try to be a bit more clever
and simply have one per-queue. This avoids the overhead of having to
tear down and setup a timer for each command, so it will result in a lot
less timer fiddling.

Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:13 +02:00
Tejun Heo da0e21d3fa libata: use ata_link_printk() when printing SError
SError belongs to link not port.  Use ata_link_printk() to print it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-08-22 02:19:44 -04:00
Tejun Heo 5dbfc9cb59 libata: always do follow-up SRST if hardreset returned -EAGAIN
As an optimization, follow-up SRST used to be skipped if
classification wasn't requested even when hardreset requested it via
-EAGAIN.  However, some hardresets can't wait for device readiness and
skipping SRST can cause timeout or other failures during revalidation.
Always perform follow-up SRST if hardreset returns -EAGAIN.  This
makes reset paths more predictable and thus less error-prone.

While at it, move hardreset error checking such that it's done right
after hardreset is finished.  This simplifies followup SRST condition
check a bit and makes the reset path easier to modify.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-08-22 02:19:41 -04:00
Tejun Heo a674050e06 libata: fix EH action overwriting in ata_eh_reset()
ehc->i.action got accidentally overwritten to ATA_EH_HARD/SOFTRESET in
ata_eh_reset().  The original intention was to clear reset action
which wasn't selected.  This can cause unexpected behavior when other
EH actions are scheduled together with reset.  Fix it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-08-22 02:19:39 -04:00
Tejun Heo 05944bdf6f libata: implement no[hs]rst force params
Implement force params nohrst, nosrst and norst.  This is to work
around reset related problems and ease debugging.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-08-22 02:07:43 -04:00
Tejun Heo 3eabddb8ed libata-eh: update atapi_eh_request_sense() to take @dev instead of @qc
Update atapi_eh_request_sense() to take @dev, @sense_buf and
@dfl_sense_key instead of taking @qc and extracting information from
it.  This change is to make the function more generic and allow it to
be called from other places.

While at it, make cdb initialization use initializer.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-07-14 15:59:33 -04:00
Tejun Heo 87fbc5a060 libata: improve EH internal command timeout handling
ATA_TMOUT_INTERNAL which was 30secs were used for all internal
commands which is way too long when something goes wrong.  This patch
implements command type based stepped timeouts.  Different command
types can use different timeouts and each command type can use
different timeout values after timeouts.

ie. the initial timeout is set to a value which should cover most of
the cases but not too long so that run away cases don't delay things
too much.  After the first try times out, the second try can use
longer timeout and if that one times out too, it can go for full 30sec
timeout.

IDENTIFYs use 5s - 10s - 30s timeout and all other commands use 5s -
10s timeouts.

This patch significantly cuts down the needed time to handle failure
cases while still allowing libata to work with nut job devices through
retries.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-07-14 15:59:32 -04:00
Tejun Heo d8af0eb604 libata: use ULONG_MAX to terminate reset timeout table
This doesn't introduce any functional changes.  This is to make reset
timeout table consistent with to-be-added command timeout tables.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-07-14 15:59:32 -04:00
Tejun Heo 0a2c0f5615 libata: improve EH retry delay handling
EH retries were delayed by 5 seconds to ensure that resets don't occur
back-to-back.  However, this 5 second delay is superflous or excessive
in many cases.  For example, after IDENTIFY times out, there's no
reason to wait five more seconds before retrying.

This patch adds ehc->last_reset timestamp and record the timestamp for
the last reset trial or success and uses it to space resets by
ATA_EH_RESET_COOL_DOWN which is 5 secs and removes unconditional 5 sec
sleeps.

As this change makes inter-try waits often shorter and they're
redundant in nature, this patch also removes the "retrying..."
messages.

While at it, convert explicit rounding up division to DIV_ROUND_UP().

This change speeds up EH in many cases w/o sacrificing robustness.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-07-14 15:59:32 -04:00
Tejun Heo 341c2c958e libata: consistently use msecs for time durations
libata has been using mix of jiffies and msecs for time druations.
This is getting confusing.  As writing sub HZ values in jiffies is
PITA and msecs_to_jiffies() can't be used as initializer, unify unit
for all time durations to msecs.  So, durations are in msecs and
deadlines are in jiffies.  ata_deadline() is added to compute deadline
from a start time and duration in msecs.

While at it, drop now superflous _msec suffix from arguments and
rename @timeout to @deadline if it represents a fixed point in time
rather than duration.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-07-14 15:59:32 -04:00
Tejun Heo e0614db2a3 libata: ignore recovered PHY errors
No reason to get overzealous about recovered comm and data errors.
Some PHYs habitually sets them w/o no good reason and being draconian
about these soft error conditions doesn't seem to help anybody.

If need ever rises, we might need to add soft PHY error condition, say
AC_ERR_MAYBE_ATA_BUS and use it only to determine whether speed down
is necessary but I don't think that's very likely to happen.  It's far
more likely we'll get timeouts or fatal transmission errors if
recovered errors are so prominent that they hamper operation.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-05-19 17:51:47 -04:00
Tejun Heo f046519fc8 libata: kill hotplug related race condition
Originally, whole reset processing was done while the port is frozen
and SError was cleared during @postreset().  This had two race
conditions.  1: hotplug could occur after reset but before SError is
cleared and libata won't know about it.  2: hotplug could occur after
all the reset is complete but before the port is thawed.  As all
events are cleared on thaw, the hotplug event would be lost.

Commit ac371987a8 kills the first race
by clearing SError during link resume but before link onlineness test.
However, this doesn't fix race #2 and in some cases clearing SError
after SRST is a good idea.

This patch solves this problem by cross checking link onlineness with
classification result after SError is cleared and port is thawed.
Reset is retried if link is online but all devices attached to the
link are unknown.  As all devices will be revalidated, this one-way
check is enough to ensure that all devices are detected and
revalidated reliably.

This, luckily, also fixes the cases where host controller returns
bogus status while harddrive is spinning up after hotplug making
classification run before the device sends the first FIS and thus
causes misdetection.

Low level drivers can bypass the logic by setting class explicitly to
ATA_DEV_NONE if ever necessary (currently none requires this).

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-05-19 17:51:47 -04:00
Tejun Heo dc98c32cbe libata: move reset freeze/thaw handling into ata_eh_reset()
Previously reset freeze/thaw handling lived outside of ata_eh_reset()
mainly because the original PMP reset code needed the port frozen
while resetting all the fan-out ports, which is no longer the case.

This patch moves freeze/thaw handling into ata_eh_reset().
@prereset() and @postreset() are now called w/o freezing the port
although @prereset() an be called frozen if the port is frozen prior
to entering ata_eh_reset().

This makes code simpler and will help removing hotplug event related
races.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-05-19 17:51:47 -04:00
Tejun Heo 932648b007 libata: reorganize ata_eh_reset() no reset method path
Reorganize ata_eh_reset() such that @prereset() is called even when no
reset method is available and if block is used instead of goto to skip
actual reset.  This makes no reset case behave better (readiness wait)
and future changes easier.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-05-19 17:51:47 -04:00
Mark Lord 10acf3b0d3 libata: export ata_eh_analyze_ncq_error
Export ata_eh_analyze_ncq_error() for subsequent use by sata_mv,
as suggested by Tejun.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-05-06 11:37:58 -04:00
Mark Lord a6116c9e60 libata-eh set tf flags in NCQ EH result_tf
Fix mis-reporting of NCQ errors by ensuring that result_tf->flags
is properly initialized in libata-eh.  This allows ata_gen_ata_sense()
to report the failed block number correctly to SCSI after a media error
during NCQ.

This patch may also be a candidate for backporting to earlier kernels.
Without this fix, SCSI will fail I/O on the entire request rather
than just the bad sector.  That can be bad for a request that was
merged from many independent read reads from different tasks.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-04-25 01:11:37 -04:00
Tejun Heo 4f7faa3f2b libata: make EH fail gracefully if no reset method is available
When no reset method is available, libata currently oopses.  Although
the condition can't happen unless there's a bug in a low level driver,
oopsing isn't the best way to report the error condition.  Complain,
dump stack and fail reset instead.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-04-17 15:44:26 -04:00
Tejun Heo 45db2f6c95 libata: move link onlineness check out of softreset methods
Currently, SATA softresets should do link onlineness check before
actually performing SRST protocol but it doesn't really belong to
softreset.

This patch moves onlineness check in softreset to ata_eh_reset() and
ata_eh_followup_srst_needed() to clean up code and help future sata_mv
changes which need clear separation between SCR and TF accesses.

sata_fsl is peculiar in that its softreset really isn't softreset but
combination of hardreset and softreset.  This patch adds dummy private
->prereset to keep the current behavior but the driver really should
implement separate hard and soft resets and return -EAGAIN from
hardreset if it should be follwed by softreset.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-04-17 15:44:25 -04:00
Tejun Heo 2a0c15ca39 libata: kill dead code paths in reset path
Some code paths which had been made obsolete by recent reset
simplification were still around.  Kill them.

* ata_eh_reset() checked for ATA_DEV_UNKNOWN to determine
  classification failure.  This is no longer applicable.

* ata_do_reset() should convert ATA_DEV_UNKNOWN to ATA_DEV_NONE
  regardless of reset result (e.g. -EAGAIN).

* LLDs don't need to convert ATA_DEV_UNKNOWN to ATA_DEV_NONE.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-04-17 15:44:25 -04:00
Tejun Heo 071f44b1d2 libata: implement PMP helpers
Implement helpers to test whether PMP is supported, attached and
determine pmp number to use when issuing SRST to a link.  While at it,
move ata_is_host_link() so that it's together with the two new PMP
helpers.

This change simplifies LLDs and helps making PMP support optional.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:25 -04:00
Tejun Heo 305d2a1ab1 libata: unify mechanism to request follow-up SRST
Previously, there were two ways to trigger follow-up SRST from
hardreset method - returning -EAGAIN and leaving all device classes
unmodified.  Drivers never used the latter mechanism and the only use
case for the former was when hardreset couldn't classify.

Drop the latter mechanism and let -EAGAIN mean "perform follow-up SRST
if classification is required".  This change removes unnecessary
follow-up SRSTs and simplifies reset implementations.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:23 -04:00
Tejun Heo 5958e3025f libata: move PMP SCR access failure during reset to ata_eh_reset()
If PMP fan-out reset fails and SCR isn't accessible, PMP should be
reset.  This used to be tested by sata_pmp_std_hardreset() and
communicated to EH by -ERESTART.  However, this logic is generic and
doesn't really have much to do with specific hardreset implementation.

This patch moves SCR access failure detection logic to ata_eh_reset()
where it belongs.  As this makes sata_pmp_std_hardreset() identical to
sata_std_hardreset(), the function is killed and replaced with the
standard method.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:23 -04:00
Tejun Heo 57c9efdfb3 libata: implement and use sata_std_hardreset()
Implement sata_std_hardreset(), which simply wraps around
sata_link_hardreset().  sata_std_hardreset() becomes new standard
hardreset method for sata_port_ops and sata_sff_hardreset() moves from
ata_base_port_ops to ata_sff_port_ops, which is where it really
belongs.

ata_is_builtin_hardreset() is added so that both
ata_std_error_handler() and ata_sff_error_handler() skip both builtin
hardresets if SCR isn't accessible.

piix_sidpr_hardreset() in ata_piix.c is identical to
sata_std_hardreset() in functionality and got replaced with the
standard function.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:23 -04:00
Tejun Heo 9363c3825e libata: rename SFF functions
SFF functions have confusing names.  Some have sff prefix, some have
bmdma, some std, some pci and some none.  Unify the naming by...

* SFF functions which are common to both BMDMA and non-BMDMA are
  prefixed with ata_sff_.

* SFF functions which are specific to BMDMA are prefixed with
  ata_bmdma_.

* SFF functions which are specific to PCI but apply to both BMDMA and
  non-BMDMA are prefixed with ata_pci_sff_.

* SFF functions which are specific to PCI and BMDMA are prefixed with
  ata_pci_bmdma_.

* Drop generic prefixes from LLD specific routines.  For example,
  bfin_std_dev_select -> bfin_dev_select.

The following renames are noteworthy.

  ata_qc_issue_prot() -> ata_sff_qc_issue()
  ata_pci_default_filter() -> ata_bmdma_mode_filter()
  ata_dev_try_classify() -> ata_sff_dev_classify()

This rename is in preparation of separating SFF support out of libata
core layer.  This patch strictly renames functions and doesn't
introduce any behavior difference.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:21 -04:00
Tejun Heo 03faab7827 libata: implement ATA_QCFLAG_RETRY
Currently whether a command should be retried after failure is
determined inside ata_eh_finish().  Add ATA_QCFLAG_RETRY and move the
logic into ata_eh_autopsy().  This makes things clearer and helps
extending retry determination logic.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-04-17 15:44:20 -04:00