linux-apfs

mirror of https://github.com/linux-apfs/linux-apfs.git synced 2026-05-01 15:00:59 -07:00

Author	SHA1	Message	Date
Tejun Heo	fe06e5f9b7	libata-sff: separate out BMDMA EH Some of error handling logic in ata_sff_error_handler() and all of ata_sff_post_internal_cmd() are for BMDMA. Create ata_bmdma_error_handler() and ata_bmdma_post_internal_cmd() and move BMDMA part into those. While at it, change DMA protocol check to ata_is_dma(), fix post_internal_cmd to call ap->ops->bmdma_stop instead of directly calling ata_bmdma_stop() and open code hardreset selection so that ata_std_error_handler() doesn't have to know about sff hardreset. As these two functions are BMDMA specific, there's no reason to check for bmdma_addr before calling bmdma methods if the protocol of the failed command is DMA. sata_mv and pata_mpc52xx now don't need to set .post_internal_cmd to ATA_OP_NULL and pata_icside and sata_qstor don't need to set it to their bmdma_stop routines. ata_sff_post_internal_cmd() becomes noop and is removed. This fixes p3 described in clean-up-BMDMA-initialization patch. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2010-05-19 13:36:46 -04:00
Tejun Heo	c429137a67	libata-sff: port_task is SFF specific port_task is tightly bound to the standard SFF PIO HSM implementation. Using it for any other purpose would be error-prone and there's no such user and if some drivers need such feature, it would be much better off using its own. Move it inside CONFIG_ATA_SFF and rename it to sff_pio_task. The only function which is exposed to the core layer is ata_sff_flush_pio_task() which is renamed from ata_port_flush_task() and now also takes care of resetting hsm_task_state to HSM_ST_IDLE, which is possible as it's now specific to PIO HSM. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2010-05-19 13:35:49 -04:00
Jeff Garzik	a09bf4cd53	libata: ensure NCQ error result taskfile is fully initialized before returning it via qc->result_tf. Cc: stable@kernel.org Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2010-04-22 21:59:13 -04:00
Tejun Heo	fa41efdae7	libata: fix locking around blk_abort_request() blk_abort_request() expectes queue lock to be held by the caller. Grab it before calling the function. Lack of this synchronization led to infinite loop on corrupt q->timeout_list. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: stable@kernel.org Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2010-04-22 21:47:52 -04:00
Tejun Heo	534ead7092	libata: retry FS IOs even if it has failed with AC_ERR_INVALID libata currently doesn't retry if a command fails with AC_ERR_INVALID assuming that retrying won't get it any further even if retried. However, a failure may be classified as invalid through hardware glitch (incorrect reading of the error register or firmware bug) and there isn't whole lot to gain by not retrying as actually invalid commands will be failed immediately. Also, commands serving FS IOs are extremely unlikely to be invalid. Retry FS IOs even if it's marked invalid. Transient and incorrect invalid failure was seen while debugging firmware related issue on Samsung n130 on bko#14314. http://bugzilla.kernel.org/show_bug.cgi?id=14314 Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Johannes Stezenbach <js@sig21.net> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2010-01-20 14:25:11 -05:00
Tejun Heo	6013efd886	libata: retry failed FLUSH if device didn't fail it If ATA device failed FLUSH, it means that the device failed to write out some amount of data and the error needs to be reported to upper layers. As retries can't recover the lost data, FLUSH failures need to be reported immediately in general. However, if FLUSH fails due to transmission errors, the FLUSH needs to be retried; otherwise, filesystems may switch to RO mode and/or raid array may drop a drive for a random transmission glitch. This condition can be rather easily reproduced on certain ahci controllers which go through a PHY event after powersave mode switch + ext4 combination. Powersave mode switch is often closely followed by flush from the filesystem failing the FLUSH with ATA bus error which makes the filesystem code believe that data is lost and drop to RO mode. This was reported in the following bugzilla bug. http://bugzilla.kernel.org/show_bug.cgi?id=14543 This patch makes libata EH retry FLUSH if it wasn't failed by the device. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Andrey Vihrov <andrey.vihrov@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-12-03 02:46:35 -05:00
Tejun Heo	4f7c287499	libata: fix PMP initialization Commit `842faa6c1a` fixed error handling during attach by not committing detected device class to dev->class while attaching a new device. However, this change missed the PMP class check in the configuration loop causing a new PMP device to go through ata_dev_configure() as if it were an ATA or ATAPI device. As PMP device doesn't have a regular IDENTIFY data, this makes ata_dev_configure() tries to configure a PMP device using an invalid data. For the most part, it wasn't too harmful and went unnoticed but this ends up clearing dev->flags which may have ATA_DFLAG_AN set by sata_pmp_attach(). This means that SATA_PMP_FEAT_NOTIFY ends up being disabled on PMPs and on PMPs which honor the flag breaks hotplug support. This problem was discovered and reported by Ethan Hsiao. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Ethan Hsiao <ethanhsiao@jmicron.com> Cc: stable@kernel.org Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-10-16 06:21:54 -04:00
Tejun Heo	3b761d3d43	libata: fix incorrect link online check during probe While trying to work around spurious detection retries for non-existent devices on slave links, commit `816ab89782` incorrectly added link offline check logic before ata_eh_thaw() was called. This means that if an occupied link goes down briefly at the time that offline check was performed, device class will be cleared to ATA_DEV_NONE and libata wouldn't retry thus failing detection of the device. The offline check should be done after the port is thawed together with online check so that such link glitches can be detected by the interrupt handler and handled properly. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Tim Blechmann <tim@klingt.org> Cc: stable@kernel.org Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-10-06 20:58:18 -04:00
Robert Hancock	6521148c64	libata: add command name parsing for error output This patch improve libata's output for error/notification messages to allow easier comprehension and debugging: When ATAPI commands issued through the SCSI layer fail, use SCSI functions to print the CDB in human-readable form instead of just dumping out the CDB in hex. Print out the name of the failed command (as defined by the ATA specification) in error handling output along with the raw register contents. When reporting status of ACPI taskfile commands executed on resume, also output the names of the commands being executed (or not) in readable form. Since the extra data for printing command names increases kernel size slightly, a config option has been added to allow disabling command name output (as well as some of the error register parsing) for those highly sensitive to kernel text size. Signed-off-by: Robert Hancock <hancockrwd@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-09-01 19:47:20 -04:00
Tejun Heo	1e641060c4	libata: clear eh_info on reset completion Resets are done with port frozen but some controllers still issue interrupts during reset and they may end up recording error conditions in ehi leading to unnecessary EH retrials. This patch makes ata_eh_reset() clear ehi on reset completion. As reset is the most severe recovery action, there's nothing to lose by clearing ehi on its completion. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Zdenek Kaspar <zkaspar82@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-09-01 19:47:19 -04:00
Jeff Garzik	54c38444fa	[libata] EH: freeze port before aborting commands Call the ->freeze() hook before aborting qc's, because some hardware requires special handling prior to accessing the taskfile registers (for diagnosis/analysis/reset). Most notably, hardware may wish to disable the DMA engine or interrupts in the ->freeze() hook. Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-09-01 19:47:19 -04:00
Bartlomiej Zolnierkiewicz	705d201414	libata: add missing NULL pointer check to ata_eh_reset() drivers/ata/libata-eh.c +2403 ata_eh_reset(80) warning: variable derefenced before check 'slave' Please note that this is _not_ a real bug at the moment since ata_eh_context structure is embedded into ata_list structure and the code alwas checks for 'slave' before accessing 'sehc'. Anyway lets add missing check and always have a valid 'sehc' pointer (which makes code easier to understand and prevents introducing some possible bugs in the future). Reported-by: Dan Carpenter <error27@gmail.com> Cc: corbet@lwn.net Cc: eteo@redhat.com Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-07-28 21:05:41 -04:00
Tejun Heo	fe2c4d018f	libata: fix follow-up SRST failure path ata_eh_reset() was missing error return handling after follow-up SRST allowing EH to continue the normal probing path after reset failure. This was discovered while testing new WD 2TB drives which take longer than 10 secs to spin up and cause the first follow-up SRST to time out. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-07-14 22:41:28 -04:00
Martin Olsson	98a1708de1	trivial: fix typos s/paramter/parameter/ and s/excute/execute/ in documentation and source comments. Signed-off-by: Martin Olsson <martin@minimum.se> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-06-12 18:01:46 +02:00
Tejun Heo	6f9c1ea2c1	libata: clear ering on resume Error timestamps are in jiffies which doesn't run while suspended and PHY events during resume isn't too uncommon. When the two are combined, it can lead to unnecessary speed downs if the machine is suspended and resumed repeatedly. Clear error history on resume. This was reported and verified in bnc#486803 by Vladimir Botka. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Vladimir Botka <vbotka@novell.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-05-11 14:30:59 -04:00
Tejun Heo	842faa6c1a	libata: fix attach error handling New device attach path in ata_eh_revalidate_and_attach() is divided into two separate loops because ATA requires IDENTIFY to be issued to slave first while the user expects to see device probe messages from the master device. new_mask is used to track which devices are the new ones between the first loop and the second. This usually works well but if an error occurs during configuration stage, ata_dev_revalidate_and_attach() returns with error code and forgets new_mask. On the retry run, dev->class is set and new_mask for the device is clear, so the device just gets revalidated and thus ends up skipping post-configuration procedure including scheduling of SCSI_HOTPLUG for the device. When this occurs, ATA part of probing works fine but SCSI probing usually doesn't happen and makes the device unreachable. The behavior has been around for a very long time but it has been uncovered with the recent addition of 1_5_GBPS horkage which uses -EAGAIN return value from ata_dev_configure() to restart the probing sequence after forcing cable speed. This can be fixed by making sure dev->class is permanently set only after all configurations are successfully complete. Fix it. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Tim Connors <tconnors+linuxkml@astro.swin.edu.au> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-05-11 14:26:01 -04:00
Alan Cox	c96f1732e2	[libata] Improve timeout handling On a timeout call a device specific handler early in the recovery so that we can complete and process successful commands which timed out due to IRQ loss or the like rather more elegantly. [Revised to exclude the timeout handling on a few devices that inherit from SFF but are not SFF enough to use the default timeout handler] Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-03-24 22:52:39 -04:00
Tejun Heo	d6515e6ff4	libata: make sure port is thawed when skipping resets When SCR access is available and the link is offline, softreset is skipped as it only wastes time and some controllers don't respond very well. However, the skip path forgot to thaw the port, which not only blocks further event notification from the port but also causes repeated EH invocations on the same event on drivers which rely on ->thaw() to clear events if the IRQ is shared with another device or port. This problem has always been there but is uncovered by recent sata_nv nf2/3 change which dropped hardreset support while maintaining SCR access. nf2/3 doesn't clear hotplug event mask from the interrupt handler but relies on ->thaw() to clear them. When the hardreset was there, the reset action was never skipped and the port was always thawed but, with the hardreset gone, ->prereset() determines that there's no need for softreset and both ->softreset() and ->thaw() are skipped. This leads to stuck hotplug event in the IRQ status register triggering hotplug event whenever IRQ is delieverd on the same IRQ. As the controller shares the same IRQ for both ports, this happens on every IO if one port is occpupied and the other isn't. This patch fixes the problem by making sure that the port is thawed on reset-skip path. bko#11615 reports this problem. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Robert Hancock <hancockrwd@gmail.com> Reported-by: Dan Andresan <danyer@gmail.com> Reported-by: Arne Woerner <arne_woerner@yahoo.com> Reported-by: Stefan Lippers-Hollmann <s.L-H@gmx.de> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-03-05 07:25:43 -05:00
Tejun Heo	b535708146	libata: don't use on-stack sense buffer sense_buffer is used as DMA target and shouldn't be allocated on stack. Use ap->sector_buf instead. This problem is spotted by Chuck Ebbert. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-03-05 07:25:10 -05:00
Tejun Heo	cf9a590a9e	libata: add no penalty retry request for EH device handling routines Let -EAGAIN from EH device handling routines trigger EH retry without consuming its tries count. This will be used to implement link SPD horkage which requires hardreset to adjust SPD without affecting other EH decisions. As it bypasses the forward progress guarantee provided by the tries count, the requester is responsible for ensuring forward progress. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-02-02 23:04:19 -05:00
Tejun Heo	c2c7a89c5e	libata: improve probe failure handling When link is flaky at high speed, it isn't uncommon for a device to repeatedly fail probing sequence early after successfully negotiating high link speed. This often leads to consecutive hotplug events without successful probing. This patch improves libata EH such that it remembers probing trials and if there have been more than two unsuccessful trials in the past 60 seconds, slows down link speed to 1.5Gbps. As link speed negotiation is the duty of the PHY layer proper, the goal of this fallback mechanism is to provide the last resort when everything else fails, which unfortunately happens not too infrequently, so no fancy 6->3->1.5 speeding down or highest successful transmission speed seen kind of logics (yet). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-02-02 23:03:34 -05:00
Tejun Heo	a07d499b47	libata: add @spd_limit to sata_down_spd_limit() Add @spd_limit to sata_down_spd_limit() so that the caller can specify the SPD limit it wants. This parameter doesn't get in the way even when it's too low. The closest possible limit is applied. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-02-02 23:03:22 -05:00
Tejun Heo	99cf610aa4	libata: clear dev->ering in smarter way dev->ering used to be cleared together with the rest of ata_device in ata_dev_init() which is called whenever a probing event occurs. dev->ering is about to be used to track probing failures so it needs to remain persistent over multiple porbing events. This patch achieves this by doing the following. * Instead of CLEAR_OFFSET, define CLEAR_BEGIN and CLEAR_END and only clear between BEGIN and END. ering is moved after END. The split of persistent area is to allow hotter items remain at the head. * ering is explicitly cleared on ata_dev_disable() and when device attach succeeds. So, ering is persistent throug a device's life time (unless explicitly cleared of course) and also through periods inbetween disablement of an attached device and successful detection of the next one. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-02-02 23:03:17 -05:00
Tejun Heo	678afac678	libata: move ata_dev_disable() to libata-eh.c ata_dev_disable() is about to be more tightly integrated into EH logic. Move it to libata-eh.c. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-02-02 23:03:00 -05:00
Tejun Heo	d89293abd9	libata: fix EH device failure handling The dev->pio_mode > XFER_PIO_0 test is there to avoid unnecessary speed down warning messages but it accidentally disabled SATA link spd down during configuration phase after reset where PIO mode is always zero. This patch fixes the problem by moving the test where it belongs. This makes libata probing sequence behave better when the connection is flaky at higher link speeds which isn't too uncommon for eSATA devices. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-02-02 23:02:57 -05:00

1 2 3 4 5 ...

173 Commits