linux-apfs

mirror of https://github.com/linux-apfs/linux-apfs.git synced 2026-05-01 15:00:59 -07:00

Author	SHA1	Message	Date
Jonathan Brassow	72f4b31410	dm raid1: handle write failures This patch gives mirror the ability to handle device failures during normal write operations. The 'write_callback' function is called when a write completes. If all the writes failed or succeeded, we report failure or success respectively. If some of the writes failed, we call fail_mirror; which increments the error count for the device, notes the type of error encountered (DM_RAID1_WRITE_ERROR), and selects a new primary (if necessary). Note that the primary device can never change while the mirror is not in-sync (IOW, while recovery is happening.) This means that the scenario where a failed write changes the primary and gives recovery_complete a chance to misread the primary never happens. The fact that the primary can change has necessitated the change to the default_mirror field. We need to protect against reading garbage while the primary changes. We then add the bio to a new list in the mirror set, 'failures'. For every bio in the 'failures' list, we call a new function, '__bio_mark_nosync', where we mark the region 'not-in-sync' in the log and properly set the region state as, RH_NOSYNC. Userspace must also be notified of the failure. This is done by 'raising an event' (dm_table_event()). If fail_mirror is called in process context the event can be raised right away. If in interrupt context, the event is deferred to the kmirrord thread - which raises the event if 'event_waiting' is set. Backwards compatibility is maintained by ignoring errors if the DM_FEATURES_HANDLE_ERRORS flag is not present. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2008-02-08 02:11:29 +00:00
Jonathan Brassow	aa5617c553	dm raid1: add mirror_set to struct mirror Store a pointer to the owning mirror_set structure within each mirror structure for a subsequent patch to use. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2007-10-20 02:01:22 +01:00
Jonathan Brassow	6b3df0d7a5	dm log: split suspend There are now two phases to a suspend in device-mapper - presuspend and postsuspend. This patch removes the single 'suspend' in the logging API and replaces it with 'presuspend' and 'postsuspend' functions to align it better with core device-mapper. A subsequent patch will make use of 'presuspend'. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2007-10-20 02:01:21 +01:00
vignesh babu	6f3c3f0afa	dm: use is_power_of_2 Replacing n & (n - 1) for power of 2 check by is_power_of_2(n) Signed-off-by: vignesh babu <vignesh.babu@wipro.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2007-10-20 02:01:06 +01:00
Dmitry Monakhov	a72cf737e0	dm raid1: fix leakage Add missing 'dm_io_client_destroy' to alloc_context error path. Reorganize mirror constructor error path in order to prevent workqueue leakage. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2007-10-20 02:01:01 +01:00
NeilBrown	6712ecf8f6	Drop 'size' argument from bio_endio and bi_end_io As bi_end_io is only called once when the reqeust is complete, the 'size' argument is now redundant. Remove it. Now there is no need for bio_endio to subtract the size completed from bi_size. So don't do that either. While we are at it, change bi_end_io to return void. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-10 09:25:57 +02:00
Yoann Padioleau	dd00cc486a	some kmalloc/memset ->kzalloc (tree wide) Transform some calls to kmalloc/memset to a single kzalloc (or kcalloc). Here is a short excerpt of the semantic patch performing this transformation: @@ type T2; expression x; identifier f,fld; expression E; expression E1,E2; expression e1,e2,e3,y; statement S; @@ x = - kmalloc + kzalloc (E1,E2) ... when != \(x->fld=E;\\|y=f(...,x,...);\\|f(...,x,...);\\|x=E;\\|while(...) S\\|for(e1;e2;e3) S\) - memset((T2)x,0,E1); @@ expression E1,E2,E3; @@ - kzalloc(E1 * E2,E3) + kcalloc(E1,E2,E3) [akpm@linux-foundation.org: get kcalloc args the right way around] Signed-off-by: Yoann Padioleau <padator@wanadoo.fr> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Acked-by: Russell King <rmk@arm.linux.org.uk> Cc: Bryan Wu <bryan.wu@analog.com> Acked-by: Jiri Slaby <jirislaby@gmail.com> Cc: Dave Airlie <airlied@linux.ie> Acked-by: Roland Dreier <rolandd@cisco.com> Cc: Jiri Kosina <jkosina@suse.cz> Acked-by: Dmitry Torokhov <dtor@mail.ru> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org> Acked-by: Pierre Ossman <drzeus-list@drzeus.cx> Cc: Jeff Garzik <jeff@garzik.org> Cc: "David S. Miller" <davem@davemloft.net> Acked-by: Greg KH <greg@kroah.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-19 10:04:50 -07:00
Jonathan Brassow	fc1ff9588a	dm raid1: handle log failure When writing to a mirror, the log must be updated first. Failure to update the log could result in the log not properly reflecting the state of the mirror if the machine should crash. We change the return type of the rh_flush function to give us the ability to check if a log write was successful. If the log write was unsuccessful, we fail the writes to avoid the case where the log does not properly reflect the state of the mirror. A follow-up patch - which is dependent on the ability to requeue I/O's to core device-mapper - will requeue the I/O's for retry (allowing the mirror to be reconfigured.) Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-12 15:01:08 -07:00
Jonathan Brassow	f44db678ed	dm raid1: handle resync failures Device-mapper mirroring currently takes a best effort approach to recovery - failures during mirror synchronization are completely ignored. This means that regions are marked 'in-sync' and 'clean' and removed from the hash list. Future reads and writes that query the region will incorrectly interpret the region as in-sync. This patch handles failures during the recovery process. If a failure occurs, the region is marked as 'not-in-sync' (aka RH_NOSYNC) and added to a new list 'failed_recovered_regions'. Regions on the 'failed_recovered_regions' list are not marked as 'clean' upon removal from the list. Furthermore, if the DM_RAID1_HANDLE_ERRORS flag is set, the region is marked as 'not-in-sync'. This action prevents any future read-balancing from choosing an invalid device because of the 'not-in-sync' status. If "handle_errors" is not specified when creating a mirror (leaving the DM_RAID1_HANDLE_ERRORS flag unset), failures will be ignored exactly as they would be without this patch. This is to preserve backwards compatibility with user-space tools, such as 'pvmove'. However, since future read-balancing policies will rely on the correct sync status of a region, a user must choose "handle_errors" when using read-balancing. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-12 15:01:08 -07:00
Jonathan Brassow	943317efdb	dm raid1: clear region outside spinlock A clear_region function is permitted to block (in practice, rare) but gets called in rh_update_states() with a spinlock held. The bits being marked and cleared by the above functions are used to update the on-disk log, but are never read directly. We can perform these operations outside the spinlock since the bits are only changed within one thread viz. - mark_region in rh_inc() - clear_region in rh_update_states(). So, we grab the clean_regions list items via list_splice() within the spinlock and defer clear_region() until we iterate over the list for deletion - similar to how the recovered_regions list is already handled. We then move the flush() call down to ensure it encapsulates the changes which are done by the later calls to clear_region(). Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-12 15:01:08 -07:00
Milan Broz	c95bc206da	dm raid1: fix status Fix mirror status line broken in dm-log-report-fault-status.patch: - space missing between two words - placeholder ("0") required for compatibility with a subsequent patch - incorrect offset parameter Cc: stable@kernel.org Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-12 15:01:08 -07:00
Alasdair G Kergon	0cd3312434	dm: remove duplicate module name from error msgs Remove explicit module name from messages as the macro now includes it automatically. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-12 15:01:08 -07:00
Jonathan Brassow	b997b82d26	dm raid1: switch rh_in_sync to blocking in do_reads The call to rh_in_sync() in do_reads() should be allowed to block. It is in the mirror worker thread which already permits blocking operations. This will be needed to support clustered mirroring which will perform network operations. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:48 -07:00
Jonathan Brassow	f5353cd7c9	dm raid1: fix to commit pending clear region requests With the code as it is, it is possible for oustanding clear region requests never to get flushed when a mirror is deactivated or suspended. This means there will always be some resync work required when a mirror is activated, even though it may very well be in-sync. Always requesting the flush doesn't hurt us. This is because the log tracks whether any changes occurred and, if not, no flush is performed. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:48 -07:00
Milan Broz	88be163abb	dm raid1: update dm io interface This patch ports dm-raid1.c to the new dm-io interface. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:47 -07:00
Jonathan E Brassow	a8e6afa236	dm raid1: add handle_errors feature flag This patch adds the ability to specify desired features in the mirror constructor/mapping table. The first feature of interest is "handle_errors". Currently, mirroring will ignore any I/O errors from the devices. Subsequent patches will check for this flag and handle the errors. If flag/feature is not present, mirror will do nothing - maintaining backwards compatibility. Signed-off-by: Jonathan E Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:47 -07:00
Jonathan E Brassow	315dcc226f	dm log: report fault status This patch reports the status of the log device so that userspace can detect the error and take appropriate action. Signed-off-by: Jonathan E Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:47 -07:00
Holger Smolinski	6ad36fe2b4	dm raid1: one kmirrord per mirror This patch replaces the single instance of kmirrord by one instance per mirror set. This change is required to avoid a deadlock in kmirrord when the persistent dirty log of a mirror itself resides on a mirror. The single instance of kmirrord then issues a sync write to the dirty log in write_bits which gets deferred to kmirrord itself later in the call chain. But kmirrord never does the deferred work because it is still waiting for the sync write_bits. _mirror_sets is removed as it no longer needed, and we always flush the workqueue before destroying it to ensure all work is complete before destroying it. Signed-off-by: Holger Smolinski <smolinski@de.ibm.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:46 -07:00
Jonathan E Brassow	f3ee6b2f62	[PATCH] dm: log: rename complete_resync_work The complete_resync_work function only provides the ability to change an out-of-sync region to in-sync. This patch enhances the function to allow us to change the status from in-sync to out-of-sync as well, something that is needed when a mirror write to one of the devices or an initial resync on a given region fails. Signed-off-by: Jonathan E Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: dm-devel@redhat.com Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:29:09 -08:00
Kiyoshi Ueda	d2a7ad29a8	[PATCH] dm: map and endio symbolic return codes Update existing targets to use the new symbols for return values from target map and end_io functions. There is no effect on behaviour. Test results: Done build test without errors. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: dm-devel@redhat.com Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:29:09 -08:00
David Howells	c4028958b6	WorkStruct: make allyesconfig Fix up for make allyesconfig. Signed-Off-By: David Howells <dhowells@redhat.com>	2006-11-22 14:57:56 +00:00
Jonathan E Brassow	33184048dc	[PATCH] dm: raid1: fix waiting for io on suspend All device-mapper targets must complete outstanding I/O before suspending. The mirror target generates I/O in its recovery phase and fails to wait for it. It needs to be tracked so we can ensure that it has completed before we suspend. [akpm@osdl.org: cleanup] Signed-off-by: Jonathan E Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: <dm-devel@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-11-08 18:29:23 -08:00
Jonathan Brassow	e52b8f6dbe	[PATCH] dm mirror: remove trailing space from table Remove trailing space from 'dmsetup table' output. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-03 08:04:15 -07:00
Daniel Kobras	c06aad854f	[PATCH] dm: Fix deadlock under high i/o load in raid1 setup. On an nForce4-equipped machine with two SATA disk in raid1 setup using dmraid, we experienced frequent deadlock of the system under high i/o load. 'cat /dev/zero > ~/zero' was the most reliable way to reproduce them: Randomly after a few GB, 'cp' would be left in 'D' state along with kjournald and kmirrord. The functions cp and kjournald were blocked in did vary, but kmirrord's wchan always pointed to 'mempool_alloc()'. We've seen this pattern on 2.6.15 and 2.6.17 kernels. http://lkml.org/lkml/2005/4/20/142 indicates that this problem has been around even before. So much for the facts, here's my interpretation: mempool_alloc() first tries to atomically allocate the requested memory, or falls back to hand out preallocated chunks from the mempool. If both fail, it puts the calling process (kmirrord in this case) on a private waitqueue until somebody refills the pool. Where the only 'somebody' is kmirrord itself, so we have a deadlock. I worked around this problem by falling back to a (blocking) kmalloc when before kmirrord would have ended up on the waitqueue. This defeats part of the benefits of using the mempool, but at least keeps the system running. And it could be done with a two-line change. Note that mempool_alloc() clears the GFP_NOIO flag internally, and only uses it to decide whether to wait or return an error if immediate allocation fails, so the attached patch doesn't change behaviour in the non-deadlocking case. Path is against current git (2.6.18-rc4), but should apply to earlier versions as well. I've tested on 2.6.15, where this patch makes the difference between random lockup and a stable system. Signed-off-by: Daniel Kobras <kobras@linux.de> Acked-by: Alasdair G Kergon <agk@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-08-27 11:01:28 -07:00
Alasdair G Kergon	72d9486169	[PATCH] dm: improve error message consistency Tidy device-mapper error messages to include context information automatically. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-26 09:58:36 -07:00

1 2

39 Commits