linux-apfs

mirror of https://github.com/linux-apfs/linux-apfs.git synced 2026-05-01 15:00:59 -07:00

Author	SHA1	Message	Date
NeilBrown	c6563a8c38	md: add possibility to change data-offset for devices. When reshaping we can avoid costly intermediate backup by changing the 'start' address of the array on the device (if there is enough room). So as a first step, allow such a change to be requested through sysfs, and recorded in v1.x metadata. (As we didn't previous check that all 'pad' fields were zero, we need a new FEATURE flag for this. A (belatedly) check that all remaining 'pad' fields are zero to avoid a repeat of this) The new data offset must be requested separately for each device. This allows each to have a different change in the data offset. This is not likely to be used often but as data_offset can be set per-device, new_data_offset should be too. This patch also removes the 'acknowledged' arg to rdev_set_badblocks as it is never used and never will be. At the same time we add a new arg ('in_new') which is currently always zero but will be used more soon. When a reshape finishes we will need to update the data_offset and rdev->sectors. So provide an exported function to do that. Signed-off-by: NeilBrown <neilb@suse.de>	2012-05-21 09:27:00 +10:00
NeilBrown	b0d634d568	md/raid10: fix transcription error in calc_sectors conversion. The old code was sector_div(stride, fc); the new code was sector_dir(size, conf->near_copies); 'size' is right (the stride various wasn't really needed), but 'fc' means 'far_copies', and that is an important difference. Signed-off-by: NeilBrown <neilb@suse.de>	2012-05-19 09:01:13 +10:00
NeilBrown	6508fdbf40	md/raid10: set dev_sectors properly when resizing devices in array. raid10 stores dev_sectors in 'conf' separately from the one in 'mddev' because it can have a very significant effect on block addressing and so need to be updated carefully. However raid10_resize isn't updating it at all! To update it correctly, we need to make sure it is a proper multiple of the chunksize taking various details of the layout in to account. This calculation is currently done in setup_conf. So split it out from there and call it from raid10_resize as well. Then set conf->dev_sectors properly. Signed-off-by: NeilBrown <neilb@suse.de>	2012-05-17 10:08:45 +10:00
majianpeng	f4380a9158	md/raid1,raid10: Fix calculation of 'vcnt' when processing error recovery. If r1bio->sectors % 8 != 0,then the memcmp and a later memcpy will omit the last bio_vec. This is suitable for any stable kernel since 3.1 when bad-block management was introduced. Cc: stable@vger.kernel.org Signed-off-by: majianpeng <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-04-12 16:04:47 +10:00
NeilBrown	5020ad7d14	md/raid1,raid10: don't compare excess byte during consistency check. When comparing two pages read from different legs of a mirror, only compare the bytes that were read, not the whole page. In most cases we read a whole page, but in some cases with bad blocks or odd sizes devices we might read fewer than that. This bug has been present "forever" but at worst it might cause a report of two many mismatches and generate a little bit extra resync IO, so there is no need to back-port to -stable kernels. Reported-by: majianpeng <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-04-03 15:39:23 +10:00
NeilBrown	006a09a0ae	md/raid10 - support resizing some RAID10 arrays. 'resizing' an array in this context means making use of extra space that has become available in component devices, not adding new devices. It also includes shrinking the array to take up less space of component devices. This is not supported for array with a 'far' layout. However for 'near' and 'offset' layout arrays, adding and removing space at the end of the devices is easy to support, and this patch provides that support. Signed-off-by: NeilBrown <neilb@suse.de>	2012-03-19 12:46:40 +11:00
NeilBrown	050b66152f	md/raid10: handle merge_bvec_fn in member devices. Currently we don't honour merge_bvec_fn in member devices so if there is one, we force all requests to be single-page at most. This is not ideal. So enhance the raid10 merge_bvec_fn to check that function in children as well. This introduces a small problem. There is no locking around calls the ->merge_bvec_fn and subsequent calls to ->make_request. So a device added between these could end up getting a request which violates its merge_bvec_fn. Currently the best we can do is synchronize_sched(). This will work providing no preemption happens. If there is preemption, we just have to hope that new devices are largely consistent with old devices. Signed-off-by: NeilBrown <neilb@suse.de>	2012-03-19 12:46:39 +11:00
NeilBrown	dafb20fa34	md: tidy up rdev_for_each usage. md.h has an 'rdev_for_each()' macro for iterating the rdevs in an mddev. However it uses the 'safe' version of list_for_each_entry, and so requires the extra variable, but doesn't include 'safe' in the name, which is useful documentation. Consequently some places use this safe version without needing it, and many use an explicity list_for_each entry. So: - rename rdev_for_each to rdev_for_each_safe - create a new rdev_for_each which uses the plain list_for_each_entry, - use the 'safe' version only where needed, and convert all other list_for_each_entry calls to use rdev_for_each. Signed-off-by: NeilBrown <neilb@suse.de>	2012-03-19 12:46:39 +11:00
NeilBrown	d6b42dcb99	md/raid1,raid10: avoid deadlock during resync/recovery. If RAID1 or RAID10 is used under LVM or some other stacking block device, it is possible to enter a deadlock during resync or recovery. This can happen if the upper level block device creates two requests to the RAID1 or RAID10. The first request gets processed, blocks recovery and queue requests for underlying requests in current->bio_list. A resync request then starts which will wait for those requests and block new IO. But then the second request to the RAID1/10 will be attempted and it cannot progress until the resync request completes, which cannot progress until the underlying device requests complete, which are on a queue behind that second request. So allow that second request to proceed even though there is a resync request about to start. This is suitable for any -stable kernel. Cc: stable@vger.kernel.org Reported-by: Ray Morris <support@bettercgi.com> Tested-by: Ray Morris <support@bettercgi.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-03-19 12:46:38 +11:00
NeilBrown	dc10c643e8	md: allow re-add to failed arrays. When an array is failed (some data inaccessible) then there is no point attempting to add a spare as it could not possibly be recovered. However that may be value in re-adding a recently removed device. e.g. if there is a write-intent-bitmap and it is clear, then access to the data could be restored by this action. So don't reject a re-add to a failed array for RAID10 and RAID5 (the only arrays types that check for a failed array). Signed-off-by: NeilBrown <neilb@suse.de>	2012-03-19 12:46:37 +11:00
NeilBrown	547414d19f	md/raid10: remove unnecessary smp_mb() from end_sync_write Recent commit `4ca40c2ce0` (md/raid10: Allow replacement device ...) added an smp_mb in end_sync_write. This was to close a possible race with raid10_remove_disk. However there is no such race as it is never attempted to remove a disk while resync (or recovery) is happening. so the smp_mb is just noise. Signed-off-by: NeilBrown <neilb@suse.de>	2012-03-13 11:21:20 +11:00
NeilBrown	7a90484825	md/raid10: fix assembling of arrays with replacement devices. commit `56a2559bb6` (md/raid10: recognise replacements ...) changed 'run' to set ->replacement or ->rdev depending on the 'Replacement' status if the device, but it didn't remove the old unconditional setting of 'rdev'. So it was largely ineffective. So remove that now. Signed-off-by: NeilBrown <neilb@suse.de>	2012-03-06 10:12:45 +11:00
NeilBrown	fae8cc5ed0	md/raid10: fix handling of error on last working device in array. If we get a read error on the last working device in a RAID10 which contains the target block, then we don't fail the device (which is good) but we don't abort retries, which is wrong. We end up in an infinite loop retrying the read on the one device. This patch fixes the problem in two places: 1/ in raid10_end_read_request we don't even ask for a retry if this was the last usable device. This is efficient but a little racy and will sometimes retry when it should not. 2/ in handle_read_error we are careful to exclude any device from retry which we tried to mark as faulty (that might have failed if it was the last device). This is race-free but less efficient. Signed-off-by: NeilBrown <neilb@suse.de>	2012-02-14 11:10:10 +11:00
NeilBrown	b7044d41b5	md/raid10: If there is a spare and a want_replacement device, start replacement. When attempting to add a spare to a RAID10 array, also consider adding it as a replacement for a want_replacement device. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 10:17:56 +11:00
NeilBrown	56a2559bb6	md/raid10: recognise replacements when assembling array. If a Replacement is seen, file it as such. If we see two replacements (or two normal devices) for the one slot, abort. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 10:17:55 +11:00
NeilBrown	4ca40c2ce0	md/raid10: Allow replacement device to be replace old drive. When recovery finish and spare_active is called, check for a replace that might have just become fully synced and mark it as such, marking the original as failed. Then when the original is removed, move the replacement into its position. This means that 'replacement' and spontaneously become NULL in some situations. Make sure we check for those. It also means that 'rdev' and 'replacement' could appear to be identical - check for that too. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 10:17:55 +11:00
NeilBrown	24afd80d99	md/raid10: handle recovery of replacement devices. If there is a replacement device, then recover to it, reading from any drives - maybe the one being replaced, maybe not. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 10:17:55 +11:00
NeilBrown	9ad1aefc8a	md/raid10: Handle replacement devices during resync. If we need to resync an array which has replacement devices, we always write any block checked to every replacement. If the resync was bitmap-based resync we will then complete the replacement normally. If it was a full resync, we mark the replacements as fully recovered when the resync finishes so no further recovery is needed. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 10:17:55 +11:00
NeilBrown	475b0321a4	md/raid10: writes should get directed to replacement as well as original. When writing, we need to submit two writes, one to the original, and one to the replacements - if there is a replacement. If the write to the replacement results in a write error we just fail the device. We only try to record write errors to the original. This only handles writing new data. Writing for resync/recovery will come later. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 10:17:55 +11:00
NeilBrown	c8ab903ea9	md/raid10: allow removal of failed replacement devices. Enhance raid10_remove_disk to be able to remove ->replacement as well as ->rdev Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 10:17:54 +11:00
NeilBrown	abbf098e6e	md/raid10: preferentially read from replacement device if possible. When reading (for array reads, not for recovery etc) we read from the replacement device if it has recovered far enough. This requires storing the chosen rdev in the 'r10_bio' so we can make sure to drop the ref on the right device when the read finishes. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 10:17:54 +11:00
NeilBrown	96c3fd1f38	md/raid10: change read_balance to return an rdev It makes more sense to return an rdev than just an index as read_balance() gets a reference to the rdev and so returning the pointer make this more idiomatic. This will be needed in a future patch when we might return a 'replacement' rdev instead of the main rdev. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 10:17:54 +11:00
NeilBrown	69335ef3bc	md/raid10: prepare data structures for handling replacement. Allow each slot in the RAID10 to have 2 devices, the want_replacement and the replacement. Also an r10bio to have 2 bios, and for resync/recovery allocate the second bio if there are any replacement devices. Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 10:17:54 +11:00
NeilBrown	b8321b68d1	md: change hot_remove_disk to take an rdev rather than a number. Soon an array will be able to have multiple devices with the same raid_disk number (an original and a replacement). So removing a device based on the number won't work. So pass the actual device handle instead. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2011-12-23 10:17:51 +11:00
Linus Torvalds	32aaeffbd4	Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits) Revert "tracing: Include module.h in define_trace.h" irq: don't put module.h into irq.h for tracking irqgen modules. bluetooth: macroize two small inlines to avoid module.h ip_vs.h: fix implicit use of module_get/module_put from module.h nf_conntrack.h: fix up fallout from implicit moduleparam.h presence include: replace linux/module.h with "struct module" wherever possible include: convert various register fcns to macros to avoid include chaining crypto.h: remove unused crypto_tfm_alg_modname() inline uwb.h: fix implicit use of asm/page.h for PAGE_SIZE pm_runtime.h: explicitly requires notifier.h linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h miscdevice.h: fix up implicit use of lists and types stop_machine.h: fix implicit use of smp.h for smp_processor_id of: fix implicit use of errno.h in include/linux/of.h of_platform.h: delete needless include <linux/module.h> acpi: remove module.h include from platform/aclinux.h miscdevice.h: delete unnecessary inclusion of module.h device_cgroup.h: delete needless include <linux/module.h> net: sch_generic remove redundant use of <linux/module.h> net: inet_timewait_sock doesnt need <linux/module.h> ... Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in - drivers/media/dvb/frontends/dibx000_common.c - drivers/media/video/{mt9m111.c,ov6650.c} - drivers/mfd/ab3550-core.c - include/linux/dmaengine.h	2011-11-06 19:44:47 -08:00

1 2 3 4 5 ...

231 Commits