The comment gives more details, but I didn't quite have the sequencing write,
so there was room for races to leave bits unset in the on-disk bitmap for
short periods of time.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When a device is unplugged, requests are moved from one or two (depending on
whether a bitmap is in use) queues to the main request queue.
So whenever requests are put on either of those queues, we should make sure
the raid5 array is 'plugged'. However we don't. We currently plug the raid5
queue just before putting requests on queues, so there is room for a race. If
something unplugs the queue at just the wrong time, requests will be left on
the queue and nothing will want to unplug them. Normally something else will
plug and unplug the queue fairly soon, but there is a risk that nothing will.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We introduced 'io_sectors' recently so we could count the sectors that causes
io during resync separate from sectors which didn't cause IO - there can be a
difference if a bitmap is being used to accelerate resync.
However when a speed is reported, we find the number of sectors processed
recently by subtracting an oldish io_sectors count from a current
'curr_resync' count. This is wrong because curr_resync counts all sectors,
not just io sectors.
So, add a field to mddev to store the curren io_sectors separately from
curr_resync, and use that in the calculations.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When an array is started we start one or two threads (two if there is a
reshape or recovery that needs to be completed).
We currently start these *before* the array is completely set up and in
particular before queue->queuedata is set. If the thread actually starts
very quickly on another CPU, we can end up dereferencing queue->queuedata
and oops.
This patch also makes sure we don't try to start a recovery if a reshape is
being restarted.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This has to be done in ->load_super, not ->validate_super
Without this, hot-adding devices to an array doesn't always
work right - though there is a work around in mdadm-2.5.2 to
make this less of an issue.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I have reports of a problem with raid5 which turns out to be because the raid5
device gets stuck in a 'plugged' state. This shouldn't be able to happen as
3msec after it gets plugged it should get unplugged. However it happens
none-the-less. This patch fixes the problem and is a reasonable thing to do,
though it might hurt performance slightly in some cases.
Until I can find the real problem, we should probably have this workaround in
place.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Teach special (recursive) locking code to the lock validator.
Effects on non-lockdep kernels:
- the introduction of the following function variants:
extern struct block_device *open_partition_by_devnum(dev_t, unsigned);
extern int blkdev_put_partition(struct block_device *);
static int
blkdev_get_whole(struct block_device *bdev, mode_t mode, unsigned flags);
which on non-lockdep are the same as open_by_devnum(), blkdev_put()
and blkdev_get().
- a subclass parameter to do_open(). [unused on non-lockdep]
- a subclass parameter to __blkdev_put(), which is a new internal
function for the main blkdev_put*() functions. [parameter unused
on non-lockdep kernels, except for two sanity check WARN_ON()s]
these functions carry no semantical difference - they only express
object dependencies towards the lockdep subsystem.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/devfs-2.6: (22 commits)
[PATCH] devfs: Remove it from the feature_removal.txt file
[PATCH] devfs: Last little devfs cleanups throughout the kernel tree.
[PATCH] devfs: Rename TTY_DRIVER_NO_DEVFS to TTY_DRIVER_DYNAMIC_DEV
[PATCH] devfs: Remove the tty_driver devfs_name field as it's no longer needed
[PATCH] devfs: Remove the line_driver devfs_name field as it's no longer needed
[PATCH] devfs: Remove the videodevice devfs_name field as it's no longer needed
[PATCH] devfs: Remove the gendisk devfs_name field as it's no longer needed
[PATCH] devfs: Remove the miscdevice devfs_name field as it's no longer needed
[PATCH] devfs: Remove the devfs_fs_kernel.h file from the tree
[PATCH] devfs: Remove devfs_remove() function from the kernel tree
[PATCH] devfs: Remove devfs_mk_cdev() function from the kernel tree
[PATCH] devfs: Remove devfs_mk_bdev() function from the kernel tree
[PATCH] devfs: Remove devfs_mk_symlink() function from the kernel tree
[PATCH] devfs: Remove devfs_mk_dir() function from the kernel tree
[PATCH] devfs: Remove devfs_*_tape() functions from the kernel tree
[PATCH] devfs: Remove devfs support from the sound subsystem
[PATCH] devfs: Remove devfs support from the ide subsystem.
[PATCH] devfs: Remove devfs support from the serial subsystem
[PATCH] devfs: Remove devfs from the init code
[PATCH] devfs: Remove devfs from the partition code
...
It appears in /sys/mdX/md/dev-YYY/state
and can be set or cleared by writing 'writemostly' or '-writemostly'
respectively.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The md/dev-XXX/state file can now be written:
"faulty" simulates an error on the device
"remove" removes the device from the array (if it is not busy)
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This allows the state of an md/array to be directly controlled via sysfs and
adds the ability to stop and array without tearing it down.
Array states/settings:
clear
No devices, no size, no level
Equivalent to STOP_ARRAY ioctl
inactive
May have some settings, but array is not active
all IO results in error
When written, doesn't tear down array, but just stops it
suspended (not supported yet)
All IO requests will block. The array can be reconfigured.
Writing this, if accepted, will block until array is quiescent
readonly
no resync can happen. no superblocks get written.
write requests fail
read-auto
like readonly, but behaves like 'clean' on a write request.
clean - no pending writes, but otherwise active.
When written to inactive array, starts without resync
If a write request arrives then
if metadata is known, mark 'dirty' and switch to 'active'.
if not known, block and switch to write-pending
If written to an active array that has pending writes, then fails.
active
fully active: IO and resync can be happening.
When written to inactive array, starts with resync
write-pending (not supported yet)
clean, but writes are blocked waiting for 'active' to be written.
active-idle
like active, but no writes have been seen for a while (100msec).
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- record the 'event' count on each individual device (they
might sometimes be slightly different now)
- add a new value for 'sb_dirty': '3' means that the super
block only needs to be updated to record a clean<->dirty
transition.
- Prefer odd event numbers for dirty states and even numbers
for clean states
- Using all the above, don't update the superblock on
a spare device if the update is just doing a clean-dirty
transition. To accomodate this, a transition from
dirty back to clean might now decrement the events counter
if nothing else has changed.
The net effect of this is that spare drives will not see any IO requests
during normal running of the array, so they can go to sleep if that is what
they want to do.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When an array has a bitmap, a device can be removed and re-added and only
blocks changes since the removal (as recorded in the bitmap) will be resynced.
It should be possible to do a similar thing to arrays without bitmaps. i.e.
if a device is removed and re-added and *no* changes have been made in the
interim, then the add should not require a resync.
This patch allows that option. This means that when assembling an array one
device at a time (e.g. during device discovery) the array can be enabled
read-only as soon as enough devices are available, but extra devices can still
be added without causing a resync.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>