Commit Graph

751 Commits

Author SHA1 Message Date
NeilBrown
6c144d3164 md: move EXPORT_SYMBOL to after function in md.c
Signed-off-by: NeilBrown <neilb@suse.de>
2014-10-14 13:08:29 +11:00
NeilBrown
2cbbca5e7c md: discard PRINT_RAID_DEBUG ioctl
All the interesting information printed by this ioctl
is provided in /proc/mdstat and/or sysfs.
So it isn't needed and isn't used and would be best if it didn't
exist.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-10-14 13:08:29 +11:00
NeilBrown
403df47888 md: remove MD_BUG()
Most of the places that call this are doing so pointlessly.
A couple of the others a best replaced with WARN_ON().

Signed-off-by: NeilBrown <neilb@suse.de>
2014-10-14 13:08:29 +11:00
NeilBrown
3adc28d85f md: clean up 'exit' labels in md_ioctl().
There are 4 labels and we only really need two.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-10-14 13:08:29 +11:00
NeilBrown
326eb17d73 md: remove unnecessary test for MD_MAJOR in md_ioctl()
unknown ioctls no longer get this deep into md_ioctl since
md_ioctl_valid() was introduced in 3.14.
So remove the test and the misleading comment.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-10-14 13:08:29 +11:00
NeilBrown
e1960f8c5c md: don't allow "-sync" to be set for device in an active array.
If an array is active, devices can be marked 'faulty', but simply
removing the 'sync' flag is wrong.  That only makes sense
for an array which is not active (and is probably only useful
for testing anyway).

Signed-off-by: NeilBrown <neilb@suse.de>
2014-10-14 13:08:29 +11:00
NeilBrown
f72ffdd686 md: remove unwanted white space from md.c
My editor shows much of this is RED.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-10-14 13:08:29 +11:00
NeilBrown
ac05f25669 md: don't start resync thread directly from md thread.
The main 'md' thread is needed for processing writes, so if it blocks
write requests could be delayed.

Starting a new thread requires some GFP_KERNEL allocations and so can
wait for writes to complete.  This can deadlock.

So instead, ask a workqueue to start the sync thread.
There is no particular rush for this to happen, so any work queue
will do.

MD_RECOVERY_RUNNING is used to ensure only one thread is started.

Reported-by: BillStuff <billstuff2001@sbcglobal.net>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-10-14 13:08:28 +11:00
NeilBrown
8b1afc3d67 md: Just use RCU when checking for overlap between arrays.
We don't really need the full mddev_lock here, and having to
drop it is messy.
RCU is enough to protect these lists.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-10-14 13:08:28 +11:00
Chao Yu
50bd377405 md: avoid potential long delay under pers_lock
printk may cause long time lapse if value of printk_delay in sysctl is
configured large by user. If register_md_personality takes long time to print in
spinlock pers_lock, we may encounter high CPU usage rate when there are other
pers_lock competitors who may be blocked to spin.
We can avoid this condition by moving printk out of coverage of pers_lock
spinlock.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-10-14 13:08:28 +11:00
NeilBrown
0638bb0e73 md: simplify export_array()
We don't really need that for_each loop, or those MD_BUGs.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-10-14 13:08:28 +11:00
NeilBrown
4878e9eb88 md: discard find_rdev_nr in favour of find_rdev_nr_rcu
Having both is a waste - just use the one.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-10-14 13:08:28 +11:00
NeilBrown
1967cd5616 md: use wait_event() to simplify md_super_wait()
md_super_wait is really just wait_event() open-coded.
So use the macro instead.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-10-14 13:08:28 +11:00
NeilBrown
9ba3b7f5d0 md: be more relaxed about stopping an array which isn't started.
In general we don't allow an array to be stopped if it is in use.
However if the array hasn't really been started yet, then any
apparent use is an anomily, probably due to 'udev' or similar
having a look to see what is there.

This means that if something goes wrong while assembling an array
it cannot reliably be un-assembled - STOP_ARRAY could fail.
There is no value here, so change do_md_stop() to succeed
despite concurrent opens if the array has not yet been
activated.  i.e. if ->pers is NULL.

Reported-by: "Baldysiak, Pawel" <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-10-14 13:08:28 +11:00
NeilBrown
d66b1b395a md: don't allow bitmap file to be added to raid0/linear.
An array can only accept a bitmap if it will call bitmap_daemon_work
periodically, which means it needs a thread running.

If there is no thread, don't allow a bitmap to be added.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-08-08 15:43:20 +10:00
Xiao Ni
ac7e50a383 md: Recovery speed is wrong
When we calculate the speed of recovery, the numerator that contains
the recovery done sectors.  It's need to subtract the sectors which
don't finish recovery.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-08-08 12:11:25 +10:00
NeilBrown
af5628f05d md: disable probing for md devices 512 and over.
The way md devices are traditionally created in the kernel
is simply to open the device with the desired major/minor number.

This can be problematic as some support tools, notably udev and
programs run by udev, can open a device just to see what is there, and
find that it has created something.  It is easy for a race to cause
udev to open an md device just after it was destroy, causing it to
suddenly re-appear.

For some time we have had an alternate way to create md devices
  echo md_somename > /sys/modules/md_mod/paramaters/new_array

This will always use a minor number of 512 or higher, which mdadm
normally avoids.
Using this makes the creation-by-opening unnecessary, but does
not disable it, so it is still there to cause problems.

This patch disable probing for devices with a major of 9 (MD_MAJOR)
and a minor of 512 and up.  This devices created by writing to
new_array cannot be re-created by opening the node in /dev.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-07-31 13:54:54 +10:00
NeilBrown
133d4527ea md: flush writes before starting a recovery.
When we write to a degraded array which has a bitmap, we
make sure the relevant bit in the bitmap remains set when
the write completes (so a 're-add' can quickly rebuilt a
temporarily-missing device).

If, immediately after such a write starts, we incorporate a spare,
commence recovery, and skip over the region where the write is
happening (because the 'needs recovery' flag isn't set yet),
then that write will not get to the new device.

Once the recovery finishes the new device will be trusted, but will
have incorrect data, leading to possible corruption.

We cannot set the 'needs recovery' flag when we start the write as we
do not know easily if the write will be "degraded" or not.  That
depends on details of the particular raid level and particular write
request.

This patch fixes a corruption issue of long standing and so it
suitable for any -stable kernel.  It applied correctly to 3.0 at
least and will minor editing to earlier kernels.

Reported-by: Bill <billstuff2001@sbcglobal.net>
Tested-by: Bill <billstuff2001@sbcglobal.net>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/53A518BB.60709@sbcglobal.net
Signed-off-by: NeilBrown <neilb@suse.de>
2014-07-03 10:44:45 +10:00
NeilBrown
9bd3592032 md: make sure GET_ARRAY_INFO ioctl reports correct "clean" status
If an array has a bitmap, the when we set the "has bitmap" flag we
incorrectly clear the "is clean" flag.

"is clean" isn't really important when a bitmap is present, but it is
best to get it right anyway.

Reported-by: George Duffield <forumscollective@gmail.com>
Link: http://lkml.kernel.org/CAG__1a4MRV6gJL38XLAurtoSiD3rLBTmWpcS5HYvPpSfPR88UQ@mail.gmail.com
Fixes: 36fa30636f (v2.6.14)
Signed-off-by: NeilBrown <neilb@suse.de>
2014-07-03 10:44:31 +10:00
NeilBrown
8b32bf5e37 md: md_clear_badblocks should return an error code on failure.
Julia Lawall and coccinelle report that md_clear_badblocks always
returns 0, despite appearing to have an error path.
The error path really should return an error code.  ENOSPC is
reasonably appropriate.

Reported-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-05-29 16:59:46 +10:00
NeilBrown
bd8839e03b md: refuse to change shape of array if it is active but read-only
read-only arrays should not be changed.  This includes changing
the level, layout, size, or number of devices.

So reject those changes for readonly arrays.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-05-29 16:59:30 +10:00
NeilBrown
2ac295a544 md: always set MD_RECOVERY_INTR when interrupting a reshape thread.
Commit 8313b8e57f
   md: fix problem when adding device to read-only array with bitmap.

added a called to md_reap_sync_thread() which cause a reshape thread
to be interrupted (in particular, it could cause md_thread() to never even
call md_do_sync()).
However it didn't set MD_RECOVERY_INTR so ->finish_reshape() would not
know that the reshape didn't complete.

This only happens when mddev->ro is set and normally reshape threads
don't run in that situation.  But raid5 and raid10 can start a reshape
thread during "run" is the array is in the middle of a reshape.
They do this even if ->ro is set.

So it is best to set MD_RECOVERY_INTR before abortingg the
sync thread, just in case.

Though it rare for this to trigger a problem it can cause data corruption
because the reshape isn't finished properly.
So it is suitable for any stable which the offending commit was applied to.
(3.2 or later)

Fixes: 8313b8e57f
Cc: stable@vger.kernel.org (3.2+)
Signed-off-by: NeilBrown <neilb@suse.de>
2014-05-29 16:59:06 +10:00
NeilBrown
3991b31ea0 md: always set MD_RECOVERY_INTR when aborting a reshape or other "resync".
If mddev->ro is set, md_to_sync will (correctly) abort.
However in that case MD_RECOVERY_INTR isn't set.

If a RESHAPE had been requested, then ->finish_reshape() will be
called and it will think the reshape was successful even though
nothing happened.

Normally a resync will not be requested if ->ro is set, but if an
array is stopped while a reshape is on-going, then when the array is
started, the reshape will be restarted.  If the array is also set
read-only at this point, the reshape will instantly appear to success,
resulting in data corruption.

Consequently, this patch is suitable for any -stable kernel.

Cc: stable@vger.kernel.org (any)
Signed-off-by: NeilBrown <neilb@suse.de>
2014-05-28 13:39:39 +10:00
NeilBrown
0f62fb220a md: avoid possible spinning md thread at shutdown.
If an md array with externally managed metadata (e.g. DDF or IMSM)
is in use, then we should not set safemode==2 at shutdown because:

1/ this is ineffective: user-space need to be involved in any 'safemode' handling,
2/ The safemode management code doesn't cope with safemode==2 on external metadata
   and md_check_recover enters an infinite loop.

Even at shutdown, an infinite-looping process can be problematic, so this
could cause shutdown to hang.

Cc: stable@vger.kernel.org (any kernel)
Signed-off-by: NeilBrown <neilb@suse.de>
2014-05-06 09:49:31 +10:00
NeilBrown
e2f23b606b md: avoid oops on unload if some process is in poll or select.
If md-mod is unloaded while some process is in poll() or select(),
then that process maintains a pointer to md_event_waiters, and when
the try to unlink from that list, they will oops.

The procfs infrastructure ensures that ->poll won't be called after
remove_proc_entry, but doesn't provide a wait_queue_head for us to
use, and the waitqueue code doesn't provide a way to remove all
listeners from a waitqueue.

So we need to:
 1/ make sure no further references to md_event_waiters are taken (by
    setting md_unloading)
 2/ wake up all processes currently waiting, and
 3/ wait until all those processes have disconnected from our
    wait_queue_head.

Reported-by: "majianpeng" <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2014-04-09 14:42:34 +10:00