Commit Graph

149 Commits

Author SHA1 Message Date
NeilBrown
915c420ddf md/bitmap: be more consistent when setting new bits in memory bitmap.
For each active region corresponding to a bit in the bitmap with have
a 14bit counter (and some flags).
This counts
   number of active writes + bit in the on-disk bitmap + delay-needed.

The "delay-needed" is because we always want a delay before clearing a
bit.  So the number here is normally number of active writes plus 2.
If there have been no writes for a while, we drop to 1.
If still no writes we clear the bit and drop to 0.

So for consistency, when setting bit from the on-disk bitmap or by
request from user-space it is best to set the counter to '2' to start
with.

In particular we might also set the NEEDED_MASK flag at this time, and
in all other cases NEEDED_MASK is only set when the counter is 2 or
more.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-12-23 10:17:51 +11:00
NeilBrown
2e61ebbcc4 md/bitmap: daemon_work cleanup.
We have a variable 'mddev' in this function, but repeatedly get the
same value by dereferencing bitmap->mddev.
There is room for simplification here...

Signed-off-by: NeilBrown <neilb@suse.de>
2011-12-23 10:17:50 +11:00
NeilBrown
961902c0f8 md/bitmap: It is OK to clear bits during recovery.
commit d0a4bb4927 introduced a
regression which is annoying but fairly harmless.

When writing to an array that is undergoing recovery (a spare
in being integrated into the array), writing to the array will
set bits in the bitmap, but they will not be cleared when the
write completes.

For bits covering areas that have not been recovered yet this is not a
problem as the recovery will clear the bits.  However bits set in
already-recovered region will stay set and never be cleared.
This doesn't risk data integrity.  The only negatives are:
 - next time there is a crash, more resyncing than necessary will
   be done.
 - the bitmap doesn't look clean, which is confusing.

While an array is recovering we don't want to update the
'events_cleared' setting in the bitmap but we do still want to clear
bits that have very recently been set - providing they were written to
the recovering device.

So split those two needs - which previously both depended on 'success'
and always clear the bit of the write went to all devices.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-12-23 09:57:48 +11:00
NeilBrown
7c8f424798 md/lock: ensure updates to page_attrs are properly locked.
Page attributes are set using __set_bit rather than set_bit as
it normally called under a spinlock so the extra atomicity is not
needed.

However there are two places where we might set or clear page
attributes without holding the spinlock.
So add the spinlock in those cases.

This might be the cause of occasional reports that bits a aren't
getting clear properly - theory is that BITMAP_PAGE_PENDING gets lost
when BITMAP_PAGE_NEEDWRITE is set or cleared.  This is an
inconvenience, not a threat to data safety.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-11-23 10:18:52 +11:00
NeilBrown
29d3247ea2 md/bitmap remove fault injection options.
These are too hard to use to be much more than noise.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-11 16:49:56 +11:00
NeilBrown
fd01b88c75 md: remove typedefs: mddev_t -> struct mddev
Having mddev_t and 'struct mddev_s' is ugly and not preferred

Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-11 16:47:53 +11:00
NeilBrown
3cb0300200 md: removing typedefs: mdk_rdev_t -> struct md_rdev
The typedefs are just annoying. 'mdk' probably refers to 'md_k.h'
which used to be an include file that defined this thing.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-11 16:45:26 +11:00
NeilBrown
36a4e1fe0f md: remove PRINTK and dprintk debugging and use pr_debug
Being able to dynamically enable these make them much more useful.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-10-07 14:23:17 +11:00
NeilBrown
2585f3ef8c md/bitmap: improve handling of 'allclean'.
The 'allclean' flag is used to cache the fact that there is nothing to
do, so we can avoid waking up and scanning the bitmap regularly.

The two sorts of pages that might need the attention of the bitmap
daemon are BITMAP_PAGE_PENDING and BITMAP_PAGE_NEEDWRITE pages.

So make sure allclean reflects exactly when there are none of those.
So:
  set it before scanning all pages with either bit set.
  clear it whenever these bits are set
  clear it when we desire not to clear one of these bits.
  don't clear it any other time.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-09-21 15:37:46 +10:00
NeilBrown
5a537df44d md/bitmap: rename and tidy up BITMAP_PAGE_CLEAN
The flag 'BITMAP_PAGE_CLEAN' has a confusing name as it doesn't mean
that the page is clean, but rather that there are counters in the page
which allow bits in the bitmap to be cleared - i.e. maybe cleaning can
happen.

So change it to BITMAP_PAGE_PENDING and fix some irregularities:
 - Don't set it in bitmap_init_from_disk as bitmap_set_memory_bits
   sets it when needed
 - in bitmap_daemon_work, if we find a counter that is '1', but
   need_sync is set, then set BITMAP_PAGE_PENDING again (it was
   recently cleared) to ensure we don't forget about this bit.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-09-21 15:37:46 +10:00
Jonathan Brassow
3520fa4db7 MD bitmap: Revert DM dirty log hooks
Revert most of commit e384e58549
  md/bitmap: prepare for storing write-intent-bitmap via dm-dirty-log.

MD should not need to use DM's dirty log - we decided to use md's
bitmaps instead.

Keeping the DIV_ROUND_UP clean-ups that were part of commit
e384e58549, however.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-27 11:00:37 +10:00
Akinobu Mita
a0a02a7ad6 md: use proper little-endian bitops
Using __test_and_{set,clear}_bit_le() with ignoring its return value
can be replaced with __{set,clear}_bit_le().

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: NeilBrown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-27 11:00:36 +10:00
Namhyung Kim
97b3d4aacf md/bitmap: remove unused fields from struct bitmap
Get rid of ->syncchunk and ->counter_bits since they're never used.

Also discard COUNTER_BYTE_RATIO which is unused.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-06-09 11:43:01 +10:00
Namhyung Kim
27d5ea04d0 md/bitmap: use proper accessor macro
Use COUNTER()/NEEDED() macro instead of open-coding them.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-06-09 11:42:57 +10:00
Jonathan Brassow
d744540cd3 MD: use is_power_of_2 macro
Make use of is_power_of_2 macro.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-06-09 11:42:36 +10:00
Jonathan Brassow
9c81075f43 MD: support initial bitmap creation in-kernel
Add bitmap support to the device-mapper specific metadata area.

This patch allows the creation of the bitmap metadata area upon
initial array creation via device-mapper.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-06-09 11:41:36 +10:00
NeilBrown
8258c53208 md/bitmap: fix saving of events_cleared and other state.
If a bitmap is found to be 'stale' the events_cleared value
is set to match 'events'.
However if the array is degraded this does not get stored on disk.
This can subsequently lead to incorrect behaviour.

So change bitmap_update_sb to always update events_cleared in the
superblock from the known events_cleared.
For neatness also set ->state from ->flags.
This requires updating ->state whenever we update ->flags, which makes
sense anyway.

This is suitable for any active -stable release.

cc: stable@kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
2011-05-11 14:26:30 +10:00
Linus Torvalds
6c51038900 Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits)
  Documentation/iostats.txt: bit-size reference etc.
  cfq-iosched: removing unnecessary think time checking
  cfq-iosched: Don't clear queue stats when preempt.
  blk-throttle: Reset group slice when limits are changed
  blk-cgroup: Only give unaccounted_time under debug
  cfq-iosched: Don't set active queue in preempt
  block: fix non-atomic access to genhd inflight structures
  block: attempt to merge with existing requests on plug flush
  block: NULL dereference on error path in __blkdev_get()
  cfq-iosched: Don't update group weights when on service tree
  fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away
  block: Require subsystems to explicitly allocate bio_set integrity mempool
  jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
  jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
  fs: make fsync_buffers_list() plug
  mm: make generic_writepages() use plugging
  blk-cgroup: Add unaccounted time to timeslice_used.
  block: fixup plugging stubs for !CONFIG_BLOCK
  block: remove obsolete comments for blkdev_issue_zeroout.
  blktrace: Use rq->cmd_flags directly in blk_add_trace_rq.
  ...

Fix up conflicts in fs/{aio.c,super.c}
2011-03-24 10:16:26 -07:00
Akinobu Mita
6b33aff368 md: use little-endian bitops
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h.  This converts ext2 non-atomic bit operations to
little-endian bit operations.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23 19:46:20 -07:00
Jens Axboe
721a9602e6 block: kill off REQ_UNPLUG
With the plugging now being explicitly controlled by the
submitter, callers need not pass down unplugging hints
to the block layer. If they want to unplug, it's because they
manually plugged on their own - in which case, they should just
unplug at will.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:27 +01:00
Jens Axboe
7eaceaccab block: remove per-queue plugging
Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:07 +01:00
NeilBrown
75d3da43cb md: Don't let implementation detail of curr_resync leak out through sysfs.
mddev->curr_resync has artificial values of '1' and '2' which are used
by the code which ensures only one resync is happening at a time on
any given device.

These values are internal and should never be exposed to user-space
(except when translated appropriately as in the 'pending' status in
/proc/mdstat).

Unfortunately they are as ->curr_resync is assigned to
->curr_resync_completed and that value is directly visible through
sysfs.

So change the assignments to ->curr_resync_completed to get the same
valued from elsewhere in a form that doesn't have the magic '1' or '2'
values.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-01-14 09:14:34 +11:00
Jonathan Brassow
a6ff7e089c md: separate meta and data devs
Allow the metadata to be on a separate device from the
data.

This doesn't mean the data and metadata will by on separate
physical devices - it simply gives device-mapper and userspace
tools more flexibility.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-01-14 09:14:34 +11:00
Jonathan Brassow
ccebd4c415 md-new-param-to_sync_page_io
Add new parameter to 'sync_page_io'.

The new parameter allows us to distinguish between metadata and data
operations.  This becomes important later when we add the ability to
use separate devices for data and metadata.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
2011-01-14 09:14:33 +11:00
NeilBrown
be2a2656ee md: unplug writes to external bitmaps.
When writing to an 'external' bitmap we don't currently unplug the
device before waiting, so we can get a 3msec delay each time;
So use REQ_UNPLUG to force and unplug.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-10-29 16:40:32 +11:00