Revert most of commit e384e58549
md/bitmap: prepare for storing write-intent-bitmap via dm-dirty-log.
MD should not need to use DM's dirty log - we decided to use md's
bitmaps instead.
Keeping the DIV_ROUND_UP clean-ups that were part of commit
e384e58549, however.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Get rid of ->syncchunk and ->counter_bits since they're never used.
Also discard COUNTER_BYTE_RATIO which is unused.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Add bitmap support to the device-mapper specific metadata area.
This patch allows the creation of the bitmap metadata area upon
initial array creation via device-mapper.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If a bitmap is found to be 'stale' the events_cleared value
is set to match 'events'.
However if the array is degraded this does not get stored on disk.
This can subsequently lead to incorrect behaviour.
So change bitmap_update_sb to always update events_cleared in the
superblock from the known events_cleared.
For neatness also set ->state from ->flags.
This requires updating ->state whenever we update ->flags, which makes
sense anyway.
This is suitable for any active -stable release.
cc: stable@kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
* 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits)
Documentation/iostats.txt: bit-size reference etc.
cfq-iosched: removing unnecessary think time checking
cfq-iosched: Don't clear queue stats when preempt.
blk-throttle: Reset group slice when limits are changed
blk-cgroup: Only give unaccounted_time under debug
cfq-iosched: Don't set active queue in preempt
block: fix non-atomic access to genhd inflight structures
block: attempt to merge with existing requests on plug flush
block: NULL dereference on error path in __blkdev_get()
cfq-iosched: Don't update group weights when on service tree
fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away
block: Require subsystems to explicitly allocate bio_set integrity mempool
jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
fs: make fsync_buffers_list() plug
mm: make generic_writepages() use plugging
blk-cgroup: Add unaccounted time to timeslice_used.
block: fixup plugging stubs for !CONFIG_BLOCK
block: remove obsolete comments for blkdev_issue_zeroout.
blktrace: Use rq->cmd_flags directly in blk_add_trace_rq.
...
Fix up conflicts in fs/{aio.c,super.c}
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h. This converts ext2 non-atomic bit operations to
little-endian bit operations.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With the plugging now being explicitly controlled by the
submitter, callers need not pass down unplugging hints
to the block layer. If they want to unplug, it's because they
manually plugged on their own - in which case, they should just
unplug at will.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
mddev->curr_resync has artificial values of '1' and '2' which are used
by the code which ensures only one resync is happening at a time on
any given device.
These values are internal and should never be exposed to user-space
(except when translated appropriately as in the 'pending' status in
/proc/mdstat).
Unfortunately they are as ->curr_resync is assigned to
->curr_resync_completed and that value is directly visible through
sysfs.
So change the assignments to ->curr_resync_completed to get the same
valued from elsewhere in a form that doesn't have the magic '1' or '2'
values.
Signed-off-by: NeilBrown <neilb@suse.de>
Allow the metadata to be on a separate device from the
data.
This doesn't mean the data and metadata will by on separate
physical devices - it simply gives device-mapper and userspace
tools more flexibility.
Signed-off-by: NeilBrown <neilb@suse.de>
Add new parameter to 'sync_page_io'.
The new parameter allows us to distinguish between metadata and data
operations. This becomes important later when we add the ability to
use separate devices for data and metadata.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
When writing to an 'external' bitmap we don't currently unplug the
device before waiting, so we can get a 3msec delay each time;
So use REQ_UNPLUG to force and unplug.
Signed-off-by: NeilBrown <neilb@suse.de>
Currently sync_page_io takes a 'bdev'.
Every caller passes 'rdev->bdev'.
We will soon want another field out of the rdev in sync_page_io,
So just pass the rdev instead of the bdev out of it.
Signed-off-by: NeilBrown <neilb@suse.de>
bitmap_get_counter returns the number of sectors covered
by the counter in a pass-by-reference variable.
In some cases this can be very large, so make it a sector_t
for safety.
Signed-off-by: NeilBrown <neilb@suse.de>
MD_CHANGE_CLEAN is used for two different purposes and this leads to
confusion.
One of the purposes is largely mirrored by MD_CHANGE_PENDING which is
not used for anything else, so have MD_CHANGE_PENDING take over that
purpose fully.
The two purposes are:
1/ tell md_update_sb that an update is needed and that it is just a
clean/dirty transition.
2/ tell user-space that an transition from clean to dirty is pending
(something wants to write), and tell te kernel (by clearin the
flag) that the transition is OK.
The first purpose remains wit MD_CHANGE_CLEAN, the second is moved
fully to MD_CHANGE_PENDING.
This means that various places which conditionally set or cleared
MD_CHANGE_CLEAN no longer need to be conditional.
Signed-off-by: NeilBrown <neilb@suse.de>
dm makes this distinction between ->ctr and ->resume, so we need to
too.
Also get the new bitmap_load to clear out the bitmap first, as this is
most consistent with the dm suspend/resume approach
Signed-off-by: NeilBrown <neilb@suse.de>
This allows md/raid5 to fully work as a dm target.
Normally md uses a 'filemap' which contains a list of pages of bits
each of which may be written separately.
dm-log uses and all-or-nothing approach to writing the log, so
when using a dm-log, ->filemap is NULL and the flags normally stored
in filemap_attr are stored in ->logattrs instead.
Signed-off-by: NeilBrown <neilb@suse.de>
A bitmap is stored as one page per 2048 bits.
If none of the bits are set, the page is not allocated.
When bitmap_get_counter finds that a page isn't allocate,
it just reports that one bit work of space isn't flagged,
rather than reporting that 2048 bits worth of space are
unflagged.
This can cause searches for flagged bits (e.g. bitmap_close_sync)
to do more work than is really necessary.
So change bitmap_get_counter (when creating) to report a number of
blocks that more accurately reports the range of the device for which
no counter currently exists.
Signed-off-by: NeilBrown <neilb@suse.de>
1/ use md_unplug in bitmap.c as we will soon be using bitmaps under
arrays with no queue attached.
2/ Don't bother plugging the queue when we set a bit in the bitmap.
The reason for this was to encourage as many bits as possible to
get set before we unplug and write stuff out.
However every personality already plugs the queue after
bitmap_startwrite either directly (raid1/raid10) or be setting
STRIPE_BIT_DELAY which causes the queue to be plugged later
(raid5).
Signed-off-by: NeilBrown <neilb@suse.de>
For dm-raid45 we will want to use bitmaps in dm-targets which don't
have entries in sysfs, so cope with the mddev not living in sysfs.
Signed-off-by: NeilBrown <neilb@suse.de>
Fixes some whitespace problems
Fixed some checkpatch.pl complaints.
Replaced kmalloc ... memset(0), with kzalloc
Fixed an unlikely memory leak on an error path.
Reformatted a number of 'if/else' sets, sometimes
replacing goto with an else clause.
Removed some old comments and commented-out code.
Signed-off-by: NeilBrown <neilb@suse.de>