This patch introduces a new field to the mirror_set (default_mirror) to store
the default mirror.
(A subsequent patch will allow us to change the default mirror in the event of
a failure.)
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use %llu not %Lu in sscanf/printf format strings.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch removes an unused #define.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
More snapshot metadata reading into separate function, to prepare for changing
the place it gets called from.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
After changing the name of a mapped device, trigger a dm event. (For
userspace multipath tools.)
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add dm_get_dev() to get a mapped device given its dev_t.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Abstract dm_find_md() from dm_get_mdptr() to allow use elsewhere.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ignore all files generated from *_shipped files, plus a few others.
Signed-off-by: Brian Gerst <bgerst@didntduck.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
I had thought that keeping the reported tail level clearly different
from the module name was a good idea, but I've changed my mind.
'raid5' is better and probably less confusing than 'RAID-5'.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- export __blk_put_request and blk_execute_rq_nowait
needed for async REQ_BLOCK_PC requests
- seperate max_hw_sectors and max_sectors for block/scsi_ioctl.c and
SG_IO bio.c helpers per Jens's last comments. Since block/scsi_ioctl.c SG_IO was
already testing against max_sectors and SCSI-ml was setting max_sectors and
max_hw_sectors to the same value this does not change any scsi SG_IO behavior. It only
prepares ll_rw_blk.c, scsi_ioctl.c and bio.c for when SCSI-ml begins to set
a valid max_hw_sectors for all LLDs. Today if a LLD does not set it
SCSI-ml sets it to a safe default and some LLDs set it to a artificial low
value to overcome memory and feedback issues.
Note: Since we now cap max_sectors to BLK_DEF_MAX_SECTORS, which is 1024,
drivers that used to call blk_queue_max_sectors with a large value of
max_sectors will now see the fs requests capped to BLK_DEF_MAX_SECTORS.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The raid5 stripe cache was recently changed from fixed size (NR_STRIPES) to
variable size (conf->max_nr_stripes). However there are two places that still
use the constant and as a result, reducing the size of the stripe cache can
result in a deadlock.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Who would submit code with a FIXME like that in it !!!!
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If you have an array with a write-intent-bitmap, and you remove a device, then
re-add it, a full recovery isn't needed. We detect a re-add by looking at
saved_raid_disk. For raid1, it doesn't matter which disk it was, only whether
or not it was an active device. The old code being removed set a value of
'mirror' which was then ignored, so it can go. The changed code performs the
correct check.
For raid6, if there are two missing devices, make sure we chose the right slot
on --re-add rather than always the first slot.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If an array is created using set_array_info, default_bitmap_offset isn't set
properly meaning that an internal bitmap cannot be hot-added until the array
is stopped and re-assembled.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When doing a recovery, we need to know whether the array will still be
degraded after the recovery has finished, so we can know whether bits can be
clearred yet or not. This patch performs the required check.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
bitmap_unplug actually writes data (bits) to storage, so we shouldn't be
holding a spinlock...
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
raid10 has two different layouts. One uses near-copies (so multiple
copies of a block are at the same or similar offsets of different
devices) and the other uses far-copies (so multiple copies of a block
are stored a greatly different offsets on different devices). The point
of far-copies is that it allows the first section (normally first half)
to be layed out in normal raid0 style, and thus provide raid0 sequential
read performance.
Unfortunately, the read balancing in raid10 makes some poor decisions
for far-copies arrays and you don't get the desired performance. So
turn off that bad bit of read_balance for far-copies arrays.
With this patch, read speed of an 'f2' array is comparable with a raid0
with the same number of devices, though write speed is ofcourse still
very slow.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The spinlock region_lock is held while calling mark_region which can sleep.
Drop the spinlock before calling that function.
A region's state and inclusion in the clean list are altered by rh_inc and
rh_dec. The state variable is set to RH_CLEAN in rh_dec, but only if
'pending' is zero. It is set to RH_DIRTY in rh_inc, but not if it is already
so. The changes to 'pending', the state, and the region's inclusion in the
clean list need to be atomicly.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
bio_list_merge() should do nothing if the second list is empty - not oops.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
do_end_io() can be called without interrupts blocked.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The linux bitset operators (test_bit, set_bit etc) work on arrays of "unsigned
long". dm-log uses such bitsets but treats them as arrays of uint32_t, only
allocating and zeroing a multiple of 4 bytes (as 'clean_bits' is a uint32_t).
The patch below fixes this problem.
The problem is specific to 64-bit big endian machines such as s390x or ppc-64
and can prevent pvmove terminating.
In the simplest case, if "region_count" were (say) 30, then
bitset_size (below) would be 4 and bitset_uint32_count would be 1.
Thus the memory for this butset, after allocation and zeroing would
be
0 0 0 0 X X X X
On a bigendian 64bit machine, bit 0 for this bitset is in the 8th
byte! (and every bit that dm-log would use would be in the X area).
0 0 0 0 X X X X
^
here
which hasn't been cleared properly.
As the dm-raid1 code only syncs and counts regions which have a 0 in the
'sync_bits' bitset, and only finishes when it has counted high enough, a large
number of 1's among those 'X's will cause the sync to not complete.
It is worth noting that the code uses the same bitsets for in-memory and
on-disk logs. As these bitsets are host-endian and host-sized, this means
that they cannot safely be moved between computers with
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In some circumstances the LIST_VERSIONS output is truncated because the size
calculation forgets about a 'uint32_t' in each structure - but the inclusion
of the whole of ALIGN_MASK frequently compensates for the omission.
This is a quick workaround to use an upper bound. (The code ought to be fixed
to supply the actual size.)
Running 'dmsetup targets' may demonstrate the problem: when I run it, the last
line comes out as 'erro' instead of 'error'. Consequently, 'lvcreate --type
error' doesn't work.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
An error path in table_load() forgets to release a table that won't now be
referenced.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>