commit 7b6d91daee changed the behaviour
of a few variables in raid1 and raid10 from flags to bit-sets, but
left them as type 'bool' so they did not work.
Change them (back) to unsigned long.
(historical note: see 1ef04fefe2)
Signed-off-by: NeilBrown <neilb@suse.de>
Reported-by: Jiri Slaby <jslaby@suse.cz> and many others
md_check_recovery expects ->spare_active to return 'true' if any
spares were activated, but none of them do, so the consequent change
in 'degraded' is not notified through sysfs.
So count the number of spares activated, subtract it from 'degraded'
just once, and return it.
Reported-by: Adrian Drzewiecki <adriand@vmware.com>
Signed-off-by: NeilBrown <neilb@suse.de>
When RAID1 is done syncing disks, it'll update the state
of synced rdevs to In_sync. But it neglected to notify
sysfs that the attribute changed. So any programs that
are waiting for an rdev's state to change will not be
woken.
(raid5/raid10 added by neilb)
Signed-off-by: Adrian Drzewiecki <adriand@vmware.com>
Signed-off-by: NeilBrown <neilb@suse.de>
The update of ->recovery_offset in sync_sbs is appropriate even then external
metadata is in use. However sync_sbs is only called when native
metadata is used.
So move that update in to the top of md_update_sb (which is the only
caller of sync_sbs) before the test on ->external.
This moves the update out of ->write_lock protection, but those fields
only need ->reconfig_mutex protection which they still have.
Also move the test on ->persistent up to where ->external is set as
for metadata update purposes they are the same.
Clear MD_CHANGE_DEVS and MD_CHANGE_CLEAN as they can only be confusing
if ->external is set or ->persistent isn't.
Finally move the update of ->utime down as it is only relevent (like
the ->events update) for native metadata.
Signed-off-by: NeilBrown <neilb@suse.de>
Reported-by: "Kwolek, Adam" <adam.kwolek@intel.com>
Enable discard support in the DM multipath target.
This discard support depends on a few discard-specific fixes to the
block layer's request stacking driver methods.
Discard requests are optional so don't allow a failed discard to trigger
path failures. If there is a real problem with a given path the
barriers associated with the discard (either before or after the
discard) will cause path failure. That said, unconditionally passing
discard failures up the stack is not ideal. This must be fixed once DM
has more information about the nature of the underlying storage failure.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
The DM core will submit a discard bio to the stripe target for each
stripe in a striped DM device. The stripe target will determine
stripe-specific portions of the supplied bio to be remapped into
individual (at most 'num_discard_requests' extents). If a given
stripe-specific discard bio doesn't touch a particular stripe the bio
will be dropped.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Update __clone_and_map_discard to loop across all targets in a DM
device's table when it processes a discard bio. If a discard crosses a
target boundary it must be split accordingly.
Update __issue_target_requests and __issue_target_request to allow a
cloned discard bio to have a custom start sector and size.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Optimize sector division: If the number of stripes is a power of two,
we can do shift and mask instead of division.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Have the error target respond to a discard request with a hard -EIO
rather than fail the request with -EOPNOTSUPP.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Have the zero target silently drop a discard rather than fail the
request with -EOPNOTSUPP.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Split max_io_len_target_boundary out of max_io_len so that the discard
support can make use of it without duplicating max_io_len code.
Avoiding max_io_len's split_io logic enables DM's discard support to
submit the entire discard request to a target. But discards must still
be split on target boundaries.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Rename __flush_target to __issue_target_request now that it is used to
issue both flush and discard requests.
Introduce __issue_target_requests as a convenient wrapper to
__issue_target_request 'num_flush_requests' or 'num_discard_requests'
times per target.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Allow discards to be passed through to linear mappings if at least one
underlying device supports it. Discards will be forwarded only to
devices that support them.
A target that supports discards should set num_discard_requests to
indicate how many times each discard request must be submitted to it.
Verify table's underlying devices support discards prior to setting the
associated DM device as capable of discards (via QUEUE_FLAG_DISCARD).
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Joe Thornber <thornber@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Allocate cipher strings indpendently of struct crypt_config and move
cipher parsing and allocation into a separate function to prepare for
supporting the cryptoapi format e.g. "xts(aes)".
No functional change in this patch.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Use just one label and reuse common destructor for crypt target.
Parse remaining argv arguments in logic order.
Also do not ignore error values from IV init and set key functions.
No functional change in this patch except changed return codes
based on above.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Add devname:mapper/control and MAPPER_CTRL_MINOR module alias
to support dm-mod module autoloading.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Peter Rajnoha <prajnoha@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
'target_request_nr' is a more generic name that reflects the fact that
it will be used for both flush and discard support.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
This change unifies the various checks and finalization that occurs on a
table prior to use. By doing so, it allows table construction without
traversing the dm-ioctl interface.
Signed-off-by: Will Drewry <wad@chromium.org>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Implement merge method for the snapshot origin to improve read
performance.
Without merge method, dm asks the upper layers to submit smallest possible
bios --- one page. Submitting such small bios impacts performance negatively
when reading or writing the origin device.
Without this patch, CPU consumption when reading the origin on lvm on md-raid0
was 6 to 12%, with this patch, it drops to 1 to 4%.
Note: in my testing, it actually degraded performance in some settings, I
traced it to Maxtor disks having problems with > 512-sector requests.
Reducing the number of sectors to /sys/block/sd*/queue/max_sectors_kb to
256 fixed the read performance. I think we don't have to care about weird
disks that actually degrade performance because of large requests being
sent to them.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Change bio-based mapped devices no longer to have a fully initialized
request_queue (request_fn, elevator, etc). This means bio-based DM
devices no longer register elevator sysfs attributes ('iosched/' tree
or 'scheduler' other than "none").
In contrast, a request-based DM device will continue to have a full
request_queue and will register elevator sysfs attributes. Therefore
a user can determine a DM device's type by checking if elevator sysfs
attributes exist.
First allocate a minimalist request_queue structure for a DM device
(needed for both bio and request-based DM).
Initialization of a full request_queue is deferred until it is known
that the DM device is request-based, at the end of the table load
sequence.
Factor DM device's request_queue initialization:
- common to both request-based and bio-based into dm_init_md_queue().
- specific to request-based into dm_init_request_based_queue().
The md->type_lock mutex is used to protect md->queue, in addition to
md->type, during table_load().
A DM device's first table_load will establish the immutable md->type.
But md->queue initialization, based on md->type, may fail at that time
(because blk_init_allocated_queue cannot allocate memory). Therefore
any subsequent table_load must (re)try dm_setup_md_queue independently of
establishing md->type.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Determine whether a mapped device is bio-based or request-based when
loading its first (inactive) table and don't allow that to be changed
later.
This patch performs different device initialisation in each of the two
cases. (We don't think it's necessary to add code to support changing
between the two types.)
Allowed md->type transitions:
DM_TYPE_NONE to DM_TYPE_BIO_BASED
DM_TYPE_NONE to DM_TYPE_REQUEST_BASED
We now prevent table_load from replacing the inactive table with a
conflicting type of table even after an explicit table_clear.
Introduce 'type_lock' into the struct mapped_device to protect md->type
and to prepare for the next patch that will change the queue
initialization and allocate memory while md->type_lock is held.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
drivers/md/dm-ioctl.c | 15 +++++++++++++++
drivers/md/dm.c | 37 ++++++++++++++++++++++++++++++-------
drivers/md/dm.h | 5 +++++
include/linux/dm-ioctl.h | 4 ++--
4 files changed, 52 insertions(+), 9 deletions(-)
When processing barriers, skip the second flush if processing the bio
failed with -EOPNOTSUPP. This can happen with discard+barrier requests.
If the device doesn't support discard, there would be two useless
SYNCHRONIZE CACHE commands. The first dm_flush cannot be so easily
optimized out, so we leave it there.
Previously, -EOPNOTSUPP could be received in dec_pending only with empty
barriers and we ignored that error, assuming the device not supporting
cache flushes has cache always consistent. With the addition of discard
barriers, this -EOPNOTSUPP can also be generated by discards and we
must record it in md->barrier_error for process_barrier.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>