Commit Graph

97 Commits

Author SHA1 Message Date
Mikulas Patocka fa34ce7307 dm kcopyd: return client directly and not through a pointer
Return client directly from dm_kcopyd_client_create, not through a
parameter, making it consistent with dm_io_client_create.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-05-29 13:03:13 +01:00
Mikulas Patocka 5f43ba2950 dm kcopyd: reserve fewer pages
Reserve just the minimum of pages needed to process one job.

Because we allocate pages from page allocator, we don't need to reserve
a large number of pages.  The maximum job size is SUB_JOB_SIZE and we
calculate the number of reserved pages based on this.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-05-29 13:03:11 +01:00
Mikulas Patocka bda8efec5c dm io: use fixed initial mempool size
Replace the arbitrary calculation of an initial io struct mempool size
with a constant.

The code calculated the number of reserved structures based on the request
size and used a "magic" multiplication constant of 4.  This patch changes
it to reserve a fixed number - itself still chosen quite arbitrarily.
Further testing might show if there is a better number to choose.

Note that if there is no memory pressure, we can still allocate an
arbitrary number of "struct io" structures.  One structure is enough to
process the whole request.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-05-29 13:03:09 +01:00
Jens Axboe 7eaceaccab block: remove per-queue plugging
Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:07 +01:00
Tejun Heo 9c4376de98 dm: use non reentrant workqueues if equivalent
kmirrord_wq, kcopyd_work and md->wq are created per dm instance and
serve only a single work item from the dm instance, so non-reentrant
workqueues would provide the same ordering guarantees as ordered ones
while allowing CPU affinity and use of the workqueues for other
purposes.  Switch them to non-reentrant workqueues.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-01-13 19:59:58 +00:00
Tejun Heo 4d4d66ab53 dm: convert workqueues to alloc_ordered
Convert all create[_singlethread]_work() users to the new
alloc[_ordered]_workqueue().  This conversion is mechanical and
doesn't introduce any behavior change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-01-13 19:59:57 +00:00
Tejun Heo d5ffa387e2 dm: dont use flush_scheduled_work
flush_scheduled_work() is being deprecated.  Flush the used work
directly instead.  In all dm targets, the only work which uses
system_wq is ->trigger_event.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-01-13 19:59:56 +00:00
Mike Snitzer 5fc2ffeabb dm raid1: support discard
Enable discard support in the DM mirror target.
Also change an existing use of 'bvec' to 'addr' in the union.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-01-13 19:59:48 +00:00
Tejun Heo d87f4c14f2 dm: implement REQ_FLUSH/FUA support for bio-based dm
This patch converts bio-based dm to support REQ_FLUSH/FUA instead of
now deprecated REQ_HARDBARRIER.

* -EOPNOTSUPP handling logic dropped.

* Preflush is handled as before but postflush is dropped and replaced
  with passing down REQ_FUA to member request_queues.  This replaces
  one array wide cache flush w/ member specific FUA writes.

* __split_and_process_bio() now calls __clone_and_map_flush() directly
  for flushes and guarantees all FLUSH bio's going to targets are zero
`  length.

* It's now guaranteed that all FLUSH bio's which are passed onto dm
  targets are zero length.  bio_empty_barrier() tests are replaced
  with REQ_FLUSH tests.

* Empty WRITE_BARRIERs are replaced with WRITE_FLUSHes.

* Dropped unlikely() around REQ_FLUSH tests.  Flushes are not unlikely
  enough to be marked with unlikely().

* Block layer now filters out REQ_FLUSH/FUA bio's if the request_queue
  doesn't support cache flushing.  Advertise REQ_FLUSH | REQ_FUA
  capability.

* Request based dm isn't converted yet.  dm_init_request_based_queue()
  resets flush support to 0 for now.  To avoid disturbing request
  based dm code, dm->flush_error is added for bio based dm while
  requested based dm continues to use dm->barrier_error.

Lightly tested linear, stripe, raid1, snap and crypt targets.  Please
proceed with caution as I'm not familiar with the code base.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: dm-devel@redhat.com
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:35:38 +02:00
Alasdair G Kergon b441a262e7 dm: use dm_target_offset macro
Use new dm_target_offset() macro to avoid most references to ti->begin
in dm targets.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-08-12 04:14:11 +01:00
Christoph Hellwig 7b6d91daee block: unify flags for struct bio and struct request
Remove the current bio flags and reuse the request flags for the bio, too.
This allows to more easily trace the type of I/O from the filesystem
down to the block driver.  There were two flags in the bio that were
missing in the requests:  BIO_RW_UNPLUG and BIO_RW_AHEAD.  Also I've
renamed two request flags that had a superflous RW in them.

Note that the flags are in bio.h despite having the REQ_ name - as
blkdev.h includes bio.h that is the only way to go for now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:20:39 +02:00
Takahiro Yasui f070304094 dm raid1: fix deadlock when suspending failed device
To prevent deadlock, bios in the hold list should be flushed before
dm_rh_stop_recovery() is called in mirror_suspend().

The recovery can't start because there are pending bios and therefore
dm_rh_stop_recovery deadlocks.

When there are pending bios in the hold list, the recovery waits for
the completion of the bios after recovery_count is acquired.
The recovery_count is released when the recovery finished, however,
the bios in the hold list are processed after dm_rh_stop_recovery() in
mirror_presuspend(). dm_rh_stop_recovery() also acquires recovery_count,
then deadlock occurs.

Signed-off-by: Takahiro Yasui <tyasui@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
2010-03-06 02:32:35 +00:00
Nikanth Karthikesan 8215d6ec5f dm table: remove unused dm_get_device range parameters
Remove unused parameters(start and len) of dm_get_device()
and fix the callers.

Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-03-06 02:32:27 +00:00
Mikulas Patocka ede5ea0b8b dm raid1: always return error if all legs fail
If all mirror legs fail, always return an error instead of holding the
bio, even if the handle_errors option was set.  At present it is the
responsibility of the driver underneath us to deal with retries,
multipath etc.

The patch adds the bio to the failures list instead of holding it
directly.  do_failures tests first if all legs failed and, if so,
returns the bio with -EIO.  If any leg is still alive and handle_errors
is set, do_failures calls hold_bio.

Reviewed-by: Takahiro Yasui <tyasui@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-03-06 02:32:22 +00:00
Mikulas Patocka 5528d17de1 dm raid1: fail writes if errors are not handled and log fails
If the mirror log fails when the handle_errors option was not selected
and there is no remaining valid mirror leg, writes return success even
though they weren't actually written to any device.  This patch
completes them with EIO instead.

This code path is taken:
do_writes:
	bio_list_merge(&ms->failures, &sync);
do_failures:
	if (!get_valid_mirror(ms)) (false)
	else if (errors_handled(ms)) (false)
	else bio_endio(bio, 0);

The logic in do_failures is based on presuming that the write was already
tried: if it succeeded at least on one leg (without handle_errors) it
is reported as success.

Reference: https://bugzilla.redhat.com/show_bug.cgi?id=555197

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-02-16 18:42:55 +00:00
Mikulas Patocka 5339fc2d47 dm raid1: explicitly initialise bio_lists
Explicitly initialize bio lists instead of relying on kzalloc.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Takahiro Yasui <tyasui@redhat.com>
Tested-by: Takahiro Yasui <tyasui@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:06 +00:00
Mikulas Patocka 929be8fcb4 dm raid1: hold all write bios when leg fails
Hold all write bios when leg fails and errors are handled

When using a userspace daemon such as dmeventd to handle errors, we must
delay completing  bios until it has done its job.
This patch prevents the following race:
  - primary leg fails
  - write "1" fail, the write is held, secondary leg is set default
  - write "2" goes straight to the secondary leg

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Takahiro Yasui <tyasui@redhat.com>
Tested-by: Takahiro Yasui <tyasui@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:06 +00:00
Mikulas Patocka 60f355ead3 dm raid1: hold write bios when errors are handled
Hold all write bios when errors are handled.

Previously the failures list was used only when handling errors with
a userspace daemon such as dmeventd.  Now, it is always used for all bios.
The regions where some writes failed must be marked as nosync. This can only
be done in process context (i.e. in raid1 workqueue), not in the
write_callback function.

Previously the write would succeed if writing to at least one leg
succeeded.  This is wrong because data from the failed leg may be
replicated to the correct leg.  Now, if using a userspace daemon, the
write with some failures will be held until the daemon has done its job
and reconfigured the array.  If not using a daemon, the write still
succeeds if at least one leg succeeds. This is bad, but it is consistent
with current behavior.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Takahiro Yasui <tyasui@redhat.com>
Tested-by: Takahiro Yasui <tyasui@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:05 +00:00
Mikulas Patocka c58098be97 dm raid1: remove bio_endio from dm_rh_mark_nosync
Move bio completion out of dm_rh_mark_nosync in preparation for the
next patch.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Takahiro Yasui <tyasui@redhat.com>
Tested-by: Takahiro Yasui <tyasui@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:05 +00:00
Mikulas Patocka 87968ddd2f dm raid1: abstract get_valid_mirror function
Move the logic to get a valid mirror leg into a function for re-use
in a later patch.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Takahiro Yasui <tyasui@redhat.com>
Tested-by: Takahiro Yasui <tyasui@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:04 +00:00
Mikulas Patocka 0f398a8403 dm raid1: use hold framework in do_failures
Use the hold framework in do_failures.

This patch doesn't change the bio processing logic, it just simplifies
failure handling and avoids periodically polling the failures list.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Takahiro Yasui <tyasui@redhat.com>
Tested-by: Takahiro Yasui <tyasui@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:04 +00:00
Mikulas Patocka 0478850768 dm raid1: add framework to hold bios during suspend
Add framework to delay bios until a suspend and then resubmit them with
either DM_ENDIO_REQUEUE (if the suspend was noflush) or complete them
with -EIO.  I/O barrier support will use this.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Takahiro Yasui <tyasui@redhat.com>
Tested-by: Takahiro Yasui <tyasui@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:03 +00:00
Mikulas Patocka 64b30c46e8 dm raid1: report flush errors separately in status
Report flush errors as 'F' instead of 'D' for log and mirror devices.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:02 +00:00
Mikulas Patocka c0da3748b9 dm raid1: implement mirror_flush
Implement flush callee. It uses dm_io to send zero-size barrier synchronously
and concurrently to all the mirror legs.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:02 +00:00
Mikulas Patocka 87a8f240e9 dm log: add flush callback fn
Introduce a callback pointer from the log to dm-raid1 layer.

Before some region is set as "in-sync", we need to flush hardware cache on
all the disks. But the log module doesn't have access to the mirror_set
structure. So it will use this callback.

So far the callback is unused, it will be used in further patches.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:01 +00:00