Commit Graph

1541 Commits

Author SHA1 Message Date
Mike Snitzer 8a2d528620 dm snapshot: merge consecutive chunks together
s->store->type->prepare_merge returns the number of chunks that can be
copied linearly working backwards from the returned chunk number.

For example, if it returns 3 chunks with old_chunk == 10 and new_chunk
== 20, then chunk 20 can be copied to 10, chunk 19 to 9 and 18 to 8.

Until now kcopyd only copied one chunk at a time.  This patch now copies
the full set at once.

Consequently, snapshot_merge_process() needs to delay the merging of all
chunks if any have writes in progress, not just the first chunk in the
region that is to be merged.

snapshot-merge's performance is now comparable to the original
snapshot-origin target.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:34 +00:00
Mikulas Patocka 73dfd078cf dm snapshot: trigger exceptions in remaining snapshots during merge
When there is one merging snapshot and other non-merging snapshots,
snapshot_merge_process() must make exceptions in the non-merging
snapshots.

Use a sequence count to resolve the race between I/O to chunks that are
about to be merged.  The count increases each time an exception
reallocation finishes.  Use wait_event() to wait until the count
changes.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:34 +00:00
Mikulas Patocka 17aa03326d dm snapshot: delay merging a chunk until writes to it complete
Track writes to chunks that are currently being merged and delay merging
a chunk until all writes to that chunk finish.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:33 +00:00
Mikulas Patocka 9fe8625488 dm snapshot: queue writes to chunks being merged
While a set of chunks is being merged, any overlapping writes need to be
queued.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:33 +00:00
Mikulas Patocka 1e03f97e43 dm snapshot: add merging
Merging is started when origin is resumed and it is stopped when
origin is suspended or when the merging snapshot is destroyed or
errors are detected.

Merging is not yet interlocked with writes: this will be handled in
subsequent patches.

The code relies on callbacks from a private kcopyd thread.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:32 +00:00
Mikulas Patocka 9d3b15c4c7 dm snapshot: permit only one merge at once
Merging more than one snapshot is not supported, so prevent
this happening.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:32 +00:00
Mike Snitzer 10b8106a70 dm snapshot: support barriers in snapshot merge target
Sets num_flush_requests=2 to support flushing both the origin and cow
devices used by the snapshot-merge target.

Also, snapshot_ctr() now gets the origin device using FMODE_WRITE if the
target is snapshot-merge (which writes to the origin device).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:31 +00:00
Mikulas Patocka 3452c2a1eb dm snapshot: avoid allocating exceptions in merge
The snapshot-merge target should not allocate new exceptions because the
intent is to merge all of its exceptions as quickly and safely as
possible.

This patch introduces the snapshot-merge mapping function and updates
__origin_write() so that it doesn't allocate exceptions on any snapshots
that are being merged.

If a write request to a merging snapshot device is to be dispatched
directly to the origin (because the chunk is not remapped or was already
merged), snapshot_merge_map() must make exceptions in other snapshots so
calls do_origin().

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:31 +00:00
Mikulas Patocka 515ad66cc4 dm snapshot: rework writing to origin
To track the completion of exceptions relating to the same location on
the device, the current code selects one exception as primary_pe, links
the other exceptions to it and uses reference counting to wait until all
the reallocations are complete.

It is considered too complicated to extend this code to handle the new
snapshot-merge target, where sets of non-overlapping chunks would also
need to become linked.

Instead, a simpler (but less efficient) approach is taken.  Bios are
linked to one exception.  When it completes, bios are simply retried,
and if other related exceptions are still outstanding, they'll get
queued again to wait for another one.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:30 +00:00
Mikulas Patocka d698aa4500 dm snapshot: add merge target
The snapshot-merge target allows a snapshot to be merged back into the
snapshot's origin device.

One anticipated use of snapshot merging is the rollback of filesystems
to back out problematic system upgrades.

This patch adds snapshot-merge target management to both
dm_snapshot_init() and dm_snapshot_exit().  As an initial place-holder,
snapshot-merge is identical to the snapshot target.  Documentation is
provided.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:30 +00:00
Mikulas Patocka 4454a6216f dm exception store: add merge specific methods
Add functions that decide how many consecutive chunks of snapshot to
merge back into the origin next and to update the metadata afterwards.

prepare_merge provides a pointer to the most recent still-to-be-merged
chunk and returns how many previous ones are consecutive and can be
processed together.

commit_merge removes the nr_merged most-recent chunks permanently from
the exception store.  The number must not exceed that returned by
prepare_merge.

Introduce NUM_SNAPSHOT_HDR_CHUNKS to show where the snapshot header
chunk is accounted for.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:29 +00:00
Mike Snitzer 615d1eb9ca dm snapshot: create function for chunk_is_tracked wait
Move the __chunk_is_tracked() loop into a separate function as we will
also need to call it from the write path in the rare case of conflicting
writes to the same chunk.

Originally introduced in commit a8d41b59f3
("dm snapshot: fix race during exception creation").

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:29 +00:00
Mikulas Patocka 9eaae8ffbc dm snapshot: make bio optional in __origin_write
To support the merging of snapshots back into their origin we need
to trigger exceptions in other snapshots not being merged without
any incoming bio on the origin device.  The bio parameter to
__origin_write() becomes optional and the sector needs supplying
separately.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:28 +00:00
Kiyoshi Ueda c2f3d24b78 dm mpath: reject messages when device is suspended
This patch rejects messages that can generate I/O while the device
itself is suspended.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Cc: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:27 +00:00
Kiyoshi Ueda 64dbce580d dm: export suspended state to targets
This patch adds the exported dm_suspended() function so that targets
can check whether or not they are suspended.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:27 +00:00
Kiyoshi Ueda 4f186f8bbf dm: rename dm_suspended to dm_suspended_md
This patch renames dm_suspended() to dm_suspended_md() and
keeps it internal to dm.
No functional change.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:26 +00:00
Kiyoshi Ueda 4d4471cb5c dm: swap target postsuspend call and setting suspended flag
This patch moves DMF_SUSPENDED flag set before postsuspend.
No one should care about the ordering, because the flag set and
the postsuspend are protected by a single lock, md->suspend_lock,
and all strict flag-checkers take the lock.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:26 +00:00
Milan Broz 61afef614b dm crypt: add plain64 iv
The default plain IV is 32-bit only.

This plain64 IV provides a compatible mode for encrypted devices bigger
than 4TB.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:25 +00:00
Jun'ichi Nomura 6db4ccd635 dm: trace request based remapping
This patch adds a remapping trace to request-based dm.
BIO-based dm already has the equivalent tracepoint.

For example, under this dm stack (linear LV on multipath):
  # dmsetup ls --tree -o ascii
  vg-lv0 (253:1)
   `-mpath0 (253:0)
      |- (8:160)
      |- (66:80)
      |- (65:176)
      `- (65:160)

Trace of 'dd of=/dev/vg/lv0 bs=128k count=1 oflag=direct' looks like this:

without the patch:
  dd-6674  [000]   539.727384: block_bio_queue: 253,1 WS 0 + 256 [dd]
  dd-6674  [000]   539.727392: block_remap: 253,0 WS 384 + 256 <- (253,1) 0
  dd-6674  [000]   539.727394: block_bio_queue: 253,0 WS 384 + 256 [dd]
  dd-6674  [000]   539.727405: block_getrq: 253,0 WS 384 + 256 [dd]
  dd-6674  [000]   539.727409: block_plug: [dd]
  dd-6674  [000]   539.727410: block_rq_insert: 253,0 W 0 () 384 + 256 [dd]
  dd-6674  [000]   539.727416: block_rq_issue: 253,0 W 0 () 384 + 256 [dd]
  dd-6674  [000]   539.727426: block_rq_insert: 65,176 W 0 () 384 + 256 [dd]
  dd-6674  [000]   539.727427: block_rq_issue: 65,176 W 0 () 384 + 256 [dd]
  ...

and with the patch: (the line with '**' is the trace added by this patch)
  dd-6617  [002]   162.914301: block_bio_queue: 253,1 WS 0 + 256 [dd]
  dd-6617  [002]   162.914314: block_remap: 253,0 WS 384 + 256 <- (253,1) 0
  dd-6617  [002]   162.914316: block_bio_queue: 253,0 WS 384 + 256 [dd]
  dd-6617  [002]   162.914331: block_getrq: 253,0 WS 384 + 256 [dd]
  dd-6617  [002]   162.914335: block_plug: [dd]
  dd-6617  [002]   162.914337: block_rq_insert: 253,0 W 0 () 384 + 256 [dd]
  dd-6617  [002]   162.914347: block_rq_issue: 253,0 W 0 () 384 + 256 [dd]
**dd-6617  [002]   162.914356: block_rq_remap: 65,176 W 384 + 256 <- (253,0) 384
  dd-6617  [002]   162.914358: block_rq_insert: 65,176 W 0 () 384 + 256 [dd]
  dd-6617  [002]   162.914359: block_rq_issue: 65,176 W 0 () 384 + 256 [dd]
  ...

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:25 +00:00
Mike Snitzer c1f0c183f6 dm snapshot: allow live exception store handover between tables
Permit in-use snapshot exception data to be 'handed over' from one
snapshot instance to another.  This is a pre-requisite for patches
that allow the changes made in a snapshot device to be merged back into
its origin device and also allows device resizing.

The basic call sequence is:

  dmsetup load new_snapshot (referencing the existing in-use cow device)
     - the ctr code detects that the cow is already in use and allows the
       two snapshot target instances to be linked together
  dmsetup suspend original_snapshot
  dmsetup resume new_snapshot
     - the new_snapshot becomes live, and if anything now tries to access
       the original one it will receive -EIO
  dmsetup remove original_snapshot

(There can only be two snapshot targets referencing the same cow device
simultaneously.)

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:24 +00:00
Alasdair G Kergon 042d2a9bcd dm: keep old table until after resume succeeded
When swapping a new table into place, retain the old table until
its replacement is in place.

An old check for an empty table is removed because this is enforced
in populate_table().

__unbind() becomes redundant when followed by __bind().

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:24 +00:00
Alasdair G Kergon a794015597 dm: bind new table before destroying old
When replacing a mapped device's table during a 'resume', delay the
destruction of the old table until the new one is successfully in place.

This will make it easier for a later patch to transfer internal state
information from the old table to the new one (something we do not currently
support) while giving us more options for reversion if a later part
of the operation fails.

Devices are always in the suspended state during dm_swap_table().
This patch reinforces the requirement that all I/O must have been
flushed from the table targets while in this state (including any in
workqueues).  In the case of 'noflush' suspending, unprocessed
I/O should have been 'pushed back' to the dm core prior to this point,
for resubmission after the new table is in place.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:23 +00:00
Mike Snitzer 1d0f3ce832 dm ioctl: retrieve status from inactive table
Add the flag DM_QUERY_INACTIVE_TABLE_FLAG to the ioctls to return
infomation about the loaded-but-not-yet-active table instead of the live
table.  Prior to this patch it was impossible to obtain this information
until the device had been 'resumed'.

Userspace dmsetup and libdevmapper support the flag as of version 1.02.40.
e.g. dmsetup info --inactive vg1-lv1

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:22 +00:00
Mikulas Patocka 12fc0f49dc dm io: handle empty barriers
Accept empty barriers in dm-io.

dm-io will process empty write barrier requests just like the other
read/write requests.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:22 +00:00
Mike Anderson 67a46dad25 dm mpath: prevent io from work queue while suspended
Reject messages that can generate I/O while the device itself
is suspended.

Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
Acked-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-12-10 23:52:21 +00:00