You've already forked linux-apfs
mirror of
https://github.com/linux-apfs/linux-apfs.git
synced 2026-05-01 15:00:59 -07:00
Merge tag 'for-4.16/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mike Snitzer: - DM core fixes to ensure that bio submission follows a depth-first tree walk; this is critical to allow forward progress without the need to use the bioset's BIOSET_NEED_RESCUER. - Remove DM core's BIOSET_NEED_RESCUER based dm_offload infrastructure. - DM core cleanups and improvements to make bio-based DM more efficient (e.g. reduced memory footprint as well leveraging per-bio-data more). - Introduce new bio-based mode (DM_TYPE_NVME_BIO_BASED) that leverages the more direct IO submission path in the block layer; this mode is used by DM multipath and also optimizes targets like DM thin-pool that stack directly on NVMe data device. - DM multipath improvements to factor out legacy SCSI-only (e.g. scsi_dh) code paths to allow for more optimized support for NVMe multipath. - A fix for DM multipath path selectors (service-time and queue-length) to select paths in a more balanced way; largely academic but doesn't hurt. - Numerous DM raid target fixes and improvements. - Add a new DM "unstriped" target that enables Intel to workaround firmware limitations in some NVMe drives that are striped internally (this target also works when stacked above the DM "striped" target). - Various Documentation fixes and improvements. - Misc cleanups and fixes across various DM infrastructure and targets (e.g. bufio, flakey, log-writes, snapshot). * tag 'for-4.16/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (69 commits) dm cache: Documentation: update default migration_throttling value dm mpath selector: more evenly distribute ties dm unstripe: fix target length versus number of stripes size check dm thin: fix trailing semicolon in __remap_and_issue_shared_cell dm table: fix NVMe bio-based dm_table_determine_type() validation dm: various cleanups to md->queue initialization code dm mpath: delay the retry of a request if the target responded as busy dm mpath: return DM_MAPIO_DELAY_REQUEUE if QUEUE_IO or PG_INIT_REQUIRED dm mpath: return DM_MAPIO_REQUEUE on blk-mq rq allocation failure dm log writes: fix max length used for kstrndup dm: backfill missing calls to mutex_destroy() dm snapshot: use mutex instead of rw_semaphore dm flakey: check for null arg_name in parse_features() dm thin: extend thinpool status format string with omitted fields dm thin: fixes in thin-provisioning.txt dm thin: document representation of <highest mapped sector> when there is none dm thin: fix documentation relative to low water mark threshold dm cache: be consistent in specifying sectors and SI units in cache.txt dm cache: delete obsoleted paragraph in cache.txt dm cache: fix grammar in cache-policies.txt ...
This commit is contained in:
@@ -60,7 +60,7 @@ Memory usage:
|
||||
The mq policy used a lot of memory; 88 bytes per cache block on a 64
|
||||
bit machine.
|
||||
|
||||
smq uses 28bit indexes to implement it's data structures rather than
|
||||
smq uses 28bit indexes to implement its data structures rather than
|
||||
pointers. It avoids storing an explicit hit count for each block. It
|
||||
has a 'hotspot' queue, rather than a pre-cache, which uses a quarter of
|
||||
the entries (each hotspot block covers a larger area than a single
|
||||
@@ -84,7 +84,7 @@ resulting in better promotion/demotion decisions.
|
||||
|
||||
Adaptability:
|
||||
The mq policy maintained a hit count for each cache block. For a
|
||||
different block to get promoted to the cache it's hit count has to
|
||||
different block to get promoted to the cache its hit count has to
|
||||
exceed the lowest currently in the cache. This meant it could take a
|
||||
long time for the cache to adapt between varying IO patterns.
|
||||
|
||||
|
||||
@@ -59,7 +59,7 @@ Fixed block size
|
||||
The origin is divided up into blocks of a fixed size. This block size
|
||||
is configurable when you first create the cache. Typically we've been
|
||||
using block sizes of 256KB - 1024KB. The block size must be between 64
|
||||
(32KB) and 2097152 (1GB) and a multiple of 64 (32KB).
|
||||
sectors (32KB) and 2097152 sectors (1GB) and a multiple of 64 sectors (32KB).
|
||||
|
||||
Having a fixed block size simplifies the target a lot. But it is
|
||||
something of a compromise. For instance, a small part of a block may be
|
||||
@@ -119,7 +119,7 @@ doing here to avoid migrating during those peak io moments.
|
||||
|
||||
For the time being, a message "migration_threshold <#sectors>"
|
||||
can be used to set the maximum number of sectors being migrated,
|
||||
the default being 204800 sectors (or 100MB).
|
||||
the default being 2048 sectors (1MB).
|
||||
|
||||
Updating on-disk metadata
|
||||
-------------------------
|
||||
@@ -143,11 +143,6 @@ the policy how big this chunk is, but it should be kept small. Like the
|
||||
dirty flags this data is lost if there's a crash so a safe fallback
|
||||
value should always be possible.
|
||||
|
||||
For instance, the 'mq' policy, which is currently the default policy,
|
||||
uses this facility to store the hit count of the cache blocks. If
|
||||
there's a crash this information will be lost, which means the cache
|
||||
may be less efficient until those hit counts are regenerated.
|
||||
|
||||
Policy hints affect performance, not correctness.
|
||||
|
||||
Policy messaging
|
||||
|
||||
@@ -343,5 +343,8 @@ Version History
|
||||
1.11.0 Fix table line argument order
|
||||
(wrong raid10_copies/raid10_format sequence)
|
||||
1.11.1 Add raid4/5/6 journal write-back support via journal_mode option
|
||||
1.12.1 fix for MD deadlock between mddev_suspend() and md_write_start() available
|
||||
1.12.1 Fix for MD deadlock between mddev_suspend() and md_write_start() available
|
||||
1.13.0 Fix dev_health status at end of "recover" (was 'a', now 'A')
|
||||
1.13.1 Fix deadlock caused by early md_stop_writes(). Also fix size an
|
||||
state races.
|
||||
1.13.2 Fix raid redundancy validation and avoid keeping raid set frozen
|
||||
|
||||
@@ -49,6 +49,10 @@ The difference between persistent and transient is with transient
|
||||
snapshots less metadata must be saved on disk - they can be kept in
|
||||
memory by the kernel.
|
||||
|
||||
When loading or unloading the snapshot target, the corresponding
|
||||
snapshot-origin or snapshot-merge target must be suspended. A failure to
|
||||
suspend the origin target could result in data corruption.
|
||||
|
||||
|
||||
* snapshot-merge <origin> <COW device> <persistent> <chunksize>
|
||||
|
||||
|
||||
@@ -112,9 +112,11 @@ $low_water_mark is expressed in blocks of size $data_block_size. If
|
||||
free space on the data device drops below this level then a dm event
|
||||
will be triggered which a userspace daemon should catch allowing it to
|
||||
extend the pool device. Only one such event will be sent.
|
||||
Resuming a device with a new table itself triggers an event so the
|
||||
userspace daemon can use this to detect a situation where a new table
|
||||
already exceeds the threshold.
|
||||
|
||||
No special event is triggered if a just resumed device's free space is below
|
||||
the low water mark. However, resuming a device always triggers an
|
||||
event; a userspace daemon should verify that free space exceeds the low
|
||||
water mark when handling this event.
|
||||
|
||||
A low water mark for the metadata device is maintained in the kernel and
|
||||
will trigger a dm event if free space on the metadata device drops below
|
||||
@@ -274,7 +276,8 @@ ii) Status
|
||||
|
||||
<transaction id> <used metadata blocks>/<total metadata blocks>
|
||||
<used data blocks>/<total data blocks> <held metadata root>
|
||||
[no_]discard_passdown ro|rw
|
||||
ro|rw|out_of_data_space [no_]discard_passdown [error|queue]_if_no_space
|
||||
needs_check|-
|
||||
|
||||
transaction id:
|
||||
A 64-bit number used by userspace to help synchronise with metadata
|
||||
@@ -394,3 +397,6 @@ ii) Status
|
||||
If the pool has encountered device errors and failed, the status
|
||||
will just contain the string 'Fail'. The userspace recovery
|
||||
tools should then be used.
|
||||
|
||||
In the case where <nr mapped sectors> is 0, there is no highest
|
||||
mapped sector and the value of <highest mapped sector> is unspecified.
|
||||
|
||||
@@ -0,0 +1,124 @@
|
||||
Introduction
|
||||
============
|
||||
|
||||
The device-mapper "unstriped" target provides a transparent mechanism to
|
||||
unstripe a device-mapper "striped" target to access the underlying disks
|
||||
without having to touch the true backing block-device. It can also be
|
||||
used to unstripe a hardware RAID-0 to access backing disks.
|
||||
|
||||
Parameters:
|
||||
<number of stripes> <chunk size> <stripe #> <dev_path> <offset>
|
||||
|
||||
<number of stripes>
|
||||
The number of stripes in the RAID 0.
|
||||
|
||||
<chunk size>
|
||||
The amount of 512B sectors in the chunk striping.
|
||||
|
||||
<dev_path>
|
||||
The block device you wish to unstripe.
|
||||
|
||||
<stripe #>
|
||||
The stripe number within the device that corresponds to physical
|
||||
drive you wish to unstripe. This must be 0 indexed.
|
||||
|
||||
|
||||
Why use this module?
|
||||
====================
|
||||
|
||||
An example of undoing an existing dm-stripe
|
||||
-------------------------------------------
|
||||
|
||||
This small bash script will setup 4 loop devices and use the existing
|
||||
striped target to combine the 4 devices into one. It then will use
|
||||
the unstriped target ontop of the striped device to access the
|
||||
individual backing loop devices. We write data to the newly exposed
|
||||
unstriped devices and verify the data written matches the correct
|
||||
underlying device on the striped array.
|
||||
|
||||
#!/bin/bash
|
||||
|
||||
MEMBER_SIZE=$((128 * 1024 * 1024))
|
||||
NUM=4
|
||||
SEQ_END=$((${NUM}-1))
|
||||
CHUNK=256
|
||||
BS=4096
|
||||
|
||||
RAID_SIZE=$((${MEMBER_SIZE}*${NUM}/512))
|
||||
DM_PARMS="0 ${RAID_SIZE} striped ${NUM} ${CHUNK}"
|
||||
COUNT=$((${MEMBER_SIZE} / ${BS}))
|
||||
|
||||
for i in $(seq 0 ${SEQ_END}); do
|
||||
dd if=/dev/zero of=member-${i} bs=${MEMBER_SIZE} count=1 oflag=direct
|
||||
losetup /dev/loop${i} member-${i}
|
||||
DM_PARMS+=" /dev/loop${i} 0"
|
||||
done
|
||||
|
||||
echo $DM_PARMS | dmsetup create raid0
|
||||
for i in $(seq 0 ${SEQ_END}); do
|
||||
echo "0 1 unstriped ${NUM} ${CHUNK} ${i} /dev/mapper/raid0 0" | dmsetup create set-${i}
|
||||
done;
|
||||
|
||||
for i in $(seq 0 ${SEQ_END}); do
|
||||
dd if=/dev/urandom of=/dev/mapper/set-${i} bs=${BS} count=${COUNT} oflag=direct
|
||||
diff /dev/mapper/set-${i} member-${i}
|
||||
done;
|
||||
|
||||
for i in $(seq 0 ${SEQ_END}); do
|
||||
dmsetup remove set-${i}
|
||||
done
|
||||
|
||||
dmsetup remove raid0
|
||||
|
||||
for i in $(seq 0 ${SEQ_END}); do
|
||||
losetup -d /dev/loop${i}
|
||||
rm -f member-${i}
|
||||
done
|
||||
|
||||
Another example
|
||||
---------------
|
||||
|
||||
Intel NVMe drives contain two cores on the physical device.
|
||||
Each core of the drive has segregated access to its LBA range.
|
||||
The current LBA model has a RAID 0 128k chunk on each core, resulting
|
||||
in a 256k stripe across the two cores:
|
||||
|
||||
Core 0: Core 1:
|
||||
__________ __________
|
||||
| LBA 512| | LBA 768|
|
||||
| LBA 0 | | LBA 256|
|
||||
---------- ----------
|
||||
|
||||
The purpose of this unstriping is to provide better QoS in noisy
|
||||
neighbor environments. When two partitions are created on the
|
||||
aggregate drive without this unstriping, reads on one partition
|
||||
can affect writes on another partition. This is because the partitions
|
||||
are striped across the two cores. When we unstripe this hardware RAID 0
|
||||
and make partitions on each new exposed device the two partitions are now
|
||||
physically separated.
|
||||
|
||||
With the dm-unstriped target we're able to segregate an fio script that
|
||||
has read and write jobs that are independent of each other. Compared to
|
||||
when we run the test on a combined drive with partitions, we were able
|
||||
to get a 92% reduction in read latency using this device mapper target.
|
||||
|
||||
|
||||
Example dmsetup usage
|
||||
=====================
|
||||
|
||||
unstriped ontop of Intel NVMe device that has 2 cores
|
||||
-----------------------------------------------------
|
||||
dmsetup create nvmset0 --table '0 512 unstriped 2 256 0 /dev/nvme0n1 0'
|
||||
dmsetup create nvmset1 --table '0 512 unstriped 2 256 1 /dev/nvme0n1 0'
|
||||
|
||||
There will now be two devices that expose Intel NVMe core 0 and 1
|
||||
respectively:
|
||||
/dev/mapper/nvmset0
|
||||
/dev/mapper/nvmset1
|
||||
|
||||
unstriped ontop of striped with 4 drives using 128K chunk size
|
||||
--------------------------------------------------------------
|
||||
dmsetup create raid_disk0 --table '0 512 unstriped 4 256 0 /dev/mapper/striped 0'
|
||||
dmsetup create raid_disk1 --table '0 512 unstriped 4 256 1 /dev/mapper/striped 0'
|
||||
dmsetup create raid_disk2 --table '0 512 unstriped 4 256 2 /dev/mapper/striped 0'
|
||||
dmsetup create raid_disk3 --table '0 512 unstriped 4 256 3 /dev/mapper/striped 0'
|
||||
@@ -269,6 +269,13 @@ config DM_BIO_PRISON
|
||||
|
||||
source "drivers/md/persistent-data/Kconfig"
|
||||
|
||||
config DM_UNSTRIPED
|
||||
tristate "Unstriped target"
|
||||
depends on BLK_DEV_DM
|
||||
---help---
|
||||
Unstripes I/O so it is issued solely on a single drive in a HW
|
||||
RAID0 or dm-striped target.
|
||||
|
||||
config DM_CRYPT
|
||||
tristate "Crypt target support"
|
||||
depends on BLK_DEV_DM
|
||||
|
||||
@@ -43,6 +43,7 @@ obj-$(CONFIG_BCACHE) += bcache/
|
||||
obj-$(CONFIG_BLK_DEV_MD) += md-mod.o
|
||||
obj-$(CONFIG_BLK_DEV_DM) += dm-mod.o
|
||||
obj-$(CONFIG_BLK_DEV_DM_BUILTIN) += dm-builtin.o
|
||||
obj-$(CONFIG_DM_UNSTRIPED) += dm-unstripe.o
|
||||
obj-$(CONFIG_DM_BUFIO) += dm-bufio.o
|
||||
obj-$(CONFIG_DM_BIO_PRISON) += dm-bio-prison.o
|
||||
obj-$(CONFIG_DM_CRYPT) += dm-crypt.o
|
||||
|
||||
+20
-17
@@ -662,7 +662,7 @@ static void submit_io(struct dm_buffer *b, int rw, bio_end_io_t *end_io)
|
||||
|
||||
sector = (b->block << b->c->sectors_per_block_bits) + b->c->start;
|
||||
|
||||
if (rw != WRITE) {
|
||||
if (rw != REQ_OP_WRITE) {
|
||||
n_sectors = 1 << b->c->sectors_per_block_bits;
|
||||
offset = 0;
|
||||
} else {
|
||||
@@ -740,7 +740,7 @@ static void __write_dirty_buffer(struct dm_buffer *b,
|
||||
b->write_end = b->dirty_end;
|
||||
|
||||
if (!write_list)
|
||||
submit_io(b, WRITE, write_endio);
|
||||
submit_io(b, REQ_OP_WRITE, write_endio);
|
||||
else
|
||||
list_add_tail(&b->write_list, write_list);
|
||||
}
|
||||
@@ -753,7 +753,7 @@ static void __flush_write_list(struct list_head *write_list)
|
||||
struct dm_buffer *b =
|
||||
list_entry(write_list->next, struct dm_buffer, write_list);
|
||||
list_del(&b->write_list);
|
||||
submit_io(b, WRITE, write_endio);
|
||||
submit_io(b, REQ_OP_WRITE, write_endio);
|
||||
cond_resched();
|
||||
}
|
||||
blk_finish_plug(&plug);
|
||||
@@ -1123,7 +1123,7 @@ static void *new_read(struct dm_bufio_client *c, sector_t block,
|
||||
return NULL;
|
||||
|
||||
if (need_submit)
|
||||
submit_io(b, READ, read_endio);
|
||||
submit_io(b, REQ_OP_READ, read_endio);
|
||||
|
||||
wait_on_bit_io(&b->state, B_READING, TASK_UNINTERRUPTIBLE);
|
||||
|
||||
@@ -1193,7 +1193,7 @@ void dm_bufio_prefetch(struct dm_bufio_client *c,
|
||||
dm_bufio_unlock(c);
|
||||
|
||||
if (need_submit)
|
||||
submit_io(b, READ, read_endio);
|
||||
submit_io(b, REQ_OP_READ, read_endio);
|
||||
dm_bufio_release(b);
|
||||
|
||||
cond_resched();
|
||||
@@ -1454,7 +1454,7 @@ retry:
|
||||
old_block = b->block;
|
||||
__unlink_buffer(b);
|
||||
__link_buffer(b, new_block, b->list_mode);
|
||||
submit_io(b, WRITE, write_endio);
|
||||
submit_io(b, REQ_OP_WRITE, write_endio);
|
||||
wait_on_bit_io(&b->state, B_WRITING,
|
||||
TASK_UNINTERRUPTIBLE);
|
||||
__unlink_buffer(b);
|
||||
@@ -1716,7 +1716,7 @@ struct dm_bufio_client *dm_bufio_client_create(struct block_device *bdev, unsign
|
||||
if (!DM_BUFIO_CACHE_NAME(c)) {
|
||||
r = -ENOMEM;
|
||||
mutex_unlock(&dm_bufio_clients_lock);
|
||||
goto bad_cache;
|
||||
goto bad;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1727,7 +1727,7 @@ struct dm_bufio_client *dm_bufio_client_create(struct block_device *bdev, unsign
|
||||
if (!DM_BUFIO_CACHE(c)) {
|
||||
r = -ENOMEM;
|
||||
mutex_unlock(&dm_bufio_clients_lock);
|
||||
goto bad_cache;
|
||||
goto bad;
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1738,27 +1738,28 @@ struct dm_bufio_client *dm_bufio_client_create(struct block_device *bdev, unsign
|
||||
|
||||
if (!b) {
|
||||
r = -ENOMEM;
|
||||
goto bad_buffer;
|
||||
goto bad;
|
||||
}
|
||||
__free_buffer_wake(b);
|
||||
}
|
||||
|
||||
c->shrinker.count_objects = dm_bufio_shrink_count;
|
||||
c->shrinker.scan_objects = dm_bufio_shrink_scan;
|
||||
c->shrinker.seeks = 1;
|
||||
c->shrinker.batch = 0;
|
||||
r = register_shrinker(&c->shrinker);
|
||||
if (r)
|
||||
goto bad;
|
||||
|
||||
mutex_lock(&dm_bufio_clients_lock);
|
||||
dm_bufio_client_count++;
|
||||
list_add(&c->client_list, &dm_bufio_all_clients);
|
||||
__cache_size_refresh();
|
||||
mutex_unlock(&dm_bufio_clients_lock);
|
||||
|
||||
c->shrinker.count_objects = dm_bufio_shrink_count;
|
||||
c->shrinker.scan_objects = dm_bufio_shrink_scan;
|
||||
c->shrinker.seeks = 1;
|
||||
c->shrinker.batch = 0;
|
||||
register_shrinker(&c->shrinker);
|
||||
|
||||
return c;
|
||||
|
||||
bad_buffer:
|
||||
bad_cache:
|
||||
bad:
|
||||
while (!list_empty(&c->reserved_buffers)) {
|
||||
struct dm_buffer *b = list_entry(c->reserved_buffers.next,
|
||||
struct dm_buffer, lru_list);
|
||||
@@ -1767,6 +1768,7 @@ bad_cache:
|
||||
}
|
||||
dm_io_client_destroy(c->dm_io);
|
||||
bad_dm_io:
|
||||
mutex_destroy(&c->lock);
|
||||
kfree(c);
|
||||
bad_client:
|
||||
return ERR_PTR(r);
|
||||
@@ -1811,6 +1813,7 @@ void dm_bufio_client_destroy(struct dm_bufio_client *c)
|
||||
BUG_ON(c->n_buffers[i]);
|
||||
|
||||
dm_io_client_destroy(c->dm_io);
|
||||
mutex_destroy(&c->lock);
|
||||
kfree(c);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(dm_bufio_client_destroy);
|
||||
|
||||
@@ -91,8 +91,7 @@ struct mapped_device {
|
||||
/*
|
||||
* io objects are allocated from here.
|
||||
*/
|
||||
mempool_t *io_pool;
|
||||
|
||||
struct bio_set *io_bs;
|
||||
struct bio_set *bs;
|
||||
|
||||
/*
|
||||
@@ -130,8 +129,6 @@ struct mapped_device {
|
||||
struct srcu_struct io_barrier;
|
||||
};
|
||||
|
||||
void dm_init_md_queue(struct mapped_device *md);
|
||||
void dm_init_normal_md_queue(struct mapped_device *md);
|
||||
int md_in_flight(struct mapped_device *md);
|
||||
void disable_write_same(struct mapped_device *md);
|
||||
void disable_write_zeroes(struct mapped_device *md);
|
||||
|
||||
@@ -2193,6 +2193,8 @@ static void crypt_dtr(struct dm_target *ti)
|
||||
kzfree(cc->cipher_auth);
|
||||
kzfree(cc->authenc_key);
|
||||
|
||||
mutex_destroy(&cc->bio_alloc_lock);
|
||||
|
||||
/* Must zero key material before freeing */
|
||||
kzfree(cc);
|
||||
}
|
||||
@@ -2702,8 +2704,7 @@ static int crypt_ctr(struct dm_target *ti, unsigned int argc, char **argv)
|
||||
goto bad;
|
||||
}
|
||||
|
||||
cc->bs = bioset_create(MIN_IOS, 0, (BIOSET_NEED_BVECS |
|
||||
BIOSET_NEED_RESCUER));
|
||||
cc->bs = bioset_create(MIN_IOS, 0, BIOSET_NEED_BVECS);
|
||||
if (!cc->bs) {
|
||||
ti->error = "Cannot allocate crypt bioset";
|
||||
goto bad;
|
||||
|
||||
@@ -229,6 +229,8 @@ static void delay_dtr(struct dm_target *ti)
|
||||
if (dc->dev_write)
|
||||
dm_put_device(ti, dc->dev_write);
|
||||
|
||||
mutex_destroy(&dc->timer_lock);
|
||||
|
||||
kfree(dc);
|
||||
}
|
||||
|
||||
|
||||
@@ -70,6 +70,11 @@ static int parse_features(struct dm_arg_set *as, struct flakey_c *fc,
|
||||
arg_name = dm_shift_arg(as);
|
||||
argc--;
|
||||
|
||||
if (!arg_name) {
|
||||
ti->error = "Insufficient feature arguments";
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
/*
|
||||
* drop_writes
|
||||
*/
|
||||
|
||||
+1
-2
@@ -58,8 +58,7 @@ struct dm_io_client *dm_io_client_create(void)
|
||||
if (!client->pool)
|
||||
goto bad;
|
||||
|
||||
client->bios = bioset_create(min_ios, 0, (BIOSET_NEED_BVECS |
|
||||
BIOSET_NEED_RESCUER));
|
||||
client->bios = bioset_create(min_ios, 0, BIOSET_NEED_BVECS);
|
||||
if (!client->bios)
|
||||
goto bad;
|
||||
|
||||
|
||||
@@ -477,8 +477,10 @@ static int run_complete_job(struct kcopyd_job *job)
|
||||
* If this is the master job, the sub jobs have already
|
||||
* completed so we can free everything.
|
||||
*/
|
||||
if (job->master_job == job)
|
||||
if (job->master_job == job) {
|
||||
mutex_destroy(&job->lock);
|
||||
mempool_free(job, kc->job_pool);
|
||||
}
|
||||
fn(read_err, write_err, context);
|
||||
|
||||
if (atomic_dec_and_test(&kc->nr_jobs))
|
||||
@@ -750,6 +752,7 @@ int dm_kcopyd_copy(struct dm_kcopyd_client *kc, struct dm_io_region *from,
|
||||
* followed by SPLIT_COUNT sub jobs.
|
||||
*/
|
||||
job = mempool_alloc(kc->job_pool, GFP_NOIO);
|
||||
mutex_init(&job->lock);
|
||||
|
||||
/*
|
||||
* set up for the read.
|
||||
@@ -811,7 +814,6 @@ int dm_kcopyd_copy(struct dm_kcopyd_client *kc, struct dm_io_region *from,
|
||||
if (job->source.count <= SUB_JOB_SIZE)
|
||||
dispatch_job(job);
|
||||
else {
|
||||
mutex_init(&job->lock);
|
||||
job->progress = 0;
|
||||
split_job(job);
|
||||
}
|
||||
|
||||
@@ -594,7 +594,7 @@ static int log_mark(struct log_writes_c *lc, char *data)
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
block->data = kstrndup(data, maxsize, GFP_KERNEL);
|
||||
block->data = kstrndup(data, maxsize - 1, GFP_KERNEL);
|
||||
if (!block->data) {
|
||||
DMERR("Error copying mark data");
|
||||
kfree(block);
|
||||
|
||||
+187
-110
File diff suppressed because it is too large
Load Diff
@@ -195,9 +195,6 @@ static struct dm_path *ql_select_path(struct path_selector *ps, size_t nr_bytes)
|
||||
if (list_empty(&s->valid_paths))
|
||||
goto out;
|
||||
|
||||
/* Change preferred (first in list) path to evenly balance. */
|
||||
list_move_tail(s->valid_paths.next, &s->valid_paths);
|
||||
|
||||
list_for_each_entry(pi, &s->valid_paths, list) {
|
||||
if (!best ||
|
||||
(atomic_read(&pi->qlen) < atomic_read(&best->qlen)))
|
||||
@@ -210,6 +207,9 @@ static struct dm_path *ql_select_path(struct path_selector *ps, size_t nr_bytes)
|
||||
if (!best)
|
||||
goto out;
|
||||
|
||||
/* Move most recently used to least preferred to evenly balance. */
|
||||
list_move_tail(&best->list, &s->valid_paths);
|
||||
|
||||
ret = best->path;
|
||||
out:
|
||||
spin_unlock_irqrestore(&s->lock, flags);
|
||||
|
||||
+254
-126
File diff suppressed because it is too large
Load Diff
+4
-2
@@ -315,6 +315,10 @@ static void dm_done(struct request *clone, blk_status_t error, bool mapped)
|
||||
/* The target wants to requeue the I/O */
|
||||
dm_requeue_original_request(tio, false);
|
||||
break;
|
||||
case DM_ENDIO_DELAY_REQUEUE:
|
||||
/* The target wants to requeue the I/O after a delay */
|
||||
dm_requeue_original_request(tio, true);
|
||||
break;
|
||||
default:
|
||||
DMWARN("unimplemented target endio return value: %d", r);
|
||||
BUG();
|
||||
@@ -713,7 +717,6 @@ int dm_old_init_request_queue(struct mapped_device *md, struct dm_table *t)
|
||||
/* disable dm_old_request_fn's merge heuristic by default */
|
||||
md->seq_rq_merge_deadline_usecs = 0;
|
||||
|
||||
dm_init_normal_md_queue(md);
|
||||
blk_queue_softirq_done(md->queue, dm_softirq_done);
|
||||
|
||||
/* Initialize the request-based DM worker thread */
|
||||
@@ -821,7 +824,6 @@ int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t)
|
||||
err = PTR_ERR(q);
|
||||
goto out_tag_set;
|
||||
}
|
||||
dm_init_md_queue(md);
|
||||
|
||||
return 0;
|
||||
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user