linux-apfs

mirror of https://github.com/linux-apfs/linux-apfs.git synced 2026-05-01 15:00:59 -07:00

Author	SHA1	Message	Date
Joe Thornber	88bf5184fa	dm cache: wake the worker thread every time we free a migration object When the cache is idle, writeback work was only being issued every second. With this change outstanding writebacks are streamed constantly. This offers a writeback performance improvement. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-06-11 17:13:00 -04:00
Joe Thornber	66a6363566	dm cache: add stochastic-multi-queue (smq) policy The stochastic-multi-queue (smq) policy addresses some of the problems with the current multiqueue (mq) policy. Memory usage ------------ The mq policy uses a lot of memory; 88 bytes per cache block on a 64 bit machine. SMQ uses 28bit indexes to implement it's data structures rather than pointers. It avoids storing an explicit hit count for each block. It has a 'hotspot' queue rather than a pre cache which uses a quarter of the entries (each hotspot block covers a larger area than a single cache block). All these mean smq uses ~25bytes per cache block. Still a lot of memory, but a substantial improvement nontheless. Level balancing --------------- MQ places entries in different levels of the multiqueue structures based on their hit count (~ln(hit count)). This means the bottom levels generally have the most entries, and the top ones have very few. Having unbalanced levels like this reduces the efficacy of the multiqueue. SMQ does not maintain a hit count, instead it swaps hit entries with the least recently used entry from the level above. The over all ordering being a side effect of this stochastic process. With this scheme we can decide how many entries occupy each multiqueue level, resulting in better promotion/demotion decisions. Adaptability ------------ The MQ policy maintains a hit count for each cache block. For a different block to get promoted to the cache it's hit count has to exceed the lowest currently in the cache. This means it can take a long time for the cache to adapt between varying IO patterns. Periodically degrading the hit counts could help with this, but I haven't found a nice general solution. SMQ doesn't maintain hit counts, so a lot of this problem just goes away. In addition it tracks performance of the hotspot queue, which is used to decide which blocks to promote. If the hotspot queue is performing badly then it starts moving entries more quickly between levels. This lets it adapt to new IO patterns very quickly. Performance ----------- In my tests SMQ shows substantially better performance than MQ. Once this matures a bit more I'm sure it'll become the default policy. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-06-11 17:12:59 -04:00
Joe Thornber	40775257b9	dm cache: boost promotion of blocks that will be overwritten When considering whether to move a block to the cache we already give preferential treatment to discarded blocks, since they are cheap to promote (no read of the origin required since the data is junk). The same is true of blocks that are about to be completely overwritten, so we likewise boost their promotion chances. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:19:07 -04:00
Joe Thornber	651f5fa2a3	dm cache: defer whole cells Currently individual bios are deferred to the worker thread if they cannot be processed immediately (eg, a block is in the process of being moved to the fast device). This patch passes whole cells across to the worker. This saves reaquiring the cell, and also collects bios destined for the same block together, which allows them to be mapped with a single look up to the policy. This reduces the overhead of using dm-cache. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:19:06 -04:00
Joe Thornber	3cdf93f9d8	dm bio prison: add dm_cell_promote_or_release() Rather than always releasing the prisoners in a cell, the client may want to promote one of them to be the new holder. There is a race here though between releasing an empty cell, and other threads adding new inmates. So this function makes the decision with its lock held. This function can have two outcomes: i) An inmate is promoted to be the holder of the cell (return value of 0). ii) The cell has no inmate for promotion and is released (return value of 1). Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:19:06 -04:00
Joe Thornber	451b9e0071	dm cache: pull out some bitset utility functions for reuse Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:19:05 -04:00
Joe Thornber	20f6814b94	dm cache: pass a new 'critical' flag to the policies when requesting writeback work We only allow non critical writeback if the origin is idle. It is up to the policy to decide what writeback work is critical. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:19:04 -04:00
Joe Thornber	066dbaa386	dm cache: track IO to the origin device using io_tracker Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:19:04 -04:00
Joe Thornber	77289d3207	dm cache: add io_tracker A little class that keeps track of the volume of io that is in flight, and the length of time that a device has been idle for. FIXME: rather than jiffes, may be best to use ktime_t (to support faster devices). Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:19:03 -04:00
Joe Thornber	fb4100ae7f	dm cache: fix race when issuing a POLICY_REPLACE operation There is a race between a policy deciding to replace a cache entry, the core target writing back any dirty data from this block, and other IO threads doing IO to the same block. This sort of problem is avoided most of the time by the core target grabbing a bio prison cell before making the request to the policy. But for a demotion the core target doesn't know which block will be demoted, so can't do this in advance. Fix this demotion race by introducing a callback to the policy interface that allows the policy to grab the cell on behalf of the core target. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org	2015-05-29 14:19:03 -04:00
Milan Broz	54cea3f668	dm crypt: add comments to better describe crypto processing logic A crypto driver can process requests synchronously or asynchronously and can use an internal driver queue to backlog requests. Add some comments to clarify internal logic and completion return codes. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:19:02 -04:00
Lidong Zhong	ed63287dd6	dm raid1: keep issuing IO after leg failure Currently if there is a leg failure, the bio will be put into the hold list until userspace does a remove/replace on the leg. Doing so in a cluster config (clvmd) is problematic because there may be a temporary path failure that results in cluster raid1 remove/replace. Such recovery takes a long time due to a full resync. Update dm-raid1 to optionally ignore these failures so bios continue being issued without interrupton. To enable this feature userspace must pass "keep_log" when creating the dm-raid1 device. Signed-off-by: Lidong Zhong <lzhong@suse.com> Tested-by: Liuhua Wang <lwang@suse.com> Acked-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:19:02 -04:00
Geert Uytterhoeven	f4ad317aed	dm log writes: use ULL suffix for 64-bit constants On 32-bit: drivers/md/dm-log-writes.c: In function ‘log_super’: drivers/md/dm-log-writes.c:323: warning: integer constant is too large for ‘long’ type Add a ULL suffix to WRITE_LOG_MAGIC to fix this. Also add a ULL suffix to WRITE_LOG_VERSION as it's stored in a __le64 field. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:19:01 -04:00
Luis Henriques	e223e1de4f	dm stripe: drop useless exit point from dm_stripe_init() Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:19:01 -04:00
Heinz Mauelshagen	0cf4503174	dm raid: add support for the MD RAID0 personality Add dm-raid access to the MD RAID0 personality to enable single zone striping. The following changes enable that access: - add type definition to raid_types array - make bitmap creation conditonal in super_validate(), because bitmaps are not allowed in raid0 - set rdev->sectors to the data image size in super_validate() to allow the raid0 personality to calculate the MD array size properly - use mdddev(un)lock() functions instead of direct mutex_(un)lock() (wrapped in here because it's a trivial change) - enhance raid_status() to always report full sync for raid0 so that userspace checks for 100% sync will succeed and allow for resize (and takeover/reshape once added in future paches) - enhance raid_resume() to not load bitmap in case of raid0 - add merge function to avoid data corruption (seen with readahead) that resulted from bio payloads that grew too large. This problem did not occur with the other raid levels because it either did not apply without striping (raid1) or was avoided via stripe caching. - raise version to 1.7.0 because of the raid0 API change Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Reviewed-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:19:00 -04:00
Heinz Mauelshagen	c76d53f43e	dm raid: a few cleanups - ensure maximum device limit in superblock - rename DMPF_* (print flags) to CTR_FLAG_* (constructor flags) and their respective struct raid_set member - use strcasecmp() in raid10_format_to_md_layout() as in the constructor Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Reviewed-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:19:00 -04:00
Heinz Mauelshagen	0f4106b32f	dm raid: fixup documentation for discard support Remove comment above parse_raid_params() that claims "devices_handle_discard_safely" is a table line argument when it is actually is a module parameter. Also, backfill dm-raid target version 1.6.0 documentation. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Reviewed-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:18:59 -04:00
Mike Snitzer	49f154c732	dm thin metadata: remove in-core 'read_only' flag Leverage the block manager's read_only flag instead of duplicating it; access with new dm_bm_is_read_only() method. Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:18:59 -04:00
Mike Snitzer	f8ae75253e	dm thin: cleanup schedule_zero() to read more logically The overwrite has only ever about optimizing away the need to zero a block if the entire block was being overwritten. As such it is only relevant when zeroing is enabled. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Joe Thornber <ejt@redhat.com>	2015-05-29 14:18:58 -04:00
Mike Snitzer	8b908f8e94	dm thin: cleanup overwrite's endio restore to be centralized Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:18:58 -04:00
Mike Snitzer	0f20972f7b	dm: factor out a common cleanup_mapped_device() Introduce a single common method for cleaning up a DM device's mapped_device. No functional change, just eliminates duplication of delicate mapped_device cleanup code. Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:18:58 -04:00
Mike Snitzer	2d76fff18f	dm: cleanup methods that requeue requests More often than not a request that is requeued _is_ mapped (meaning the clone request is allocated and clone->q is initialized). Rename dm_requeue_unmapped_original_request() to avoid potential confusion due to function name containing "unmapped". Also, remove dm_requeue_unmapped_request() since callers can easily call the dm_requeue_original_request() directly. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:18:57 -04:00
Mike Snitzer	cbc4e3c135	dm: do not allocate any mempools for blk-mq request-based DM Do not allocate the io_pool mempool for blk-mq request-based DM (DM_TYPE_MQ_REQUEST_BASED) in dm_alloc_rq_mempools(). Also refine __bind_mempools() to have more precise awareness of which mempools each type of DM device uses -- avoids mempool churn when reloading DM tables (particularly for DM_TYPE_REQUEST_BASED). Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2015-05-29 14:18:57 -04:00
Mike Snitzer	183f7802e7	Merge remote-tracking branch 'jens/for-4.2/core' into dm-4.2	2015-05-29 14:17:16 -04:00
Joe Thornber	1c220c69ce	dm: fix casting bug in dm_merge_bvec() dm_merge_bvec() was originally added in f6fccb ("dm: introduce merge_bvec_fn"). In that commit a value in sectors is converted to bytes using << 9, and then assigned to an int. This code made assumptions about the value of BIO_MAX_SECTORS. A later commit 148e51 ("dm: improve documentation and code clarity in dm_merge_bvec") was meant to have no functional change but it removed the use of BIO_MAX_SECTORS in favor of using queue_max_sectors(). At this point the cast from sector_t to int resulted in a zero value. The fallout being dm_merge_bvec() would only allow a single page to be added to a bio. This interim fix is minimal for the benefit of stable@ because the more comprehensive cleanup of passing a sector_t to all DM targets' merge function will impact quite a few DM targets. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org # 3.19+	2015-05-29 13:41:16 -04:00

1 2 3 4 5 ...

519788 Commits