Commit Graph

589729 Commits

Author SHA1 Message Date
Mike Snitzer 13e4f8a695 dm thin: remove __bio_inc_remaining() and switch to using bio_inc_remaining()
DM thinp's use of bio_inc_remaining() is critical to ensure the original
parent discard bio isn't completed before sub-discards have.  DM thinp
needs this due to the extra quiescing that occurs, via multiple DM thinp
mappings, while processing large discards.  As such DM thinp must build
the async discard bio chain after some delay -- so bio_inc_remaining()
is used to enable DM thinp to take a reference on the original parent
discard bio for each mapping.  This allows the immediate use of
bio_endio() on that discard bio; but with the understanding that the
actual completion won't occur until each of the sub-discards'
per-mapping references are dropped.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
2016-05-13 09:03:52 -04:00
Heinz Mauelshagen 4c9971ca6a dm raid: make sure no feature flags are set in metadata
Given we don't yet support any feature flags in the dm-raid ondisk
metadata (see: 'features' member of 'struct dm_raid_superblock'),
add a check to ensure no flags are actually set, if any features are
set reject the activation of the RAID mapping.

This is to prevent possible data corruption in case of a kernel
downgrade when there'll potentially be feature flags set by a future
dm-raid target.

Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-05-13 09:03:51 -04:00
Michal Hocko 72f6d8d8c9 dm ioctl: drop use of __GFP_REPEAT in copy_params()'s __vmalloc() call
copy_params()'s use of __GFP_REPEAT for the __vmalloc() call doesn't make much
sense because vmalloc doesn't rely on costly high order allocations.

Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-05-05 15:25:55 -04:00
Eric Engestrom 52813d4046 dm stats: fix spelling mistake in Documentation
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-05-05 15:25:54 -04:00
Mike Snitzer 492d48db8d dm cache: update cache-policies.txt now that mq is an alias for smq
Also fix some typos and make all "smq" and "mq" references consistently
lowercase.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-05-05 15:25:53 -04:00
Mike Snitzer 2da1610ae2 dm mpath: eliminate use of spinlock in IO fast-paths
The primary motivation of this commit is to improve the scalability of
DM multipath on large NUMA systems where m->lock spinlock contention has
been proven to be a serious bottleneck on really fast storage.

The ability to atomically read a pointer, using lockless_dereference(),
is leveraged in this commit.  But all pointer writes are still protected
by the m->lock spinlock (which is fine since these all now occur in the
slow-path).

The following functions no longer require the m->lock spinlock in their
fast-path: multipath_busy(), __multipath_map(), and do_end_io()

And choose_pgpath() is modified to _not_ update m->current_pgpath unless
it also switches the path-group.  This is done to avoid needing to take
the m->lock everytime __multipath_map() calls choose_pgpath().
But m->current_pgpath will be reset if it is failed via fail_path().

Suggested-by: Jeff Moyer <jmoyer@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Tested-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-05-05 15:25:52 -04:00
Mike Snitzer 20800cb345 dm mpath: move trigger_event member to the end of 'struct multipath'
Allows the 'work_mutex' member to no longer cross a cacheline.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Tested-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-05-05 15:25:52 -04:00
Mike Snitzer 91e968aa60 dm mpath: use atomic_t for counting members of 'struct multipath'
The use of atomic_t for nr_valid_paths, pg_init_in_progress and
pg_init_count will allow relaxing the use of the m->lock spinlock.

Suggested-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Tested-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-05-05 15:25:51 -04:00
Mike Snitzer 518257b132 dm mpath: switch to using bitops for state flags
Mechanical change that doesn't make any real effort to reduce the use of
m->lock; that will come later (once atomics are used for counters, etc).

Suggested-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Tested-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-05-05 15:25:50 -04:00
Amitoj Kaur Chawla 813923b1a2 dm thin: Remove return statement from void function
Return statement at the end of a void function is useless.

The Coccinelle semantic patch used to make this change is as follows:
//<smpl>
@@
identifier f;
expression e;
@@
void f(...) {
<...
- return
  e;
...>
}
//</smpl>

Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-05-05 15:25:50 -04:00
Mike Snitzer cfae7529b5 dm: remove unused mapped_device argument from free_tio()
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-05-05 15:25:49 -04:00
Mike Snitzer ef1d88ced1 Merge remote-tracking branch 'jens/for-4.7/core' into dm-4.7
Needed in order to update the DM thinp code to use the new async
__blkdev_issue_discard() interface.
2016-05-05 15:21:14 -04:00
Mike Snitzer 0ef5a50c16 block: make bio_inc_remaining() interface accessible again
Commit 326e1dbb57 ("block: remove management of bi_remaining when
restoring original bi_end_io") made bio_inc_remaining() private to bio.c
because the only use-case that made sense was confined to the
bio_chain() interface.

Since that time DM thinp went on to use bio_chain() in its relatively
complex implementation of async discard support.  That implementation,
even when converted over to use the new async __blkdev_issue_discard()
interface, depends on deferred completion of the original discard bio --
which is most appropriately implemented using bio_inc_remaining().

DM thinp foolishly duplicated bio_inc_remaining(), local to dm-thin.c as
__bio_inc_remaining(), so re-exporting bio_inc_remaining() allows us to
put an end to that foolishness.

All said, bio_inc_remaining() should really only be used in conjunction
with bio_chain().  It isn't intended for generic bio reference counting.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-05 13:03:29 -06:00
Mike Snitzer bbd848e0fa block: reinstate early return of -EOPNOTSUPP from blkdev_issue_discard
Commit 38f2525533 ("block: add __blkdev_issue_discard") incorrectly
disallowed the early return of -EOPNOTSUPP if the device doesn't support
discard (or secure discard).  This early return of -EOPNOTSUPP has
always been part of blkdev_issue_discard() interface so there isn't a
good reason to break that behaviour -- especially when it can be easily
reinstated.

The nuance of allowing early return of -EOPNOTSUPP vs disallowing late
return of -EOPNOTSUPP is: if the overall device never advertised support
for discards and one is issued to the device it is beneficial to inform
the caller that discards are not supported via -EOPNOTSUPP.  But if a
device advertises discard support it means that at least a subset of the
device does have discard support -- but it could be that discards issued
to some regions of a stacked device will not be supported.  In that case
the late return of -EOPNOTSUPP must be disallowed.

Fixes: 38f2525533 ("block: add __blkdev_issue_discard")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-05 13:03:26 -06:00
Michael Callahan a21f2a3ec6 block: Minor blk_account_io_start usage cleanup
blk_account_io_start does not need to be wrapped with blk_do_io_stat
ais it already checks for that condition.

Signed-off-by: Michael Callahan <michaelcallahan@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-03 09:26:58 -06:00
Christoph Hellwig 38f2525533 block: add __blkdev_issue_discard
This is a version of blkdev_issue_discard which doesn't wait for
the I/O to complete, but instead allows the caller to submit
the final bio and/or chain it to others.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
Signed-off-by: Sagi Grimberg <sagig@grimberg.me>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-02 09:19:46 -06:00
Christoph Hellwig 9082e87bfb block: remove struct bio_batch
It can be replaced with a combination of bio_chain and submit_bio_wait.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
Signed-off-by: Sagi Grimberg <sagig@grimberg.me>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-02 09:19:43 -06:00
Linus Torvalds 04974df804 Linux 4.6-rc6 2016-05-01 15:52:31 -07:00
Linus Torvalds da9373d67c Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal
Pull thermal fixes from Eduardo Valentin:
 "A couple of minor fixes for the thermal subsystem.

  Specifics in this pull request:

   - Fixes in hisilicon thermal driver
   - More fixes of unsigned to int type change in thermal_core.c"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
  thermal: use %d to print S32 parameters
  thermal: hisilicon: increase temperature resolution
2016-04-30 18:57:42 -07:00
Linus Torvalds 1b46bac627 Merge tag 'powerpc-4.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
 "A few more powerpc fixes for 4.6:

   - cxl: Keep IRQ mappings on context teardown from Michael Neuling

   - cxl: Poll for outstanding IRQs when detaching a context from
     Michael Neuling

   - Wire up preadv2 and pwritev2 syscalls from Rui Salvaterra"

* tag 'powerpc-4.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc: wire up preadv2 and pwritev2 syscalls
  cxl: Poll for outstanding IRQs when detaching a context
  cxl: Keep IRQ mappings on context teardown
2016-04-29 18:50:08 -07:00
Linus Torvalds 65c4cbeba7 Merge tag 'edac_fix_for_4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp
Pull EDAC fix from Borislav Petkov:
 "Make sure sb_edac and i7core_edac do not terminate MCE processing on
  the decoding callchain prematurely"

* tag 'edac_fix_for_4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  EDAC: i7core, sb_edac: Don't return NOTIFY_BAD from mce_decoder callback
2016-04-29 17:59:26 -07:00
Linus Torvalds b49a5195e2 Merge tag 'pm+acpi-4.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
 "One revert of a recent cpufreq commit that introduced a regression and
  a fix for intel_pstate's Turbo Activation Ratio handling code.

  Specifics:

   - Revert cpufreq commit that attempted to fix a problem in the
     ondemand/conservative governor code, but did that incorrectly and
     introduced another problem instead (Rafael Wysocki).

   - Fix incorrect decoding of MSR contents related to the Turbo
     Activation Ratio (TAR) handling in the intel_pstate driver
     (Srinivas Pandruvada)"

* tag 'pm+acpi-4.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: intel_pstate: Fix processing for turbo activation ratio
  Revert "cpufreq: governor: Fix negative idle_time when configured with CONFIG_HZ_PERIODIC"
2016-04-29 17:39:51 -07:00
Linus Torvalds a8feb78209 Merge tag 'mmc-v4.6-rc4' of git://git.linaro.org/people/ulf.hansson/mmc
Pull MMC fixes from Ulf Hansson:
 "Here are a two MMC host fixes:

  - sdhci-acpi: Reduce Baytrail eMMC/SD/SDIO hangs

  - sunxi: Disable eMMC HS-DDR for Allwinner A80"

* tag 'mmc-v4.6-rc4' of git://git.linaro.org/people/ulf.hansson/mmc:
  mmc: sunxi: Disable eMMC HS-DDR (MMC_CAP_1_8V_DDR) for Allwinner A80
  mmc: sdhci-acpi: Reduce Baytrail eMMC/SD/SDIO hangs
2016-04-29 17:32:19 -07:00
Linus Torvalds b9cc335ffa Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "A few fixes all over the place:

  radeon is probably the biggest standout, it's a fix for screen
  corruption or hung black outputs so I thought it was worth pulling in.

  Otherwise some amdgpu power control fixes, some misc vmwgfx fixes, one
  etnaviv fix, one virtio-gpu fix, two DP MST fixes, and a single TTM
  fix"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/vmwgfx: Fix order of operation
  drm/vmwgfx: use vmw_cmd_dx_cid_check for query commands.
  drm/vmwgfx: Enable SVGA_3D_CMD_DX_SET_PREDICATION
  drm/amdgpu: disable vm interrupts with vm_fault_stop=2
  drm/amdgpu: print a message if ATPX dGPU power control is missing
  Revert "drm/amdgpu: disable runtime pm on PX laptops without dGPU power control"
  drm/radeon: fix vertical bars appear on monitor (v2)
  drm/ttm: fix kref count mess in ttm_bo_move_to_lru_tail
  drm/virtio: send vblank event after crtc updates
  drm/dp/mst: Restore primary hub guid on resume
  drm/dp/mst: Get validated port ref in drm_dp_update_payload_part1()
  drm/etnaviv: don't move linear memory window on 3D cores without MC2.0
2016-04-29 17:18:55 -07:00
Linus Torvalds 925d96a0c9 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
 "Final set of -rc fixes for 4.6.

  I've collected up a number of patches that are all pretty small with
  the exception of only a couple.  The hfi1 driver has a number of
  important patches, and it is what really drives the line count of this
  pull request up.  These are all small and I've got this kernel built
  and running in the test lab (I have most of the hardware, I think nes
  is the only thing in this patch set that I can't say I've personally
  tested and have up and running).

  Summary:

   - A number of collected fixes for oopses, memory corruptions,
     deadlocks, etc.  All of these fixes are small (many only 5-10
     lines), obvious, and tested.

   - Fix for the security issue related to the use of write for
     bi-directional communications"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  RDMA/nes: don't leak skb if carrier down
  IB/security: Restrict use of the write() interface
  IB/hfi1: Use kernel default llseek for ui device
  IB/hfi1: Don't attempt to free resources if initialization failed
  IB/hfi1: Fix missing lock/unlock in verbs drain callback
  IB/rdmavt: Fix send scheduling
  IB/hfi1: Prevent unpinning of wrong pages
  IB/hfi1: Fix deadlock caused by locking with wrong scope
  IB/hfi1: Prevent NULL pointer deferences in caching code
  MAINTAINERS: Update iser/isert maintainer contact info
  IB/mlx5: Expose correct max_sge_rd limit
  RDMA/iw_cxgb4: Fix bar2 virt addr calculation for T4 chips
  iw_cxgb4: handle draining an idle qp
  iw_cxgb3: initialize ibdev.iwcm->ifname for port mapping
  iw_cxgb4: initialize ibdev.iwcm->ifname for port mapping
  IB/core: Don't drain non-existent rq queue-pair
  IB/core: Fix oops in ib_cache_gid_set_default_gid
2016-04-29 17:07:54 -07:00