Pull the pending 4.21 changes for md from Shaohua.
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
md: fix raid10 hang issue caused by barrier
raid10: refactor common wait code from regular read/write request
md: remvoe redundant condition check
lib/raid6: add option to skip algo benchmarking
lib/raid6: sort algos in rough performance order
lib/raid6: check for assembler SSSE3 support
lib/raid6: avoid __attribute_const__ redefinition
lib/raid6: add missing include for raid6test
md: remove set but not used variable 'bi_rdev'
Pull DMA mapping updates from Christoph Hellwig:
"A huge update this time, but a lot of that is just consolidating or
removing code:
- provide a common DMA_MAPPING_ERROR definition and avoid indirect
calls for dma_map_* error checking
- use direct calls for the DMA direct mapping case, avoiding huge
retpoline overhead for high performance workloads
- merge the swiotlb dma_map_ops into dma-direct
- provide a generic remapping DMA consistent allocator for
architectures that have devices that perform DMA that is not cache
coherent. Based on the existing arm64 implementation and also used
for csky now.
- improve the dma-debug infrastructure, including dynamic allocation
of entries (Robin Murphy)
- default to providing chaining scatterlist everywhere, with opt-outs
for the few architectures (alpha, parisc, most arm32 variants) that
can't cope with it
- misc sparc32 dma-related cleanups
- remove the dma_mark_clean arch hook used by swiotlb on ia64 and
replace it with the generic noncoherent infrastructure
- fix the return type of dma_set_max_seg_size (Niklas Söderlund)
- move the dummy dma ops for not DMA capable devices from arm64 to
common code (Robin Murphy)
- ensure dma_alloc_coherent returns zeroed memory to avoid kernel
data leaks through userspace. We already did this for most common
architectures, but this ensures we do it everywhere.
dma_zalloc_coherent has been deprecated and can hopefully be
removed after -rc1 with a coccinelle script"
* tag 'dma-mapping-4.21' of git://git.infradead.org/users/hch/dma-mapping: (73 commits)
dma-mapping: fix inverted logic in dma_supported
dma-mapping: deprecate dma_zalloc_coherent
dma-mapping: zero memory returned from dma_alloc_*
sparc/iommu: fix ->map_sg return value
sparc/io-unit: fix ->map_sg return value
arm64: default to the direct mapping in get_arch_dma_ops
PCI: Remove unused attr variable in pci_dma_configure
ia64: only select ARCH_HAS_DMA_COHERENT_TO_PFN if swiotlb is enabled
dma-mapping: bypass indirect calls for dma-direct
vmd: use the proper dma_* APIs instead of direct methods calls
dma-direct: merge swiotlb_dma_ops into the dma_direct code
dma-direct: use dma_direct_map_page to implement dma_direct_map_sg
dma-direct: improve addressability error reporting
swiotlb: remove dma_mark_clean
swiotlb: remove SWIOTLB_MAP_ERROR
ACPI / scan: Refactor _CCA enforcement
dma-mapping: factor out dummy DMA ops
dma-mapping: always build the direct mapping code
dma-mapping: move dma_cache_sync out of line
dma-mapping: move various slow path functions out of line
...
This is helpful for systems where fast startup time is important.
It is especially nice to avoid benchmarking RAID functions that are
never used (for example, BTRFS selects RAID6_PQ even if the parity RAID
mode is not in use).
This saves 250+ milliseconds of boot time on modern x86 and ARM systems
with a dozen or more available implementations.
The new option is defaulted to 'y' to match the previous behavior of
always benchmarking on init.
Signed-off-by: Daniel Verkamp <dverkamp@chromium.org>
Signed-off-by: Shaohua Li <shli@fb.com>
These days architectures are mostly out of the business of dealing with
struct scatterlist at all, unless they have architecture specific iommu
drivers. Replace the ARCH_HAS_SG_CHAIN symbol with a ARCH_NO_SG_CHAIN
one only enabled for architectures with horrible legacy iommu drivers
like alpha and parisc, and conditionally for arm which wants to keep it
disable for legacy platforms.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
This lib tracks objects which could be of two types:
1) root object
2) nested object - with a "delta" which differentiates it from
the associated root object
The objects are tracked by a hashtable and reference-counted. User is
responsible of implementing callbacks to create/destroy root entity
related to each root object and callback to create/destroy nested object
delta.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull more RISC-V updates from Palmer Dabbelt:
"This contains the follow-on patches I'd like to target for the 4.20
merge window. I'm being somewhat conservative here, as while there are
a few patches on the mailing list that were posted early in the merge
window I'd like to let those bake for another round -- this was a
fairly big release as far as RISC-V is concerened, and we need to walk
before we can run.
As far as the patches that made it go:
- A patch to ignore offline CPUs when calculating AT_HWCAP. This
should fix GDB on the HiFive unleashed, which has an embedded core
for hart 0 which is exposed to Linux as an offline CPU.
- A move of EM_RISCV to elf-em.h, which is where it should have been
to begin with.
- I've also removed the 64-bit divide routines. I know I'm not really
playing by my own rules here because I posted the patches this
morning, but since they shouldn't be in the kernel I think it's
better to err on the side of going too fast here.
I don't anticipate any more patch sets for the merge window"
* tag 'riscv-for-linus-4.20-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
Move EM_RISCV into elf-em.h
RISC-V: properly determine hardware caps
Revert "lib: Add umoddi3 and udivmoddi4 of GCC library routines"
Revert "RISC-V: Select GENERIC_LIB_UMODDI3 on RV32"
Pull XArray conversion from Matthew Wilcox:
"The XArray provides an improved interface to the radix tree data
structure, providing locking as part of the API, specifying GFP flags
at allocation time, eliminating preloading, less re-walking the tree,
more efficient iterations and not exposing RCU-protected pointers to
its users.
This patch set
1. Introduces the XArray implementation
2. Converts the pagecache to use it
3. Converts memremap to use it
The page cache is the most complex and important user of the radix
tree, so converting it was most important. Converting the memremap
code removes the only other user of the multiorder code, which allows
us to remove the radix tree code that supported it.
I have 40+ followup patches to convert many other users of the radix
tree over to the XArray, but I'd like to get this part in first. The
other conversions haven't been in linux-next and aren't suitable for
applying yet, but you can see them in the xarray-conv branch if you're
interested"
* 'xarray' of git://git.infradead.org/users/willy/linux-dax: (90 commits)
radix tree: Remove multiorder support
radix tree test: Convert multiorder tests to XArray
radix tree tests: Convert item_delete_rcu to XArray
radix tree tests: Convert item_kill_tree to XArray
radix tree tests: Move item_insert_order
radix tree test suite: Remove multiorder benchmarking
radix tree test suite: Remove __item_insert
memremap: Convert to XArray
xarray: Add range store functionality
xarray: Move multiorder_check to in-kernel tests
xarray: Move multiorder_shrink to kernel tests
xarray: Move multiorder account test in-kernel
radix tree test suite: Convert iteration test to XArray
radix tree test suite: Convert tag_tagged_items to XArray
radix tree: Remove radix_tree_clear_tags
radix tree: Remove radix_tree_maybe_preload_order
radix tree: Remove split/join code
radix tree: Remove radix_tree_update_node_t
page cache: Finish XArray conversion
dax: Convert page fault handlers to XArray
...
Add umoddi3 and udivmoddi4 support for 32-bit.
The RV32 need the umoddi3 to do modulo when the operands are long long
type, like other libraries implementation such as ucmpdi2, lshrdi3 and
so on.
I encounter the undefined reference 'umoddi3' when I use the in
house dma driver, although it is in house driver, but I think that
umoddi3 is a common function for RV32.
The udivmoddi4 and umoddi3 are copies from libgcc in gcc. There are other
functions use the udivmoddi4 in libgcc, so I separate the umoddi3 and
udivmoddi4 for flexible extension in the future.
Signed-off-by: Zong Li <zong@andestech.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
All users have now been converted to the XArray. Removing the support
reduces code size and ensures new users will use the XArray instead.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Instead of storing a pointer to the slot containing the canonical entry,
store the offset of the slot. Produces slightly more efficient code
(~300 bytes) and simplifies the implementation.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Patch series "add crc64 calculation as kernel library", v5.
This patchset adds basic implementation of crc64 calculation as a Linux
kernel library. Since bcache already does crc64 by itself, this patchset
also modifies bcache code to use the new crc64 library routine.
Currently bcache is the only user of crc64 calculation, another potential
user is bcachefs which is on the way to be in mainline kernel. Therefore
it makes sense to make crc64 calculation to be a public library.
bcache uses crc64 as storage checksum, if a change of crc lib routines
results an inconsistent result, the unmatched checksum may make bcache
'think' the on-disk is corrupted, such a change should be avoided or
detected as early as possible. Therefore a patch is being prepared which
adds a crc test framework, to check consistency of different calculations.
This patch (of 2):
Add the re-write crc64 calculation routines for Linux kernel. The CRC64
polynomical arithmetic follows ECMA-182 specification, inspired by CRC
paper of Dr. Ross N. Williams (see
http://www.ross.net/crc/download/crc_v3.txt) and other public domain
implementations.
All the changes work in this way,
- When Linux kernel is built, host program lib/gen_crc64table.c will be
compiled to lib/gen_crc64table and executed.
- The output of gen_crc64table execution is an array called as lookup
table (a.k.a POLY 0x42f0e1eba9ea369) which contain 256 64-bit long
numbers, this table is dumped into header file lib/crc64table.h.
- Then the header file is included by lib/crc64.c for normal 64bit crc
calculation.
- Function declaration of the crc64 calculation routines is placed in
include/linux/crc64.h
Currently bcache is the only user of crc64_be(), another potential user is
bcachefs which is on the way to be in mainline kernel. Therefore it makes
sense to move crc64 calculation into lib/crc64.c as public code.
[colyli@suse.de: fix review comments from v4]
Link: http://lkml.kernel.org/r/20180726053352.2781-2-colyli@suse.de
Link: http://lkml.kernel.org/r/20180718165545.1622-2-colyli@suse.de
Signed-off-by: Coly Li <colyli@suse.de>
Co-developed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Michael Lyle <mlyle@lyle.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Noah Massey <noah.massey@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull dma-mapping rename from Christoph Hellwig:
"Move all the dma-mapping code to kernel/dma and lose their dma-*
prefixes"
* tag 'dma-rename-4.18' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: move all DMA mapping code to kernel/dma
dma-mapping: use obj-y instead of lib-y for generic dma ops
Currently the code is split over various files with dma- prefixes in the
lib/ and drives/base directories, and the number of files keeps growing.
Move them into a single directory to keep the code together and remove
the file name prefixes. To match the irq infrastructure this directory
is placed under the kernel/ directory.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Pull MIPS updates from James Hogan:
"These are the main MIPS changes for 4.18.
Rough overview:
- MAINTAINERS: Add Paul Burton as MIPS co-maintainer
- Misc: Generic compiler intrinsics, Y2038 improvements, Perf+MT fixes
- Platform support: Netgear WNR1000 V3, Microsemi Ocelot integrated
switch, Ingenic watchdog cleanups
More detailed summary:
Maintainers:
- Add Paul Burton as MIPS co-maintainer, as I soon won't have access
to much MIPS hardware, nor enough time to properly maintain MIPS on
my own.
Miscellaneous:
- Use generic GCC library routines from lib/
- Add notrace to generic ucmpdi2 implementation
- Rename compiler intrinsic selects to GENERIC_LIB_*
- vmlinuz: Use generic ashldi3
- y2038: Convert update/read_persistent_clock() to *_clock64()
- sni: Remove read_persistent_clock()
- perf: Fix perf with MT counting other threads
- Probe for per-TC perf counters in cpu-probe.c
- Use correct VPE ID for VPE tracing
Minor cleanups:
- Avoid unneeded built-in.a in DTS dirs
- sc-debugfs: Re-use kstrtobool_from_user
- memset.S: Reinstate delay slot indentation
- VPE: Fix spelling "uneeded" -> "Unneeded"
Platform support:
BCM47xx:
- Add support for Netgear WNR1000 V3
- firmware: Support small NVRAM partitions
- Use __initdata for LEDs platform data
Ingenic:
- Watchdog driver & platform code improvements:
- Disable clock after stopping counter
- Use devm_* functions
- Drop module remove function
- Move platform reset code to restart handler in driver
- JZ4740: Convert watchdog instantiation to DT
- JZ4780: Fix watchdog DT node
- qi_lb60_defconfig: Enable watchdog driver
Microsemi:
- Ocelot: Add support for integrated switch
- pcb123: Connect phys to ports"
* tag 'mips_4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (30 commits)
MAINTAINERS: Add Paul Burton as MIPS co-maintainer
MIPS: ptrace: Make FPU context layout comments match reality
MIPS: memset.S: Reinstate delay slot indentation
MIPS: perf: Fix perf with MT counting other threads
MIPS: perf: Use correct VPE ID when setting up VPE tracing
MIPS: perf: More robustly probe for the presence of per-tc counters
MIPS: Probe for MIPS MT perf counters per TC
MIPS: mscc: Connect phys to ports on ocelot_pcb123
MIPS: mscc: Add switch to ocelot
MIPS: JZ4740: Drop old platform reset code
MIPS: qi_lb60: Enable the jz4740-wdt driver
MIPS: JZ4780: dts: Fix watchdog node
MIPS: JZ4740: dts: Add bindings for the jz4740-wdt driver
watchdog: JZ4740: Drop module remove function
watchdog: JZ4740: Register a restart handler
watchdog: JZ4740: Use devm_* functions
watchdog: JZ4740: Disable clock after stopping counter
MIPS: VPE: Fix spelling mistake: "uneeded" -> "unneeded"
MIPS: Re-use kstrtobool_from_user()
MIPS: Convert update_persistent_clock() to update_persistent_clock64()
...
Pull libnvdimm updates from Dan Williams:
"This adds a user for the new 'bytes-remaining' updates to
memcpy_mcsafe() that you already received through Ingo via the
x86-dax- for-linus pull.
Not included here, but still targeting this cycle, is support for
handling memory media errors (poison) consumed via userspace dax
mappings.
Summary:
- DAX broke a fundamental assumption of truncate of file mapped
pages. The truncate path assumed that it is safe to disconnect a
pinned page from a file and let the filesystem reclaim the physical
block. With DAX the page is equivalent to the filesystem block.
Introduce dax_layout_busy_page() to enable filesystems to wait for
pinned DAX pages to be released. Without this wait a filesystem
could allocate blocks under active device-DMA to a new file.
- DAX arranges for the block layer to be bypassed and uses
dax_direct_access() + copy_to_iter() to satisfy read(2) calls.
However, the memcpy_mcsafe() facility is available through the pmem
block driver. In order to safely handle media errors, via the DAX
block-layer bypass, introduce copy_to_iter_mcsafe().
- Fix cache management policy relative to the ACPI NFIT Platform
Capabilities Structure to properly elide cache flushes when they
are not necessary. The table indicates whether CPU caches are
power-fail protected. Clarify that a deep flush is always performed
on REQ_{FUA,PREFLUSH} requests"
* tag 'libnvdimm-for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (21 commits)
dax: Use dax_write_cache* helpers
libnvdimm, pmem: Do not flush power-fail protected CPU caches
libnvdimm, pmem: Unconditionally deep flush on *sync
libnvdimm, pmem: Complete REQ_FLUSH => REQ_PREFLUSH
acpi, nfit: Remove ecc_unit_size
dax: dax_insert_mapping_entry always succeeds
libnvdimm, e820: Register all pmem resources
libnvdimm: Debug probe times
linvdimm, pmem: Preserve read-only setting for pmem devices
x86, nfit_test: Add unit test for memcpy_mcsafe()
pmem: Switch to copy_to_iter_mcsafe()
dax: Report bytes remaining in dax_iomap_actor()
dax: Introduce a ->copy_to_iter dax operation
uio, lib: Fix CONFIG_ARCH_HAS_UACCESS_MCSAFE compilation
xfs, dax: introduce xfs_break_dax_layouts()
xfs: prepare xfs_break_layouts() for another layout type
xfs: prepare xfs_break_layouts() to be called with XFS_MMAPLOCK_EXCL
mm, fs, dax: handle layout changes to pinned dax mappings
mm: fix __gup_device_huge vs unmap
mm: introduce MEMORY_DEVICE_FS_DAX and CONFIG_DEV_PAGEMAP_OPS
...
Add a common Kconfig CONFIG_ARCH_HAS_UACCESS_MCSAFE that archs can
optionally select, and fixup the declaration of _copy_to_iter_mcsafe().
Fixes: 8780356ef6 ("x86/asm/memcpy_mcsafe: Define copy_to_iter_mcsafe()")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Add a new dma_map_ops implementation that uses dma-direct for the
address mapping of streaming mappings, and which requires arch-specific
implemenations of coherent allocate/free.
Architectures have to provide flushing helpers to ownership trasnfers
to the device and/or CPU, and can provide optional implementations of
the coherent mmap functionality, and the cache_flush routines for
non-coherent long term allocations.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Alexey Brodkin <abrodkin@synopsys.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
ARCH_DMA_ADDR_T_64BIT is always true for 64-bit architectures now, so we
can skip the clause requiring it. 'n' is the default default, so no need
to explicitly state it.
Tested-by: Alexey Brodkin <abrodkin@synopsys.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
This way we have one central definition of it, and user can select it as
needed. The new option is not user visible, which is the behavior
it had in most architectures, with a few notable exceptions:
- On x86_64 and mips/loongson3 it used to be user selectable, but
defaulted to y. It now is unconditional, which seems like the right
thing for 64-bit architectures without guaranteed availablity of
IOMMUs.
- on powerpc the symbol is user selectable and defaults to n, but
many boards select it. This change assumes no working setup
required a manual selection, but if that turned out to be wrong
we'll have to add another select statement or two for the respective
boards.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Define this symbol if the architecture either uses 64-bit pointers or the
PHYS_ADDR_T_64BIT is set. This covers 95% of the old arch magic. We only
need an additional select for Xen on ARM (why anyway?), and we now always
set ARCH_DMA_ADDR_T_64BIT on mips boards with 64-bit physical addressing
instead of only doing it when highmem is set.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: James Hogan <jhogan@kernel.org>
This way we have one central definition of it, and user can select it as
needed. Note that we now also always select it when CONFIG_DMA_API_DEBUG
is select, which fixes some incorrect checks in a few network drivers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
This way we have one central definition of it, and user can select it as
needed.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>