Commit Graph

60 Commits

Author SHA1 Message Date
YueHaibing 27fad74a5a iov_iter: Fix build error without CONFIG_CRYPTO
If CONFIG_CRYPTO is not set or set to m,
gcc building warn this:

lib/iov_iter.o: In function `hash_and_copy_to_iter':
iov_iter.c:(.text+0x9129): undefined reference to `crypto_stats_get'
iov_iter.c:(.text+0x9152): undefined reference to `crypto_stats_ahash_update'

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: d05f443554 ("iov_iter: introduce hash_and_copy_to_iter helper")
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-04-03 22:37:41 -04:00
Eric Dumazet 6daef95b8c iov_iter: optimize page_copy_sane()
Avoid cache line miss dereferencing struct page if we can.

page_copy_sane() mostly deals with order-0 pages.

Extra cache line miss is visible on TCP recvmsg() calls dealing
with GRO packets (typically 45 page frags are attached to one skb).

Bringing the 45 struct pages into cpu cache while copying the data
is not free, since the freeing of the skb (and associated
page frags put_page()) can happen after cache lines have been evicted.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-26 14:05:20 -05:00
Linus Torvalds 9b286efeb5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull trivial vfs updates from Al Viro:
 "A few cleanups + Neil's namespace_unlock() optimization"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  exec: make prepare_bprm_creds static
  genheaders: %-<width>s had been there since v6; %-*s - since v7
  VFS: use synchronize_rcu_expedited() in namespace_unlock()
  iov_iter: reduce code duplication
2019-01-05 13:18:59 -08:00
Linus Torvalds 96d4f267e4 Remove 'type' argument from access_ok() function
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.

It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access.  But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.

A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model.  And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.

This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.

There were a couple of notable cases:

 - csky still had the old "verify_area()" name as an alias.

 - the iter_iov code had magical hardcoded knowledge of the actual
   values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
   really used it)

 - microblaze used the type argument for a debug printout

but other than those oddities this should be a total no-op patch.

I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something.  Any missed conversion should be trivially fixable, though.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-03 18:57:57 -08:00
Sagi Grimberg d05f443554 iov_iter: introduce hash_and_copy_to_iter helper
Allow consumers that want to use iov iterator helpers and also update
a predefined hash calculation online when copying data. This is useful
when copying incoming network buffers to a local iterator and calculate
a digest on the incoming stream. nvme-tcp host driver that will be
introduced in following patches is the first consumer via
skb_copy_and_hash_datagram_iter.

Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sagi Grimberg <sagi@lightbitslabs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-13 09:58:54 +01:00
Sagi Grimberg cb002d074d iov_iter: pass void csum pointer to csum_and_copy_to_iter
The single caller to csum_and_copy_to_iter is skb_copy_and_csum_datagram
and we are trying to unite its logic with skb_copy_datagram_iter by passing
a callback to the copy function that we want to apply. Thus, we need
to make the checksum pointer private to the function.

Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sagi Grimberg <sagi@lightbitslabs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-13 09:58:53 +01:00
Al Viro f91528955d iov_iter: reduce code duplication
The same combination of csum_partial_copy_nocheck() with csum_add_block()
is used in a bunch of places.  Add a helper doing just that and use it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-11-27 22:35:08 -05:00
Al Viro 78e1f38617 iov_iter: teach csum_and_copy_to_iter() to handle pipe-backed ones
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-11-25 16:24:49 -05:00
David Howells 9ea9ce0427 iov_iter: Add I/O discard iterator
Add a new iterator, ITER_DISCARD, that can only be used in READ mode and
just discards any data copied to it.

This is useful in a network filesystem for discarding any unwanted data
sent by a server.

Signed-off-by: David Howells <dhowells@redhat.com>
2018-10-24 00:41:07 +01:00
David Howells aa563d7bca iov_iter: Separate type from direction and use accessor functions
In the iov_iter struct, separate the iterator type from the iterator
direction and use accessor functions to access them in most places.

Convert a bunch of places to use switch-statements to access them rather
then chains of bitwise-AND statements.  This makes it easier to add further
iterator types.  Also, this can be more efficient as to implement a switch
of small contiguous integers, the compiler can use ~50% fewer compare
instructions than it has to use bitwise-and instructions.

Further, cease passing the iterator type into the iterator setup function.
The iterator function can set that itself.  Only the direction is required.

Signed-off-by: David Howells <dhowells@redhat.com>
2018-10-24 00:41:07 +01:00
David Howells 00e2370744 iov_iter: Use accessor function
Use accessor functions to access an iterator's type and direction.  This
allows for the possibility of using some other method of determining the
type of iterator than if-chains with bitwise-AND conditions.

Signed-off-by: David Howells <dhowells@redhat.com>
2018-10-24 00:40:44 +01:00
Dan Williams ca146f6f09 lib/iov_iter: Fix pipe handling in _copy_to_iter_mcsafe()
By mistake the ITER_PIPE early-exit / warning from copy_from_iter() was
cargo-culted in _copy_to_iter_mcsafe() rather than a machine-check-safe
version of copy_to_iter_pipe().

Implement copy_pipe_to_iter_mcsafe() being careful to return the
indication of short copies due to a CPU exception.

Without this regression-fix all splice reads to dax-mode files fail.

Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Fixes: 8780356ef6 ("x86/asm/memcpy_mcsafe: Define copy_to_iter_mcsafe()")
Link: http://lkml.kernel.org/r/153108277278.37979.3327916996902264102.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-07-16 00:05:05 +02:00
Dan Williams abd08d7d24 lib/iov_iter: Document _copy_to_iter_flushcache()
Add some theory of operation documentation to _copy_to_iter_flushcache().

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/153108276767.37979.9462477994086841699.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-07-16 00:05:05 +02:00
Dan Williams bf3eeb9b5f lib/iov_iter: Document _copy_to_iter_mcsafe()
Add some theory of operation documentation to _copy_to_iter_mcsafe().

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/153108276256.37979.1689794213845539316.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-07-16 00:05:05 +02:00
Linus Torvalds d09a8e6f2c Merge branch 'x86-dax-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 dax updates from Ingo Molnar:
 "This contains x86 memcpy_mcsafe() fault handling improvements the
  nvdimm tree would like to make more use of"

* 'x86-dax-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/asm/memcpy_mcsafe: Define copy_to_iter_mcsafe()
  x86/asm/memcpy_mcsafe: Add write-protection-fault handling
  x86/asm/memcpy_mcsafe: Return bytes remaining
  x86/asm/memcpy_mcsafe: Add labels for __memcpy_mcsafe() write fault handling
  x86/asm/memcpy_mcsafe: Remove loop unrolling
2018-06-04 19:23:13 -07:00
Dan Williams 8780356ef6 x86/asm/memcpy_mcsafe: Define copy_to_iter_mcsafe()
Use the updated memcpy_mcsafe() implementation to define
copy_user_mcsafe() and copy_to_iter_mcsafe(). The most significant
difference from typical copy_to_iter() is that the ITER_KVEC and
ITER_BVEC iterator types can fail to complete a full transfer.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: hch@lst.de
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-nvdimm@lists.01.org
Link: http://lkml.kernel.org/r/152539239150.31796.9189779163576449784.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-15 08:32:42 +02:00
Ilya Dryomov d7760d638b iov_iter: fix memory leak in pipe_get_pages_alloc()
Make n signed to avoid leaking the pages array if __pipe_get_pages()
fails to allocate any pages.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-02 15:43:19 -04:00
Ilya Dryomov e76b631239 iov_iter: fix return type of __pipe_get_pages()
It returns -EFAULT and happens to be a helper for pipe_get_pages()
whose return type is ssize_t.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-02 15:43:15 -04:00
Al Viro 09cf698a59 new primitive: iov_iter_for_each_range()
For kvec and bvec: feeds segments to given callback as long as it
returns 0.  For iovec and pipe: fails.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-10-11 22:36:54 -04:00
Petar Penkov a90bcb86ae iov_iter: fix page_copy_sane for compound pages
Issue is that if the data crosses a page boundary inside a compound
page, this check will incorrectly trigger a WARN_ON.

To fix this, compute the order using the head of the compound page and
adjust the offset to be relative to that head.

Fixes: 72e809ed81 ("iov_iter: sanity checks for copy to/from page
primitives")

Signed-off-by: Petar Penkov <ppenkov@google.com>
CC: Al Viro <viro@zeniv.linux.org.uk>
CC: Eric Dumazet <edumazet@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-09-20 23:27:48 -04:00
Linus Torvalds 6a37e94009 Merge branch 'uaccess-work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull iov_iter hardening from Al Viro:
 "This is the iov_iter/uaccess/hardening pile.

  For one thing, it trims the inline part of copy_to_user/copy_from_user
  to the minimum that *does* need to be inlined - object size checks,
  basically. For another, it sanitizes the checks for iov_iter
  primitives. There are 4 groups of checks: access_ok(), might_fault(),
  object size and KASAN.

   - access_ok() had been verified by whoever had set the iov_iter up.
     However, that has happened in a function far away, so proving that
     there's no path to actual copying bypassing those checks is hard
     and proving that iov_iter has not been buggered in the meanwhile is
     also not pleasant. So we want those redone in actual
     copyin/copyout.

   - might_fault() is better off consolidated - we know whether it needs
     to be checked as soon as we enter iov_iter primitive and observe
     the iov_iter flavour. No need to wait until the copyin/copyout. The
     call chains are short enough to make sure we won't miss anything -
     in fact, it's more robust that way, since there are cases where we
     do e.g. forced fault-in before getting to copyin/copyout. It's not
     quite what we need to check (in particular, combination of
     iovec-backed and set_fs(KERNEL_DS) is almost certainly a bug, not a
     cause to skip checks), but that's for later series. For now let's
     keep might_fault().

   - KASAN checks belong in copyin/copyout - at the same level where
     other iov_iter flavours would've hit them in memcpy().

   - object size checks should apply to *all* iov_iter flavours, not
     just iovec-backed ones.

  There are two groups of primitives - one gets the kernel object
  described as pointer + size (copy_to_iter(), etc.) while another gets
  it as page + offset + size (copy_page_to_iter(), etc.)

  For the first group the checks are best done where we actually have a
  chance to find the object size. In other words, those belong in inline
  wrappers in uio.h, before calling into iov_iter.c. Same kind as we
  have for inlined part of copy_to_user().

  For the second group there is no object to look at - offset in page is
  just a number, it bears no type information. So we do them in the
  common helper called by iov_iter.c primitives of that kind. All it
  currently does is checking that we are not trying to access outside of
  the compound page; eventually we might want to add some sanity checks
  on the page involved.

  So the things we need in copyin/copyout part of iov_iter.c do not
  quite match anything in uaccess.h (we want no zeroing, we *do* want
  access_ok() and KASAN and we want no might_fault() or object size
  checks done on that level). OTOH, these needs are simple enough to
  provide a couple of helpers (static in iov_iter.c) doing just what we
  need..."

* 'uaccess-work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  iov_iter: saner checks on copyin/copyout
  iov_iter: sanity checks for copy to/from page primitives
  iov_iter/hardening: move object size checks to inlined part
  copy_{to,from}_user(): consolidate object size checks
  copy_{from,to}_user(): move kasan checks and might_fault() out-of-line
2017-07-07 20:39:20 -07:00
Al Viro 09fc68dc66 iov_iter: saner checks on copyin/copyout
* might_fault() is better checked in caller (and e.g. fault-in + kmap_atomic
codepath also needs might_fault() coverage)
* we have already done object size checks
* we have *NOT* done access_ok() recently enough; we rely upon the
iovec array having passed sanity checks back when it had been created
and not nothing having buggered it since.  However, that's very much
non-local, so we'd better recheck that.

So the thing we want does not match anything in uaccess - we need
access_ok + kasan checks + raw copy without any zeroing.  Just define
such helpers and use them here.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-07-07 05:18:09 -04:00
Al Viro 72e809ed81 iov_iter: sanity checks for copy to/from page primitives
for now - just that we don't attempt to cross out of compound page

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29 22:21:23 -04:00
Al Viro aa28de275a iov_iter/hardening: move object size checks to inlined part
There we actually have useful information about object sizes.
Note: this patch has them done for all iov_iter flavours.
Right now we do them twice in iovec case, but that'll change
very shortly.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29 22:21:22 -04:00
Dan Williams 0aed55af88 x86, uaccess: introduce copy_from_iter_flushcache for pmem / cache-bypass operations
The pmem driver has a need to transfer data with a persistent memory
destination and be able to rely on the fact that the destination writes are not
cached. It is sufficient for the writes to be flushed to a cpu-store-buffer
(non-temporal / "movnt" in x86 terms), as we expect userspace to call fsync()
to ensure data-writes have reached a power-fail-safe zone in the platform. The
fsync() triggers a REQ_FUA or REQ_FLUSH to the pmem driver which will turn
around and fence previous writes with an "sfence".

Implement a __copy_from_user_inatomic_flushcache, memcpy_page_flushcache, and
memcpy_flushcache, that guarantee that the destination buffer is not dirty in
the cpu cache on completion. The new copy_from_iter_flushcache and sub-routines
will be used to replace the "pmem api" (include/linux/pmem.h +
arch/x86/include/asm/pmem.h). The availability of copy_from_iter_flushcache()
and memcpy_flushcache() are gated by the CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
config symbol, and fallback to copy_from_iter_nocache() and plain memcpy()
otherwise.

This is meant to satisfy the concern from Linus that if a driver wants to do
something beyond the normal nocache semantics it should be something private to
that driver [1], and Al's concern that anything uaccess related belongs with
the rest of the uaccess code [2].

The first consumer of this interface is a new 'copy_from_iter' dax operation so
that pmem can inject cache maintenance operations without imposing this
overhead on other dax-capable drivers.

[1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html
[2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html

Cc: <x86@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-09 09:09:56 -07:00