Commit Graph

170 Commits

Author SHA1 Message Date
Linus Torvalds
8d4a0b5d08 Merge tag '5.15-rc-cifs-part2' of git://git.samba.org/sfrench/cifs-2.6
Pull smbfs updates from Steve French:
 "cifs/smb3 updates:

   - DFS reconnect fix

   - begin creating common headers for server and client

   - rename the cifs_common directory to smbfs_common to be more
     consistent ie change use of the name cifs to smb (smb3 or smbfs is
     more accurate, as the very old cifs dialect has long been
     superseded by smb3 dialects).

  In the future we can rename the fs/cifs directory to fs/smbfs.

  This does not include the set of multichannel fixes nor the two
  deferred close fixes (they are still being reviewed and tested)"

* tag '5.15-rc-cifs-part2' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: properly invalidate cached root handle when closing it
  cifs: move SMB FSCTL definitions to common code
  cifs: rename cifs_common to smbfs_common
  cifs: update FSCTL definitions
2021-09-12 10:10:21 -07:00
Linus Torvalds
c0f7e49fc4 Merge tag 'block-5.15-2021-09-11' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - NVMe pull request from Christoph:
     - fix nvmet command set reporting for passthrough controllers (Adam Manzanares)
     - update a MAINTAINERS email address (Chaitanya Kulkarni)
     - set QUEUE_FLAG_NOWAIT for nvme-multipth (me)
     - handle errors from add_disk() (Luis Chamberlain)
     - update the keep alive interval when kato is modified (Tatsuya Sasaki)
     - fix a buffer overrun in nvmet_subsys_attr_serial (Hannes Reinecke)
     - do not reset transport on data digest errors in nvme-tcp (Daniel Wagner)
     - only call synchronize_srcu when clearing current path (Daniel Wagner)
     - revalidate paths during rescan (Hannes Reinecke)

 - Split out the fs/block_dev into block/fops.c and block/bdev.c, which
   has been long overdue. Do this now before -rc1, to avoid annoying
   conflicts due to this (Christoph)

 - blk-throtl use-after-free fix (Li)

 - Improve plug depth for multi-device plugs, greatly increasing md
   resync performance (Song)

 - blkdev_show() locking fix (Tetsuo)

 - n64cart error check fix (Yang)

* tag 'block-5.15-2021-09-11' of git://git.kernel.dk/linux-block:
  n64cart: fix return value check in n64cart_probe()
  blk-mq: allow 4x BLK_MAX_REQUEST_COUNT at blk_plug for multiple_queues
  block: move fs/block_dev.c to block/bdev.c
  block: split out operations on block special files
  blk-throttle: fix UAF by deleteing timer in blk_throtl_exit()
  block: genhd: don't call blkdev_show() with major_names_lock held
  nvme: update MAINTAINERS email address
  nvme: add error handling support for add_disk()
  nvme: only call synchronize_srcu when clearing current path
  nvme: update keep alive interval when kato is modified
  nvme-tcp: Do not reset transport on data digest errors
  nvmet: fixup buffer overrun in nvmet_subsys_attr_serial()
  nvmet: return bool from nvmet_passthru_ctrl and nvmet_is_passthru_req
  nvmet: looks at the passthrough controller when initializing CAP
  nvme: move nvme_multi_css into nvme.h
  nvme-multipath: revalidate paths during rescan
  nvme-multipath: set QUEUE_FLAG_NOWAIT
2021-09-11 10:19:51 -07:00
Steve French
23e91d8b7c cifs: rename cifs_common to smbfs_common
As we move to common code between client and server, we have
been asked to make the names less confusing, and refer less
to "cifs" and more to words which include "smb" instead to
e.g. "smbfs" for the client (we already have "ksmbd" for the
kernel server, and "smbd" for the user space Samba daemon).
So to be more consistent in the naming of common code between
client and server and reduce the risk of merge conflicts as
more common code is added - rename "cifs_common" to
"smbfs_common" (in future releases we also will rename
the fs/cifs directory to fs/smbfs)

Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-08 23:59:26 -05:00
Christoph Hellwig
0dca4462ed block: move fs/block_dev.c to block/bdev.c
Move it together with the rest of the block layer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210907141303.1371844-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-07 08:39:40 -06:00
Linus Torvalds
f7464060f7 Merge git://github.com/Paragon-Software-Group/linux-ntfs3
Merge NTFSv3 filesystem from Konstantin Komarov:
 "This patch adds NTFS Read-Write driver to fs/ntfs3.

  Having decades of expertise in commercial file systems development and
  huge test coverage, we at Paragon Software GmbH want to make our
  contribution to the Open Source Community by providing implementation
  of NTFS Read-Write driver for the Linux Kernel.

  This is fully functional NTFS Read-Write driver. Current version works
  with NTFS (including v3.1) and normal/compressed/sparse files and
  supports journal replaying.

  We plan to support this version after the codebase once merged, and
  add new features and fix bugs. For example, full journaling support
  over JBD will be added in later updates"

Link: https://lore.kernel.org/lkml/20210729134943.778917-1-almaz.alexandrovich@paragon-software.com/
Link: https://lore.kernel.org/lkml/aa4aa155-b9b2-9099-b7a2-349d8d9d8fbd@paragon-software.com/

* git://github.com/Paragon-Software-Group/linux-ntfs3: (35 commits)
  fs/ntfs3: Change how module init/info messages are displayed
  fs/ntfs3: Remove GPL boilerplates from decompress lib files
  fs/ntfs3: Remove unnecessary condition checking from ntfs_file_read_iter
  fs/ntfs3: Fix integer overflow in ni_fiemap with fiemap_prep()
  fs/ntfs3: Restyle comments to better align with kernel-doc
  fs/ntfs3: Rework file operations
  fs/ntfs3: Remove fat ioctl's from ntfs3 driver for now
  fs/ntfs3: Restyle comments to better align with kernel-doc
  fs/ntfs3: Fix error handling in indx_insert_into_root()
  fs/ntfs3: Potential NULL dereference in hdr_find_split()
  fs/ntfs3: Fix error code in indx_add_allocate()
  fs/ntfs3: fix an error code in ntfs_get_acl_ex()
  fs/ntfs3: add checks for allocation failure
  fs/ntfs3: Use kcalloc/kmalloc_array over kzalloc/kmalloc
  fs/ntfs3: Do not use driver own alloc wrappers
  fs/ntfs3: Use kernel ALIGN macros over driver specific
  fs/ntfs3: Restyle comment block in ni_parse_reparse()
  fs/ntfs3: Remove unused including <linux/version.h>
  fs/ntfs3: Fix fall-through warnings for Clang
  fs/ntfs3: Fix one none utf8 char in source file
  ...
2021-09-04 11:15:50 -07:00
Linus Torvalds
9c849ce86e Merge tag '5.15-rc-smb3-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs client updates from Steve French:
 "Eleven cifs/smb3 client fixes:

   - mostly restructuring to allow disabling less secure algorithms
     (this will allow eventual removing rc4 and md4 from general use in
     the kernel)

   - four fixes, including two for stable

   - enable r/w support with fscache and cifs.ko

  I am working on a larger set of changes (the usual ... multichannel,
  auth and signing improvements), but wanted to get these in earlier to
  reduce chance of merge conflicts later in the merge window"

* tag '5.15-rc-smb3-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: Do not leak EDEADLK to dgetents64 for STATUS_USER_SESSION_DELETED
  cifs: add cifs_common directory to MAINTAINERS file
  cifs: cifs_md4 convert to SPDX identifier
  cifs: create a MD4 module and switch cifs.ko to use it
  cifs: fork arc4 and create a separate module for it for cifs and other users
  cifs: remove support for NTLM and weaker authentication algorithms
  cifs: enable fscache usage even for files opened as rw
  oid_registry: Add OIDs for missing Spnego auth mechanisms to Macs
  smb3: fix posix extensions mount option
  cifs: fix wrong release in sess_alloc_buffer() failed path
  CIFS: Fix a potencially linear read overflow
2021-08-31 09:22:37 -07:00
Linus Torvalds
e24c567b7e Merge tag '5.15-rc-first-ksmbd-merge' of git://git.samba.org/ksmbd
Pull initial ksmbd implementation from Steve French:
 "Initial merge of kernel smb3 file server, ksmbd.

  The SMB family of protocols is the most widely deployed network
  filesystem protocol, the default on Windows and Macs (and even on many
  phones and tablets), with clients and servers on all major operating
  systems, but lacked a kernel server for Linux. For many cases the
  current userspace server choices were suboptimal either due to memory
  footprint, performance or difficulty integrating well with advanced
  Linux features.

  ksmbd is a new kernel module which implements the server-side of the
  SMB3 protocol. The target is to provide optimized performance, GPLv2
  SMB server, and better lease handling (distributed caching). The
  bigger goal is to add new features more rapidly (e.g. RDMA aka
  "smbdirect", and recent encryption and signing improvements to the
  protocol) which are easier to develop on a smaller, more tightly
  optimized kernel server than for example in Samba.

  The Samba project is much broader in scope (tools, security services,
  LDAP, Active Directory Domain Controller, and a cross platform file
  server for a wider variety of purposes) but the user space file server
  portion of Samba has proved hard to optimize for some Linux workloads,
  including for smaller devices.

  This is not meant to replace Samba, but rather be an extension to
  allow better optimizing for Linux, and will continue to integrate well
  with Samba user space tools and libraries where appropriate. Working
  with the Samba team we have already made sure that the configuration
  files and xattrs are in a compatible format between the kernel and
  user space server.

  Various types of functional and regression tests are regularly run
  against it. One example is the automated 'buildbot' regression tests
  which use the Linux client to test against ksmbd, e.g.

     http://smb3-test-rhel-75.southcentralus.cloudapp.azure.com/#/builders/8/builds/56

  but other test suites, including Samba's smbtorture functional test
  suite are also used regularly"

* tag '5.15-rc-first-ksmbd-merge' of git://git.samba.org/ksmbd: (219 commits)
  ksmbd: fix __write_overflow warning in ndr_read_string
  MAINTAINERS: ksmbd: add cifs_common directory to ksmbd entry
  MAINTAINERS: ksmbd: update my email address
  ksmbd: fix permission check issue on chown and chmod
  ksmbd: don't set FILE DELETE and FILE_DELETE_CHILD in access mask by default
  MAINTAINERS: add git adddress of ksmbd
  ksmbd: update SMB3 multi-channel support in ksmbd.rst
  ksmbd: smbd: fix kernel oops during server shutdown
  ksmbd: remove select FS_POSIX_ACL in Kconfig
  ksmbd: use proper errno instead of -1 in smb2_get_ksmbd_tcon()
  ksmbd: update the comment for smb2_get_ksmbd_tcon()
  ksmbd: change int data type to boolean
  ksmbd: Fix multi-protocol negotiation
  ksmbd: fix an oops in error handling in smb2_open()
  ksmbd: add ipv6_addr_v4mapped check to know if connection from client is ipv4
  ksmbd: fix missing error code in smb2_lock
  ksmbd: use channel signingkey for binding SMB2 session setup
  ksmbd: don't set RSS capable in FSCTL_QUERY_NETWORK_INTERFACE_INFO
  ksmbd: Return STATUS_OBJECT_PATH_NOT_FOUND if smb2_creat() returns ENOENT
  ksmbd: fix -Wstringop-truncation warnings
  ...
2021-08-31 09:11:55 -07:00
Ronnie Sahlberg
71c0286324 cifs: fork arc4 and create a separate module for it for cifs and other users
We can not drop ARC4 and basically destroy CIFS connectivity for
almost all CIFS users so create a new forked ARC4 module that CIFS and other
subsystems that have a hard dependency on ARC4 can use.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-08-25 15:47:57 -05:00
Konstantin Komarov
6e5be40d32 fs/ntfs3: Add NTFS3 in fs/Kconfig and fs/Makefile
This adds NTFS3 in fs/Kconfig and fs/Makefile

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-08-13 07:59:09 -07:00
David Hildenbrand
6208721f13 binfmt: remove support for em86 (alpha only)
We have a fairly specific alpha binary loader in Linux: running x86
(i386, i486) binaries via the em86 [1] emulator. As noted in the Kconfig
option, the same behavior can be achieved via binfmt_misc, for example,
more nowadays used for running qemu-user.

An example on how to get binfmt_misc running with em86 can be found in
Documentation/admin-guide/binfmt-misc.rst

The defconfig does not have CONFIG_BINFMT_EM86=y set. And doing a
	make defconfig && make olddefconfig
results in
	# CONFIG_BINFMT_EM86 is not set

... as we don't seem to have any supported Linux distirbution for alpha
anymore, there isn't really any "default" user of that feature anymore.

Searching for "CONFIG_BINFMT_EM86=y" reveals mostly discussions from
around 20 years ago, like [2] describing how to get netscape via em86
running via em86, or [3] discussing that running wine or installing
Win 3.11 through em86 would be a nice feature.

The latest binaries available for em86 are from 2000, version 2.2.1 [4] --
which translates to "unsupported"; further, em86 doesn't even work with
glibc-2.x but only with glibc-2.0 [4, 5]. These are clear signs that
there might not be too many em86 users out there, especially users
relying on modern Linux kernels.

Even though the code footprint is relatively small, let's just get rid
of this blast from the past that's effectively unused.

[1] http://ftp.dreamtime.org/pub/linux/Linux-Alpha/em86/v0.4/docs/em86.html
[2] https://static.lwn.net/1998/1119/a/alpha-netscape.html
[3] https://groups.google.com/g/linux.debian.alpha/c/AkGuQHeCe0Y
[4] http://zeniv.linux.org.uk/pub/linux/alpha/em86/v2.2-1/relnotes.2.2.1.html
[5] https://forum.teamspeak.com/archive/index.php/t-1477.html

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-api@vger.kernel.org
Cc: linux-alpha@vger.kernel.org
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2021-07-25 22:33:03 -07:00
Namjae Jeon
1a93084b9a ksmbd: move fs/cifsd to fs/ksmbd
Move fs/cifsd to fs/ksmbd and rename the remaining cifsd name to ksmbd.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-06-28 16:28:31 +09:00
Namjae Jeon
a848c4f15a cifsd: add Kconfig and Makefile
This adds the Kconfig and Makefile for cifsd.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-05-10 19:15:16 -05:00
David Howells
3d3c950467 netfs: Provide readahead and readpage netfs helpers
Add a pair of helper functions:

 (*) netfs_readahead()
 (*) netfs_readpage()

to do the work of handling a readahead or a readpage, where the page(s)
that form part of the request may be split between the local cache, the
server or just require clearing, and may be single pages and transparent
huge pages.  This is all handled within the helper.

Note that while both will read from the cache if there is data present,
only netfs_readahead() will expand the request beyond what it was asked to
do, and only netfs_readahead() will write back to the cache.

netfs_readpage(), on the other hand, is synchronous and only fetches the
page (which might be a THP) it is asked for.

The netfs gives the helper parameters from the VM, the cache cookie it
wants to use (or NULL) and a table of operations (only one of which is
mandatory):

 (*) expand_readahead() [optional]

     Called to allow the netfs to request an expansion of a readahead
     request to meet its own alignment requirements.  This is done by
     changing rreq->start and rreq->len.

 (*) clamp_length() [optional]

     Called to allow the netfs to cut down a subrequest to meet its own
     boundary requirements.  If it does this, the helper will generate
     additional subrequests until the full request is satisfied.

 (*) is_still_valid() [optional]

     Called to find out if the data just read from the cache has been
     invalidated and must be reread from the server.

 (*) issue_op() [required]

     Called to ask the netfs to issue a read to the server.  The subrequest
     describes the read.  The read request holds information about the file
     being accessed.

     The netfs can cache information in rreq->netfs_priv.

     Upon completion, the netfs should set the error, transferred and can
     also set FSCACHE_SREQ_CLEAR_TAIL and then call
     fscache_subreq_terminated().

 (*) done() [optional]

     Called after the pages have been unlocked.  The read request is still
     pinning the file and mapping and may still be pinning pages with
     PG_fscache.  rreq->error indicates any error that has been
     accumulated.

 (*) cleanup() [optional]

     Called when the helper is disposing of a finished read request.  This
     allows the netfs to clear rreq->netfs_priv.

Netfs support is enabled with CONFIG_NETFS_SUPPORT=y.  It will be built
even if CONFIG_FSCACHE=n and in this case much of it should be optimised
away, allowing the filesystem to use it even when caching is disabled.

Changes:
v5:
 - Comment why netfs_readahead() is putting pages[2].
 - Use page_file_mapping() rather than page->mapping[2].
 - Use page_index() rather than page->index[2].
 - Use set_page_fscache()[3] rather then SetPageFsCache() as this takes an
   appropriate ref too[4].

v4:
 - Folded in a kerneldoc comment fix.
 - Folded in a fix for the error handling in the case that ENOMEM occurs.
 - Added flag to netfs_subreq_terminated() to indicate that the caller may
   have been running async and stuff that might sleep needs punting to a
   workqueue (can't use in_softirq()[1]).

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-and-tested-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Dave Wysochanski <dwysocha@redhat.com>
Tested-By: Marc Dionne <marc.dionne@auristor.com>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-mm@kvack.org
cc: linux-cachefs@redhat.com
cc: linux-afs@lists.infradead.org
cc: linux-nfs@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: ceph-devel@vger.kernel.org
cc: v9fs-developer@lists.sourceforge.net
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20210216084230.GA23669@lst.de/ [1]
Link: https://lore.kernel.org/r/20210321014202.GF3420@casper.infradead.org/ [2]
Link: https://lore.kernel.org/r/2499407.1616505440@warthog.procyon.org.uk/ [3]
Link: https://lore.kernel.org/r/CAHk-=wh+2gbF7XEjYc=HV9w_2uVzVf7vs60BPz0gFA=+pUm3ww@mail.gmail.com/ [4]
Link: https://lore.kernel.org/r/160588497406.3465195.18003475695899726222.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/161118136849.1232039.8923686136144228724.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/161161032290.2537118.13400578415247339173.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/161340394873.1303470.6237319335883242536.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/161539537375.286939.16642940088716990995.stgit@warthog.procyon.org.uk/ # v4
Link: https://lore.kernel.org/r/161653795430.2770958.4947584573720000554.stgit@warthog.procyon.org.uk/ # v5
Link: https://lore.kernel.org/r/161789076581.6155.6745849361504760209.stgit@warthog.procyon.org.uk/ # v6
2021-04-23 10:14:32 +01:00
Viresh Kumar
be65de6b03 fs: Remove dcookies support
The dcookies stuff was only used by the kernel's old oprofile code. Now
that oprofile's support is removed from the kernel, there is no need for
dcookies as well. Remove it.

Suggested-by: Christoph Hellwig <hch@infradead.org>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Robert Richter <rric@kernel.org>
Acked-by: William Cohen <wcohen@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
2021-01-29 10:06:46 +05:30
Linus Torvalds
c4728cfbed Merge tag 'vfs-5.10-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull clone/dedupe/remap code refactoring from Darrick Wong:
 "Move the generic file range remap (aka reflink and dedupe) functions
  out of mm/filemap.c and fs/read_write.c and into fs/remap_range.c to
  reduce clutter in the first two files"

* tag 'vfs-5.10-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  vfs: move the generic write and copy checks out of mm
  vfs: move the remap range helpers to remap_range.c
  vfs: move generic_remap_checks out of mm
2020-10-23 11:33:41 -07:00
Linus Torvalds
726eb70e0d Merge tag 'char-misc-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
 "Here is the big set of char, misc, and other assorted driver subsystem
  patches for 5.10-rc1.

  There's a lot of different things in here, all over the drivers/
  directory. Some summaries:

   - soundwire driver updates

   - habanalabs driver updates

   - extcon driver updates

   - nitro_enclaves new driver

   - fsl-mc driver and core updates

   - mhi core and bus updates

   - nvmem driver updates

   - eeprom driver updates

   - binder driver updates and fixes

   - vbox minor bugfixes

   - fsi driver updates

   - w1 driver updates

   - coresight driver updates

   - interconnect driver updates

   - misc driver updates

   - other minor driver updates

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (396 commits)
  binder: fix UAF when releasing todo list
  docs: w1: w1_therm: Fix broken xref, mistakes, clarify text
  misc: Kconfig: fix a HISI_HIKEY_USB dependency
  LSM: Fix type of id parameter in kernel_post_load_data prototype
  misc: Kconfig: add a new dependency for HISI_HIKEY_USB
  firmware_loader: fix a kernel-doc markup
  w1: w1_therm: make w1_poll_completion static
  binder: simplify the return expression of binder_mmap
  test_firmware: Test partial read support
  firmware: Add request_partial_firmware_into_buf()
  firmware: Store opt_flags in fw_priv
  fs/kernel_file_read: Add "offset" arg for partial reads
  IMA: Add support for file reads without contents
  LSM: Add "contents" flag to kernel_read_file hook
  module: Call security_kernel_post_load_data()
  firmware_loader: Use security_post_load_data()
  LSM: Introduce kernel_post_load_data() hook
  fs/kernel_read_file: Add file_size output argument
  fs/kernel_read_file: Switch buffer size arg to size_t
  fs/kernel_read_file: Remove redundant size argument
  ...
2020-10-15 10:01:51 -07:00
Darrick J. Wong
02e83f46eb vfs: move generic_remap_checks out of mm
I would like to move all the generic helpers for the vfs remap range
functionality (aka clonerange and dedupe) into a separate file so that
they won't be scattered across the vfs and the mm subsystems.  The
eventual goal is to be able to deselect remap_range.c if none of the
filesystems need that code, but the tricky part here is picking a
stable(ish) part of the merge window to rearrange code.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-10-14 16:47:08 -07:00
Kees Cook
5287b07f6d fs/kernel_read_file: Split into separate source file
These routines are used in places outside of exec(2), so in preparation
for refactoring them, move them into a separate source file,
fs/kernel_read_file.c.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Acked-by: Scott Branden <scott.branden@broadcom.com>
Link: https://lore.kernel.org/r/20201002173828.2099543-5-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:34:18 +02:00
Christoph Hellwig
028abd9222 fs: remove compat_sys_mount
compat_sys_mount is identical to the regular sys_mount now, so remove it
and use the native version everywhere.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-22 23:45:57 -04:00
Christoph Hellwig
c60166f042 init: add an init_mount helper
Like do_mount, but takes a kernel pointer for the destination path.
Switch over the mounts in the init code and devtmpfs to it, which
just happen to work due to the implicit set_fs(KERNEL_DS) during early
init right now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-31 08:17:51 +02:00
Namjae Jeon
b9d1e2e626 exfat: add Kconfig and Makefile
This adds the Kconfig and Makefile for exfat.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Pali Rohár <pali.rohar@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-03-05 21:00:40 -05:00
Linus Torvalds
380a129eb2 Merge tag 'zonefs-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs
Pull new zonefs file system from Damien Le Moal:
 "Zonefs is a very simple file system exposing each zone of a zoned
  block device as a file.

  Unlike a regular file system with native zoned block device support
  (e.g. f2fs or the on-going btrfs effort), zonefs does not hide the
  sequential write constraint of zoned block devices to the user. As a
  result, zonefs is not a POSIX compliant file system. Its goal is to
  simplify the implementation of zoned block devices support in
  applications by replacing raw block device file accesses with a richer
  file based API, avoiding relying on direct block device file ioctls
  which may be more obscure to developers.

  One example of this approach is the implementation of LSM
  (log-structured merge) tree structures (such as used in RocksDB and
  LevelDB) on zoned block devices by allowing SSTables to be stored in a
  zone file similarly to a regular file system rather than as a range of
  sectors of a zoned device. The introduction of the higher level
  construct "one file is one zone" can help reducing the amount of
  changes needed in the application while at the same time allowing the
  use of zoned block devices with various programming languages other
  than C.

  Zonefs IO management implementation uses the new iomap generic code.
  Zonefs has been successfully tested using a functional test suite
  (available with zonefs userland format tool on github) and a prototype
  implementation of LevelDB on top of zonefs"

* tag 'zonefs-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
  zonefs: Add documentation
  fs: New zonefs file system
2020-02-09 15:51:46 -08:00
Linus Torvalds
5586c3c1e0 Merge branch 'work.vboxsf' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vboxfs from Al Viro:
 "This is the VirtualBox guest shared folder support by Hans de Goede,
  with fixups for fs_parse folded in to avoid bisection hazards from
  those API changes..."

* 'work.vboxsf' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs: Add VirtualBox guest shared folder (vboxsf) support
2020-02-09 12:41:00 -08:00
Hans de Goede
0fd1695766 fs: Add VirtualBox guest shared folder (vboxsf) support
VirtualBox hosts can share folders with guests, this commit adds a
VFS driver implementing the Linux-guest side of this, allowing folders
exported by the host to be mounted under Linux.

This driver depends on the guest <-> host IPC functions exported by
the vboxguest driver.

Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-02-08 17:34:58 -05:00
Damien Le Moal
8dcc1a9d90 fs: New zonefs file system
zonefs is a very simple file system exposing each zone of a zoned block
device as a file. Unlike a regular file system with zoned block device
support (e.g. f2fs), zonefs does not hide the sequential write
constraint of zoned block devices to the user. Files representing
sequential write zones of the device must be written sequentially
starting from the end of the file (append only writes).

As such, zonefs is in essence closer to a raw block device access
interface than to a full featured POSIX file system. The goal of zonefs
is to simplify the implementation of zoned block device support in
applications by replacing raw block device file accesses with a richer
file API, avoiding relying on direct block device file ioctls which may
be more obscure to developers. One example of this approach is the
implementation of LSM (log-structured merge) tree structures (such as
used in RocksDB and LevelDB) on zoned block devices by allowing SSTables
to be stored in a zone file similarly to a regular file system rather
than as a range of sectors of a zoned device. The introduction of the
higher level construct "one file is one zone" can help reducing the
amount of changes needed in the application as well as introducing
support for different application programming languages.

Zonefs on-disk metadata is reduced to an immutable super block to
persistently store a magic number and optional feature flags and
values. On mount, zonefs uses blkdev_report_zones() to obtain the device
zone configuration and populates the mount point with a static file tree
solely based on this information. E.g. file sizes come from the device
zone type and write pointer offset managed by the device itself.

The zone files created on mount have the following characteristics.
1) Files representing zones of the same type are grouped together
   under a common sub-directory:
     * For conventional zones, the sub-directory "cnv" is used.
     * For sequential write zones, the sub-directory "seq" is used.
  These two directories are the only directories that exist in zonefs.
  Users cannot create other directories and cannot rename nor delete
  the "cnv" and "seq" sub-directories.
2) The name of zone files is the number of the file within the zone
   type sub-directory, in order of increasing zone start sector.
3) The size of conventional zone files is fixed to the device zone size.
   Conventional zone files cannot be truncated.
4) The size of sequential zone files represent the file's zone write
   pointer position relative to the zone start sector. Truncating these
   files is allowed only down to 0, in which case, the zone is reset to
   rewind the zone write pointer position to the start of the zone, or
   up to the zone size, in which case the file's zone is transitioned
   to the FULL state (finish zone operation).
5) All read and write operations to files are not allowed beyond the
   file zone size. Any access exceeding the zone size is failed with
   the -EFBIG error.
6) Creating, deleting, renaming or modifying any attribute of files and
   sub-directories is not allowed.
7) There are no restrictions on the type of read and write operations
   that can be issued to conventional zone files. Buffered, direct and
   mmap read & write operations are accepted. For sequential zone files,
   there are no restrictions on read operations, but all write
   operations must be direct IO append writes. mmap write of sequential
   files is not allowed.

Several optional features of zonefs can be enabled at format time.
* Conventional zone aggregation: ranges of contiguous conventional
  zones can be aggregated into a single larger file instead of the
  default one file per zone.
* File ownership: The owner UID and GID of zone files is by default 0
  (root) but can be changed to any valid UID/GID.
* File access permissions: the default 640 access permissions can be
  changed.

The mkzonefs tool is used to format zoned block devices for use with
zonefs. This tool is available on Github at:

git@github.com:damien-lemoal/zonefs-tools.git.

zonefs-tools also includes a test suite which can be run against any
zoned block device, including null_blk block device created with zoned
mode.

Example: the following formats a 15TB host-managed SMR HDD with 256 MB
zones with the conventional zones aggregation feature enabled.

$ sudo mkzonefs -o aggr_cnv /dev/sdX
$ sudo mount -t zonefs /dev/sdX /mnt
$ ls -l /mnt/
total 0
dr-xr-xr-x 2 root root     1 Nov 25 13:23 cnv
dr-xr-xr-x 2 root root 55356 Nov 25 13:23 seq

The size of the zone files sub-directories indicate the number of files
existing for each type of zones. In this example, there is only one
conventional zone file (all conventional zones are aggregated under a
single file).

$ ls -l /mnt/cnv
total 137101312
-rw-r----- 1 root root 140391743488 Nov 25 13:23 0

This aggregated conventional zone file can be used as a regular file.

$ sudo mkfs.ext4 /mnt/cnv/0
$ sudo mount -o loop /mnt/cnv/0 /data

The "seq" sub-directory grouping files for sequential write zones has
in this example 55356 zones.

$ ls -lv /mnt/seq
total 14511243264
-rw-r----- 1 root root 0 Nov 25 13:23 0
-rw-r----- 1 root root 0 Nov 25 13:23 1
-rw-r----- 1 root root 0 Nov 25 13:23 2
...
-rw-r----- 1 root root 0 Nov 25 13:23 55354
-rw-r----- 1 root root 0 Nov 25 13:23 55355

For sequential write zone files, the file size changes as data is
appended at the end of the file, similarly to any regular file system.

$ dd if=/dev/zero of=/mnt/seq/0 bs=4K count=1 conv=notrunc oflag=direct
1+0 records in
1+0 records out
4096 bytes (4.1 kB, 4.0 KiB) copied, 0.000452219 s, 9.1 MB/s

$ ls -l /mnt/seq/0
-rw-r----- 1 root root 4096 Nov 25 13:23 /mnt/seq/0

The written file can be truncated to the zone size, preventing any
further write operation.

$ truncate -s 268435456 /mnt/seq/0
$ ls -l /mnt/seq/0
-rw-r----- 1 root root 268435456 Nov 25 13:49 /mnt/seq/0

Truncation to 0 size allows freeing the file zone storage space and
restart append-writes to the file.

$ truncate -s 0 /mnt/seq/0
$ ls -l /mnt/seq/0
-rw-r----- 1 root root 0 Nov 25 13:49 /mnt/seq/0

Since files are statically mapped to zones on the disk, the number of
blocks of a file as reported by stat() and fstat() indicates the size
of the file zone.

$ stat /mnt/seq/0
  File: /mnt/seq/0
  Size: 0       Blocks: 524288     IO Block: 4096   regular empty file
Device: 870h/2160d      Inode: 50431       Links: 1
Access: (0640/-rw-r-----)  Uid: (    0/    root)   Gid: (    0/  root)
Access: 2019-11-25 13:23:57.048971997 +0900
Modify: 2019-11-25 13:52:25.553805765 +0900
Change: 2019-11-25 13:52:25.553805765 +0900
 Birth: -

The number of blocks of the file ("Blocks") in units of 512B blocks
gives the maximum file size of 524288 * 512 B = 256 MB, corresponding
to the device zone size in this example. Of note is that the "IO block"
field always indicates the minimum IO size for writes and corresponds
to the device physical sector size.

This code contains contributions from:
* Johannes Thumshirn <jthumshirn@suse.de>,
* Darrick J. Wong <darrick.wong@oracle.com>,
* Christoph Hellwig <hch@lst.de>,
* Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> and
* Ting Yao <tingyao@hust.edu.cn>.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-02-07 14:39:38 +09:00