linux-apfs

mirror of https://github.com/linux-apfs/linux-apfs.git synced 2026-05-01 15:00:59 -07:00

Author	SHA1	Message	Date
Linus Torvalds	050453295f	Merge branch 'work.sane_pwd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "Making sure that something like a referral point won't end up as pwd or root. The main part is the last commit (fixing mntns_install()); that one fixes a hard-to-hit race. The fchdir() commit is making fchdir(2) a bit more robust - it should be impossible to get opened files (even O_PATH ones) for referral points in the first place, so the existing checks are OK, but checking the same thing as in chdir(2) is just as cheap. The path_init() commit removes a redundant check that shouldn't have been there in the first place" * 'work.sane_pwd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: make sure that mntns_install() doesn't end up with referral for root path_init(): don't bother with checking MAY_EXEC for LOOKUP_ROOT make sure that fchdir() won't accept referral points, etc.	2017-05-12 11:39:59 -07:00
Linus Torvalds	b948abf53a	Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs update from Miklos Szeredi: "The biggest part of this is making st_dev/st_ino on the overlay behave like a normal filesystem (i.e. st_ino doesn't change on copy up, st_dev is the same for all files and directories). Currently this only works if all layers are on the same filesystem, but future work will move the general case towards more sane behavior. There are also miscellaneous fixes, including fixes to handling append-only files. There's a small change in the VFS, but that only has an effect on overlayfs, since otherwise file->f_path.dentry->inode and file_inode(file) are always the same" * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: update documentation w.r.t. constant inode numbers ovl: persistent inode numbers for upper hardlinks ovl: merge getattr for dir and nondir ovl: constant st_ino/st_dev across copy up ovl: persistent inode number for directories ovl: set the ORIGIN type flag ovl: lookup non-dir copy-up-origin by file handle ovl: use an auxiliary var for overlay root entry ovl: store file handle of lower inode on copy up ovl: check if all layers are on the same fs ovl: do not set overlay.opaque on non-dir create ovl: check IS_APPEND() on real upper inode vfs: ftruncate check IS_APPEND() on real upper inode ovl: Use designated initializers ovl: lockdep annotate of nested stacked overlayfs inode lock	2017-05-10 09:03:48 -07:00
Linus Torvalds	11fbf53d66	Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "Assorted bits and pieces from various people. No common topic in this pile, sorry" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs/affs: add rename exchange fs/affs: add rename2 to prepare multiple methods Make stat/lstat/fstatat pass AT_NO_AUTOMOUNT to vfs_statx() fs: don't set *REFERENCED on single use objects fs: compat: Remove warning from COMPATIBLE_IOCTL remove pointless extern of atime_need_update_rcu() fs: completely ignore unknown open flags fs: add a VALID_OPEN_FLAGS fs: remove _submit_bh() fs: constify tree_descr arrays passed to simple_fill_super() fs: drop duplicate header percpu-rwsem.h fs/affs: bugfix: Write files greater than page size on OFS fs/affs: bugfix: enable writes on OFS disks fs/affs: remove node generation check fs/affs: import amigaffs.h fs/affs: bugfix: make symbolic links work again	2017-05-09 09:12:53 -07:00
Christoph Hellwig	629e014bb8	fs: completely ignore unknown open flags Currently we just stash anything we got into file->f_flags, and the report it in fcntl(F_GETFD). This patch just clears out all unknown flags so that we don't pass them to the fs or report them. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-04-27 05:13:04 -04:00
Al Viro	159b095628	make sure that fchdir() won't accept referral points, etc. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-04-21 14:05:35 -04:00
Amir Goldstein	78757af651	vfs: ftruncate check IS_APPEND() on real upper inode ftruncate an overlayfs inode was checking IS_APPEND() on overlay inode, but overlay inode does not have the S_APPEND flag. Check IS_APPEND() on real upper inode instead. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2017-04-20 16:37:26 +02:00
Al Viro	e35d49f637	open: move compat syscalls from compat.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-04-17 12:52:25 -04:00
Amir Goldstein	bfe219d373	vfs: wrap write f_ops with file_{start,end}_write() Before calling write f_ops, call file_start_write() instead of sb_start_write(). Replace {sb,file}_start_write() for {copy,clone}_file_range() and for fallocate(). Beyond correct semantics, this avoids freeze protection to sb when operating on special inodes, such as fallocate() on a blockdev. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2017-02-07 15:05:04 +01:00
Amir Goldstein	9e79b13263	vfs: deny fallocate() on directory There was an obscure use case of fallocate of directory inode in the vfs helper with the comment: "Let individual file system decide if it supports preallocation for directories or not." But there is no in-tree file system that implements fallocate for directory operations. Deny an attempt to fallocate a directory with EISDIR error. This change is needed prior to converting sb_start_write() to file_start_write(), so freeze protection is correctly handled for cases of fallocate file and blockdev. Cc: linux-api@vger.kernel.org Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2017-02-07 15:05:04 +01:00
Linus Torvalds	7c0f6ba682	Replace <asm/uaccess.h> with <linux/uaccess.h> globally This was entirely automated, using the script by Al: PATT='^[[:blank:]]#[[:blank:]]include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"\|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-24 11:46:01 -08:00
Linus Torvalds	35a891be96	Merge tag 'xfs-reflink-for-linus-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs < XFS has gained super CoW powers! > ---------------------------------- \ ^__^ \ (oo)\_______ (__)\ )\/\ \|\|----w \| \|\| \|\| Pull XFS support for shared data extents from Dave Chinner: "This is the second part of the XFS updates for this merge cycle. This pullreq contains the new shared data extents feature for XFS. Given the complexity and size of this change I am expecting - like the addition of reverse mapping last cycle - that there will be some follow-up bug fixes and cleanups around the -rc3 stage for issues that I'm sure will show up once the code hits a wider userbase. What it is: At the most basic level we are simply adding shared data extents to XFS - i.e. a single extent on disk can now have multiple owners. To do this we have to add new on-disk features to both track the shared extents and the number of times they've been shared. This is done by the new "refcount" btree that sits in every allocation group. When we share or unshare an extent, this tree gets updated. Along with this new tree, the reverse mapping tree needs to be updated to track each owner or a shared extent. This also needs to be updated ever share/unshare operation. These interactions at extent allocation and freeing time have complex ordering and recovery constraints, so there's a significant amount of new intent-based transaction code to ensure that operations are performed atomically from both the runtime and integrity/crash recovery perspectives. We also need to break sharing when writes hit a shared extent - this is where the new copy-on-write implementation comes in. We allocate new storage and copy the original data along with the overwrite data into the new location. We only do this for data as we don't share metadata at all - each inode has it's own metadata that tracks the shared data extents, the extents undergoing CoW and it's own private extents. Of course, being XFS, nothing is simple - we use delayed allocation for CoW similar to how we use it for normal writes. ENOSPC is a significant issue here - we build on the reservation code added in 4.8-rc1 with the reverse mapping feature to ensure we don't get spurious ENOSPC issues part way through a CoW operation. These mechanisms also help minimise fragmentation due to repeated CoW operations. To further reduce fragmentation overhead, we've also introduced a CoW extent size hint, which indicates how large a region we should allocate when we execute a CoW operation. With all this functionality in place, we can hook up .copy_file_range, .clone_file_range and .dedupe_file_range and we gain all the capabilities of reflink and other vfs provided functionality that enable manipulation to shared extents. We also added a fallocate mode that explicitly unshares a range of a file, which we implemented as an explicit CoW of all the shared extents in a file. As such, it's a huge chunk of new functionality with new on-disk format features and internal infrastructure. It warns at mount time as an experimental feature and that it may eat data (as we do with all new on-disk features until they stabilise). We have not released userspace suport for it yet - userspace support currently requires download from Darrick's xfsprogs repo and build from source, so the access to this feature is really developer/tester only at this point. Initial userspace support will be released at the same time the kernel with this code in it is released. The new code causes 5-6 new failures with xfstests - these aren't serious functional failures but things the output of tests changing slightly due to perturbations in layouts, space usage, etc. OTOH, we've added 150+ new tests to xfstests that specifically exercise this new functionality so it's got far better test coverage than any functionality we've previously added to XFS. Darrick has done a pretty amazing job getting us to this stage, and special mention also needs to go to Christoph (review, testing, improvements and bug fixes) and Brian (caught several intricate bugs during review) for the effort they've also put in. Summary: - unshare range (FALLOC_FL_UNSHARE) support for fallocate - copy-on-write extent size hints (FS_XFLAG_COWEXTSIZE) for fsxattr interface - shared extent support for XFS - copy-on-write support for shared extents - copy_file_range support - clone_file_range support (implements reflink) - dedupe_file_range support - defrag support for reverse mapping enabled filesystems" * tag 'xfs-reflink-for-linus-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (71 commits) xfs: convert COW blocks to real blocks before unwritten extent conversion xfs: rework refcount cow recovery error handling xfs: clear reflink flag if setting realtime flag xfs: fix error initialization xfs: fix label inaccuracies xfs: remove isize check from unshare operation xfs: reduce stack usage of _reflink_clear_inode_flag xfs: check inode reflink flag before calling reflink functions xfs: implement swapext for rmap filesystems xfs: refactor swapext code xfs: various swapext cleanups xfs: recognize the reflink feature bit xfs: simulate per-AG reservations being critically low xfs: don't mix reflink and DAX mode for now xfs: check for invalid inode reflink flags xfs: set a default CoW extent size of 32 blocks xfs: convert unwritten status of reverse mappings for shared files xfs: use interval query for rmap alloc operations on shared files xfs: add shared rmap map/unmap/convert log item types xfs: increase log reservations for reflink ...	2016-10-13 20:28:22 -07:00
Darrick J. Wong	25f4c41415	block: implement (some of) fallocate for block devices After much discussion, it seems that the fallocate feature flag FALLOC_FL_ZERO_RANGE maps nicely to SCSI WRITE SAME; and the feature FALLOC_FL_PUNCH_HOLE maps nicely to the devices that have been whitelisted for zeroing SCSI UNMAP. Punch still requires that FALLOC_FL_KEEP_SIZE is set. A length that goes past the end of the device will be clamped to the device size if KEEP_SIZE is set; or will return -EINVAL if not. Both start and length must be aligned to the device's logical block size. Since the semantics of fallocate are fairly well established already, wire up the two pieces. The other fallocate variants (collapse range, insert range, and allocate blocks) are not supported. Link: http://lkml.kernel.org/r/147518379992.22791.8849838163218235007.stgit@birch.djwong.org Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Mike Snitzer <snitzer@redhat.com> # tweaked header Cc: Brian Foster <bfoster@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Hannes Reinecke <hare@suse.de> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-10-11 15:06:30 -07:00
Darrick J. Wong	71be6b4942	vfs: add a FALLOC_FL_UNSHARE mode to fallocate to unshare a range of blocks Add a new fallocate mode flag that explicitly unshares blocks on filesystems that support such features. The new flag can only be used with an allocate-mode fallocate call. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2016-10-03 09:11:14 -07:00
Miklos Szeredi	4d0c5ba2ff	vfs: do get_write_access() on upper layer of overlayfs The problem with writecount is: we want consistent handling of it for underlying filesystems as well as overlayfs. Making sure i_writecount is correct on all layers is difficult. Instead this patch makes sure that when write access is acquired, it's always done on the underlying writable layer (called the upper layer). We must also make sure to look at the writecount on this layer when checking for conflicting leases. Open for write already updates the upper layer's writecount. Leaving only truncate. For truncate copy up must happen before get_write_access() so that the writecount is updated on the upper layer. Problem with this is if something fails after that, then copy-up was done needlessly. E.g. if break_lease() was interrupted. Probably not a big deal in practice. Another interesting case is if there's a denywrite on a lower file that is then opened for write or truncated. With this patch these will succeed, which is somewhat counterintuitive. But I think it's still acceptable, considering that the copy-up does actually create a different file, so the old, denywrite mapping won't be touched. On non-overlayfs d_real() is an identity function and d_real_inode() is equivalent to d_inode() so this patch doesn't change behavior in that case. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Acked-by: Jeff Layton <jlayton@poochiereds.net> Cc: "J. Bruce Fields" <bfields@fieldses.org>	2016-09-16 12:44:21 +02:00
Miklos Szeredi	c568d68341	locks: fix file locking on overlayfs This patch allows flock, posix locks, ofd locks and leases to work correctly on overlayfs. Instead of using the underlying inode for storing lock context use the overlay inode. This allows locks to be persistent across copy-up. This is done by introducing locks_inode() helper and using it instead of file_inode() to get the inode in locking code. For non-overlayfs the two are equivalent, except for an extra pointer dereference in locks_inode(). Since lock operations are in "struct file_operations" we must also make sure not to call underlying filesystem's lock operations. Introcude a super block flag MS_NOREMOTELOCK to this effect. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Acked-by: Jeff Layton <jlayton@poochiereds.net> Cc: "J. Bruce Fields" <bfields@fieldses.org>	2016-09-16 12:44:20 +02:00
Linus Torvalds	e9d488c311	Merge tag 'binfmt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/binfmt_misc Pull binfmt_misc update from James Bottomley: "This update is to allow architecture emulation containers to function such that the emulation binary can be housed outside the container itself. The container and fs parts both have acks from relevant experts. To use the new feature you have to add an F option to your binfmt_misc configuration" From the docs: "The usual behaviour of binfmt_misc is to spawn the binary lazily when the misc format file is invoked. However, this doesn't work very well in the face of mount namespaces and changeroots, so the F mode opens the binary as soon as the emulation is installed and uses the opened image to spawn the emulator, meaning it is always available once installed, regardless of how the environment changes" * tag 'binfmt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/binfmt_misc: binfmt_misc: add F option description to documentation binfmt_misc: add persistent opened binary handler for containers fs: add filp_clone_open API	2016-08-07 10:13:14 -04:00
Miklos Szeredi	2d902671ce	vfs: merge .d_select_inode() into .d_real() The two methods essentially do the same: find the real dentry/inode belonging to an overlay dentry. The difference is in the usage: vfs_open() uses ->d_select_inode() and expects the function to perform copy-up if necessary based on the open flags argument. file_dentry() uses ->d_real() passing in the overlay dentry as well as the underlying inode. vfs_rename() uses ->d_select_inode() but passes zero flags. ->d_real() with a zero inode would have worked just as well here. This patch merges the functionality of ->d_select_inode() into ->d_real() by adding an 'open_flags' argument to the latter. [Al Viro] Make the signature of d_real() match that of ->d_real() again. And constify the inode argument, while we are at it. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2016-06-30 08:53:27 +02:00
Linus Torvalds	c52b76185b	Merge branch 'work.const-path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull 'struct path' constification update from Al Viro: "'struct path' is passed by reference to a bunch of Linux security methods; in theory, there's nothing to stop them from modifying the damn thing and LSM community being what it is, sooner or later some enterprising soul is going to decide that it's a good idea. Let's remove the temptation and constify all of those..." * 'work.const-path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: constify ima_d_path() constify security_sb_pivotroot() constify security_path_chroot() constify security_path_{link,rename} apparmor: remove useless checks for NULL ->mnt constify security_path_{mkdir,mknod,symlink} constify security_path_{unlink,rmdir} apparmor: constify common_perm_...() apparmor: constify aa_path_link() apparmor: new helper - common_path_perm() constify chmod_common/security_path_chmod constify security_sb_mount() constify chown_common/security_path_chown tomoyo: constify assorted struct path * apparmor_path_truncate(): path->mnt is never NULL constify vfs_truncate() constify security_path_truncate() [apparmor] constify struct path * in a bunch of helpers	2016-05-17 14:41:03 -07:00
Al Viro	0e0162bb8c	Merge branch 'ovl-fixes' into for-linus Backmerge to resolve a conflict in ovl_lookup_real(); "ovl_lookup_real(): use lookup_one_len_unlocked()" instead, but it was too late in the cycle to rebase.	2016-05-17 02:17:59 -04:00
Miklos Szeredi	54d5ca871e	vfs: add vfs_select_inode() helper Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org> # v4.2+	2016-05-10 23:55:01 -04:00
Al Viro	63b6df1413	give readdir(2)/getdents(2)/etc. uniform exclusion with lseek() same as read() on regular files has, and for the same reason. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:49:28 -04:00
James Bottomley	9a08c352d0	fs: add filp_clone_open API I need an API that allows me to obtain a clone of the current file pointer to pass in to an exec handler. I've labelled this as an internal API because I can't see how it would be useful outside of the fs subsystem. The use case will be a persistent binfmt_misc handler. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Jan Kara <jack@suse.cz>	2016-03-30 14:12:22 -07:00
Al Viro	be01f9f28e	constify chmod_common/security_path_chmod Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-03-28 00:47:25 -04:00
Al Viro	7fd25dac9a	constify chown_common/security_path_chown Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-03-28 00:47:24 -04:00
Al Viro	7df818b237	constify vfs_truncate() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-03-28 00:47:22 -04:00

1 2 3 4 5 ...

277 Commits