Commit Graph

953 Commits

Author SHA1 Message Date
Linus Torvalds 9ac0367501 Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
 "These are regression and bug fixes for ext4.

  We had a number of new features in ext4 during this merge window
  (ZERO_RANGE and COLLAPSE_RANGE fallocate modes, renameat, etc.) so
  there were many more regression and bug fixes this time around.  It
  didn't help that xfstests hadn't been fully updated to fully stress
  test COLLAPSE_RANGE until after -rc1"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (31 commits)
  ext4: disable COLLAPSE_RANGE for bigalloc
  ext4: fix COLLAPSE_RANGE failure with 1KB block size
  ext4: use EINVAL if not a regular file in ext4_collapse_range()
  ext4: enforce we are operating on a regular file in ext4_zero_range()
  ext4: fix extent merging in ext4_ext_shift_path_extents()
  ext4: discard preallocations after removing space
  ext4: no need to truncate pagecache twice in collapse range
  ext4: fix removing status extents in ext4_collapse_range()
  ext4: use filemap_write_and_wait_range() correctly in collapse range
  ext4: use truncate_pagecache() in collapse range
  ext4: remove temporary shim used to merge COLLAPSE_RANGE and ZERO_RANGE
  ext4: fix ext4_count_free_clusters() with EXT4FS_DEBUG and bigalloc enabled
  ext4: always check ext4_ext_find_extent result
  ext4: fix error handling in ext4_ext_shift_extents
  ext4: silence sparse check warning for function ext4_trim_extent
  ext4: COLLAPSE_RANGE only works on extent-based files
  ext4: fix byte order problems introduced by the COLLAPSE_RANGE patches
  ext4: use i_size_read in ext4_unaligned_aio()
  fs: disallow all fallocate operation on active swapfile
  fs: move falloc collapse range check into the filesystem methods
  ...
2014-04-20 20:43:47 -07:00
Linus Torvalds 96c57ade7e ceph: fix pr_fmt() redefinition
The vfs merge caused a latent bug to show up:

   In file included from fs/ceph/super.h:4:0,
                    from fs/ceph/ioctl.c:3:
   include/linux/ceph/ceph_debug.h:4:0: warning: "pr_fmt" redefined [enabled by default]
    #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
    ^
   In file included from include/linux/kernel.h:13:0,
                    from include/linux/uio.h:12,
                    from include/linux/socket.h:7,
                    from include/uapi/linux/in.h:22,
                    from include/linux/in.h:23,
                    from fs/ceph/ioctl.c:1:
   include/linux/printk.h:214:0: note: this is the location of the previous definition
    #define pr_fmt(fmt) fmt
    ^

where the reason is that <linux/ceph_debug.h> is included much too late
for the "pr_fmt()" define.

The include of <linux/ceph_debug.h> needs to be the first include in the
file, but fs/ceph/ioctl.c had for some reason missed that, and it wasn't
noticeable until some unrelated header file changes brought in an
indirect earlier include of <linux/kernel.h>.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-12 15:39:53 -07:00
Linus Torvalds 5166701b36 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
 "The first vfs pile, with deep apologies for being very late in this
  window.

  Assorted cleanups and fixes, plus a large preparatory part of iov_iter
  work.  There's a lot more of that, but it'll probably go into the next
  merge window - it *does* shape up nicely, removes a lot of
  boilerplate, gets rid of locking inconsistencie between aio_write and
  splice_write and I hope to get Kent's direct-io rewrite merged into
  the same queue, but some of the stuff after this point is having
  (mostly trivial) conflicts with the things already merged into
  mainline and with some I want more testing.

  This one passes LTP and xfstests without regressions, in addition to
  usual beating.  BTW, readahead02 in ltp syscalls testsuite has started
  giving failures since "mm/readahead.c: fix readahead failure for
  memoryless NUMA nodes and limit readahead pages" - might be a false
  positive, might be a real regression..."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  missing bits of "splice: fix racy pipe->buffers uses"
  cifs: fix the race in cifs_writev()
  ceph_sync_{,direct_}write: fix an oops on ceph_osdc_new_request() failure
  kill generic_file_buffered_write()
  ocfs2_file_aio_write(): switch to generic_perform_write()
  ceph_aio_write(): switch to generic_perform_write()
  xfs_file_buffered_aio_write(): switch to generic_perform_write()
  export generic_perform_write(), start getting rid of generic_file_buffer_write()
  generic_file_direct_write(): get rid of ppos argument
  btrfs_file_aio_write(): get rid of ppos
  kill the 5th argument of generic_file_buffered_write()
  kill the 4th argument of __generic_file_aio_write()
  lustre: don't open-code kernel_recvmsg()
  ocfs2: don't open-code kernel_recvmsg()
  drbd: don't open-code kernel_recvmsg()
  constify blk_rq_map_user_iov() and friends
  lustre: switch to kernel_sendmsg()
  ocfs2: don't open-code kernel_sendmsg()
  take iov_iter stuff to mm/iov_iter.c
  process_vm_access: tidy up a bit
  ...
2014-04-12 14:49:50 -07:00
Lukas Czerner 0790b31b69 fs: disallow all fallocate operation on active swapfile
Currently some file system have IS_SWAPFILE check in their fallocate
implementations and some do not. However we should really prevent any
fallocate operation on swapfile so move the check to vfs and remove the
redundant checks from the file systems fallocate implementations.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-04-12 10:05:37 -04:00
Al Viro eab87235c0 ceph_sync_{,direct_}write: fix an oops on ceph_osdc_new_request() failure
ceph_osdc_put_request(ERR_PTR(-error)) oopses.  What we want there
is break, not goto out.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-12 06:51:51 -04:00
Yan, Zheng a30be7cb2c ceph: skip invalid dentry during dcache readdir
skip dentries that were added before MDS issued FILE_SHARED to
client.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-06 09:13:14 -07:00
Yan, Zheng a56371d9d9 ceph: flush cap release queue when trimming session caps
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-04-04 21:08:26 -07:00
Yan, Zheng 4819301287 ceph: don't grabs open file reference for aborted request
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-04-04 21:08:25 -07:00
Yan, Zheng ab866549b3 ceph: drop extra open file reference in ceph_atomic_open()
ceph_atomic_open() calls ceph_open() after receiving the MDS reply.
ceph_open() grabs an extra open file reference. (The open request
already holds an open file reference)

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-04-04 21:08:23 -07:00
Yan, Zheng 54008399dc ceph: preallocate buffer for readdir reply
Preallocate buffer for readdir reply. Limit number of entries in
readdir reply according to the buffer size.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-04-04 21:08:22 -07:00
Yan, Zheng cc48c3e85f ceph: don't include ceph.{file,dir}.layout vxattr in listxattr()
This avoids 'cp -a' modifying layout of new files/directories.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-04-04 21:07:21 -07:00
Yan, Zheng 1e5c6649ff ceph: check buffer size in ceph_vxattrcb_layout()
If buffer size is zero, return the size of layout vxattr. If buffer
size is not zero, check if it is large enough for layout vxattr.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-04-04 21:07:19 -07:00
Yan, Zheng 00bd8edb86 ceph: fix null pointer dereference in discard_cap_releases()
send_mds_reconnect() may call discard_cap_releases() after all
release messages have been dropped by cleanup_cap_releases()

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-04 21:07:17 -07:00
Fabian Frederick 5f75ce5781 ceph: Remove get/set acl on symlinks
Remove unsupported symlink operations.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
2014-04-04 21:07:14 -07:00
Yan, Zheng d9ffc4f770 ceph: set mds_wanted when MDS reply changes a cap to auth cap
When adjusting caps client wants, MDS does not record caps that are
not allowed. For non-auth MDS, it does not record WR caps. So when
a MDS reply changes a non-auth cap to auth cap, client needs to set
cap's mds_wanted according to the reply.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-04-04 21:07:12 -07:00
Yan, Zheng eb13e832f8 ceph: use fl->fl_file as owner identifier of flock and posix lock
flock and posix lock should use fl->fl_file instead of process ID
as owner identifier. (posix lock uses fl->fl_owner. fl->fl_owner
is usually equal to fl->fl_file, but it also can be a customized
value). The process ID of who holds the lock is just for F_GETLK
fcntl(2).

The fix is rename the 'pid' fields of struct ceph_mds_request_args
and struct ceph_filelock to 'owner', rename 'pid_namespace' fields
to 'pid'. Assign fl->fl_file to the 'owner' field of lock messages.
We also set the most significant bit of the 'owner' field. MDS can
use that bit to distinguish between old and new clients.

The MDS counterpart of this patch modifies the flock code to not
take the 'pid_namespace' into consideration when checking conflict
locks.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-04 21:07:11 -07:00
Yan, Zheng eb70c0ce4e ceph: forbid mandatory file lock
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-04-04 21:07:09 -07:00
Yan, Zheng 0e8e95d6d7 ceph: use fl->fl_type to decide flock operation
VFS does not directly pass flock's operation code to filesystem's
flock callback. It translates the operation code to the form how
posix lock's parameters are presented.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-04-04 21:07:08 -07:00
Yan, Zheng 8c93cd610c ceph: update i_max_size even if inode version does not change
handle following sequence of events:
 - client releases a inode with i_max_size > 0. The release message
   is queued. (is not sent to the auth MDS)
 - a 'lookup' request reply from non-auth MDS returns the same inode.
 - client opens the inode in write mode. The version of inode trace
   in 'open' request reply is equal to the cached inode's version.
 - client requests new max size. The MDS ignores the request because
   it does not affect client's write range

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-04 21:07:06 -07:00
Yan, Zheng a255060451 ceph: make sure write caps are registered with auth MDS
Only auth MDS can issue write caps to clients, so don't consider
write caps registered with non-auth MDS as valid.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-04-04 21:07:05 -07:00
Yan, Zheng c137a32a40 ceph: print inode number for LOOKUPINO request
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-03 10:33:54 +08:00
Yan, Zheng 19913b4eac ceph: add get_name() NFS export callback
Use the newly introduced LOOKUPNAME MDS request to connect child
inode to its parent directory.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-03 10:33:53 +08:00
Yan, Zheng 8996f4f23d ceph: fix ceph_fh_to_parent()
ceph_fh_to_parent() returns dentry that corresponds to the 'ino' field
of struct ceph_nfs_confh. This is wrong, it should return dentry that
corresponds to the 'parent_ino' field.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-03 10:33:53 +08:00
Yan, Zheng 9017c2ec78 ceph: add get_parent() NFS export callback
The callback uses LOOKUPPARENT MDS request to find parent.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-03 10:33:53 +08:00
Yan, Zheng 4f32b42dca ceph: simplify ceph_fh_to_dentry()
MDS handles LOOKUPHASH and LOOKUPINO MDS requests in the same way.
So __cfh_to_dentry() is redundant.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-03 10:33:53 +08:00