Commit Graph

40426 Commits

Author SHA1 Message Date
Yan, Zheng c0bd50e2ee ceph: fix null pointer dereference in send_mds_reconnect()
sb->s_root can be null when umounting

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-04-22 18:33:31 +03:00
Giuseppe Cantavenera bb7ffbf29e nfsd: fix nsfd startup race triggering BUG_ON
nfsd triggered a BUG_ON in net_generic(...) when rpc_pipefs_event(...)
in fs/nfsd/nfs4recover.c was called before assigning ntfsd_net_id.
The following was observed on a MIPS 32-core processor:
kernel: Call Trace:
kernel: [<ffffffffc00bc5e4>] rpc_pipefs_event+0x7c/0x158 [nfsd]
kernel: [<ffffffff8017a2a0>] notifier_call_chain+0x70/0xb8
kernel: [<ffffffff8017a4e4>] __blocking_notifier_call_chain+0x4c/0x70
kernel: [<ffffffff8053aff8>] rpc_fill_super+0xf8/0x1a0
kernel: [<ffffffff8022204c>] mount_ns+0xb4/0xf0
kernel: [<ffffffff80222b48>] mount_fs+0x50/0x1f8
kernel: [<ffffffff8023dc00>] vfs_kern_mount+0x58/0xf0
kernel: [<ffffffff802404ac>] do_mount+0x27c/0xa28
kernel: [<ffffffff80240cf0>] SyS_mount+0x98/0xe8
kernel: [<ffffffff80135d24>] handle_sys64+0x44/0x68
kernel:
kernel:
        Code: 0040f809  00000000  2e020001 <00020336> 3c12c00d
                3c02801a  de100000 6442eb98  0040f809
kernel: ---[ end trace 7471374335809536 ]---

Fixed this behaviour by calling register_pernet_subsys(&nfsd_net_ops) before
registering rpc_pipefs_event(...) with the notifier chain.

Signed-off-by: Giuseppe Cantavenera <giuseppe.cantavenera.ext@nokia.com>
Signed-off-by: Lorenzo Restelli <lorenzo.restelli.ext@nokia.com>
Reviewed-by: Kinlong Mee <kinglongmee@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-21 16:16:03 -04:00
Mark Salter 135dd002c2 nfsd: eliminate NFSD_DEBUG
Commit f895b252d4 ("sunrpc: eliminate RPC_DEBUG") introduced
use of IS_ENABLED() in a uapi header which leads to a build
failure for userspace apps trying to use <linux/nfsd/debug.h>:

   linux/nfsd/debug.h:18:15: error: missing binary operator before token "("
  #if IS_ENABLED(CONFIG_SUNRPC_DEBUG)
                ^

Since this was only used to define NFSD_DEBUG if CONFIG_SUNRPC_DEBUG
is enabled, replace instances of NFSD_DEBUG with CONFIG_SUNRPC_DEBUG.

Cc: stable@vger.kernel.org
Fixes: f895b252d4 "sunrpc: eliminate RPC_DEBUG"
Signed-off-by: Mark Salter <msalter@redhat.com>
Reviewed-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-21 16:16:02 -04:00
J. Bruce Fields 6e4891dc28 nfsd4: fix READ permission checking
In the case we already have a struct file (derived from a stateid), we
still need to do permission-checking; otherwise an unauthorized user
could gain access to a file by sniffing or guessing somebody else's
stateid.

Cc: stable@vger.kernel.org
Fixes: dc97618ddd "nfsd4: separate splice and readv cases"
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-21 16:16:01 -04:00
J. Bruce Fields 980608fb50 nfsd4: disallow SEEK with special stateids
If the client uses a special stateid then we'll pass a NULL file to
vfs_llseek.

Fixes: 24bab49122 " NFSD: Implement SEEK"
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Cc: stable@vger.kernel.org
Reported-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-21 16:16:01 -04:00
J. Bruce Fields 5ba4a25ab7 nfsd4: disallow ALLOCATE with special stateids
vfs_fallocate will hit a NULL dereference if the client tries an
ALLOCATE or DEALLOCATE with a special stateid.  Fix that.  (We also
depend on the open to have broken any conflicting leases or delegations
for us.)

(If it turns out we need to allow special stateid's then we could do a
temporary open here in the special-stateid case, as we do for read and
write.  For now I'm assuming it's not necessary.)

Fixes: 95d871f03c "nfsd: Add ALLOCATE support"
Cc: stable@vger.kernel.org
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-21 15:44:06 -04:00
Linus Torvalds 8f443e2372 Revert "ocfs2: incorrect check for debugfs returns"
This reverts commit e2ac55b6a8.

Huang Ying reports that this causes a hang at boot with debugfs disabled.

It is true that the debugfs error checks are kind of confusing, and this
code certainly merits more cleanup and thinking about it, but there's
something wrong with the trivial "check not just for NULL, but for error
pointers too" patch.

Yes, with debugfs disabled, we will end up setting the o2hb_debug_dir
pointer variable to an error pointer (-ENODEV), and then continue as if
everything was fine.  But since debugfs is disabled, all the _users_ of
that pointer end up being compiled away, so even though the pointer can
not be dereferenced, that's still fine.

So it's confusing and somewhat questionable, but the "more correct"
error checks end up causing more trouble than they fix.

Reported-by: Huang Ying <ying.huang@intel.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Chengyu Song <csong84@gatech.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-21 09:17:28 -07:00
Yan, Zheng 32ec439775 ceph: hold on to exclusive caps on complete directories
If a directory is complete, we want to keep the exclusive
cap. So that MDS does not end up revoking the shared cap
on every create/unlink operation.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-04-20 18:55:40 +03:00
Ilya Dryomov ff7eeb82cc ceph: show non-default options only
Don't pollute /proc/mounts with default options (presently these are
dcache, nofsc and acl).  Leave the acl/noacl however - it's a bit of
a special case due to CONFIG_CEPH_FS_POSIX_ACL.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-04-20 18:55:39 +03:00
Ilya Dryomov ff40f9ae95 libceph, ceph: split ceph_show_options()
Split ceph_show_options() into two pieces and move the piece
responsible for printing client (libceph) options into net/ceph.  This
way people adding a libceph option wouldn't have to remember to update
code in fs/ceph.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-04-20 18:55:38 +03:00
Yan, Zheng 1c841a96b5 ceph: cleanup unsafe requests when reconnecting is denied
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-04-20 18:55:37 +03:00
Yan, Zheng a9f6eb6185 ceph: don't zero i_wrbuffer_ref when reconnecting is denied
remove_session_caps_cb() does not truncate dirty data in page
cache, but zeros i_wrbuffer_ref/i_wrbuffer_ref_head. This will
result negtive i_wrbuffer_ref/i_wrbuffer_ref_head

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-04-20 18:55:36 +03:00
Yan, Zheng 571ade336a ceph: don't mark dirty caps when there is no auth cap
No i_auth_cap means reconnecting to MDS was denied. So don't
add new dirty caps.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-04-20 18:55:36 +03:00
Yan, Zheng db40cc1702 ceph: keep i_snap_realm while there are writers
when reconnecting to MDS is denied, we remove session caps
forcibly. But it's possible there are ongoing write, the
write code needs to reference i_snap_realm. So if there are
ongoing write, we keep i_snap_realm.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-04-20 18:55:35 +03:00
Sanidhya Kashyap a149bb9a28 ceph: kstrdup() memory handling
Currently, there is no check for the kstrdup() for r_path2,
r_path1 and snapdir_name as various locations as there is a
possibility of failure during memory pressure. Therefore,
returning ENOMEM where the checks have been missed.

Signed-off-by: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-04-20 18:55:34 +03:00
Taesoo Kim c1d00b2d9c ceph: properly release page upon error
When ceph_update_writeable_page fails (including -EAGAIN), it
unlocks (w/ unlock_page) the page but does not 'release'
(w/ page_cache_release) properly.

Upon error, properly set *pagep to NULL, indicating an error.

Signed-off-by: Taesoo Kim <tsgatesv@gmail.com>
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-04-20 18:55:34 +03:00
Nicholas Mc Guire 57e95460f0 ceph: match wait_for_completion_timeout return type
return type of wait_for_completion_timeout is unsigned long not int. An
appropriately named unsigned long is added and the assignment fixed up.

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-04-20 17:30:23 +03:00
Nicholas Mc Guire 3563dbdd99 ceph: use msecs_to_jiffies for time conversion
This is only an API consolidation and should make things more readable
it replaces var * HZ / 1000 by msecs_to_jiffies(var).

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-04-20 17:30:22 +03:00
Fabian Frederick e1eba3ea02 ceph: remove redundant declaration
ceph_aops was already defined extern in addr.c section

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-04-20 17:30:22 +03:00
Yan, Zheng e2c3de046c ceph: fix dcache/nocache mount option
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-04-20 17:30:22 +03:00
Yan, Zheng 6e6f09231a ceph: drop cap releases in requests composed before cap reconnect
These cap releases are stale because MDS will re-establish client
caps according to the cap reconnect messages.

Note: MDS can detect stale cap messages, so these stale cap
releases are harmless even we don't drop them.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-04-20 17:30:22 +03:00
Linus Torvalds 6162e4b0be Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
 "A few bug fixes and add support for file-system level encryption in
  ext4"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (31 commits)
  ext4 crypto: enable encryption feature flag
  ext4 crypto: add symlink encryption
  ext4 crypto: enable filename encryption
  ext4 crypto: filename encryption modifications
  ext4 crypto: partial update to namei.c for fname crypto
  ext4 crypto: insert encrypted filenames into a leaf directory block
  ext4 crypto: teach ext4_htree_store_dirent() to store decrypted filenames
  ext4 crypto: filename encryption facilities
  ext4 crypto: implement the ext4 decryption read path
  ext4 crypto: implement the ext4 encryption write path
  ext4 crypto: inherit encryption policies on inode and directory create
  ext4 crypto: enforce context consistency
  ext4 crypto: add encryption key management facilities
  ext4 crypto: add ext4 encryption facilities
  ext4 crypto: add encryption policy and password salt support
  ext4 crypto: add encryption xattr support
  ext4 crypto: export ext4_empty_dir()
  ext4 crypto: add ext4 encryption Kconfig
  ext4 crypto: reserve codepoints used by the ext4 encryption feature
  ext4 crypto: add ext4_mpage_readpages()
  ...
2015-04-19 14:26:31 -07:00
Jann Horn 8b01fc86b9 fs: take i_mutex during prepare_binprm for set[ug]id executables
This prevents a race between chown() and execve(), where chowning a
setuid-user binary to root would momentarily make the binary setuid
root.

This patch was mostly written by Linus Torvalds.

Signed-off-by: Jann Horn <jann@thejh.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-19 13:46:21 -07:00
Linus Torvalds dba94f2155 Merge tag 'for-linus-4.1-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
Pull 9pfs updates from Eric Van Hensbergen:
 "Some accumulated cleanup patches for kerneldoc and unused variables as
  well as some lock bug fixes and adding privateport option for RDMA"

* tag 'for-linus-4.1-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  net/9p: add a privport option for RDMA transport.
  fs/9p: Initialize status in v9fs_file_do_lock.
  net/9p: Initialize opts->privport as it should be.
  net/9p: use memcpy() instead of snprintf() in p9_mount_tag_show()
  9p: use unsigned integers for nwqid/count
  9p: do not crash on unknown lock status code
  9p: fix error handling in v9fs_file_do_lock
  9p: remove unused variable in p9_fd_create()
  9p: kerneldoc warning fixes
2015-04-18 17:45:30 -04:00
Linus Torvalds 8f502d5b9e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull usernamespace mount fixes from Eric Biederman:
 "Way back in October Andrey Vagin reported that umount(MNT_DETACH)
  could be used to defeat MNT_LOCKED.  As I worked to fix this I
  discovered that combined with mount propagation and an appropriate
  selection of shared subtrees a reference to a directory on an
  unmounted filesystem is not necessary.

  That MNT_DETACH is allowed in user namespace in a form that can break
  MNT_LOCKED comes from my early misunderstanding what MNT_DETACH does.

  To avoid breaking existing userspace the conflict between MNT_DETACH
  and MNT_LOCKED is fixed by leaving mounts that are locked to their
  parents in the mount hash table until the last reference goes away.

  While investigating this issue I also found an issue with
  __detach_mounts.  The code was unnecessarily and incorrectly
  triggering mount propagation.  Resulting in too many mounts going away
  when a directory is deleted, and too many cpu cycles are burned while
  doing that.

  Looking some more I realized that __detach_mounts by only keeping
  mounts connected that were MNT_LOCKED it had the potential to still
  leak information so I tweaked the code to keep everything locked
  together that possibly could be.

  This code was almost ready last cycle but Al invented fs_pin which
  slightly simplifies this code but required rewrites and retesting, and
  I have not been in top form for a while so it took me a while to get
  all of that done.  Similiarly this pull request is late because I have
  been feeling absolutely miserable all week.

  The issue of being able to escape a bind mount has not yet been
  addressed, as the fixes are not yet mature"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  mnt: Update detach_mounts to leave mounts connected
  mnt: Fix the error check in __detach_mounts
  mnt: Honor MNT_LOCKED when detaching mounts
  fs_pin: Allow for the possibility that m_list or s_list go unused.
  mnt: Factor umount_mnt from umount_tree
  mnt: Factor out unhash_mnt from detach_mnt and umount_tree
  mnt: Fail collect_mounts when applied to unmounted mounts
  mnt: Don't propagate unmounts to locked mounts
  mnt: On an unmount propagate clearing of MNT_LOCKED
  mnt: Delay removal from the mount hash.
  mnt: Add MNT_UMOUNT flag
  mnt: In umount_tree reuse mnt_list instead of mnt_hash
  mnt: Don't propagate umounts in __detach_mounts
  mnt: Improve the umount_tree flags
  mnt: Use hlist_move_list in namespace_unlock
2015-04-18 11:20:31 -04:00