Commit Graph

349 Commits

Author SHA1 Message Date
J. Bruce Fields 1a0a397e41 dcache: d_obtain_alias callers don't all want DISCONNECTED
There are a few d_obtain_alias callers that are using it to get the
root of a filesystem which may already have an alias somewhere else.

This is not the same as the filehandle-lookup case, and none of them
actually need DCACHE_DISCONNECTED set.

It isn't really a serious problem, but it would really be clearer if we
reserved DCACHE_DISCONNECTED for those cases where it's actually needed.

In the btrfs case this was causing a spurious printk from
nfsd/nfsfh.c:fh_verify when it found an unexpected DCACHE_DISCONNECTED
dentry.  Josef worked around this by unsetting DCACHE_DISCONNECTED
manually in 3a0dfa6a12 "Btrfs: unset DCACHE_DISCONNECTED when mounting
default subvol", and this replaces that workaround.

Cc: Josef Bacik <jbacik@fb.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-08-07 14:40:10 -04:00
J. Bruce Fields da093a9b76 dcache: d_splice_alias should ignore DCACHE_DISCONNECTED
Any IS_ROOT() alias should be safe to use; there's nothing special about
DCACHE_DISCONNECTED dentries.

Note that this is in fact useful for filesystems such as btrfs which can
legimately encounter a directory with a preexisting IS_ROOT alias on a
lookup that crosses into a subvolume.  (Those aliases are currently
marked DCACHE_DISCONNECTED--but not really for any good reason, and
we'll change that soon.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-08-07 14:40:10 -04:00
J. Bruce Fields 908790fa3b dcache: d_splice_alias mustn't create directory aliases
Currently if d_splice_alias finds a directory with an alias that is not
IS_ROOT or not DCACHE_DISCONNECTED, it creates a duplicate directory.

Duplicate directory dentries are unacceptable; it is better just to
error out.

(In the case of a local filesystem the most likely case is filesystem
corruption: for example, perhaps two directories point to the same child
directory, and the other parent has already been found and cached.)

Note that distributed filesystems may encounter this case in normal
operation if a remote host moves a directory to a location different
from the one we last cached in the dcache.  For that reason, such
filesystems should instead use d_materialise_unique, which tries to move
the old directory alias to the right place instead of erroring out.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-08-07 14:40:10 -04:00
J. Bruce Fields 75a2352d01 dcache: close d_move race in d_splice_alias
d_splice_alias will d_move an IS_ROOT() directory dentry into place if
one exists.  This should be safe as long as the dentry remains IS_ROOT,
but I can't see what guarantees that: once we drop the i_lock all we
hold here is the i_mutex on an unrelated parent directory.

Instead copy the logic of d_materialise_unique.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-08-07 14:40:10 -04:00
J. Bruce Fields 3f70bd51cb dcache: move d_splice_alias
Just a trivial move to locate it near (similar) d_materialise_unique
code and save some forward references in a following patch.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-08-07 14:40:10 -04:00
Linus Torvalds 16b9057804 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
 "This the bunch that sat in -next + lock_parent() fix.  This is the
  minimal set; there's more pending stuff.

  In particular, I really hope to get acct.c fixes merged this cycle -
  we need that to deal sanely with delayed-mntput stuff.  In the next
  pile, hopefully - that series is fairly short and localized
  (kernel/acct.c, fs/super.c and fs/namespace.c).  In this pile: more
  iov_iter work.  Most of prereqs for ->splice_write with sane locking
  order are there and Kent's dio rewrite would also fit nicely on top of
  this pile"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (70 commits)
  lock_parent: don't step on stale ->d_parent of all-but-freed one
  kill generic_file_splice_write()
  ceph: switch to iter_file_splice_write()
  shmem: switch to iter_file_splice_write()
  nfs: switch to iter_splice_write_file()
  fs/splice.c: remove unneeded exports
  ocfs2: switch to iter_file_splice_write()
  ->splice_write() via ->write_iter()
  bio_vec-backed iov_iter
  optimize copy_page_{to,from}_iter()
  bury generic_file_aio_{read,write}
  lustre: get rid of messing with iovecs
  ceph: switch to ->write_iter()
  ceph_sync_direct_write: stop poking into iov_iter guts
  ceph_sync_read: stop poking into iov_iter guts
  new helper: copy_page_from_iter()
  fuse: switch to ->write_iter()
  btrfs: switch to ->write_iter()
  ocfs2: switch to ->write_iter()
  xfs: switch to ->write_iter()
  ...
2014-06-12 10:30:18 -07:00
Al Viro c2338f2dc7 lock_parent: don't step on stale ->d_parent of all-but-freed one
Dentry that had been through (or into) __dentry_kill() might be seen
by shrink_dentry_list(); that's normal, it'll be taken off the shrink
list and freed if __dentry_kill() has already finished.  The problem
is, its ->d_parent might be pointing to already freed dentry, so
lock_parent() needs to be careful.

We need to check that dentry hasn't already gone into __dentry_kill()
*and* grab rcu_read_lock() before dropping ->d_lock - the latter makes
sure that whatever we see in ->d_parent after dropping ->d_lock it
won't be freed until we drop rcu_read_lock().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-06-12 00:29:13 -04:00
Joe Perches 1f7e0616cd fs: convert use of typedef ctl_table to struct ctl_table
This typedef is unnecessary and should just be removed.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:16 -07:00
Linus Torvalds 9f12600fe4 dcache: add missing lockdep annotation
lock_parent() very much on purpose does nested locking of dentries, and
is careful to maintain the right order (lock parent first).  But because
it didn't annotate the nested locking order, lockdep thought it might be
a deadlock on d_lock, and complained.

Add the proper annotation for the inner locking of the child dentry to
make lockdep happy.

Introduced by commit 046b961b45 ("shrink_dentry_list(): take parent's
->d_lock earlier").

Reported-and-tested-by: Josh Boyer <jwboyer@fedoraproject.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-05-31 09:13:21 -07:00
Al Viro 8cbf74da43 dentry_kill() doesn't need the second argument now
it's 1 in the only remaining caller.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-05-30 11:10:33 -04:00
Al Viro b2b80195d8 dealing with the rest of shrink_dentry_list() livelock
We have the same problem with ->d_lock order in the inner loop, where
we are dropping references to ancestors.  Same solution, basically -
instead of using dentry_kill() we use lock_parent() (introduced in the
previous commit) to get that lock in a safe way, recheck ->d_count
(in case if lock_parent() has ended up dropping and retaking ->d_lock
and somebody managed to grab a reference during that window), trylock
the inode->i_lock and use __dentry_kill() to do the rest.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-05-30 11:10:33 -04:00
Al Viro 046b961b45 shrink_dentry_list(): take parent's ->d_lock earlier
The cause of livelocks there is that we are taking ->d_lock on
dentry and its parent in the wrong order, forcing us to use
trylock on the parent's one.  d_walk() takes them in the right
order, and unfortunately it's not hard to create a situation
when shrink_dentry_list() can't make progress since trylock
keeps failing, and shrink_dcache_parent() or check_submounts_and_drop()
keeps calling d_walk() disrupting the very shrink_dentry_list() it's
waiting for.

Solution is straightforward - if that trylock fails, let's unlock
the dentry itself and take locks in the right order.  We need to
stabilize ->d_parent without holding ->d_lock, but that's doable
using RCU.  And we'd better do that in the very beginning of the
loop in shrink_dentry_list(), since the checks on refcount, etc.
would need to be redone anyway.

That deals with a half of the problem - killing dentries on the
shrink list itself.  Another one (dropping their parents) is
in the next commit.

locking parent is interesting - it would be easy to do rcu_read_lock(),
lock whatever we think is a parent, lock dentry itself and check
if the parent is still the right one.  Except that we need to check
that *before* locking the dentry, or we are risking taking ->d_lock
out of order.  Fortunately, once the D1 is locked, we can check if
D2->d_parent is equal to D1 without the need to lock D2; D2->d_parent
can start or stop pointing to D1 only under D1->d_lock, so taking
D1->d_lock is enough.  In other words, the right solution is
rcu_read_lock/lock what looks like parent right now/check if it's
still our parent/rcu_read_unlock/lock the child.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-05-30 11:03:21 -04:00
Al Viro ff2fde9929 expand dentry_kill(dentry, 0) in shrink_dentry_list()
Result will be massaged to saner shape in the next commits.  It is
ugly, no questions - the point of that one is to be a provably
equivalent transformation (and it might be worth splitting a bit
more).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-05-29 08:50:08 -04:00
Al Viro e55fd01154 split dentry_kill()
... into trylocks and everything else.  The latter (actual killing)
is __dentry_kill().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-05-29 08:46:08 -04:00
Al Viro 64fd72e0a4 lift the "already marked killed" case into shrink_dentry_list()
It can happen only when dentry_kill() is called with unlock_on_failure
equal to 0 - other callers had dentry pinned until the moment they've
got ->d_lock and DCACHE_DENTRY_KILLED is set only after lockref_mark_dead().

IOW, only one of three call sites of dentry_kill() might end up reaching
that code.  Just move it there.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-05-28 09:48:44 -04:00
Miklos Szeredi 60942f2f23 dcache: don't need rcu in shrink_dentry_list()
Since now the shrink list is private and nobody can free the dentry while
it is on the shrink list, we can remove RCU protection from this.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-05-03 16:46:16 -04:00
Al Viro 9c8c10e262 more graceful recovery in umount_collect()
Start with shrink_dcache_parent(), then scan what remains.

First of all, BUG() is very much an overkill here; we are holding
->s_umount, and hitting BUG() means that a lot of interesting stuff
will be hanging after that point (sync(2), for example).  Moreover,
in cases when there had been more than one leak, we'll be better
off reporting all of them.  And more than just the last component
of pathname - %pd is there for just such uses...

That was the last user of dentry_lru_del(), so kill it off...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-05-03 16:46:13 -04:00
Al Viro fe91522a7b don't remove from shrink list in select_collect()
If we find something already on a shrink list, just increment
data->found and do nothing else.  Loops in shrink_dcache_parent() and
check_submounts_and_drop() will do the right thing - everything we
did put into our list will be evicted and if there had been nothing,
but data->found got non-zero, well, we have somebody else shrinking
those guys; just try again.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-05-03 16:45:06 -04:00
Al Viro 41edf278fc dentry_kill(): don't try to remove from shrink list
If the victim in on the shrink list, don't remove it from there.
If shrink_dentry_list() manages to remove it from the list before
we are done - fine, we'll just free it as usual.  If not - mark
it with new flag (DCACHE_MAY_FREE) and leave it there.

Eventually, shrink_dentry_list() will get to it, remove the sucker
from shrink list and call dentry_kill(dentry, 0).  Which is where
we'll deal with freeing.

Since now dentry_kill(dentry, 0) may happen after or during
dentry_kill(dentry, 1), we need to recognize that (by seeing
DCACHE_DENTRY_KILLED already set), unlock everything
and either free the sucker (in case DCACHE_MAY_FREE has been
set) or leave it for ongoing dentry_kill(dentry, 1) to deal with.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-05-01 10:30:00 -04:00
Al Viro 01b6035190 expand the call of dentry_lru_del() in dentry_kill()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-30 18:02:52 -04:00
Al Viro b4f0354e96 new helper: dentry_free()
The part of old d_free() that dealt with actual freeing of dentry.
Taken out of dentry_kill() into a separate function.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-30 18:02:52 -04:00
Al Viro 5c47e6d0ad fold try_prune_one_dentry()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-30 18:02:51 -04:00
Al Viro 03b3b889e7 fold d_kill() and d_free()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-30 18:02:51 -04:00
Al Viro 22213318af fix races between __d_instantiate() and checks of dentry flags
in non-lazy walk we need to be careful about dentry switching from
negative to positive - both ->d_flags and ->d_inode are updated,
and in some places we might see only one store.  The cases where
dentry has been obtained by dcache lookup with ->i_mutex held on
parent are safe - ->d_lock and ->i_mutex provide all the barriers
we need.  However, there are several places where we run into
trouble:
	* do_last() fetches ->d_inode, then checks ->d_flags and
assumes that inode won't be NULL unless d_is_negative() is true.
Race with e.g. creat() - we might have fetched the old value of
->d_inode (still NULL) and new value of ->d_flags (already not
DCACHE_MISS_TYPE).  Lin Ming has observed and reported the resulting
oops.
	* a bunch of places checks ->d_inode for being non-NULL,
then checks ->d_flags for "is it a symlink".  Race with symlink(2)
in case if our CPU sees ->d_inode update first - we see non-NULL
there, but ->d_flags still contains DCACHE_MISS_TYPE instead of
DCACHE_SYMLINK_TYPE.  Result: false negative on "should we follow
link here?", with subsequent unpleasantness.

Cc: stable@vger.kernel.org # 3.13 and 3.14 need that one
Reported-and-tested-by: Lin Ming <minggr@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-19 12:30:58 -04:00
Linus Torvalds e9f37d3a8d Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
 "Highlights:

   - drm:

     Generic display port aux features, primary plane support, drm
     master management fixes, logging cleanups, enforced locking checks
     (instead of docs), documentation improvements, minor number
     handling cleanup, pseudofs for shared inodes.

   - ttm:

     add ability to allocate from both ends

   - i915:

     broadwell features, power domain and runtime pm, per-process
     address space infrastructure (not enabled)

   - msm:

     power management, hdmi audio support

   - nouveau:

     ongoing GPU fault recovery, initial maxwell support, random fixes

   - exynos:

     refactored driver to clean up a lot of abstraction, DP support
     moved into drm, LVDS bridge support added, parallel panel support

   - gma500:

     SGX MMU support, SGX irq handling, asle irq work fixes

   - radeon:

     video engine bringup, ring handling fixes, use dp aux helpers

   - vmwgfx:

     add rendernode support"

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (849 commits)
  DRM: armada: fix corruption while loading cursors
  drm/dp_helper: don't return EPROTO for defers (v2)
  drm/bridge: export ptn3460_init function
  drm/exynos: remove MODULE_DEVICE_TABLE definitions
  ARM: dts: exynos4412-trats2: enable exynos/fimd node
  ARM: dts: exynos4210-trats: enable exynos/fimd node
  ARM: dts: exynos4412-trats2: add panel node
  ARM: dts: exynos4210-trats: add panel node
  ARM: dts: exynos4: add MIPI DSI Master node
  drm/panel: add S6E8AA0 driver
  ARM: dts: exynos4210-universal_c210: add proper panel node
  drm/panel: add ld9040 driver
  panel/ld9040: add DT bindings
  panel/s6e8aa0: add DT bindings
  drm/exynos: add DSIM driver
  exynos/dsim: add DT bindings
  drm/exynos: disallow fbdev initialization if no device is connected
  drm/mipi_dsi: create dsi devices only for nodes with reg property
  drm/mipi_dsi: add flags to DSI messages
  Skip intel_crt_init for Dell XPS 8700
  ...
2014-04-08 09:52:16 -07:00