Commit Graph

589110 Commits

Author SHA1 Message Date
Al Viro 1643b43fbd lookup_open(): lift the "fallback to !O_CREAT" logics from atomic_open()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:51:15 -04:00
Al Viro b3d58eaffb atomic_open(): be paranoid about may_open() return value
It should never return positives; however, with Linux S&M crowd
involved, no bogosity is impossible.  Results would be unpleasant...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:51:14 -04:00
Al Viro 0fb1ea0933 atomic_open(): delay open_to_namei_flags() until the method call
nobody else needs that transformation.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:51:14 -04:00
Al Viro fe9ec8291f do_last(): take fput() on error after opening to out:
make it conditional on *opened & FILE_OPENED; in addition to getting
rid of exit_fput: thing, it simplifies atomic_open() cleanup on
may_open() failure.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:51:13 -04:00
Al Viro 47f9dbd387 do_last(): get rid of duplicate ELOOP check
may_open() will catch it

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:51:13 -04:00
Al Viro 55db2fd936 atomic_open(): massage the create_error logics a bit
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:51:12 -04:00
Al Viro 9d0728e16e atomic_open(): consolidate "overridden ENOENT" in open-yourself cases
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:51:12 -04:00
Al Viro 5249e411b4 atomic_open(): don't bother with EEXIST check - it's done in do_last()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:51:11 -04:00
Al Viro df889b3631 Merge branch 'for-linus' into work.lookups 2016-05-02 19:49:46 -04:00
Al Viro ce8644fcad lookup_open(): expand the call of vfs_create()
Lift IS_DEADDIR handling up into the part common with atomic_open(),
remove it from the latter.  Collapse permission checks into the
call of may_o_create(), getting it closer to atomic_open() case.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:33 -04:00
Al Viro 6ac087099e path_openat(): take O_PATH handling out of do_last()
do_last() and lookup_open() simpler that way and so does O_PATH
itself.  As it bloody well should: we find what the pathname
resolves to, same way as in stat() et.al. and associate it with
FMODE_PATH struct file.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:33 -04:00
Al Viro 3b0a3c1ac1 simple local filesystems: switch to ->iterate_shared()
no changes needed (XFS isn't simple, but it has the same parallelism
in the interesting parts exercised from CXFS).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:32 -04:00
Al Viro 4e82901cd6 dcache_{readdir,dir_lseek}() users: switch to ->iterate_shared
no need to lock directory in dcache_dir_lseek(), while we are
at it - per-struct file exclusion is enough.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:32 -04:00
Al Viro 3125d2650c cifs: switch to ->iterate_shared()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:31 -04:00
Al Viro d9b3dbdcfd fuse: switch to ->iterate_shared()
Switch dcache pre-seeding on readdir to d_alloc_parallel();
nothing else is needed.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:31 -04:00
Al Viro f50752eaa0 switch all procfs directories ->iterate_shared()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:30 -04:00
Al Viro 76aab3ab61 proc_sys_fill_cache(): switch to d_alloc_parallel()
make it usable with directory locked shared

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:30 -04:00
Al Viro 3781764b5c proc_fill_cache(): switch to d_alloc_parallel()
... making it usable with directory locked shared

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:29 -04:00
Al Viro 6192269444 introduce a parallel variant of ->iterate()
New method: ->iterate_shared().  Same arguments as in ->iterate(),
called with the directory locked only shared.  Once all filesystems
switch, the old one will be gone.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:29 -04:00
Al Viro 63b6df1413 give readdir(2)/getdents(2)/etc. uniform exclusion with lseek()
same as read() on regular files has, and for the same reason.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:28 -04:00
Al Viro 9902af79c0 parallel lookups: actual switch to rwsem
ta-da!

The main issue is the lack of down_write_killable(), so the places
like readdir.c switched to plain inode_lock(); once killable
variants of rwsem primitives appear, that'll be dealt with.

lockdep side also might need more work

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:28 -04:00
Al Viro d9171b9345 parallel lookups machinery, part 4 (and last)
If we *do* run into an in-lookup match, we need to wait for it to
cease being in-lookup.  Fortunately, we do have unused space in
in-lookup dentries - d_lru is never looked at until it stops being
in-lookup.

So we can stash a pointer to wait_queue_head from stack frame of
the caller of ->lookup().  Some precautions are needed while
waiting, but it's not that hard - we do hold a reference to dentry
we are waiting for, so it can't go away.  If it's found to be
in-lookup the wait_queue_head is still alive and will remain so
at least while ->d_lock is held.  Moreover, the condition we
are waiting for becomes true at the same point where everything
on that wq gets woken up, so we can just add ourselves to the
queue once.

d_alloc_parallel() gets a pointer to wait_queue_head_t from its
caller; lookup_slow() adjusted, d_add_ci() taught to use
d_alloc_parallel() if the dentry passed to it happens to be
in-lookup one (i.e. if it's been called from the parallel lookup).

That's pretty much it - all that remains is to switch ->i_mutex
to rwsem and have lookup_slow() take it shared.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:27 -04:00
Al Viro 94bdd655ca parallel lookups machinery, part 3
We will need to be able to check if there is an in-lookup
dentry with matching parent/name.  Right now it's impossible,
but as soon as start locking directories shared such beasts
will appear.

Add a secondary hash for locating those.  Hash chains go through
the same space where d_alias will be once it's not in-lookup anymore.
Search is done under the same bitlock we use for modifications -
with the primary hash we can rely on d_rehash() into the wrong
chain being the worst that could happen, but here the pointers are
buggered once it's removed from the chain.  On the other hand,
the chains are not going to be long and normally we'll end up
adding to the chain anyway.  That allows us to avoid bothering with
->d_lock when doing the comparisons - everything is stable until
removed from chain.

New helper: d_alloc_parallel().  Right now it allocates, verifies
that no hashed and in-lookup matches exist and adds to in-lookup
hash.

Returns ERR_PTR() for error, hashed match (in the unlikely case it's
been found) or new dentry.  In-lookup matches trigger BUG() for
now; that will change in the next commit when we introduce waiting
for ongoing lookup to finish.  Note that in-lookup matches won't be
possible until we actually go for shared locking.

lookup_slow() switched to use of d_alloc_parallel().

Again, these commits are separated only for making it easier to
review.  All this machinery will start doing something useful only
when we go for shared locking; it's just that the combination is
too large for my taste.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:27 -04:00
Al Viro 84e710da2a parallel lookups machinery, part 2
We'll need to verify that there's neither a hashed nor in-lookup
dentry with desired parent/name before adding to in-lookup set.

One possible solution would be to hold the parent's ->d_lock through
both checks, but while the in-lookup set is relatively small at any
time, dcache is not.  And holding the parent's ->d_lock through
something like __d_lookup_rcu() would suck too badly.

So we leave the parent's ->d_lock alone, which means that we watch
out for the following scenario:
	* we verify that there's no hashed match
	* existing in-lookup match gets hashed by another process
	* we verify that there's no in-lookup matches and decide
that everything's fine.

Solution: per-directory kinda-sorta seqlock, bumped around the times
we hash something that used to be in-lookup or move (and hash)
something in place of in-lookup.  Then the above would turn into
	* read the counter
	* do dcache lookup
	* if no matches found, check for in-lookup matches
	* if there had been none of those either, check if the
counter has changed; repeat if it has.

The "kinda-sorta" part is due to the fact that we don't have much spare
space in inode.  There is a spare word (shared with i_bdev/i_cdev/i_pipe),
so the counter part is not a problem, but spinlock is a different story.

We could use the parent's ->d_lock, and it would be less painful in
terms of contention, for __d_add() it would be rather inconvenient to
grab; we could do that (using lock_parent()), but...

Fortunately, we can get serialization on the counter itself, and it
might be a good idea in general; we can use cmpxchg() in a loop to
get from even to odd and smp_store_release() from odd to even.

This commit adds the counter and updating logics; the readers will be
added in the next commit.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:26 -04:00
Al Viro 85c7f81041 beginning of transition to parallel lookups - marking in-lookup dentries
marked as such when (would be) parallel lookup is about to pass them
to actual ->lookup(); unmarked when
	* __d_add() is about to make it hashed, positive or not.
	* __d_move() (from d_splice_alias(), directly or via
__d_unalias()) puts a preexisting dentry in its place
	* in caller of ->lookup() if it has escaped all of the
above.  Bug (WARN_ON, actually) if it reaches the final dput()
or d_instantiate() while still marked such.

As the result, we are guaranteed that for as long as the flag is
set, dentry will
	* remain negative unhashed with positive refcount
	* never have its ->d_alias looked at
	* never have its ->d_lru looked at
	* never have its ->d_parent and ->d_name changed

Right now we have at most one such for any given parent directory.
With parallel lookups that restriction will weaken to
	* only exist when parent is locked shared
	* at most one with given (parent,name) pair (comparison of
names is according to ->d_compare())
	* only exist when there's no hashed dentry with the same
(parent,name)

Transition will take the next several commits; unfortunately, we'll
only be able to switch to rwsem at the end of this series.  The
reason for not making it a single patch is to simplify review.

New primitives: d_in_lookup() (a predicate checking if dentry is in
the in-lookup state) and d_lookup_done() (tells the system that
we are done with lookup and if it's still marked as in-lookup, it
should cease to be such).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:47:51 -04:00