Commit Graph

60 Commits

Author SHA1 Message Date
Al Viro
c63181e6b6 vfs: move fsnotify junk to struct mount
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:12 -05:00
Al Viro
52ba1621de vfs: move mnt_devname
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:11 -05:00
Al Viro
1a4eeaf2a8 vfs: move mnt_list to struct mount
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:11 -05:00
Al Viro
863d684f94 vfs: move the rest of int fields to struct mount
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:10 -05:00
Al Viro
15169fe784 vfs: mnt_id/mnt_group_id moved
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:10 -05:00
Al Viro
143c8c91ce vfs: mnt_ns moved to struct mount
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:09 -05:00
Al Viro
6776db3d32 vfs: take mnt_share/mnt_slave/mnt_slave_list and mnt_expire to struct mount
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:08 -05:00
Al Viro
d10e8def07 vfs: take mnt_master to struct mount
make IS_MNT_SLAVE take struct mount * at the same time

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:08 -05:00
Al Viro
6b41d536f7 vfs: take mnt_child/mnt_mounts to struct mount
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:06 -05:00
Al Viro
68e8a9feab vfs: all counters taken to struct mount
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:06 -05:00
Al Viro
a73324da7a vfs: move mnt_mountpoint to struct mount
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:05 -05:00
Al Viro
3376f34fff vfs: mnt_parent moved to struct mount
the second victim...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:04 -05:00
Al Viro
1b8e5564b9 vfs: the first spoils - mnt_hash moved
taken out of struct vfsmount into struct mount

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:02 -05:00
Al Viro
2a79f17e4a vfs: mnt_drop_write_file()
new helper (wrapper around mnt_drop_write()) to be used in pair with
mnt_want_write_file().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:52:40 -05:00
Al Viro
79e801a906 vfs: make do_kern_mount() static
the only user outside of fs/namespace.c has died

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:52:39 -05:00
Arun Sharma
60063497a9 atomic: use <linux/atomic.h>
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
Al Viro
f03c65993b sanitize vfsmount refcounting changes
Instead of splitting refcount between (per-cpu) mnt_count
and (SMP-only) mnt_longrefs, make all references contribute
to mnt_count again and keep track of how many are longterm
ones.

Accounting rules for longterm count:
	* 1 for each fs_struct.root.mnt
	* 1 for each fs_struct.pwd.mnt
	* 1 for having non-NULL ->mnt_ns
	* decrement to 0 happens only under vfsmount lock exclusive

That allows nice common case for mntput() - since we can't drop the
final reference until after mnt_longterm has reached 0 due to the rules
above, mntput() can grab vfsmount lock shared and check mnt_longterm.
If it turns out to be non-zero (which is the common case), we know
that this is not the final mntput() and can just blindly decrement
percpu mnt_count.  Otherwise we grab vfsmount lock exclusive and
do usual decrement-and-check of percpu mnt_count.

For fs_struct.c we have mnt_make_longterm() and mnt_make_shortterm();
namespace.c uses the latter in places where we don't already hold
vfsmount lock exclusive and opencodes a few remaining spots where
we need to manipulate mnt_longterm.

Note that we mostly revert the code outside of fs/namespace.c back
to what we used to have; in particular, normal code doesn't need
to care about two kinds of references, etc.  And we get to keep
the optimization Nick's variant had bought us...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-01-16 13:47:07 -05:00
David Howells
ea5b778a8b Unexport do_add_mount() and add in follow_automount(), not ->d_automount()
Unexport do_add_mount() and make ->d_automount() return the vfsmount to be
added rather than calling do_add_mount() itself.  follow_automount() will then
do the addition.

This slightly complicates things as ->d_automount() normally wants to add the
new vfsmount to an expiration list and start an expiration timer.  The problem
with that is that the vfsmount will be deleted if it has a refcount of 1 and
the timer will not repeat if the expiration list is empty.

To this end, we require the vfsmount to be returned from d_automount() with a
refcount of (at least) 2.  One of these refs will be dropped unconditionally.
In addition, follow_automount() must get a 3rd ref around the call to
do_add_mount() lest it eat a ref and return an error, leaving the mount we
have open to being expired as we would otherwise have only 1 ref on it.

d_automount() should also add the the vfsmount to the expiration list (by
calling mnt_set_expiry()) and start the expiration timer before returning, if
this mechanism is to be used.  The vfsmount will be unlinked from the
expiration list by follow_automount() if do_add_mount() fails.

This patch also fixes the call to do_add_mount() for AFS to propagate the mount
flags from the parent vfsmount.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-01-15 20:07:48 -05:00
Nick Piggin
b3e19d924b fs: scale mntget/mntput
The problem that this patch aims to fix is vfsmount refcounting scalability.
We need to take a reference on the vfsmount for every successful path lookup,
which often go to the same mount point.

The fundamental difficulty is that a "simple" reference count can never be made
scalable, because any time a reference is dropped, we must check whether that
was the last reference. To do that requires communication with all other CPUs
that may have taken a reference count.

We can make refcounts more scalable in a couple of ways, involving keeping
distributed counters, and checking for the global-zero condition less
frequently.

- check the global sum once every interval (this will delay zero detection
  for some interval, so it's probably a showstopper for vfsmounts).

- keep a local count and only taking the global sum when local reaches 0 (this
  is difficult for vfsmounts, because we can't hold preempt off for the life of
  a reference, so a counter would need to be per-thread or tied strongly to a
  particular CPU which requires more locking).

- keep a local difference of increments and decrements, which allows us to sum
  the total difference and hence find the refcount when summing all CPUs. Then,
  keep a single integer "long" refcount for slow and long lasting references,
  and only take the global sum of local counters when the long refcount is 0.

This last scheme is what I implemented here. Attached mounts and process root
and working directory references are "long" references, and everything else is
a short reference.

This allows scalable vfsmount references during path walking over mounted
subtrees and unattached (lazy umounted) mounts with processes still running
in them.

This results in one fewer atomic op in the fastpath: mntget is now just a
per-CPU inc, rather than an atomic inc; and mntput just requires a spinlock
and non-atomic decrement in the common case. However code is otherwise bigger
and heavier, so single threaded performance is basically a wash.

Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07 17:50:33 +11:00
Miklos Szeredi
532490f0a5 vfs: remove unused MNT_STRICTATIME
Commit d0adde574b added MNT_STRICTATIME
but it isn't actually used (MS_STRICTATIME clears MNT_RELATIME and
MNT_NOATIME rather than setting any mount flag).

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-11 00:29:47 -04:00
Andreas Gruenbacher
2504c5d63b fsnotify/vfsmount: add fsnotify fields to struct vfsmount
This patch adds the list and mask fields needed to support vfsmount marks.
These are the same fields fsnotify needs on an inode.  They are not used,
just declared and we note where the cleanup hook should be (the function is
not yet defined)

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:57 -04:00
Linus Torvalds
0f2cc4ecd8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits)
  init: Open /dev/console from rootfs
  mqueue: fix typo "failues" -> "failures"
  mqueue: only set error codes if they are really necessary
  mqueue: simplify do_open() error handling
  mqueue: apply mathematics distributivity on mq_bytes calculation
  mqueue: remove unneeded info->messages initialization
  mqueue: fix mq_open() file descriptor leak on user-space processes
  fix race in d_splice_alias()
  set S_DEAD on unlink() and non-directory rename() victims
  vfs: add NOFOLLOW flag to umount(2)
  get rid of ->mnt_parent in tomoyo/realpath
  hppfs can use existing proc_mnt, no need for do_kern_mount() in there
  Mirror MS_KERNMOUNT in ->mnt_flags
  get rid of useless vfsmount_lock use in put_mnt_ns()
  Take vfsmount_lock to fs/internal.h
  get rid of insanity with namespace roots in tomoyo
  take check for new events in namespace (guts of mounts_poll()) to namespace.c
  Don't mess with generic_permission() under ->d_lock in hpfs
  sanitize const/signedness for udf
  nilfs: sanitize const/signedness in dealing with ->d_name.name
  ...

Fix up fairly trivial (famous last words...) conflicts in
drivers/infiniband/core/uverbs_main.c and security/tomoyo/realpath.c
2010-03-04 08:15:33 -08:00
Al Viro
8089352a13 Mirror MS_KERNMOUNT in ->mnt_flags
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:08:00 -05:00
Al Viro
47cd813f29 Take vfsmount_lock to fs/internal.h
no more users left outside of fs/*.c (and very few outside of
fs/namespace.c, actually)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:59 -05:00
Valerie Aurora
495d6c9c65 VFS: Clean up shared mount flag propagation
The handling of mount flags in set_mnt_shared() got a little tangled
up during previous cleanups, with the following problems:

* MNT_PNODE_MASK is defined as a literal constant when it should be a
bitwise xor of other MNT_* flags
* set_mnt_shared() clears and then sets MNT_SHARED (part of MNT_PNODE_MASK)
* MNT_PNODE_MASK could use a comment in mount.h
* MNT_PNODE_MASK is a terrible name, change to MNT_SHARED_MASK

This patch fixes these problems.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:55 -05:00