Commit Graph

44 Commits

Author SHA1 Message Date
Al Viro d10e8def07 vfs: take mnt_master to struct mount
make IS_MNT_SLAVE take struct mount * at the same time

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:08 -05:00
Al Viro 14cf1fa8f5 vfs: spread struct mount - remaining argument of mnt_set_mountpoint()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:07 -05:00
Al Viro a8d56d8e4f vfs: spread struct mount - propagate_mnt()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:07 -05:00
Al Viro c937135d98 vfs: spread struct mount - shared subtree iterators
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:07 -05:00
Al Viro 6fc7871fed vfs: spread struct mount - get_dominating_id / do_make_slave
next pile of horrors, similar to mnt_parent one; this time it's
mnt_master.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:06 -05:00
Al Viro 6b41d536f7 vfs: take mnt_child/mnt_mounts to struct mount
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:06 -05:00
Al Viro 83adc75322 vfs: spread struct mount - work with counters
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:05 -05:00
Al Viro a73324da7a vfs: move mnt_mountpoint to struct mount
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:05 -05:00
Al Viro 0714a53380 vfs: now it can be done - make mnt_parent point to struct mount
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:05 -05:00
Al Viro 3376f34fff vfs: mnt_parent moved to struct mount
the second victim...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:04 -05:00
Al Viro 643822b41e vfs: spread struct mount - is_path_reachable
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:04 -05:00
Al Viro 1ab5973862 vfs: spread struct mount - do_umount/propagate_mount_busy
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:03 -05:00
Al Viro 44d964d609 vfs: spread struct mount mnt_set_mountpoint child argument
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:03 -05:00
Al Viro 87129cc0e3 vfs: spread struct mount - clone_mnt/copy_tree argument
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:03 -05:00
Al Viro 761d5c38eb vfs: spread struct mount - umount_tree argument
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:02 -05:00
Al Viro 1b8e5564b9 vfs: the first spoils - mnt_hash moved
taken out of struct vfsmount into struct mount

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:02 -05:00
Al Viro cb338d06e9 vfs: spread struct mount - clone_mnt/copy_tree result
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:01 -05:00
Al Viro 0f0afb1dcf vfs: spread struct mount - change_mnt_propagation/set_mnt_shared
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:57:01 -05:00
Al Viro 4b8b21f4fe vfs: spread struct mount - mount group id handling
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:56:59 -05:00
Al Viro 61ef47b1e4 vfs: spread struct mount - __propagate_umount() argument
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:56:58 -05:00
Al Viro c71053659e vfs: spread struct mount - __lookup_mnt() result
switch __lookup_mnt() to returning struct mount *; callers adjusted.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:56:58 -05:00
Al Viro afac7cba7e vfs: more mnt_parent cleanups
a) mount --move is checking that ->mnt_parent is non-NULL before
looking if that parent happens to be shared; ->mnt_parent is never
NULL and it's not even an misspelled !mnt_has_parent()

b) pivot_root open-codes is_path_reachable(), poorly.

c) so does path_is_under(), while we are at it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:52:36 -05:00
Al Viro b2dba1af3c vfs: new internal helper: mnt_has_parent(mnt)
vfsmounts have ->mnt_parent pointing either to a different vfsmount
or to itself; it's never NULL and termination condition in loops
traversing the tree towards root is mnt == mnt->mnt_parent.  At least
one place (see the next patch) is confused about what's going on;
let's add an explicit helper checking it right way and use it in
all places where we need it.  Not that there had been too many,
but...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:52:36 -05:00
Nick Piggin b3e19d924b fs: scale mntget/mntput
The problem that this patch aims to fix is vfsmount refcounting scalability.
We need to take a reference on the vfsmount for every successful path lookup,
which often go to the same mount point.

The fundamental difficulty is that a "simple" reference count can never be made
scalable, because any time a reference is dropped, we must check whether that
was the last reference. To do that requires communication with all other CPUs
that may have taken a reference count.

We can make refcounts more scalable in a couple of ways, involving keeping
distributed counters, and checking for the global-zero condition less
frequently.

- check the global sum once every interval (this will delay zero detection
  for some interval, so it's probably a showstopper for vfsmounts).

- keep a local count and only taking the global sum when local reaches 0 (this
  is difficult for vfsmounts, because we can't hold preempt off for the life of
  a reference, so a counter would need to be per-thread or tied strongly to a
  particular CPU which requires more locking).

- keep a local difference of increments and decrements, which allows us to sum
  the total difference and hence find the refcount when summing all CPUs. Then,
  keep a single integer "long" refcount for slow and long lasting references,
  and only take the global sum of local counters when the long refcount is 0.

This last scheme is what I implemented here. Attached mounts and process root
and working directory references are "long" references, and everything else is
a short reference.

This allows scalable vfsmount references during path walking over mounted
subtrees and unattached (lazy umounted) mounts with processes still running
in them.

This results in one fewer atomic op in the fastpath: mntget is now just a
per-CPU inc, rather than an atomic inc; and mntput just requires a spinlock
and non-atomic decrement in the common case. However code is otherwise bigger
and heavier, so single threaded performance is basically a wash.

Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07 17:50:33 +11:00
Nick Piggin 99b7db7b8f fs: brlock vfsmount_lock
fs: brlock vfsmount_lock

Use a brlock for the vfsmount lock. It must be taken for write whenever
modifying the mount hash or associated fields, and may be taken for read when
performing mount hash lookups.

A new lock is added for the mnt-id allocator, so it doesn't need to take
the heavy vfsmount write-lock.

The number of atomics should remain the same for fastpath rlock cases, though
code would be slightly slower due to per-cpu access. Scalability is not not be
much improved in common cases yet, due to other locks (ie. dcache_lock) getting
in the way. However path lookups crossing mountpoints should be one case where
scalability is improved (currently requiring the global lock).

The slowpath is slower due to use of brlock. On a 64 core, 64 socket, 32 node
Altix system (high latency to remote nodes), a simple umount microbenchmark
(mount --bind mnt mnt2 ; umount mnt2 loop 1000 times), before this patch it
took 6.8s, afterwards took 7.1s, about 5% slower.

Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-18 08:35:48 -04:00