Commit Graph

120 Commits

Author SHA1 Message Date
Stanislav Kinsbursky 5ecebb7c7f SUNRPC: unregister service on creation in current network namespace
On service shutdown we can be sure, that no more users of it left except
current. Thus it looks like using current network namespace context is safe in
this case.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 19:28:14 -05:00
Stanislav Kinsbursky bee42f688c SUNRPC: register service on creation in current network namespace
Service, using rpcbind (Lockd, NFSd) are starting from userspace call and thus
we can use current network namespace.
There could be a problem with NFSd service, because it's creation can be called
through NFSd fs from different network namespace. But this is a part of "NFSd
per net ns" task and will be fixed in future.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 19:28:14 -05:00
Stanislav Kinsbursky 5247fab5c8 SUNRPC: pass network namespace to service registering routines
Lockd and NFSd services will handle requests from and to many network
nsamespaces. And thus have to be registered and unregistered per network
namespace.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 19:28:13 -05:00
Stanislav Kinsbursky f7a30c18e8 SUNRPC: parametrize local rpcbind clients creation with net ns
These client are per network namespace and thus can be created for different
network namespaces.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 19:28:11 -05:00
Stanislav Kinsbursky 977ac31573 SUNRPC: register rpcbind programs in passed network namespase context
Registering rpcbind program requires rpcbind clients, which are per network
namespace context.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 19:28:10 -05:00
Linus Torvalds 0b48d42235 Merge branch 'for-3.3' of git://linux-nfs.org/~bfields/linux
* 'for-3.3' of git://linux-nfs.org/~bfields/linux: (31 commits)
  nfsd4: nfsd4_create_clid_dir return value is unused
  NFSD: Change name of extended attribute containing junction
  svcrpc: don't revert to SVC_POOL_DEFAULT on nfsd shutdown
  svcrpc: fix double-free on shutdown of nfsd after changing pool mode
  nfsd4: be forgiving in the absence of the recovery directory
  nfsd4: fix spurious 4.1 post-reboot failures
  NFSD: forget_delegations should use list_for_each_entry_safe
  NFSD: Only reinitilize the recall_lru list under the recall lock
  nfsd4: initialize special stateid's at compile time
  NFSd: use network-namespace-aware cache registering routines
  SUNRPC: create svc_xprt in proper network namespace
  svcrpc: update outdated BKL comment
  nfsd41: allow non-reclaim open-by-fh's in 4.1
  svcrpc: avoid memory-corruption on pool shutdown
  svcrpc: destroy server sockets all at once
  svcrpc: make svc_delete_xprt static
  nfsd: Fix oops when parsing a 0 length export
  nfsd4: Use kmemdup rather than duplicating its implementation
  nfsd4: add a separate (lockowner, inode) lookup
  nfsd4: fix CONFIG_NFSD_FAULT_INJECTION compile error
  ...
2012-01-14 12:26:41 -08:00
J. Bruce Fields 9689dcce0b svcrpc: don't revert to SVC_POOL_DEFAULT on nfsd shutdown
This was unexpected behavior (at least for me)--why would you want
configuration settings automatically lost on nfsd restart?

In practice this won't affect distributions, which likely set everything
on every startup.  But I'd expect the behavior to be less confusing to
someone manually restarting nfsd for testing.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-01-05 15:35:56 -05:00
J. Bruce Fields 61c8504c42 svcrpc: fix double-free on shutdown of nfsd after changing pool mode
The pool_to and to_pool fields of the global svc_pool_map are freed on
shutdown, but are initialized in nfsd startup only in the
SVC_POOL_PERCPU and SVC_POOL_PERNODE cases.

They *are* initialized to zero on kernel startup.  So as long as you use
only SVC_POOL_GLOBAL (the default), this will never be a problem.

You're also OK if you only ever use SVC_POOL_PERCPU or SVC_POOL_PERNODE.

However, the following sequence events leads to a double-free:

	1. set SVC_POOL_PERCPU or SVC_POOL_PERNODE
	2. start nfsd: both fields are initialized.
	3. shutdown nfsd: both fields are freed.
	4. set SVC_POOL_GLOBAL
	5. start nfsd: the fields are left untouched.
	6. shutdown nfsd: now we try to free them again.

Step 4 is actually unnecessary, since (for some bizarre reason), nfsd
automatically resets the pool mode to SVC_POOL_GLOBAL on shutdown.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-01-05 15:35:55 -05:00
Eric Dumazet dfd56b8b38 net: use IS_ENABLED(CONFIG_IPV6)
Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-11 18:25:16 -05:00
J. Bruce Fields 94cf3179cc svcrpc: update outdated BKL comment
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-12-06 16:20:42 -05:00
J. Bruce Fields b4f36f88b3 svcrpc: avoid memory-corruption on pool shutdown
Socket callbacks use svc_xprt_enqueue() to add an xprt to a
pool->sp_sockets list.  In normal operation a server thread will later
come along and take the xprt off that list.  On shutdown, after all the
threads have exited, we instead manually walk the sv_tempsocks and
sv_permsocks lists to find all the xprt's and delete them.

So the sp_sockets lists don't really matter any more.  As a result,
we've mostly just ignored them and hoped they would go away.

Which has gotten us into trouble; witness for example ebc63e531c
"svcrpc: fix list-corrupting race on nfsd shutdown", the result of Ben
Greear noticing that a still-running svc_xprt_enqueue() could re-add an
xprt to an sp_sockets list just before it was deleted.  The fix was to
remove it from the list at the end of svc_delete_xprt().  But that only
made corruption less likely--I can see nothing that prevents a
svc_xprt_enqueue() from adding another xprt to the list at the same
moment that we're removing this xprt from the list.  In fact, despite
the earlier xpo_detach(), I don't even see what guarantees that
svc_xprt_enqueue() couldn't still be running on this xprt.

So, instead, note that svc_xprt_enqueue() essentially does:
	lock sp_lock
		if XPT_BUSY unset
			add to sp_sockets
	unlock sp_lock

So, if we do:

	set XPT_BUSY on every xprt.
	Empty every sp_sockets list, under the sp_socks locks.

Then we're left knowing that the sp_sockets lists are all empty and will
stay that way, since any svc_xprt_enqueue() will check XPT_BUSY under
the sp_lock and see it set.

And *then* we can continue deleting the xprt's.

(Thanks to Jeff Layton for being correctly suspicious of this code....)

Cc: Ben Greear <greearb@candelatech.com>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-12-06 16:18:58 -05:00
J. Bruce Fields 2fefb8a09e svcrpc: destroy server sockets all at once
There's no reason I can see that we need to call sv_shutdown between
closing the two lists of sockets.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-12-06 16:18:52 -05:00
Trond Myklebust 31cbecb4ab Merge branch 'osd-devel' into nfs-for-next 2011-11-02 23:56:40 -04:00
Joe Perches b9075fa968 treewide: use __printf not __attribute__((format(printf,...)))
Standardize the style for compiler based printf format verification.
Standardized the location of __printf too.

Done via script and a little typing.

$ grep -rPl --include=*.[ch] -w "__attribute__" * | \
  grep -vP "^(tools|scripts|include/linux/compiler-gcc.h)" | \
  xargs perl -n -i -e 'local $/; while (<>) { s/\b__attribute__\s*\(\s*\(\s*format\s*\(\s*printf\s*,\s*(.+)\s*,\s*(.+)\s*\)\s*\)\s*\)/__printf($1, $2)/g ; print; }'

[akpm@linux-foundation.org: revert arch bits]
Signed-off-by: Joe Perches <joe@perches.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-31 17:30:54 -07:00
Stanislav Kinsbursky 16d0587090 NFSd: call svc rpcbind cleanup explicitly
We have to call svc_rpcb_cleanup() explicitly from nfsd_last_thread() since
this function is registered as service shutdown callback and thus nobody else
will done it for us.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-25 13:19:40 +02:00
Stanislav Kinsbursky 8e356b1e2a SUNRPC: cleanup service destruction
svc_unregister() call have to be removed from svc_destroy() since it will be
called in sv_shutdown callback.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-25 13:19:13 +02:00
Stanislav Kinsbursky e40f5e29ef SUNRPC: setup rpcbind clients if service requires it
New function ("svc_uses_rpcbind") will be used to detect, that new service will
send portmapper register calls. For such services we will create rpcbind
clients and remove all stale portmap registrations.
Also, svc_rpcb_cleanup() will be set as sv_shutdown callback for such services
in case of this field wasn't initialized earlier. This will allow to destroy
rpcbind clients when no other users of them left.

Note: Currently, any creating service will be detected as portmap user.
Probably, this is wrong. But now it depends on program versions "vs_hidden"
flag.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-25 13:18:42 +02:00
Stanislav Kinsbursky d99085605c SUNRPC: introduce svc helpers for prepairing rpcbind infrastructure
This helpers will be used only for those services, that will send portmapper
registration calls.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-25 13:18:05 +02:00
Eric Dumazet 11fd165c68 sunrpc: use better NUMA affinities
Use NUMA aware allocations to reduce latencies and increase throughput.

sunrpc kthreads can use kthread_create_on_node() if pool_mode is
"percpu" or "pernode", and svc_prepare_thread()/svc_init_buffer() can
also take into account NUMA node affinity for memory allocations.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: "J. Bruce Fields" <bfields@fieldses.org>
CC: Neil Brown <neilb@suse.de>
CC: David Miller <davem@davemloft.net>
Reviewed-by: Greg Banks <gnb@fastmail.fm>
[bfields@redhat.com: fix up caller nfs41_callback_up]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-19 13:25:36 -04:00
Trond Myklebust 0d961aa934 SUNRPC: Convert the backchannel exports to EXPORT_SYMBOL_GPL
Ensure that the backchannel exports conform to the existing sunrpc
practice.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-15 09:12:23 -04:00
Trond Myklebust 9e00abc3c2 SUNRPC: sunrpc should not explicitly depend on NFS config options
Change explicit references to CONFIG_NFS_V4_1 to implicit ones
Get rid of the unnecessary defines in backchannel_rqst.c and
bc_svc.c: the Makefile takes care of those dependency.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-15 09:12:23 -04:00
Chuck Lever 7402ab19cd SUNRPC: Use AF_LOCAL for rpcbind upcalls
As libtirpc does in user space, have our registration API try using an
AF_LOCAL transport first when registering and unregistering.

This means we don't chew up privileged ports, and our registration is
bound to an "owner" (the effective uid of the process on the sending
end of the transport).  Only that "owner" may unregister the service.

The kernel could probe rpcbind via an rpcbind query to determine
whether rpcbind has an AF_LOCAL service. For simplicity, we use the
same technique that libtirpc uses: simply fail over to network
loopback if creating an AF_LOCAL transport to the well-known rpcbind
service socket fails.

This means we open-code the pathname of the rpcbind socket in the
kernel.  For now we have to do that anyway because the kernel's
RPC over AF_LOCAL implementation does not support autobind.  That may
be undesirable in the long term.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-05-27 17:42:47 -04:00
Linus Torvalds 18bce371ae Merge branch 'for-2.6.38' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.38' of git://linux-nfs.org/~bfields/linux: (62 commits)
  nfsd4: fix callback restarting
  nfsd: break lease on unlink, link, and rename
  nfsd4: break lease on nfsd setattr
  nfsd: don't support msnfs export option
  nfsd4: initialize cb_per_client
  nfsd4: allow restarting callbacks
  nfsd4: simplify nfsd4_cb_prepare
  nfsd4: give out delegations more quickly in 4.1 case
  nfsd4: add helper function to run callbacks
  nfsd4: make sure sequence flags are set after destroy_session
  nfsd4: re-probe callback on connection loss
  nfsd4: set sequence flag when backchannel is down
  nfsd4: keep finer-grained callback status
  rpc: allow xprt_class->setup to return a preexisting xprt
  rpc: keep backchannel xprt as long as server connection
  rpc: move sk_bc_xprt to svc_xprt
  nfsd4: allow backchannel recovery
  nfsd4: support BIND_CONN_TO_SESSION
  nfsd4: modify session list under cl_lock
  Documentation: fl_mylease no longer exists
  ...

Fix up conflicts in fs/nfsd/vfs.c with the vfs-scale work.  The
vfs-scale work touched some msnfs cases, and this merge removes support
for that entirely, so the conflict was trivial to resolve.
2011-01-14 13:17:26 -08:00
Andy Adamson 4a19de0f4b NFS rename client back channel transport field
Differentiate from server backchannel

Signed-off-by: Andy Adamson <andros@netapp.com>
Acked-by: Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 14:46:25 -05:00
Andy Adamson 1f11a034cd SUNRPC new transport for the NFSv4.1 shared back channel
Move the current sock create and destroy routines into the new transport ops.
Back channel socket will be destroyed by the svc_closs_all call in svc_destroy.

Added check: only TCP supported on shared back channel.

Signed-off-by: Andy Adamson <andros@netapp.com>
Acked-by: Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 14:46:23 -05:00