Commit Graph

188 Commits

Author SHA1 Message Date
Trond Myklebust
7531d692d4 SUNRPC: Fix sparse warnings
- net/sunrpc/xprtsock.c:1635:5: warning: symbol 'init_socket_xprt' was not
   declared. Should it be static?
 - net/sunrpc/xprtsock.c:1649:6: warning: symbol 'cleanup_socket_xprt' was
   not declared. Should it be static?

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-05-14 19:33:47 -04:00
Jeff Layton
cd123012d9 RPC: add wrapper for svc_reserve to account for checksum
When the kernel calls svc_reserve to downsize the expected size of an RPC
reply, it fails to account for the possibility of a checksum at the end of
the packet.  If a client mounts a NFSv2/3 with sec=krb5i/p, and does I/O
then you'll generally see messages similar to this in the server's ring
buffer:

RPC request reserved 164 but used 208

While I was never able to verify it, I suspect that this problem is also
the root cause of some oopses I've seen under these conditions:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=227726

This is probably also a problem for other sec= types and for NFSv4.  The
large reserved size for NFSv4 compound packets seems to generally paper
over the problem, however.

This patch adds a wrapper for svc_reserve that accounts for the possibility
of a checksum.  It also fixes up the appropriate callers of svc_reserve to
call the wrapper.  For now, it just uses a hardcoded value that I
determined via testing.  That value may need to be revised upward as things
change, or we may want to eventually add a new auth_op that attempts to
calculate this somehow.

Unfortunately, there doesn't seem to be a good way to reliably determine
the expected checksum length prior to actually calculating it, particularly
with schemes like spkm3.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-09 12:30:54 -07:00
NeilBrown
7ac1bea550 knfsd: rename sk_defer_lock to sk_lock
Now that sk_defer_lock protects two different things, make the name more
generic.

Also don't bother with disabling _bh as the lock is only ever taken from
process context.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-09 12:30:54 -07:00
Chuck Lever
4c2eaf073f SUNRPC: remove old portmapper
net/sunrpc/pmap_clnt.c has been replaced by net/sunrpc/rpcb_clnt.c.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-04-30 22:17:15 -07:00
Chuck Lever
a509050bd3 SUNRPC: introduce rpcbind: replacement for in-kernel portmapper
Introduce a replacement for the in-kernel portmapper client that supports
all 3 versions of the rpcbind protocol.  This code is not used yet.

Original code by Groupe Bull updated for the latest kernel, with multiple
bug fixes.

Note that rpcb_clnt.c does not yet support registering via versions 3 and
4 of the rpcbind protocol.  That is planned for a later patch.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-04-30 22:17:12 -07:00
Chuck Lever
c5a4dd8b7c SUNRPC: Eliminate side effects from rpc_malloc
Currently rpc_malloc sets req->rq_buffer internally.  Make this a more
generic interface:  return a pointer to the new buffer (or NULL) and
make the caller set req->rq_buffer and req->rq_bufsize.  This looks much
more like kmalloc and eliminates the side effects.

To fix a potential deadlock, this patch also replaces GFP_NOFS with
GFP_NOWAIT in rpc_malloc.  This prevents async RPCs from sleeping outside
the RPC's task scheduler while allocating their buffer.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-04-30 22:17:11 -07:00
Chuck Lever
2bea90d43a SUNRPC: RPC buffer size estimates are too large
The RPC buffer size estimation logic in net/sunrpc/clnt.c always
significantly overestimates the requirements for the buffer size.
A little instrumentation demonstrated that in fact rpc_malloc was never
allocating the buffer from the mempool, but almost always called kmalloc.

To compute the size of the RPC buffer more precisely, split p_bufsiz into
two fields; one for the argument size, and one for the result size.

Then, compute the sum of the exact call and reply header sizes, and split
the RPC buffer precisely between the two.  That should keep almost all RPC
buffers within the 2KiB buffer mempool limit.

And, we can finally be rid of RPC_SLACK_SPACE!

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-04-30 22:17:10 -07:00
NeilBrown
cda1fd4abd [PATCH] knfsd: fix recently introduced problem with shutting down a busy NFS server
When the last thread of nfsd exits, it shuts down all related sockets.  It
currently uses svc_close_socket to do this, but that only is immediately
effective if the socket is not SK_BUSY.

If the socket is busy - i.e.  if a request has arrived that has not yet been
processes - svc_close_socket is not effective and the shutdown process spins.

So create a new svc_force_close_socket which removes the SK_BUSY flag is set
and then calls svc_close_socket.

Also change some open-codes loops in svc_destroy to use
list_for_each_entry_safe.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-06 09:30:26 -08:00
NeilBrown
5a05ed73e1 [PATCH] knfsd: remove CONFIG_IPV6 ifdefs from sunrpc server code
They don't really save that much, and aren't worth the hassle.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-06 09:30:26 -08:00
Ralf Baechle
7b8f850beb [PATCH] Fix build errors if bitop functions are do {} while macros
If one of clear_bit, change_bit or set_bit is defined as a do { } while (0)
function usage of these functions in parenthesis like

  (foo_bit(23, &var))

while be expaned to something like

  (do { ... } while (0)}).

resulting in a build error.  This patch removes the useless parenthesis.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-20 17:10:12 -08:00
Eric W. Biederman
50d851f722 [PATCH] sysctl: move CTL_SUNRPC to sysctl.h where it belongs
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:55 -08:00
Trond Myklebust
d9bc125caf Merge branch 'master' of /home/trondmy/kernel/linux-2.6/
Conflicts:

	net/sunrpc/auth_gss/gss_krb5_crypto.c
	net/sunrpc/auth_gss/gss_spkm3_token.c
	net/sunrpc/clnt.c

Merge with mainline and fix conflicts.
2007-02-12 22:43:25 -08:00
Chuck Lever
43d78ef2ba NFS: disconnect before retrying NFSv4 requests over TCP
RFC3530 section 3.1.1 states an NFSv4 client MUST NOT send a request
twice on the same connection unless it is the NULL procedure.  Section
3.1.1 suggests that the client should disconnect and reconnect if it
wants to retry a request.

Implement this by adding an rpc_clnt flag that an ULP can use to
specify that the underlying transport should be disconnected on a
major timeout.  The NFSv4 client asserts this new flag, and requests
no retries after a minor retransmit timeout.

Note that disconnecting on a retransmit is in general not safe to do
if the RPC client does not reuse the TCP port number when reconnecting.

See http://bugzilla.linux-nfs.org/show_bug.cgi?id=6

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-02-12 22:40:45 -08:00
Chuck Lever
95756482c9 [PATCH] knfsd: SUNRPC: support IPv6 addresses in RPC server's UDP receive path
Add support for IPv6 addresses in the RPC server's UDP receive path.

[akpm@linux-foundation.org: cleanups]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:36 -08:00
Chuck Lever
73df0dbaff [PATCH] knfsd: SUNRPC: Make rq_daddr field address-version independent
The rq_daddr field must support larger addresses.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:36 -08:00
Chuck Lever
27459f0940 [PATCH] knfsd: SUNRPC: Provide room in svc_rqst for larger addresses
Expand the rq_addr field to allow it to contain larger addresses.

Specifically, we replace a 'sockaddr_in' with a 'sockaddr_storage', then
everywhere the 'sockaddr_in' was referenced, we use instead an accessor
function (svc_addr_in) which safely casts the _storage to _in.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:36 -08:00
Chuck Lever
2442222283 [PATCH] knfsd: SUNRPC: Use sockaddr_storage to store address in svc_deferred_req
Sockaddr_storage will allow us to store arbitrary socket addresses in the
svc_deferred_req struct.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:35 -08:00
Chuck Lever
ad06e4bd62 [PATCH] knfsd: SUNRPC: Add a function to format the address in an svc_rqst for printing
There are loads of places where the RPC server assumes that the rq_addr fields
contains an IPv4 address.  Top among these are error and debugging messages
that display the server's IP address.

Let's refactor the address printing into a separate function that's smart
enough to figure out the difference between IPv4 and IPv6 addresses.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:35 -08:00
Chuck Lever
067d781731 [PATCH] knfsd: SUNRPC: Cache remote peer's address in svc_sock
The remote peer's address won't change after the socket has been accepted.  We
don't need to call ->getname on every incoming request.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:35 -08:00
Chuck Lever
482fb94e1b [PATCH] knfsd: SUNRPC: allow creating an RPC service without registering with portmapper
Sometimes we need to create an RPC service but not register it with the local
portmapper.  NFSv4 delegation callback, for example.

Change the svc_makesock() API to allow optionally creating temporary or
permanent sockets, optionally registering with the local portmapper, and make
it return the ephemeral port of the new socket.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:35 -08:00
Chuck Lever
6b174337e5 [PATCH] knfsd: SUNRPC: update internal API: separate pmap register and temp sockets
Currently in the RPC server, registering with the local portmapper and
creating "permanent" sockets are tied together.  Expand the internal APIs to
allow these two socket characteristics to be separately specified.

This will be externalized in the next patch.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:35 -08:00
NeilBrown
aaf68cfbf2 [PATCH] knfsd: fix a race in closing NFSd connections
If you lose this race, it can iput a socket inode twice and you get a BUG
in fs/inode.c

When I added the option for user-space to close a socket, I added some
cruft to svc_delete_socket so that I could call that function when closing
a socket per user-space request.

This was the wrong thing to do.  I should have just set SK_CLOSE and let
normal mechanisms do the work.

Not only wrong, but buggy.  The locking is all wrong and it openned up a
race where-by a socket could be closed twice.

So this patch:
  Introduces svc_close_socket which sets SK_CLOSE then either leave
  the close up to a thread, or calls svc_delete_socket if it can
  get SK_BUSY.

  Adds a bias to sk_busy which is removed when SK_DEAD is set,
  This avoid races around shutting down the socket.

  Changes several 'spin_lock' to 'spin_lock_bh' where the _bh
  was missing.

Bugzilla-url: http://bugzilla.kernel.org/show_bug.cgi?id=7916

Signed-off-by: Neil Brown <neilb@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-09 09:25:47 -08:00
Trond Myklebust
2efef837fb RPC: Clean up rpc_execute...
The error values are already propagated through task->tk_status, and
none of the callers check one without checking the other, so we can
drop the return value.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-02-03 15:35:03 -08:00
NeilBrown
250f391518 [PATCH] knfsd: fix an NFSD bug with full sized, non-page-aligned reads
NFSd assumes that largest number of pages that will be needed for a
request+response is 2+N where N pages is the size of the largest permitted
read/write request.  The '2' are 1 for the non-data part of the request, and 1
for the non-data part of the reply.

However, when a read request is not page-aligned, and we choose to use
->sendfile to send it directly from the page cache, we may need N+1 pages to
hold the whole reply.  This can overflow and array and cause an Oops.

This patch increases size of the array for holding pages by one and makes sure
that entry is NULL when it is not in use.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-26 13:50:59 -08:00
Trond Myklebust
bde8f00ce6 [PATCH] NFS: Fix Oops in rpc_call_sync()
Fix the Oops in http://bugzilla.linux-nfs.org/show_bug.cgi?id=138
We shouldn't be calling rpc_release_task() for tasks that are not active.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-24 12:31:06 -08:00