Pull more vfs stuff from Al Viro:
"O_TMPFILE ABI changes, Oleg's fput() series, misc cleanups, including
making simple_lookup() usable for filesystems with non-NULL s_d_op,
which allows us to get rid of quite a bit of ugliness"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
sunrpc: now we can just set ->s_d_op
cgroup: we can use simple_lookup() now
efivarfs: we can use simple_lookup() now
make simple_lookup() usable for filesystems that set ->s_d_op
configfs: don't open-code d_alloc_name()
__rpc_lookup_create_exclusive: pass string instead of qstr
rpc_create_*_dir: don't bother with qstr
llist: llist_add() can use llist_add_batch()
llist: fix/simplify llist_add() and llist_add_batch()
fput: turn "list_head delayed_fput_list" into llist_head
fs/file_table.c:fput(): add comment
Safer ABI for O_TMPFILE
Pull nfsd changes from Bruce Fields:
"Changes this time include:
- 4.1 enabled on the server by default: the last 4.1-specific issues
I know of are fixed, so we're not going to find the rest of the
bugs without more exposure.
- Experimental support for NFSv4.2 MAC Labeling (to allow running
selinux over NFS), from Dave Quigley.
- Fixes for some delicate cache/upcall races that could cause rare
server hangs; thanks to Neil Brown and Bodo Stroesser for extreme
debugging persistence.
- Fixes for some bugs found at the recent NFS bakeathon, mostly v4
and v4.1-specific, but also a generic bug handling fragmented rpc
calls"
* 'for-3.11' of git://linux-nfs.org/~bfields/linux: (31 commits)
nfsd4: support minorversion 1 by default
nfsd4: allow destroy_session over destroyed session
svcrpc: fix failures to handle -1 uid's
sunrpc: Don't schedule an upcall on a replaced cache entry.
net/sunrpc: xpt_auth_cache should be ignored when expired.
sunrpc/cache: ensure items removed from cache do not have pending upcalls.
sunrpc/cache: use cache_fresh_unlocked consistently and correctly.
sunrpc/cache: remove races with queuing an upcall.
nfsd4: return delegation immediately if lease fails
nfsd4: do not throw away 4.1 lock state on last unlock
nfsd4: delegation-based open reclaims should bypass permissions
svcrpc: don't error out on small tcp fragment
svcrpc: fix handling of too-short rpc's
nfsd4: minor read_buf cleanup
nfsd4: fix decoding of compounds across page boundaries
nfsd4: clean up nfs4_open_delegation
NFSD: Don't give out read delegations on creates
nfsd4: allow client to send no cb_sec flavors
nfsd4: fail attempts to request gss on the backchannel
nfsd4: implement minimal SP4_MACH_CRED
...
commit d202cce896
sunrpc: never return expired entries in sunrpc_cache_lookup
moved the 'entry is expired' test from cache_check to
sunrpc_cache_lookup, so that it happened early and some races could
safely be ignored.
However the ip_map (in svcauth_unix.c) has a separate single-item
cache which allows quick lookup without locking. An entry in this
case would not be subject to the expiry test and so could be used
well after it has expired.
This is not normally a big problem because the first time it is used
after it is expired an up-call will be scheduled to refresh the entry
(if it hasn't been scheduled already) and the old entry will then
be invalidated. So on the second attempt to use it after it has
expired, ip_map_cached_get will discard it.
However that is subtle and not ideal, so replace the "!cache_valid"
test with "cache_is_expired".
In doing this we drop the test on the "CACHE_VALID" bit. This is
unnecessary as the bit is never cleared, and an entry will only
be cached if the bit is set.
Reported-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
It is possible for a race to set CACHE_PENDING after cache_clean()
has removed a cache entry from the cache.
If CACHE_PENDING is still set when the entry is finally 'put',
the cache_dequeue() will never happen and we can leak memory.
So set a new flag 'CACHE_CLEANED' when we remove something from
the cache, and don't queue any upcall if it is set.
If CACHE_PENDING is set before CACHE_CLEANED, the call that
cache_clean() makes to cache_fresh_unlocked() will free memory
as needed. If CACHE_PENDING is set after CACHE_CLEANED, the
test in sunrpc_cache_pipe_upcall will ensure that the memory
is not allocated.
Reported-by: <bstroesser@ts.fujitsu.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Store a pointer to the gss mechanism used in the rq_cred and cl_cred.
This will make it easier to enforce SP4_MACH_CRED, which needs to
compare the mechanism used on the exchange_id with that used on
protected operations.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Note conflict: Chuck's patches modified (and made static)
gss_mech_get_by_OID, which is still needed by gss-proxy patches.
The conflict resolution is a bit minimal; we may want some more cleanup.
We expose this parameter for a future caller.
It will be used to extract the endtime from the gss-proxy upcall mechanism,
in order to set the rsc cache expiration time.
Signed-off-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
In the gss-proxy case we don't want to have to reconnect at random--we
want to connect only on gss-proxy startup when we can steal gss-proxy's
context to do the connect in the right namespace.
So, provide a flag that allows the rpc_create caller to turn off the
idle timeout.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Merging Trond's nfs-for-next branch, mainly to get
b7993cebb8 "SUNRPC: Allow rpc_create() to
request that TCP slots be unlimited", which a small piece of the
gss-proxy work depends on.
* rpcsec_gss-from_cel: (21 commits)
NFS: Retry SETCLIENTID with AUTH_SYS instead of AUTH_NONE
NFSv4: Don't clear the machine cred when client establish returns EACCES
NFSv4: Fix issues in nfs4_discover_server_trunking
NFSv4: Fix the fallback to AUTH_NULL if krb5i is not available
NFS: Use server-recommended security flavor by default (NFSv3)
SUNRPC: Don't recognize RPC_AUTH_MAXFLAVOR
NFS: Use "krb5i" to establish NFSv4 state whenever possible
NFS: Try AUTH_UNIX when PUTROOTFH gets NFS4ERR_WRONGSEC
NFS: Use static list of security flavors during root FH lookup recovery
NFS: Avoid PUTROOTFH when managing leases
NFS: Clean up nfs4_proc_get_rootfh
NFS: Handle missing rpc.gssd when looking up root FH
SUNRPC: Remove EXPORT_SYMBOL_GPL() from GSS mech switch
SUNRPC: Make gss_mech_get() static
SUNRPC: Refactor nfsd4_do_encode_secinfo()
SUNRPC: Consider qop when looking up pseudoflavors
SUNRPC: Load GSS kernel module by OID
SUNRPC: Introduce rpcauth_get_pseudoflavor()
SUNRPC: Define rpcsec_gss_info structure
NFS: Remove unneeded forward declaration
...
This is mainly for use by NFSv4.1, where the session negotiation
ultimately wants to decide how many RPC slots we can fill.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch ensures that we throttle new RPC requests if there are
requests already waiting in the xprt->backlog queue. The reason for
doing this is to fix livelock issues that can occur when an existing
(high priority) task is waiting in the backlog queue, gets woken up
by xprt_free_slot(), but a new task then steals the slot.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up. This matches a similar API for the client side, and
keeps ULP fingers out the of the GSS mech switch.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The NFSv4 SECINFO operation returns a list of security flavors that
the server supports for a particular share. An NFSv4 client is
supposed to pick a pseudoflavor it supports that corresponds to one
of the flavors returned by the server.
GSS flavors in this list have a GSS tuple that identify a specific
GSS pseudoflavor.
Currently our client ignores the GSS tuple's "qop" value. A
matching pseudoflavor is chosen based only on the OID and service
value.
So far this omission has not had much effect on Linux. The NFSv4
protocol currently supports only one qop value: GSS_C_QOP_DEFAULT,
also known as zero.
However, if an NFSv4 server happens to return something other than
zero in the qop field, our client won't notice. This could cause
the client to behave in incorrect ways that could have security
implications.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
A SECINFO reply may contain flavors whose kernel module is not
yet loaded by the client's kernel. A new RPC client API, called
rpcauth_get_pseudoflavor(), is introduced to do proper checking
for support of a security flavor.
When this API is invoked, the RPC client now tries to load the
module for each flavor first before performing the "is this
supported?" check. This means if a module is available on the
client, but has not been loaded yet, it will be loaded and
registered automatically when the SECINFO reply is processed.
The new API can take a full GSS tuple (OID, QoP, and service).
Previously only the OID and service were considered.
nfs_find_best_sec() is updated to verify all flavors requested in a
SECINFO reply, including AUTH_NULL and AUTH_UNIX. Previously these
two flavors were simply assumed to be supported without consulting
the RPC client.
Note that the replaced version of nfs_find_best_sec() can return
RPC_AUTH_MAXFLAVOR if the server returns a recognized OID but an
unsupported "service" value. nfs_find_best_sec() now returns
RPC_AUTH_UNIX in this case.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The NFSv4 SECINFO procedure returns a list of security flavors. Any
GSS flavor also has a GSS tuple containing an OID, a quality-of-
protection value, and a service value, which specifies a particular
GSS pseudoflavor.
For simplicity and efficiency, I'd like to return each GSS tuple
from the NFSv4 SECINFO XDR decoder and pass it straight into the RPC
client.
Define a data structure that is visible to both the NFS client and
the RPC client. Take structure and field names from the relevant
standards to avoid confusion.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Pull NFS client bugfixes from Trond Myklebust:
"We've just concluded another Connectathon interoperability testing
week, and so here are the fixes for the bugs that were discovered:
- Don't allow NFS silly-renamed files to be deleted
- Don't start the retransmission timer when out of socket space
- Fix a couple of pnfs-related Oopses.
- Fix one more NFSv4 state recovery deadlock
- Don't loop forever when LAYOUTGET returns NFS4ERR_LAYOUTTRYLATER"
* tag 'nfs-for-3.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
SUNRPC: One line comment fix
NFSv4.1: LAYOUTGET EDELAY loops timeout to the MDS
SUNRPC: add call to get configured timeout
PNFS: set the default DS timeout to 60 seconds
NFSv4: Fix another open/open_recovery deadlock
nfs: don't allow nfs_find_actor to match inodes of the wrong type
NFSv4.1: Hold reference to layout hdr in layoutget
pnfs: fix resend_to_mds for directio
SUNRPC: Don't start the retransmission timer when out of socket space
NFS: Don't allow NFS silly-renamed files to be deleted, no signal
Pull nfsd changes from J Bruce Fields:
"Miscellaneous bugfixes, plus:
- An overhaul of the DRC cache by Jeff Layton. The main effect is
just to make it larger. This decreases the chances of intermittent
errors especially in the UDP case. But we'll need to watch for any
reports of performance regressions.
- Containerized nfsd: with some limitations, we now support
per-container nfs-service, thanks to extensive work from Stanislav
Kinsbursky over the last year."
Some notes about conflicts, since there were *two* non-data semantic
conflicts here:
- idr_remove_all() had been added by a memory leak fix, but has since
become deprecated since idr_destroy() does it for us now.
- xs_local_connect() had been added by this branch to make AF_LOCAL
connections be synchronous, but in the meantime Trond had changed the
calling convention in order to avoid a RCU dereference.
There were a couple of more obvious actual source-level conflicts due to
the hlist traversal changes and one just due to code changes next to
each other, but those were trivial.
* 'for-3.9' of git://linux-nfs.org/~bfields/linux: (49 commits)
SUNRPC: make AF_LOCAL connect synchronous
nfsd: fix compiler warning about ambiguous types in nfsd_cache_csum
svcrpc: fix rpc server shutdown races
svcrpc: make svc_age_temp_xprts enqueue under sv_lock
lockd: nlmclnt_reclaim(): avoid stack overflow
nfsd: enable NFSv4 state in containers
nfsd: disable usermode helper client tracker in container
nfsd: use proper net while reading "exports" file
nfsd: containerize NFSd filesystem
nfsd: fix comments on nfsd_cache_lookup
SUNRPC: move cache_detail->cache_request callback call to cache_read()
SUNRPC: remove "cache_request" argument in sunrpc_cache_pipe_upcall() function
SUNRPC: rework cache upcall logic
SUNRPC: introduce cache_detail->cache_request callback
NFS: simplify and clean cache library
NFS: use SUNRPC cache creation and destruction helper for DNS cache
nfsd4: free_stid can be static
nfsd: keep a checksum of the first 256 bytes of request
sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer
sunrpc: fix comment in struct xdr_buf definition
...