The filehandle structs all use host-endian values, but will sometimes
stuff big-endian values into those fields. This is OK since these
values are opaque to the client, but it confuses sparse. Add __force to
make it clear that we are doing this intentionally.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Don't using cache_get besides export.h, using exp_get for export.
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
When debugging, rpc prints messages from dprintk(KERN_WARNING ...)
with "^A4" prefixed,
[ 2780.339988] ^A4nfsd: connect from unprivileged port: 127.0.0.1, port=35316
Trond tells,
> dprintk != printk. We have NEVER supported dprintk(KERN_WARNING...)
This patch removes using of dprintk with KERN_WARNING.
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Use fh_fsid when reffering to the fsid part of the filehandle. The
variable length auth field envisioned in nfsfh wasn't ever implemented.
Also clean up some lose ends around this and document the file handle
format better.
Btw, why do we even export nfsfh.h to userspace? The file handle very
much is kernel private, and nothing in nfs-utils include the header
either.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The reporter saw a NULL dereference when a filesystem's ->mknod returned
success but left the dentry negative, and then nfsd tried to dereference
d_inode (in this case because the CREATE was followed by a GETATTR in
the same nfsv4 compound).
fh_update already checks for this and another broken case, but for some
reason it returns success and leaves nfsd trying to soldier on. If it
failed we'd avoid the crash. There's only so much we can do with a
buggy filesystem, but it's easy enough to bail out here, so let's do
that.
Reported-by: Antti Tönkyrä <daedalus@pingtimeout.net>
Tested-by: Antti Tönkyrä <daedalus@pingtimeout.net>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This commit adds FILEID_INVALID = 0xff in fid_type to
indicate invalid fid_type
It avoids using magic number 255
Signed-off-by: Namjae Jeon <linkinjeon@gmail.com>
Signed-off-by: Vivek Trivedi <vtrivedi018@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
When mnt_want_write() starts to handle freezing it will get a full lock
semantics requiring proper lock ordering. So push mnt_want_write() call
consistently outside of i_mutex.
CC: linux-nfs@vger.kernel.org
CC: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This patch replaces cache_put() call for svc_export_cache by exp_put() call.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
We allow the fh_verify caller to specify that any object *except* those
of a given type is allowed, by passing a negative type. But only one
caller actually uses it. Open-code that check in the one caller.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The new .h files have paths at the top that are now out of date. While
we're here, just remove all of those from fs/nfsd; they never served any
purpose.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
On V4ROOT exports, only accept filehandles that are the *root* of some
export. This allows mountd to allow or deny access to individual
directories and symlinks on the pseudofilesystem.
Note that the checks in readdir and lookup are not enough, since a
malicious host with access to the network could guess filehandles that
they weren't able to obtain through lookup or readdir.
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
This was an oversight; it should be among the export flags that can be
allowed to vary by pseudoflavor. This allows an administrator to (for
example) allow auth_sys mounts only from low ports, but allow auth_krb5
mounts to use any port.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Lots of include/linux/nfsd/* headers are only used by
nfsd module. Move them to the source directory
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Now that the headers are fixed and carry their own wait, all fs/nfsd/
source files can include a minimal set of headers. and still compile just
fine.
This patch should improve the compilation speed of the nfsd module.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
All nfsd security depends on the security checks in fh_verify, and
especially on nfsd_setuser().
It therefore bothers me that the nfsd_setuser call may be made from
three different places, depending on whether the filehandle has already
been mapped to a dentry, and on whether subtreechecking is in force.
Instead, make an unconditional call in fh_verify(), so it's trivial to
verify that the call always occurs.
That leaves us with a redundant nfsd_setuser() call in the subtreecheck
case--it needs the correct user set earlier in order to check execute
permissions on the path to this filehandle--but I'm willing to accept
that minor inefficiency in the subtreecheck case in return for more
straightforward permission checking.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
None of this stuff is used outside nfsd, so move it out of the common
linux include directory.
Actually, probably none of the stuff in include/linux/nfsd/nfsd.h really
belongs there, so later we may remove that file entirely.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
A number of callers (nfsd4_encode_fattr(), at least) don't bother to
release the filehandle returned to fh_compose() if fh_compose() returns
an error. So, modify fh_compose() to release the filehandle before
returning an error.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The file nfsfh.c contains two static variables nfsd_nr_verified and
nfsd_nr_put. These are counters which are incremented as a side
effect of the fh_verify() fh_compose() and fh_put() operations,
i.e. at least twice per NFS call for any non-trivial workload.
Needless to say this makes the cacheline that contains them (and any
other innocent victims) a very hot contention point indeed under high
call-rate workloads on multiprocessor NFS server. It also turns out
that these counters are not used anywhere. They're not reported to
userspace, they're not used in logic, they're not even exported from
the object file (let alone the module). All they do is waste CPU time.
So this patch removes them.
Tests on a 16 CPU Altix A4700 with 2 10gige Myricom cards, configured
separately (no bonding). Workload is 640 client threads doing directory
traverals with random small reads, from server RAM.
Before
======
Kernel profile:
% cumulative self self total
time samples samples calls 1/call 1/call name
6.05 2716.00 2716.00 30406 0.09 1.02 svc_process
4.44 4706.00 1990.00 1975 1.01 1.01 spin_unlock_irqrestore
3.72 6376.00 1670.00 1666 1.00 1.00 svc_export_put
3.41 7907.00 1531.00 1786 0.86 1.02 nfsd_ofcache_lookup
3.25 9363.00 1456.00 10965 0.13 1.01 nfsd_dispatch
3.10 10752.00 1389.00 1376 1.01 1.01 nfsd_cache_lookup
2.57 11907.00 1155.00 4517 0.26 1.03 svc_tcp_recvfrom
...
2.21 15352.00 1003.00 1081 0.93 1.00 nfsd_choose_ofc <----
^^^^
Here the function nfsd_choose_ofc() reads a global variable
which by accident happened to be located in the same cacheline as
nfsd_nr_verified.
Call rate:
nullarbor:~ # pmdumptext nfs3.server.calls
...
Thu Dec 13 00:15:27 184780.663
Thu Dec 13 00:15:28 184885.881
Thu Dec 13 00:15:29 184449.215
Thu Dec 13 00:15:30 184971.058
Thu Dec 13 00:15:31 185036.052
Thu Dec 13 00:15:32 185250.475
Thu Dec 13 00:15:33 184481.319
Thu Dec 13 00:15:34 185225.737
Thu Dec 13 00:15:35 185408.018
Thu Dec 13 00:15:36 185335.764
After
=====
kernel profile:
% cumulative self self total
time samples samples calls 1/call 1/call name
6.33 2813.00 2813.00 29979 0.09 1.01 svc_process
4.66 4883.00 2070.00 2065 1.00 1.00 spin_unlock_irqrestore
4.06 6687.00 1804.00 2182 0.83 1.00 nfsd_ofcache_lookup
3.20 8110.00 1423.00 10932 0.13 1.00 nfsd_dispatch
3.03 9456.00 1346.00 1343 1.00 1.00 nfsd_cache_lookup
2.62 10622.00 1166.00 4645 0.25 1.01 svc_tcp_recvfrom
[...]
0.10 42586.00 44.00 74 0.59 1.00 nfsd_choose_ofc <--- HA!!
^^^^
Call rate:
nullarbor:~ # pmdumptext nfs3.server.calls
...
Thu Dec 13 01:45:28 194677.118
Thu Dec 13 01:45:29 193932.692
Thu Dec 13 01:45:30 194294.364
Thu Dec 13 01:45:31 194971.276
Thu Dec 13 01:45:32 194111.207
Thu Dec 13 01:45:33 194999.635
Thu Dec 13 01:45:34 195312.594
Thu Dec 13 01:45:35 195707.293
Thu Dec 13 01:45:36 194610.353
Thu Dec 13 01:45:37 195913.662
Thu Dec 13 01:45:38 194808.675
i.e. about a 5.3% improvement in call rate.
Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Reviewed-by: David Chinner <dgc@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>