Commit Graph

49 Commits

Author SHA1 Message Date
Al Viro
7449af1e8b switch xattr syscalls to fget_light/fput_light
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-29 23:28:30 -04:00
Andrew Morton
44c824982f fs/xattr.c:setxattr(): improve handling of allocation failures
This allocation can be as large as 64k.

 - Add __GFP_NOWARN so the a falied kmalloc() is silent

 - Fall back to vmalloc() if the kmalloc() failed

Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: David Rientjes <rientjes@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05 15:25:50 -07:00
Andrew Morton
0d08d7b7e1 fs/xattr.c:listxattr(): fall back to vmalloc() if kmalloc() failed
This allocation can be as large as 64k.  As David points out, "falling
back to vmalloc here is much better solution than failing to retreive
the attribute - it will work no matter how fragmented memory gets.  That
means we don't get incomplete backups occurring after days or months of
uptime and successful backups".

Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: David Rientjes <rientjes@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05 15:25:50 -07:00
Dave Jones
703bf2d122 fs/xattr.c: suppress page allocation failure warnings from sys_listxattr()
This size is user controllable, up to a maximum of XATTR_LIST_MAX (64k).
So it's trivial for someone to trigger a stream of order:4 page
allocation errors.

Signed-off-by: Dave Jones <davej@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dave Chinner <david@fromorbit.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05 15:25:50 -07:00
Paul Gortmaker
630d9c4727 fs: reduce the use of module.h wherever possible
For files only using THIS_MODULE and/or EXPORT_SYMBOL, map
them onto including export.h -- or if the file isn't even
using those, then just delete the include.  Fix up any implicit
include dependencies that were being masked by module.h along
the way.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-28 19:31:58 -05:00
Al Viro
2a79f17e4a vfs: mnt_drop_write_file()
new helper (wrapper around mnt_drop_write()) to be used in pair with
mnt_want_write_file().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:52:40 -05:00
Mimi Zohar
c7b87de23b evm: evm_inode_post_removexattr
When an EVM protected extended attribute is removed, update 'security.evm'.

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2011-07-18 12:29:43 -04:00
Mimi Zohar
1601fbad2b xattr: define vfs_getxattr_alloc and vfs_xattr_cmp
vfs_getxattr_alloc() and vfs_xattr_cmp() are two new kernel xattr helper
functions.  vfs_getxattr_alloc() first allocates memory for the requested
xattr and then retrieves it. vfs_xattr_cmp() compares a given value with
the contents of an extended attribute.

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2011-07-18 12:29:39 -04:00
Andi Kleen
69b4573296 Cache xattr security drop check for write v2
Some recent benchmarking on btrfs showed that a major scaling bottleneck
on large systems on btrfs is currently the xattr lookup on every write.

Why xattr lookup on every write I hear you ask?

write wants to drop suid and security related xattrs that could set o
capabilities for executables.  To do that it currently looks up
security.capability on EVERY write (even for non executables) to decide
whether to drop it or not.

In btrfs this causes an additional tree walk, hitting some per file system
locks and quite bad scalability. In a simple read workload on a 8S
system I saw over 90% CPU time in spinlocks related to that.

Chris Mason tells me this is also a problem in ext4, where it hits
the global mbcache lock.

This patch adds a simple per inode to avoid this problem.  We only
do the lookup once per file and then if there is no xattr cache
the decision. All xattr changes clear the flag.

I also used the same flag to avoid the suid check, although
that one is pretty cheap.

A file system can also set this flag when it creates the inode,
if it has a cheap way to do so.  This is done for some common file systems
in followon patches.

With this patch a major part of the lock contention disappears
for btrfs. Some testing on smaller systems didn't show significant
performance changes, but at least it helps the larger systems
and is generally more efficient.

v2: Rename is_sgid. add file system helper.
Cc: chris.mason@oracle.com
Cc: josef@redhat.com
Cc: viro@zeniv.linux.org.uk
Cc: agruen@linbit.com
Cc: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 12:02:09 -04:00
Andreas Gruenbacher
55b23bde19 xattr: Fix error results for non-existent / invisible attributes
Return -ENODATA when trying to read a user.* attribute which cannot
exist: user space otherwise does not have a reasonable way to
distinguish between non-existent and inaccessible attributes.

Likewise, return -ENODATA when an unprivileged process tries to read a
trusted.* attribute: to unprivileged processes, those attributes are
invisible (listxattr() won't include them).

Related to this bug report: https://bugzilla.redhat.com/660613

Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-27 09:43:00 -04:00
Jan Kara
df7e130384 vfs: Pass setxattr(2) flags properly
For some reason generic_setxattr() did not pass flags (XATTR_CREATE,
XATTR_REPLACE) to the filesystem specific helper. This caused that
setxattr(2) syscall just ignored these flags.

Fix the bug by passing flags correctly.

Signed-off-by: Jan Kara <jack@suse.cz>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-21 07:34:44 -07:00
Serge E. Hallyn
2e14967075 userns: rename is_owner_or_cap to inode_owner_or_capable
And give it a kernel-doc comment.

[akpm@linux-foundation.org: btrfs changed in linux-next]
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23 19:47:13 -07:00
Stephen Hemminger
bb4354538e fs: xattr_handler table should be const
The entries in xattr handler table should be immutable (ie const)
like other operation tables.

Later patches convert common filesystems. Uncoverted filesystems
will still work, but will generate a compiler warning.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:18 -04:00
Christoph Hellwig
431547b3c4 sanitize xattr handler prototypes
Add a flags argument to struct xattr_handler and pass it to all xattr
handler methods.  This allows using the same methods for multiple
handlers, e.g. for the ACL methods which perform exactly the same action
for the access and default ACLs, just using a different underlying
attribute.  With a little more groundwork it'll also allow sharing the
methods for the regular user/trusted/secure handlers in extN, ocfs2 and
jffs2 like it's already done for xfs in this patch.

Also change the inode argument to the handlers to a dentry to allow
using the handlers mechnism for filesystems that require it later,
e.g. cifs.

[with GFS2 bits updated by Steven Whitehouse <swhiteho@redhat.com>]

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:49 -05:00
David P. Quigley
b1ab7e4b2a VFS: Factor out part of vfs_setxattr so it can be called from the SELinux hook for inode_setsecctx.
This factors out the part of the vfs_setxattr function that performs the
setting of the xattr and its notification. This is needed so the SELinux
implementation of inode_setsecctx can handle the setting of the xattr while
maintaining the proper separation of layers.

Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-10 10:11:22 +10:00
npiggin@suse.de
96029c4e09 fs: introduce mnt_clone_write
This patch speeds up lmbench lat_mmap test by about another 2% after the
first patch.

Before:
 avg = 462.286
 std = 5.46106

After:
 avg = 453.12
 std = 9.58257

(50 runs of each, stddev gives a reasonable confidence)

It does this by introducing mnt_clone_write, which avoids some heavyweight
operations of mnt_want_write if called on a vfsmount which we know already
has a write count; and mnt_want_write_file, which can call mnt_clone_write
if the file is open for write.

After these two patches, mnt_want_write and mnt_drop_write go from 7% on
the profile down to 1.3% (including mnt_clone_write).

[AV: mnt_want_write_file() should take file alone and derive mnt from it;
not only all callers have that form, but that's the only mnt about which
we know that it's already held for write if file is opened for write]

Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-06-11 21:36:02 -04:00
Li Zefan
3939fcde24 xattr: use memdup_user()
Remove open-coded memdup_user()

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-04-20 23:02:50 -04:00
Heiko Carstens
6a6160a7b5 [CVE-2009-0029] System call wrappers part 13
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2009-01-14 14:15:23 +01:00
Heiko Carstens
64fd1de3d8 [CVE-2009-0029] System call wrappers part 12
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2009-01-14 14:15:23 +01:00
Heiko Carstens
2ed7c03ec1 [CVE-2009-0029] Convert all system calls to return a long
Convert all system calls to return a long. This should be a NOP since all
converted types should have the same size anyway.
With the exception of sys_exit_group which returned void. But that doesn't
matter since the system call doesn't return.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2009-01-14 14:15:14 +01:00
Al Viro
acfa4380ef inode->i_op is never NULL
We used to have rather schizophrenic set of checks for NULL ->i_op even
though it had been eliminated years ago.  You'd need to go out of your
way to set it to NULL explicitly _and_ a bunch of code would die on
such inodes anyway.  After killing two remaining places that still
did that bogosity, all that crap can go away.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-01-05 11:54:28 -05:00
Al Viro
2d8f30380a [PATCH] sanitize __user_walk_fd() et.al.
* do not pass nameidata; struct path is all the callers want.
* switch to new helpers:
	user_path_at(dfd, pathname, flags, &path)
	user_path(pathname, &path)
	user_lpath(pathname, &path)
	user_path_dir(pathname, &path)  (fail if not a directory)
  The last 3 are trivial macro wrappers for the first one.
* remove nameidata in callers.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:34 -04:00
Al Viro
f419a2e3b6 [PATCH] kill nameidata passing to permission(), rename to inode_permission()
Incidentally, the name that gives hundreds of false positives on grep
is not a good idea...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:31 -04:00
David Howells
8f0cfa52a1 xattr: add missing consts to function arguments
Add missing consts to xattr function arguments.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:06 -07:00
Al Viro
934b25c597 [PATCH] remove unused label in xattr.c (noise from ro-bind)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-23 00:04:04 -04:00