Commit Graph

1216 Commits

Author SHA1 Message Date
Al Viro
fc14f2fef6 convert get_sb_single() users
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-10-29 04:16:28 -04:00
Andi Kleen
27d6379894 Fix install_process_keyring error handling
Fix an incorrect error check that returns 1 for error instead of the
expected error code.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-28 09:02:15 -07:00
Linus Torvalds
426e1f5cec Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits)
  split invalidate_inodes()
  fs: skip I_FREEING inodes in writeback_sb_inodes
  fs: fold invalidate_list into invalidate_inodes
  fs: do not drop inode_lock in dispose_list
  fs: inode split IO and LRU lists
  fs: switch bdev inode bdi's correctly
  fs: fix buffer invalidation in invalidate_list
  fsnotify: use dget_parent
  smbfs: use dget_parent
  exportfs: use dget_parent
  fs: use RCU read side protection in d_validate
  fs: clean up dentry lru modification
  fs: split __shrink_dcache_sb
  fs: improve DCACHE_REFERENCED usage
  fs: use percpu counter for nr_dentry and nr_dentry_unused
  fs: simplify __d_free
  fs: take dcache_lock inside __d_path
  fs: do not assign default i_ino in new_inode
  fs: introduce a per-cpu last_ino allocator
  new helper: ihold()
  ...
2010-10-26 17:58:44 -07:00
Linus Torvalds
f9ba5375a8 Merge branch 'ima-memory-use-fixes'
* ima-memory-use-fixes:
  IMA: fix the ToMToU logic
  IMA: explicit IMA i_flag to remove global lock on inode_delete
  IMA: drop refcnt from ima_iint_cache since it isn't needed
  IMA: only allocate iint when needed
  IMA: move read counter into struct inode
  IMA: use i_writecount rather than a private counter
  IMA: use inode->i_lock to protect read and write counters
  IMA: convert internal flags from long to char
  IMA: use unsigned int instead of long for counters
  IMA: drop the inode opencount since it isn't needed for operation
  IMA: use rbtree instead of radix tree for inode information cache
2010-10-26 11:37:48 -07:00
Eric Paris
bade72d607 IMA: fix the ToMToU logic
Current logic looks like this:

        rc = ima_must_measure(NULL, inode, MAY_READ, FILE_CHECK);
        if (rc < 0)
                goto out;

        if (mode & FMODE_WRITE) {
                if (inode->i_readcount)
                        send_tomtou = true;
                goto out;
        }

        if (atomic_read(&inode->i_writecount) > 0)
                send_writers = true;

Lets assume we have a policy which states that all files opened for read
by root must be measured.

Lets assume the file has permissions 777.

Lets assume that root has the given file open for read.

Lets assume that a non-root process opens the file write.

The non-root process will get to ima_counts_get() and will check the
ima_must_measure().  Since it is not supposed to measure it will goto
out.

We should check the i_readcount no matter what since we might be causing
a ToMToU voilation!

This is close to correct, but still not quite perfect.  The situation
could have been that root, which was interested in the mesurement opened
and closed the file and another process which is not interested in the
measurement is the one holding the i_readcount ATM.  This is just overly
strict on ToMToU violations, which is better than not strict enough...

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 11:37:19 -07:00
Eric Paris
196f518128 IMA: explicit IMA i_flag to remove global lock on inode_delete
Currently for every removed inode IMA must take a global lock and search
the IMA rbtree looking for an associated integrity structure.  Instead
we explicitly mark an inode when we add an integrity structure so we
only have to take the global lock and do the removal if it exists.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 11:37:19 -07:00
Eric Paris
64c62f06be IMA: drop refcnt from ima_iint_cache since it isn't needed
Since finding a struct ima_iint_cache requires a valid struct inode, and
the struct ima_iint_cache is supposed to have the same lifetime as a
struct inode (technically they die together but don't need to be created
at the same time) we don't have to worry about the ima_iint_cache
outliving or dieing before the inode.  So the refcnt isn't useful.  Just
get rid of it and free the structure when the inode is freed.

Signed-off-by: Eric Paris <eapris@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 11:37:19 -07:00
Eric Paris
bc7d2a3e66 IMA: only allocate iint when needed
IMA always allocates an integrity structure to hold information about
every inode, but only needed this structure to track the number of
readers and writers currently accessing a given inode.  Since that
information was moved into struct inode instead of the integrity struct
this patch stops allocating the integrity stucture until it is needed.
Thus greatly reducing memory usage.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 11:37:18 -07:00
Eric Paris
a178d2027d IMA: move read counter into struct inode
IMA currently allocated an inode integrity structure for every inode in
core.  This stucture is about 120 bytes long.  Most files however
(especially on a system which doesn't make use of IMA) will never need
any of this space.  The problem is that if IMA is enabled we need to
know information about the number of readers and the number of writers
for every inode on the box.  At the moment we collect that information
in the per inode iint structure and waste the rest of the space.  This
patch moves those counters into the struct inode so we can eventually
stop allocating an IMA integrity structure except when absolutely
needed.

This patch does the minimum needed to move the location of the data.
Further cleanups, especially the location of counter updates, may still
be possible.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 11:37:18 -07:00
Eric Paris
b9593d309d IMA: use i_writecount rather than a private counter
IMA tracks the number of struct files which are holding a given inode
readonly and the number which are holding the inode write or r/w.  It
needs this information so when a new reader or writer comes in it can
tell if this new file will be able to invalidate results it already made
about existing files.

aka if a task is holding a struct file open RO, IMA measured the file
and recorded those measurements and then a task opens the file RW IMA
needs to note in the logs that the old measurement may not be correct.
It's called a "Time of Measure Time of Use" (ToMToU) issue.  The same is
true is a RO file is opened to an inode which has an open writer.  We
cannot, with any validity, measure the file in question since it could
be changing.

This patch attempts to use the i_writecount field to track writers.  The
i_writecount field actually embeds more information in it's value than
IMA needs but it should work for our purposes and allow us to shrink the
struct inode even more.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 11:37:18 -07:00
Eric Paris
ad16ad00c3 IMA: use inode->i_lock to protect read and write counters
Currently IMA used the iint->mutex to protect the i_readcount and
i_writecount.  This patch uses the inode->i_lock since we are going to
start using in inode objects and that is the most appropriate lock.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 11:37:18 -07:00
Eric Paris
15aac67677 IMA: convert internal flags from long to char
The IMA flags is an unsigned long but there is only 1 flag defined.
Lets save a little space and make it a char.  This packs nicely next to
the array of u8's.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 11:37:18 -07:00
Eric Paris
497f323370 IMA: use unsigned int instead of long for counters
Currently IMA uses 2 longs in struct inode.  To save space (and as it
seems impossible to overflow 32 bits) we switch these to unsigned int.
The switch to unsigned does require slightly different checks for
underflow, but it isn't complex.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 11:37:18 -07:00
Eric Paris
b575156daf IMA: drop the inode opencount since it isn't needed for operation
The opencount was used to help debugging to make sure that everything
which created a struct file also correctly made the IMA calls.  Since we
moved all of that into the VFS this isn't as necessary.  We should be
able to get the same amount of debugging out of just the reader and
write count.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 11:37:17 -07:00
Eric Paris
8549164143 IMA: use rbtree instead of radix tree for inode information cache
The IMA code needs to store the number of tasks which have an open fd
granting permission to write a file even when IMA is not in use.  It
needs this information in order to be enabled at a later point in time
without losing it's integrity garantees.

At the moment that means we store a little bit of data about every inode
in a cache.  We use a radix tree key'd on the inode's memory address.
Dave Chinner pointed out that a radix tree is a terrible data structure
for such a sparse key space.  This patch switches to using an rbtree
which should be more efficient.

Bug report from Dave:

 "I just noticed that slabtop was reporting an awfully high usage of
  radix tree nodes:

   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
  4200331 2778082  66%    0.55K 144839       29   2317424K radix_tree_node
  2321500 2060290  88%    1.00K  72581       32   2322592K xfs_inode
  2235648 2069791  92%    0.12K  69864       32    279456K iint_cache

  That is, 2.7M radix tree nodes are allocated, and the cache itself is
  consuming 2.3GB of RAM.  I know that the XFS inodei caches are indexed
  by radix tree node, but for 2 million cached inodes that would mean a
  density of 1 inode per radix tree node, which for a system with 16M
  inodes in the filsystems is an impossibly low density.  The worst I've
  seen in a production system like kernel.org is about 20-25% density,
  which would mean about 150-200k radix tree nodes for that many inodes.
  So it's not the inode cache.

  So I looked up what the iint_cache was.  It appears to used for
  storing per-inode IMA information, and uses a radix tree for indexing.
  It uses the *address* of the struct inode as the indexing key.  That
  means the key space is extremely sparse - for XFS the struct inode
  addresses are approximately 1000 bytes apart, which means the closest
  the radix tree index keys get is ~1000.  Which means that there is a
  single entry per radix tree leaf node, so the radix tree is using
  roughly 550 bytes for every 120byte structure being cached.  For the
  above example, it's probably wasting close to 1GB of RAM...."

Reported-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 11:37:17 -07:00
Christoph Hellwig
be148247cf fs: take dcache_lock inside __d_path
All callers take dcache_lock just around the call to __d_path, so
take the lock into it in preparation of getting rid of dcache_lock.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-10-25 21:26:12 -04:00
Christoph Hellwig
85fe4025c6 fs: do not assign default i_ino in new_inode
Instead of always assigning an increasing inode number in new_inode
move the call to assign it into those callers that actually need it.
For now callers that need it is estimated conservatively, that is
the call is added to all filesystems that do not assign an i_ino
by themselves.  For a few more filesystems we can avoid assigning
any inode number given that they aren't user visible, and for others
it could be done lazily when an inode number is actually needed,
but that's left for later patches.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-10-25 21:26:11 -04:00
Linus Torvalds
092e0e7e52 Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl
* 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
  vfs: make no_llseek the default
  vfs: don't use BKL in default_llseek
  llseek: automatically add .llseek fop
  libfs: use generic_file_llseek for simple_attr
  mac80211: disallow seeks in minstrel debug code
  lirc: make chardev nonseekable
  viotape: use noop_llseek
  raw: use explicit llseek file operations
  ibmasmfs: use generic_file_llseek
  spufs: use llseek in all file operations
  arm/omap: use generic_file_llseek in iommu_debug
  lkdtm: use generic_file_llseek in debugfs
  net/wireless: use generic_file_llseek in debugfs
  drm: use noop_llseek
2010-10-22 10:52:56 -07:00
Stephen Rothwell
f0d3d9894e selinux: include vmalloc.h for vmalloc_user
Include vmalloc.h for vmalloc_user (fixes ppc build warning).
Acked-by: Eric Paris <eparis@redhat.com>

Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:13:01 +11:00
Eric Paris
845ca30fe9 selinux: implement mmap on /selinux/policy
/selinux/policy allows a user to copy the policy back out of the kernel.
This patch allows userspace to actually mmap that file and use it directly.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:59 +11:00
Eric Paris
cee74f47a6 SELinux: allow userspace to read policy back out of the kernel
There is interest in being able to see what the actual policy is that was
loaded into the kernel.  The patch creates a new selinuxfs file
/selinux/policy which can be read by userspace.  The actual policy that is
loaded into the kernel will be written back out to userspace.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:58 +11:00
Eric Paris
00d85c83ac SELinux: drop useless (and incorrect) AVTAB_MAX_SIZE
AVTAB_MAX_SIZE was a define which was supposed to be used in userspace to
define a maximally sized avtab when userspace wasn't sure how big of a table
it needed.  It doesn't make sense in the kernel since we always know our table
sizes.  The only place it is used we have a more appropiately named define
called AVTAB_MAX_HASH_BUCKETS, use that instead.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:57 +11:00
Eric Paris
4419aae1f4 SELinux: deterministic ordering of range transition rules
Range transition rules are placed in the hash table in an (almost)
arbitrary order.  This patch inserts them in a fixed order to make policy
retrival more predictable.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:56 +11:00
Eric Paris
d5630b9d27 security: secid_to_secctx returns len when data is NULL
With the (long ago) interface change to have the secid_to_secctx functions
do the string allocation instead of having the caller do the allocation we
lost the ability to query the security server for the length of the
upcoming string.  The SECMARK code would like to allocate a netlink skb
with enough length to hold the string but it is just too unclean to do the
string allocation twice or to do the allocation the first time and hold
onto the string and slen.  This patch adds the ability to call
security_secid_to_secctx() with a NULL data pointer and it will just set
the slen pointer.

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:50 +11:00
Eric Paris
2606fd1fa5 secmark: make secmark object handling generic
Right now secmark has lots of direct selinux calls.  Use all LSM calls and
remove all SELinux specific knowledge.  The only SELinux specific knowledge
we leave is the mode.  The only point is to make sure that other LSMs at
least test this generic code before they assume it works.  (They may also
have to make changes if they do not represent labels as strings)

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:48 +11:00