Commit Graph

61 Commits

Author SHA1 Message Date
Wu Fengguang
381a80e6df inotify: use GFP_NOFS in kernel_event() to work around a lockdep false-positive
There is what we believe to be a false positive reported by lockdep.

inotify_inode_queue_event() => take inotify_mutex => kernel_event() =>
kmalloc() => SLOB => alloc_pages_node() => page reclaim => slab reclaim =>
dcache reclaim => inotify_inode_is_dead => take inotify_mutex => deadlock

The plan is to fix this via lockdep annotation, but that is proving to be
quite involved.

The patch flips the allocation over to GFP_NFS to shut the warning up, for
the 2.6.30 release.

Hopefully we will fix this for real in 2.6.31.  I'll queue a patch in -mm
to switch it back to GFP_KERNEL so we don't forget.

  =================================
  [ INFO: inconsistent lock state ]
  2.6.30-rc2-next-20090417 #203
  ---------------------------------
  inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
  kswapd0/380 [HC0[0]:SC0[0]:HE1:SE1] takes:
   (&inode->inotify_mutex){+.+.?.}, at: [<ffffffff8112f1b5>] inotify_inode_is_dead+0x35/0xb0
  {RECLAIM_FS-ON-W} state was registered at:
    [<ffffffff81079188>] mark_held_locks+0x68/0x90
    [<ffffffff810792a5>] lockdep_trace_alloc+0xf5/0x100
    [<ffffffff810f5261>] __kmalloc_node+0x31/0x1e0
    [<ffffffff81130652>] kernel_event+0xe2/0x190
    [<ffffffff81130826>] inotify_dev_queue_event+0x126/0x230
    [<ffffffff8112f096>] inotify_inode_queue_event+0xc6/0x110
    [<ffffffff8110444d>] vfs_create+0xcd/0x140
    [<ffffffff8110825d>] do_filp_open+0x88d/0xa20
    [<ffffffff810f6b68>] do_sys_open+0x98/0x140
    [<ffffffff810f6c50>] sys_open+0x20/0x30
    [<ffffffff8100c272>] system_call_fastpath+0x16/0x1b
    [<ffffffffffffffff>] 0xffffffffffffffff
  irq event stamp: 690455
  hardirqs last  enabled at (690455): [<ffffffff81564fe4>] _spin_unlock_irqrestore+0x44/0x80
  hardirqs last disabled at (690454): [<ffffffff81565372>] _spin_lock_irqsave+0x32/0xa0
  softirqs last  enabled at (690178): [<ffffffff81052282>] __do_softirq+0x202/0x220
  softirqs last disabled at (690157): [<ffffffff8100d50c>] call_softirq+0x1c/0x50

  other info that might help us debug this:
  2 locks held by kswapd0/380:
   #0:  (shrinker_rwsem){++++..}, at: [<ffffffff810d0bd7>] shrink_slab+0x37/0x180
   #1:  (&type->s_umount_key#17){++++..}, at: [<ffffffff8110cfbf>] shrink_dcache_memory+0x11f/0x1e0

  stack backtrace:
  Pid: 380, comm: kswapd0 Not tainted 2.6.30-rc2-next-20090417 #203
  Call Trace:
   [<ffffffff810789ef>] print_usage_bug+0x19f/0x200
   [<ffffffff81018bff>] ? save_stack_trace+0x2f/0x50
   [<ffffffff81078f0b>] mark_lock+0x4bb/0x6d0
   [<ffffffff810799e0>] ? check_usage_forwards+0x0/0xc0
   [<ffffffff8107b142>] __lock_acquire+0xc62/0x1ae0
   [<ffffffff810f478c>] ? slob_free+0x10c/0x370
   [<ffffffff8107c0a1>] lock_acquire+0xe1/0x120
   [<ffffffff8112f1b5>] ? inotify_inode_is_dead+0x35/0xb0
   [<ffffffff81562d43>] mutex_lock_nested+0x63/0x420
   [<ffffffff8112f1b5>] ? inotify_inode_is_dead+0x35/0xb0
   [<ffffffff8112f1b5>] ? inotify_inode_is_dead+0x35/0xb0
   [<ffffffff81012fe9>] ? sched_clock+0x9/0x10
   [<ffffffff81077165>] ? lock_release_holdtime+0x35/0x1c0
   [<ffffffff8112f1b5>] inotify_inode_is_dead+0x35/0xb0
   [<ffffffff8110c9dc>] dentry_iput+0xbc/0xe0
   [<ffffffff8110cb23>] d_kill+0x33/0x60
   [<ffffffff8110ce23>] __shrink_dcache_sb+0x2d3/0x350
   [<ffffffff8110cffa>] shrink_dcache_memory+0x15a/0x1e0
   [<ffffffff810d0cc5>] shrink_slab+0x125/0x180
   [<ffffffff810d1540>] kswapd+0x560/0x7a0
   [<ffffffff810ce160>] ? isolate_pages_global+0x0/0x2c0
   [<ffffffff81065a30>] ? autoremove_wake_function+0x0/0x40
   [<ffffffff8107953d>] ? trace_hardirqs_on+0xd/0x10
   [<ffffffff810d0fe0>] ? kswapd+0x0/0x7a0
   [<ffffffff8106555b>] kthread+0x5b/0xa0
   [<ffffffff8100d40a>] child_rip+0xa/0x20
   [<ffffffff8100cdd0>] ? restore_args+0x0/0x30
   [<ffffffff81065500>] ? kthread+0x0/0xa0
   [<ffffffff8100d400>] ? child_rip+0x0/0x20

[eparis@redhat.com: fix audit too]
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-06 16:36:09 -07:00
Zhenwen Xu
c28bb7da74 make the e->rule.xxx shorter in kernel auditfilter.c
make the e->rule.xxx shorter in kernel/auditfilter.c
--
---------------------------------
Zhenwen Xu - Open and Free
Home Page:	http://zhwen.org
My Studio:	http://dim4.cn

>From 99692dc640b278f1cb1a15646ce42f22e89c0f77 Mon Sep 17 00:00:00 2001
From: Zhenwen Xu <Helight.Xu@gmail.com>
Date: Thu, 12 Mar 2009 22:04:59 +0800
Subject: [PATCH] make the e->rule.xxx shorter in kernel/auditfilter.c

Signed-off-by: Zhenwen Xu <Helight.Xu@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-04-05 13:40:33 -04:00
Al Viro
5af75d8d58 audit: validate comparison operations, store them in sane form
Don't store the field->op in the messy (and very inconvenient for e.g.
audit_comparator()) form; translate to dense set of values and do full
validation of userland-submitted value while we are at it.

->audit_init_rule() and ->audit_match_rule() get new values now; in-tree
instances updated.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-01-04 15:14:42 -05:00
Al Viro
36c4f1b18c clean up audit_rule_{add,del} a bit
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-01-04 15:14:42 -05:00
Al Viro
e45aa212ea audit rules ordering, part 2
Fix the actual rule listing; add per-type lists _not_ used for matching,
with all exit,... sitting on one such list.  Simplifies "do something
for all rules" logics, while we are at it...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-01-04 15:14:42 -05:00
Al Viro
0590b9335a fixing audit rule ordering mess, part 1
Problem: ordering between the rules on exit chain is currently lost;
all watch and inode rules are listed after everything else _and_
exit,never on one kind doesn't stop exit,always on another from
being matched.

Solution: assign priorities to rules, keep track of the current
highest-priority matching rule and its result (always/never).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-01-04 15:14:41 -05:00
Al Viro
1a9d0797b8 audit_update_lsm_rules() misses the audit_inode_hash[] ones
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-01-04 15:14:41 -05:00
Al Viro
8f7b0ba1c8 Fix inotify watch removal/umount races
Inotify watch removals suck violently.

To kick the watch out we need (in this order) inode->inotify_mutex and
ih->mutex.  That's fine if we have a hold on inode; however, for all
other cases we need to make damn sure we don't race with umount.  We can
*NOT* just grab a reference to a watch - inotify_unmount_inodes() will
happily sail past it and we'll end with reference to inode potentially
outliving its superblock.

Ideally we just want to grab an active reference to superblock if we
can; that will make sure we won't go into inotify_umount_inodes() until
we are done.  Cleanup is just deactivate_super().

However, that leaves a messy case - what if we *are* racing with
umount() and active references to superblock can't be acquired anymore?
We can bump ->s_count, grab ->s_umount, which will almost certainly wait
until the superblock is shut down and the watch in question is pining
for fjords.  That's fine, but there is a problem - we might have hit the
window between ->s_active getting to 0 / ->s_count - below S_BIAS (i.e.
the moment when superblock is past the point of no return and is heading
for shutdown) and the moment when deactivate_super() acquires
->s_umount.

We could just do drop_super() yield() and retry, but that's rather
antisocial and this stuff is luser-triggerable.  OTOH, having grabbed
->s_umount and having found that we'd got there first (i.e.  that
->s_root is non-NULL) we know that we won't race with
inotify_umount_inodes().

So we could grab a reference to watch and do the rest as above, just
with drop_super() instead of deactivate_super(), right? Wrong.  We had
to drop ih->mutex before we could grab ->s_umount.  So the watch
could've been gone already.

That still can be dealt with - we need to save watch->wd, do idr_find()
and compare its result with our pointer.  If they match, we either have
the damn thing still alive or we'd lost not one but two races at once,
the watch had been killed and a new one got created with the same ->wd
at the same address.  That couldn't have happened in inotify_destroy(),
but inotify_rm_wd() could run into that.  Still, "new one got created"
is not a problem - we have every right to kill it or leave it alone,
whatever's more convenient.

So we can use idr_find(...) == watch && watch->inode->i_sb == sb as
"grab it and kill it" check.  If it's been our original watch, we are
fine, if it's a newcomer - nevermind, just pretend that we'd won the
race and kill the fscker anyway; we are safe since we know that its
superblock won't be going away.

And yes, this is far beyond mere "not very pretty"; so's the entire
concept of inotify to start with.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-15 12:26:44 -08:00
zhangxiliang
036bbf76ad Re: [PATCH] the loginuid field should be output in all AUDIT_CONFIG_CHANGE audit messages
> shouldn't these be using the "audit_get_loginuid(current)"  and if we
> are going to output loginuid we also should be outputting sessionid

Thanks for your detailed explanation.
I have made a new patch for outputing "loginuid" and "sessionid" by audit_get_loginuid(current) and audit_get_sessionid(current).
If there are some deficiencies, please give me your indication.

Signed-off-by: Zhang Xiliang <zhangxiliang@cn.fujitsu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-08-01 12:15:03 -04:00
Peng Haitao
d8de72473e [PATCH] remove useless argument type in audit_filter_user()
The second argument "type" is not used in audit_filter_user(), so I think that type can be removed. If I'm wrong, please tell me.

Signed-off-by: Peng Haitao <penght@cn.fujitsu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-06-24 23:36:35 -04:00
Randy Dunlap
9f0aecdd1c [PATCH] audit: fix kernel-doc parameter notation
Fix auditfilter kernel-doc misssing parameter description:

Warning(lin2626-rc3//kernel/auditfilter.c:1551): No description found for parameter 'sessionid'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-06-24 23:36:28 -04:00
Linus Torvalds
9781db7b34 Merge branch 'audit.b50' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b50' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
  [PATCH] new predicate - AUDIT_FILETYPE
  [patch 2/2] Use find_task_by_vpid in audit code
  [patch 1/2] audit: let userspace fully control TTY input auditing
  [PATCH 2/2] audit: fix sparse shadowed variable warnings
  [PATCH 1/2] audit: move extern declarations to audit.h
  Audit: MAINTAINERS update
  Audit: increase the maximum length of the key field
  Audit: standardize string audit interfaces
  Audit: stop deadlock from signals under load
  Audit: save audit_backlog_limit audit messages in case auditd comes back
  Audit: collect sessionid in netlink messages
  Audit: end printk with newline
2008-04-29 11:41:22 -07:00
Hirofumi Nakagawa
801678c5a3 Remove duplicated unlikely() in IS_ERR()
Some drivers have duplicated unlikely() macros.  IS_ERR() already has
unlikely() in itself.

This patch cleans up such pointless code.

Signed-off-by: Hirofumi Nakagawa <hnakagawa@miraclelinux.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Jeff Garzik <jeff@garzik.org>
Cc: Paul Clements <paul.clements@steeleye.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Carsten Otte <cotte@de.ibm.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:25 -07:00
Al Viro
8b67dca942 [PATCH] new predicate - AUDIT_FILETYPE
Argument is S_IF... | <index>, where index is normally 0 or 1.
Triggers if chosen element of ctx->names[] is present and the
mode of object in question matches the upper bits of argument.
I.e. for things like "is the argument of that chmod a directory",
etc.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-28 06:28:37 -04:00
Harvey Harrison
7719e437fa [PATCH 2/2] audit: fix sparse shadowed variable warnings
Use msglen as the identifier.
kernel/audit.c:724:10: warning: symbol 'len' shadows an earlier one
kernel/audit.c:575:8: originally declared here

Don't use ino_f to check the inode field at the end of the functions.
kernel/auditfilter.c:429:22: warning: symbol 'f' shadows an earlier one
kernel/auditfilter.c:420:21: originally declared here
kernel/auditfilter.c:542:22: warning: symbol 'f' shadows an earlier one
kernel/auditfilter.c:529:21: originally declared here

i always used as a counter for a for loop and initialized to zero before
use.  Eliminate the inner i variables.
kernel/auditsc.c:1295:8: warning: symbol 'i' shadows an earlier one
kernel/auditsc.c:1152:6: originally declared here
kernel/auditsc.c:1320:7: warning: symbol 'i' shadows an earlier one
kernel/auditsc.c:1152:6: originally declared here

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-28 06:28:17 -04:00
Harvey Harrison
c782f242f0 [PATCH 1/2] audit: move extern declarations to audit.h
Leave audit_sig_{uid|pid|sid} protected by #ifdef CONFIG_AUDITSYSCALL.

Noticed by sparse:
kernel/audit.c:73:6: warning: symbol 'audit_ever_enabled' was not declared. Should it be static?
kernel/audit.c:100:8: warning: symbol 'audit_sig_uid' was not declared. Should it be static?
kernel/audit.c:101:8: warning: symbol 'audit_sig_pid' was not declared. Should it be static?
kernel/audit.c:102:6: warning: symbol 'audit_sig_sid' was not declared. Should it be static?
kernel/audit.c:117:23: warning: symbol 'audit_ih' was not declared. Should it be static?
kernel/auditfilter.c:78:18: warning: symbol 'audit_filter_list' was not declared. Should it be static?

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-28 06:28:04 -04:00
Eric Paris
2532386f48 Audit: collect sessionid in netlink messages
Previously I added sessionid output to all audit messages where it was
available but we still didn't know the sessionid of the sender of
netlink messages.  This patch adds that information to netlink messages
so we can audit who sent netlink messages.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-28 06:18:03 -04:00
Ahmed S. Darwish
04305e4aff Audit: Final renamings and cleanup
Rename the se_str and se_rule audit fields elements to
lsm_str and lsm_rule to avoid confusion.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
2008-04-19 09:59:43 +10:00
Ahmed S. Darwish
d7a96f3a1a Audit: internally use the new LSM audit hooks
Convert Audit to use the new LSM Audit hooks instead of
the exported SELinux interface.

Basically, use:
security_audit_rule_init
secuirty_audit_rule_free
security_audit_rule_known
security_audit_rule_match

instad of (respectively) :
selinux_audit_rule_init
selinux_audit_rule_free
audit_rule_has_selinux
selinux_audit_rule_match

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
2008-04-19 09:52:37 +10:00
Ahmed S. Darwish
2a862b32f3 Audit: use new LSM hooks instead of SELinux exports
Stop using the following exported SELinux interfaces:
selinux_get_inode_sid(inode, sid)
selinux_get_ipc_sid(ipcp, sid)
selinux_get_task_sid(tsk, sid)
selinux_sid_to_string(sid, ctx, len)
kfree(ctx)

and use following generic LSM equivalents respectively:
security_inode_getsecid(inode, secid)
security_ipc_getsecid*(ipcp, secid)
security_task_getsecid(tsk, secid)
security_sid_to_secctx(sid, ctx, len)
security_release_secctx(ctx, len)

Call security_release_secctx only if security_secid_to_secctx
succeeded.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Reviewed-by: Paul Moore <paul.moore@hp.com>
2008-04-19 09:52:34 +10:00
Jan Blunck
1d957f9bf8 Introduce path_put()
* Add path_put() functions for releasing a reference to the dentry and
  vfsmount of a struct path in the right order

* Switch from path_release(nd) to path_put(&nd->path)

* Rename dput_path() to path_put_conditional()

[akpm@linux-foundation.org: fix cifs]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: <linux-fsdevel@vger.kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:13:33 -08:00
Jan Blunck
4ac9137858 Embed a struct path into struct nameidata instead of nd->{dentry,mnt}
This is the central patch of a cleanup series. In most cases there is no good
reason why someone would want to use a dentry for itself. This series reflects
that fact and embeds a struct path into nameidata.

Together with the other patches of this series
- it enforced the correct order of getting/releasing the reference count on
  <dentry,vfsmount> pairs
- it prepares the VFS for stacking support since it is essential to have a
  struct path in every place where the stack can be traversed
- it reduces the overall code size:

without patch series:
   text    data     bss     dec     hex filename
5321639  858418  715768 6895825  6938d1 vmlinux

with patch series:
   text    data     bss     dec     hex filename
5320026  858418  715768 6894212  693284 vmlinux

This patch:

Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix cifs]
[akpm@linux-foundation.org: fix smack]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:13:33 -08:00
Eric Paris
1a6b9f2317 [AUDIT] make audit=0 really stop audit messages
Some audit messages (namely configuration changes) are still emitted even if
the audit subsystem has been explicitly disabled.  This patch turns those
messages off as well.

Signed-off-by: Eric Paris <eparis@redhat.com>
2008-02-01 14:24:33 -05:00
Al Viro
74c3cbe33b [PATCH] audit: watching subtrees
New kind of audit rule predicates: "object is visible in given subtree".
The part that can be sanely implemented, that is.  Limitations:
	* if you have hardlink from outside of tree, you'd better watch
it too (or just watch the object itself, obviously)
	* if you mount something under a watched tree, tell audit
that new chunk should be added to watched subtrees
	* if you umount something in a watched tree and it's still mounted
elsewhere, you will get matches on events happening there.  New command
tells audit to recalculate the trees, trimming such sources of false
positives.

Note that it's _not_ about path - if something mounted in several places
(multiple mount, bindings, different namespaces, etc.), the match does
_not_ depend on which one we are using for access.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-10-21 02:37:45 -04:00
Daniel Walker
9ce34218a8 whitespace fixes: audit filtering
Signed-off-by: Daniel Walker <dwalker@mvista.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:24 -07:00