Commit Graph

79 Commits

Author SHA1 Message Date
David Howells
0bb80f2405 proc: Split the namespace stuff out into linux/proc_ns.h
Split the proc namespace stuff out into linux/proc_ns.h.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: netdev@vger.kernel.org
cc: Serge E. Hallyn <serge.hallyn@ubuntu.com>
cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-05-01 17:29:39 -04:00
David Howells
303eb7e2c9 Include missing linux/magic.h inclusions
Include missing linux/magic.h inclusions where the source file is currently
expecting to get magic numbers through linux/proc_fs.h.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-efi@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-29 15:42:01 -04:00
David Howells
3cb5bf1bf9 proc: Delete create_proc_read_entry()
Delete create_proc_read_entry() as it no longer has any users.

Also delete read_proc_t, write_proc_t, the read_proc member of the
proc_dir_entry struct and the support functions that use them.  This saves a
pointer for every PDE allocated.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-29 15:42:00 -04:00
Al Viro
05c0ae21c0 try a saner locking for pde_opener...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09 15:16:52 -04:00
Al Viro
ca469f35a8 deal with races between remove_proc_entry() and proc_reg_release()
* serialize the call of ->release() on per-pdeo mutex
* don't remove pdeo from per-pde list until we are through with it

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09 15:16:51 -04:00
Al Viro
866ad9a747 procfs: preparations for remove_proc_entry() race fixes
* leave ->proc_fops alone; make ->pde_users negative instead
* trim pde_opener
* move relevant code in fs/proc/inode.c

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09 15:16:51 -04:00
Al Viro
b6cdc73103 procfs: don't allow to use proc_create, create_proc_entry, etc. for directories
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09 14:13:14 -04:00
Al Viro
021ada7dff procfs: switch /proc/self away from proc_dir_entry
Just have it pinned in dcache all along and let procfs ->kill_sb()
drop it before kill_anon_super().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09 14:13:05 -04:00
Linus Torvalds
51f0885e54 vfs,proc: guarantee unique inodes in /proc
Dave Jones found another /proc issue with his Trinity tool: thanks to
the namespace model, we can have multiple /proc dentries that point to
the same inode, aliasing directories in /proc/<pid>/net/ for example.

This ends up being a total disaster, because it acts like hardlinked
directories, and causes locking problems.  We rely on the topological
sort of the inodes pointed to by dentries, and if we have aliased
directories, that odering becomes unreliable.

In short: don't do this.  Multiple dentries with the same (directory)
inode is just a bad idea, and the namespace code should never have
exposed things this way.  But we're kind of stuck with it.

This solves things by just always allocating a new inode during /proc
dentry lookup, instead of using "iget_locked()" to look up existing
inodes by superblock and number.  That actually simplies the code a bit,
at the cost of potentially doing more inode [de]allocations.

That said, the inode lookup wasn't free either (and did a lot of locking
of inodes), so it is probably not that noticeable.  We could easily keep
the old lookup model for non-directory entries, but rather than try to
be excessively clever this just implements the minimal and simplest
workaround for the problem.

Reported-and-tested-by: Dave Jones <davej@redhat.com>
Analyzed-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-22 11:44:04 -07:00
Andrew Morton
87ebdc00ee fs/proc: clean up printks
- use pr_foo() throughout

- remove a couple of duplicated KERN_WARNINGs, via WARN(KERN_WARNING "...")

- nuke a few warnings which I've never seen happen, ever.

Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:11 -08:00
Al Viro
d3d009cb96 saner proc_get_inode() calling conventions
Make it drop the pde in *all* cases when no new reference to it is
put into an inode - both when an inode had already been set up
(as we were already doing) and when inode allocation has failed.
Makes for simpler logics in callers...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26 02:46:14 -05:00
Maxim Patlasov
87e0aab37f proc: avoid extra pde_put() in proc_fill_super()
If proc_get_inode() succeeded, but d_make_root() failed, pde_put() for
proc_root will be called twice: the first time due to iput() called from
d_make_root() and the second time directly in the end of
proc_fill_super().

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26 02:46:14 -05:00
Al Viro
496ad9aa8e new helper: file_inode(file)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-22 23:31:31 -05:00
Eric W. Biederman
bf056bfa80 proc: Fix the namespace inode permission checks.
Change the proc namespace files into symlinks so that
we won't cache the dentries for the namespace files
which can bypass the ptrace_may_access checks.

To support the symlinks create an additional namespace
inode with it's own set of operations distinct from the
proc pid inode and dentry methods as those no longer
make sense.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-11-20 04:19:48 -08:00
yan
0e06936057 proc: no need to initialize proc_inode->fd in proc_get_inode()
proc_get_inode() obtains the inode via a call to iget_locked().
iget_locked() calls alloc_inode() which will call proc_alloc_inode() which
clears proc_inode.fd, so there is no need to clear this field in
proc_get_inode().

If iget_locked() instead found the inode via find_inode_fast(), that inode
will not have I_NEW set so this change has no effect.

Signed-off-by: yan <clouds.yan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:05:18 +09:00
Linus Torvalds
90324cc1b1 Merge tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux
Pull writeback tree from Wu Fengguang:
 "Mainly from Jan Kara to avoid iput() in the flusher threads."

* tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
  writeback: Avoid iput() from flusher thread
  vfs: Rename end_writeback() to clear_inode()
  vfs: Move waiting for inode writeback from end_writeback() to evict_inode()
  writeback: Refactor writeback_single_inode()
  writeback: Remove wb->list_lock from writeback_single_inode()
  writeback: Separate inode requeueing after writeback
  writeback: Move I_DIRTY_PAGES handling
  writeback: Move requeueing when I_SYNC set to writeback_sb_inodes()
  writeback: Move clearing of I_SYNC into inode_sync_complete()
  writeback: initialize global_dirty_limit
  fs: remove 8 bytes of padding from struct writeback_control on 64 bit builds
  mm: page-writeback.c: local functions should not be exposed globally
2012-05-28 09:54:45 -07:00
Eric W. Biederman
dcb0f22282 userns: Convert proc to use kuid/kgid where appropriate
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-05-15 14:59:28 -07:00
Jan Kara
dbd5768f87 vfs: Rename end_writeback() to clear_inode()
After we moved inode_sync_wait() from end_writeback() it doesn't make sense
to call the function end_writeback() anymore. Rename it to clear_inode()
which well says what the function really does - set I_CLEAR flag.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
2012-05-06 13:43:41 +08:00
David Howells
9ffc93f203 Remove all #inclusions of asm/system.h
Remove all #inclusions of asm/system.h preparatory to splitting and killing
it.  Performed with the following command:

perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *`

Signed-off-by: David Howells <dhowells@redhat.com>
2012-03-28 18:30:03 +01:00
Al Viro
48fde701af switch open-coded instances of d_make_root() to new helper
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:35 -04:00
Al Viro
6b4231e2f9 procfs: clean proc_fill_super() up
First of all, there's no need to zero ->i_uid/->i_gid on root inode -
both had been set to zero already.  Moreover, let's take the iput()
on failure to the failure exit it belongs to...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:34 -04:00
Vasiliy Kulikov
0499680a42 procfs: add hidepid= and gid= mount options
Add support for mount options to restrict access to /proc/PID/
directories.  The default backward-compatible "relaxed" behaviour is left
untouched.

The first mount option is called "hidepid" and its value defines how much
info about processes we want to be available for non-owners:

hidepid=0 (default) means the old behavior - anybody may read all
world-readable /proc/PID/* files.

hidepid=1 means users may not access any /proc/<pid>/ directories, but
their own.  Sensitive files like cmdline, sched*, status are now protected
against other users.  As permission checking done in proc_pid_permission()
and files' permissions are left untouched, programs expecting specific
files' modes are not confused.

hidepid=2 means hidepid=1 plus all /proc/PID/ will be invisible to other
users.  It doesn't mean that it hides whether a process exists (it can be
learned by other means, e.g.  by kill -0 $PID), but it hides process' euid
and egid.  It compicates intruder's task of gathering info about running
processes, whether some daemon runs with elevated privileges, whether
another user runs some sensitive program, whether other users run any
program at all, etc.

gid=XXX defines a group that will be able to gather all processes' info
(as in hidepid=0 mode).  This group should be used instead of putting
nonroot user in sudoers file or something.  However, untrusted users (like
daemons, etc.) which are not supposed to monitor the tasks in the whole
system should not be added to the group.

hidepid=1 or higher is designed to restrict access to procfs files, which
might reveal some sensitive private information like precise keystrokes
timings:

http://www.openwall.com/lists/oss-security/2011/11/05/3

hidepid=1/2 doesn't break monitoring userspace tools.  ps, top, pgrep, and
conky gracefully handle EPERM/ENOENT and behave as if the current user is
the only user running processes.  pstree shows the process subtree which
contains "pstree" process.

Note: the patch doesn't deal with setuid/setgid issues of keeping
preopened descriptors of procfs files (like
https://lkml.org/lkml/2011/2/7/368).  We rely on that the leaked
information like the scheduling counters of setuid apps doesn't threaten
anybody's privacy - only the user started the setuid program may read the
counters.

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Greg KH <greg@kroah.com>
Cc: Theodore Tso <tytso@MIT.EDU>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: James Morris <jmorris@namei.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-10 16:30:54 -08:00
Vasiliy Kulikov
97412950b1 procfs: parse mount options
Add support for procfs mount options.  Actual mount options are coming in
the next patches.

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Greg KH <greg@kroah.com>
Cc: Theodore Tso <tytso@MIT.EDU>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: James Morris <jmorris@namei.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-10 16:30:54 -08:00
Al Viro
6b520e0565 vfs: fix the stupidity with i_dentry in inode destructors
Seeing that just about every destructor got that INIT_LIST_HEAD() copied into
it, there is no point whatsoever keeping this INIT_LIST_HEAD in inode_init_once();
the cost of taking it into inode_init_always() will be negligible for pipes
and sockets and negative for everything else.  Not to mention the removal of
boilerplate code from ->destroy_inode() instances...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:52:40 -05:00
Miklos Szeredi
bfe8684869 filesystems: add set_nlink()
Replace remaining direct i_nlink updates with a new set_nlink()
updater function.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Tested-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-11-02 12:53:43 +01:00