* de_get() is trivial -- make inline, save a few bits of code, drop
"refcount is 0" check -- it should be done in some generic refcount
code, don't recall it's was helpful
* rename GET and PUT functions to pde_get(), pde_put() for cool prefix!
* remove obvious and incorrent comments
* in remove_proc_entry() use pde_put(), when I fixed PDE refcounting to
be normal one, remove_proc_entry() was supposed to do "-1" and code now
reflects that.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
struct proc_dir_entry::owner is going to be removed. Now it's only necessary
to protect PDEs which are using ->read_proc, ->write_proc hooks.
However, ->owner assignments are racy and make it very easy for someone to switch
->owner on live PDE (as some subsystems do) without fixing refcounts and so on.
http://bugzilla.kernel.org/show_bug.cgi?id=12454
So, ->owner is on death row.
Proxy file operations exist already (proc_file_operations), just bump usecount
when necessary.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
There are four BKL users in proc: de_put(), proc_lookup_de(),
proc_readdir_de(), proc_root_readdir(),
1) de_put()
-----------
de_put() is classic atomic_dec_and_test() refcount wrapper -- no BKL
needed. BKL doesn't matter to possible refcount leak as well.
2) proc_lookup_de()
-------------------
Walking PDE list is protected by proc_subdir_lock(), proc_get_inode() is
potentially blocking, all callers of proc_lookup_de() eventually end up
from ->lookup hooks which is protected by directory's ->i_mutex -- BKL
doesn't protect anything.
3) proc_readdir_de()
--------------------
"." and ".." part doesn't need BKL, walking PDE list is under
proc_subdir_lock, calling filldir callback is potentially blocking
because it writes to luserspace. All proc_readdir_de() callers
eventually come from ->readdir hook which is under directory's
->i_mutex -- BKL doesn't protect anything.
4) proc_root_readdir_de()
-------------------------
proc_root_readdir_de is ->readdir hook, see (3).
Since readdir hooks doesn't use BKL anymore, switch to
generic_file_llseek, since it also takes directory's i_mutex.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Use WARN() rather than a printk() + backtrace();
this gives a more standard format message as well as complete
information (including line numbers etc) that will be collected
by kerneloops.org
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Print parent directory name as well.
The aim is to catch non-creation of parent directory when proc_mkdir will
return NULL and all subsequent registrations go directly in /proc instead
of intended directory.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Fixed insane printk string while at it. - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ouch, if number taken from IDA is too big, the intent was to signal an
error, not check for overflow and still do overflowing addition.
One still needs 2^28 proc entries to notice this.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
proc doesn't use "associate pointer with id" feature of IDR, so switch
to IDA.
NOTE, NOTE, NOTE:
Do not apply if release_inode_number() still mantions MAX_ID_MASK!
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Id which proc gets from IDR for inode number and id which proc removes
from IDR do not match. E.g. 0x11a transforms into 0x8000011a.
Which stayed unnoticed for a long time because, surprise, idr_remove()
masks out that high bit before doing anything.
All of this due to "| ~MAX_ID_MASK" in release_inode_number().
I still don't understand how it's supposed to work, because "| ~MASK"
is not an inversion for "& MAX" operation.
So, use just one nice, working addition. Make start offset unsigned int,
while I'm at it. It's longness is not used anywhere.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Use WARN() instead of a printk+WARN_ON() pair; this way the message
becomes part of the warning section for better reporting/collection.
This way, the entire if() {} section can collapse into the WARN() as well.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Current two-stage scheme of removing PDE emphasizes one bug in proc:
open
rmmod
remove_proc_entry
close
->release won't be called because ->proc_fops were cleared. In simple
cases it's small memory leak.
For every ->open, ->release has to be done. List of openers is introduced
which is traversed at remove_proc_entry() if neeeded.
Discussions with Al long ago (sigh).
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In this unfortunate case, proc_mkdir_mode wrapper can't be used anymore and
this is no way to reuse proc_create_data due to nlinks assignment. So,
copy the code from proc_mkdir and assign PDE->data at the appropriate
moment.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that last dozen or so users of ->get_info were removed, ditch it too.
Everyone sane shouldd have switched to seq_file interface long ago.
P.S.: Co-existing 3 interfaces (->get_info/->read_proc/->proc_fops) for proc
is long-standing crap, BTW, thus
a) put ->read_proc/->write_proc/read_proc_entry() users on death row,
b) new such users should be rejected,
c) everyone is encouraged to convert his favourite ->read_proc user or
I'll do it, lazy bastards.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
proc-misc code is noticeably full of "if (de)" checks when PDE passed is
always valid. Remove them.
Addition of such check in proc_lookup_de() is for failed lookup case.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If valid "parent" is passed to proc_create/remove_proc_entry(), then name of
PDE should consist of only one path component, otherwise creation or or
removal will fail. However, if NULL is passed as parent then create/remove
accept full path as a argument. This is arbitrary restriction -- all
infrastructure is in place.
So, patch allows the following to succeed:
create_proc_entry("foo/bar", 0, pde_baz);
remove_proc_entry("baz/foo/bar", &proc_root);
Also makes the following to behave identically:
create_proc_entry("foo/bar", 0, NULL);
create_proc_entry("foo/bar", 0, &proc_root);
Discrepancy noticed by Den Lunev (IIRC).
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
proc_subdir_lock protects only modifying and walking through PDE lists, so
after we've found PDE to remove and actually removed it from lists, there is
no need to hold proc_subdir_lock for the rest of operation.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Current /proc/net is done with so called "shadows", but current
implementation is broken and has little chances to get fixed.
The problem is that dentries subtree of /proc/net directory has
fancy revalidation rules to make processes living in different
net namespaces see different entries in /proc/net subtree, but
currently, tasks see in the /proc/net subdir the contents of any
other namespace, depending on who opened the file first.
The proposed fix is to turn /proc/net into a symlink, which points
to /proc/self/net, which in turn shows what previously was in
/proc/net - the network-related info, from the net namespace the
appropriate task lives in.
# ls -l /proc/net
lrwxrwxrwx 1 root root 8 Mar 5 15:17 /proc/net -> self/net
In other words - this behaves like /proc/mounts, but unlike
"mounts", "net" is not a file, but a directory.
Changes from v2:
* Fixed discrepancy of /proc/net nlink count and selinux labeling
screwup pointed out by Stephen.
To get the correct nlink count the ->getattr callback for /proc/net
is overridden to read one from the net->proc_net entry.
To make selinux still work the net->proc_net entry is initialized
properly, i.e. with the "net" name and the proc_net parent.
Selinux fixes are
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Changes from v1:
* Fixed a task_struct leak in get_proc_task_net, pointed out by Paul.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Typical PDE creation code looks like:
pde = create_proc_entry("foo", 0, NULL);
if (pde)
pde->proc_fops = &foo_proc_fops;
Notice that PDE is first created, only then ->proc_fops is set up to
final value. This is a problem because right after creation
a) PDE is fully visible in /proc , and
b) ->proc_fops are proc_file_operations which do not have ->open callback. So, it's
possible to ->read without ->open (see one class of oopses below).
The fix is new API called proc_create() which makes sure ->proc_fops are
set up before gluing PDE to main tree. Typical new code looks like:
pde = proc_create("foo", 0, NULL, &foo_proc_fops);
if (!pde)
return -ENOMEM;
Fix most networking users for a start.
In the long run, create_proc_entry() for regular files will go.
BUG: unable to handle kernel NULL pointer dereference at virtual address 00000024
printing eip: c1188c1b *pdpt = 000000002929e001 *pde = 0000000000000000
Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC
last sysfs file: /sys/block/sda/sda1/dev
Modules linked in: foo af_packet ipv6 cpufreq_ondemand loop serio_raw psmouse k8temp hwmon sr_mod cdrom
Pid: 24679, comm: cat Not tainted (2.6.24-rc3-mm1 #2)
EIP: 0060:[<c1188c1b>] EFLAGS: 00210002 CPU: 0
EIP is at mutex_lock_nested+0x75/0x25d
EAX: 000006fe EBX: fffffffb ECX: 00001000 EDX: e9340570
ESI: 00000020 EDI: 00200246 EBP: e9340570 ESP: e8ea1ef8
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process cat (pid: 24679, ti=E8EA1000 task=E9340570 task.ti=E8EA1000)
Stack: 00000000 c106f7ce e8ee05b4 00000000 00000001 458003d0 f6fb6f20 fffffffb
00000000 c106f7aa 00001000 c106f7ce 08ae9000 f6db53f0 00000020 00200246
00000000 00000002 00000000 00200246 00200246 e8ee05a0 fffffffb e8ee0550
Call Trace:
[<c106f7ce>] seq_read+0x24/0x28a
[<c106f7aa>] seq_read+0x0/0x28a
[<c106f7ce>] seq_read+0x24/0x28a
[<c106f7aa>] seq_read+0x0/0x28a
[<c10818b8>] proc_reg_read+0x60/0x73
[<c1081858>] proc_reg_read+0x0/0x73
[<c105a34f>] vfs_read+0x6c/0x8b
[<c105a6f3>] sys_read+0x3c/0x63
[<c10025f2>] sysenter_past_esp+0x5f/0xa5
[<c10697a7>] destroy_inode+0x24/0x33
=======================
INFO: lockdep is turned off.
Code: 75 21 68 e1 1a 19 c1 68 87 00 00 00 68 b8 e8 1f c1 68 25 73 1f c1 e8 84 06 e9 ff e8 52 b8 e7 ff 83 c4 10 9c 5f fa e8 28 89 ea ff <f0> fe 4e 04 79 0a f3 90 80 7e 04 00 7e f8 eb f0 39 76 34 74 33
EIP: [<c1188c1b>] mutex_lock_nested+0x75/0x25d SS:ESP 0068:e8ea1ef8
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Print a warning if PDE is registered with a name which already exists in
target directory.
Bug report and a simple fix can be found here:
http://bugzilla.kernel.org/show_bug.cgi?id=8798
[\n fixlet and no undescriptive variable usage --adobriyan]
[akpm@linux-foundation.org: make printk comprehensible]
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
proc symlinks always have valid ->data containing destination of symlink. No
need to check it on removal -- proc_symlink() already done it.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pseudo-code for lookup effectively is:
LOCK kernel
LOCK proc_subdir_lock
find PDE
UNLOCK proc_subdir_lock
get inode
LOCK proc_subdir_lock
goto unlock
UNLOCK proc_subdir_lock
UNLOCK kernel
We can get rid of LOCK/UNLOCK pair after getting inode simply by jumping
to unlock_kernel() directly.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>