commit 67245ff332064c01b760afa7a384ccda024bfd24 upstream.
This gets rid of the horrible notion of having that
struct inode *ptmx_inode
be the linchpin of the interface between the pty code and devpts.
By de-emphasizing the ptmx inode, a lot of things actually get cleaner,
and we will have a much saner way forward. In particular, this will
allow us to associate with any particular devpts instance at open-time,
and not be artificially tied to one particular ptmx inode.
The patch itself is actually fairly straightforward, and apart from some
locking and return path cleanups it's pretty mechanical:
- the interfaces that devpts exposes all take "struct pts_fs_info *"
instead of "struct inode *ptmx_inode" now.
NOTE! The "struct pts_fs_info" thing is a completely opaque structure
as far as the pty driver is concerned: it's still declared entirely
internally to devpts. So the pty code can't actually access it in any
way, just pass it as a "cookie" to the devpts code.
- the "look up the pts fs info" is now a single clear operation, that
also does the reference count increment on the pts superblock.
So "devpts_add/del_ref()" is gone, and replaced by a "lookup and get
ref" operation (devpts_get_ref(inode)), along with a "put ref" op
(devpts_put_ref()).
- the pty master "tty->driver_data" field now contains the pts_fs_info,
not the ptmx inode.
- because we don't care about the ptmx inode any more as some kind of
base index, the ref counting can now drop the inode games - it just
gets the ref on the superblock.
- the pts_fs_info now has a back-pointer to the super_block. That's so
that we can easily look up the information we actually need. Although
quite often, the pts fs info was actually all we wanted, and not having
to look it up based on some magical inode makes things more
straightforward.
In particular, now that "devpts_get_ref(inode)" operation should really
be the *only* place we need to look up what devpts instance we're
associated with, and we do it exactly once, at ptmx_open() time.
The other side of this is that one ptmx node could now be associated
with multiple different devpts instances - you could have a single
/dev/ptmx node, and then have multiple mount namespaces with their own
instances of devpts mounted on /dev/pts/. And that's all perfectly sane
in a model where we just look up the pts instance at open time.
This will eventually allow us to get rid of our odd single-vs-multiple
pts instance model, but this patch in itself changes no semantics, only
an internal binding model.
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Cc: Jann Horn <jann@thejh.net>
Cc: Greg KH <greg@kroah.com>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: Florian Weimer <fw@deneb.enyo.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Francesco Ruggeri <fruggeri@arista.com>
Cc: "Herton R. Krzesinski" <herton@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1f55c718c290616889c04946864a13ef30f64929 upstream.
Considering current pty code and multiple devpts instances, it's possible
to umount a devpts file system while a program still has /dev/tty opened
pointing to a previosuly closed pty pair in that instance. In the case all
ptmx and pts/N files are closed, umount can be done. If the program closes
/dev/tty after umount is done, devpts_kill_index will use now an invalid
super_block, which was already destroyed in the umount operation after
running ->kill_sb. This is another "use after free" type of issue, but now
related to the allocated super_block instance.
To avoid the problem (warning at ida_remove and potential crashes) for
this specific case, I added two functions in devpts which grabs additional
references to the super_block, which pty code now uses so it makes sure
the super block structure is still valid until pty shutdown is done.
I also moved the additional inode references to the same functions, which
also covered similar case with inode being freed before /dev/tty final
close/shutdown.
Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
Reviewed-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If devpts failed to initialize, it would store an ERR_PTR in the global
devpts_mnt. A subsequent open of /dev/ptmx would call devpts_new_index,
which would dereference devpts_mnt and crash.
Avoid storing invalid values in devpts_mnt; leave it NULL instead. Make
both devpts_new_index and devpts_pty_new fail gracefully with ENODEV in
that case, which then becomes the return value to the userspace open call
on /dev/ptmx.
[akpm@linux-foundation.org: remove unneeded static]
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reviewed-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
that's the bulk of filesystem drivers dealing with inodes of their own
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Also convert spaces to tabs (checkpatch warnings) if (!dentry) KERN_NOTICE
converted to pr_err (like if (!inode) error process)
[akpm@linux-foundation.org: use KBUILD_MODNAME, per Joe]
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Joe Perches <joe@perches.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When devpts is unmounted, there may be a no-longer-used IDR tree hanging
off the superblock we are about to kill. This needs to be cleaned up
before destroying the SB.
The leak is usually not a big deal because unmounting devpts is typically
done when shutting down the whole machine. However, shutting down an LXC
container instead of a physical machine exposes the problem (the garbage
is detectable with kmemleak).
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- The context in which devpts is mounted has no effect on the creation
of ptys as the /dev/ptmx interface has been used by unprivileged
users for many years.
- Only support unprivileged mounts in combination with the newinstance
option to ensure that mounting of /dev/pts in a user namespace will
not allow the options of an existing mount of devpts to be modified.
- Create /dev/pts/ptmx as the root user in the user namespace that
mounts devpts so that it's permissions to be changed.
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
The goal is to stop setting and using tty->driver_data in devpts code.
It should be used solely by the driver's code, pty in this case.
Now driver_data are managed only in the pty driver. devpts_pty_new is
switched to accept what we used to dig out of tty_struct, i.e. device
node number and index.
This also removes a note about driver_data being set outside of the
driver.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The goal is to stop setting and using tty->driver_data in devpts code.
It should be used solely by the driver's code, pty in this case.
For the cleanup of layering, we will need the inode created in
devpts_pty_new to be stored into slave's driver_data. So we convert
devpts_pty_new to return the inode or an ERR_PTR-encoded error in case
of failure.
The move of 'inode = new_inode(sb);' from declarators to the code is
only cosmetical, but it makes the code easier to read.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The goal is to stop setting and using tty->driver_data in devpts code.
It should be used solely by the driver's code, pty in this case.
First, here we remove TTY from devpts_get_tty and rename it to
devpts_get_priv. Note we do not remove type safety, we just shift the
[implicit] (void *) cast one layer up.
index was unused in devpts_get_tty, so remove that from the prototype
too.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pass mount flags to sget() so that it can use them in initialising a new
superblock before the set function is called. They could also be passed to the
compare function.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull vfs pile 1 from Al Viro:
"This is _not_ all; in particular, Miklos' and Jan's stuff is not there
yet."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (64 commits)
ext4: initialization of ext4_li_mtx needs to be done earlier
debugfs-related mode_t whack-a-mole
hfsplus: add an ioctl to bless files
hfsplus: change finder_info to u32
hfsplus: initialise userflags
qnx4: new helper - try_extent()
qnx4: get rid of qnx4_bread/qnx4_getblk
take removal of PF_FORKNOEXEC to flush_old_exec()
trim includes in inode.c
um: uml_dup_mmap() relies on ->mmap_sem being held, but activate_mm() doesn't hold it
um: embed ->stub_pages[] into mmu_context
gadgetfs: list_for_each_safe() misuse
ocfs2: fix leaks on failure exits in module_init
ecryptfs: make register_filesystem() the last potential failure exit
ntfs: forgets to unregister sysctls on register_filesystem() failure
logfs: missing cleanup on register_filesystem() failure
jfs: mising cleanup on register_filesystem() failure
make configfs_pin_fs() return root dentry on success
configfs: configfs_create_dir() has parent dentry in dentry->d_parent
configfs: sanitize configfs_create()
...
After adding devpts multiple-insrances sysctl kernel.pty.max limit pty count for
each devpts instance independently, while kernel.pty.nr shows total pty count.
This patch restores sysctl kernel.pty.max as global limit (4096 by default),
adds pty reseve for main devpts (mounted without "newinstance" argument),
and new sysctl to tune it: kernel.pty.reserve (1024 by default)
Also it adds devpts mount option "max=%d" to limit pty count for each devpts
instance independently. (by default NR_UNIX98_PTY_MAX == 2^20)
Thus devpts instances in containers cannot eat up all available pty even if we didn't
set any limits, while with "max" argument we can adjust limits more precisely.
Plus, now open("/dev/ptmx") return -ENOSPC in case lack of pty indexes,
this is more informative than -EIO.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Let's move this stuff to the better place, where we can account pty right in
tty-indexes managing code.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
devpts_kill_sb() is called even if devpts_fill_super() fails;
we should not do that kfree() in the latter, especially not
with ->s_fs_info left pointing to freed object. Double kfree()
is a Bad Thing(tm)...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Replace remaining direct i_nlink updates with a new set_nlink()
updater function.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Tested-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Replace direct i_nlink updates with the respective updater function
(inc_nlink, drop_nlink, clear_nlink, inode_dec_link_count).
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
In this case nobody can open a slave point, so will be better return
from devpts_pty_new()
Now we should not check error code from d_find_alias() in
devpts_pty_kill(), because the dentry exists all times.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>