The whole write-room thing is something that is up to the _caller_ to
worry about, not the pty layer itself. The total buffer space will
still be limited by the buffering routines themselves, so there is no
advantage or need in having pty_write() artificially limit the size
somehow.
And what happened was that the caller (the n_tty line discipline, in
this case) may have verified that there is room for 2 bytes to be
written (for NL -> CRNL expansion), and it used to then do those writes
as two single-byte writes. And if the first byte written (CR) then
caused a new tty buffer to be allocated, pty_space() may have returned
zero when trying to write the second byte (LF), and then incorrectly
failed the write - leading to a lost newline character.
This should finally fix
http://bugzilla.kernel.org/show_bug.cgi?id=14015
Reported-by: Mikael Pettersson <mikpe@it.uu.se>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When translating CR to CRNL in the n_tty line discipline, we did it as
two tty_put_char() calls. Which works, but is stupid, and has caused
problems before too with bad interactions with the write_room() logic.
The generic USB serial driver had that problem, for example.
Now the pty layer had similar issues after being moved to the generic
tty buffering code (in commit d945cb9cce:
"pty: Rework the pty layer to use the normal buffering logic").
So stop doing the silly separate two writes, and do it as a single write
instead. That's what the n_tty layer already does for the space
expansion of tabs (XTABS), and it means that we'll now always have just
a single write for the CRNL to match the single 'tty_write_room()' test,
which hopefully means that the next time somebody screws up buffering,
it won't cause weeks of debugging.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When I rewrote tty ldisc code to use proper reference counts (commits
65b770468e and cbe9352fa0) in order to avoid a race with hangup, the
test-program that Eric Biederman used to trigger the original problem
seems to have exposed another long-standing bug: the hangup code did the
'tty_ldisc_halt()' to stop any buffer flushing activity, but unlike the
other call sites it never actually flushed any pending work.
As a result, if you get just the right timing, the pending work may be
just about to execute (ie the timer has already triggered and thus
cancel_delayed_work() was a no-op), when we then re-initialize the ldisc
from under it.
That, in turn, results in various random problems, usually seen as a
NULL pointer dereference in run_timer_softirq() or a BUG() in
worker_thread (but it can be almost anything).
Fix it by adding the required 'flush_scheduled_work()' after doing the
tty_ldisc_halt() (this also requires us to move the ldisc halt to before
taking the ldisc mutex in order to avoid a deadlock with the workqueue
executing do_tty_hangup, which requires the mutex).
The locking should be cleaned up one day (the requirement to do this
outside the ldisc_mutex is very annoying, and weakens the lock), but
that's a larger and separate undertaking.
Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Tested-by: Xiaotian Feng <xtfeng@gmail.com>
Tested-by: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Tested-by: Dave Young <hidave.darkstar@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit d945cb9cc ("pty: Rework the pty layer to use the normal buffering
logic") dropped the test for 'tty->stopped' in pty_write_room(), which
then causes the n_tty line discipline thing to not throttle the data
properly when the tty is stopped.
So instead of pausing the write due to the tty being stopped, the ldisc
layer would go ahead and push it down to the pty. The pty write()
routine would then refuse to take the data (because it _did_ check
'stopped'), and the data wouldn't actually be written.
This whole stopped test should eventually be moved into the tty ldisc
layer rather than have low-level tty drivers care about these things,
but right now the fix is to just re-instate the missing pty 'stopped'
handling.
Reported-and-tested-by: Artur Skawina <art.08.09@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
tty-ldisc: be more careful in 'put_ldisc' locking
tty-ldisc: turn ldisc user count into a proper refcount
tty-ldisc: make refcount be atomic_t 'users' count
Use 'atomic_dec_and_lock()' to make sure that we always hold the
tty_ldisc_lock when the ldisc count goes to zero. That way we can never
race against 'tty_ldisc_try()' increasing the count again.
Reported-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@mail.by>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
By using the user count for the actual lifetime rules, we can get rid of
the silly "wait_for_idle" logic, because any busy ldisc will
automatically stay around until the last user releases it. This avoids
a host of odd issues, and simplifies the code.
So now, when the last ldisc reference is dropped, we just release the
ldisc operations struct reference, and free the ldisc.
It looks obvious enough, and it does work for me, but the counting
_could_ be off. It probably isn't (bad counting in the new version would
generally imply that the old code did something really bad, like free an
ldisc with a non-zero count), but it does need some testing, and
preferably somebody looking at it.
With this change, both 'tty_ldisc_put()' and 'tty_ldisc_deref()' are
just aliases for the new ref-counting 'put_ldisc()'. Both of them
decrement the ldisc user count and free it if it goes down to zero.
They're identical functions, in other words.
But the reason they still exist as sepate functions is that one of them
was exported (tty_ldisc_deref) and had a stupid name (so I don't want to
use it as the main name), and the other one was used in multiple places
(and I didn't want to make the patch larger just to rename the users).
In addition to the refcounting, I did do some minimal cleanup. For
example, now "tty_ldisc_try()" actually returns the ldisc it got under
the lock, rather than returning true/false and then the caller would
look up the ldisc again (now without the protection of the lock).
That said, there's tons of dubious use of 'tty->ldisc' without obviously
proper locking or refcounting left. I expressly did _not_ want to try to
fix it all, keeping the patch minimal. There may or may not be bugs in
that kind of code, but they wouldn't be _new_ bugs.
That said, even if the bugs aren't new, the timing and lifetime will
change. For example, some silly code may depend on the 'tty->ldisc'
pointer not changing because they hold a refcount on the 'ldisc'. And
that's no longer true - if you hold a ref on the ldisc, the 'ldisc'
itself is safe, but tty->ldisc may change.
So the proper locking (remains) to hold tty->ldisc_mutex if you expect
tty->ldisc to be stable. That's not really a _new_ rule, but it's an
example of something that the old code might have unintentionally
depended on and hidden bugs.
Whatever. The patch _looks_ sensible to me. The only users of
ldisc->users are:
- get_ldisc() - atomically increment the count
- put_ldisc() - atomically decrements the count and releases if zero
- tty_ldisc_try_get() - creates the ldisc, and sets the count to 1.
The ldisc should then either be released, or be attached to a tty.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@mail.by>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This is pure preparation of changing the ldisc reference counting to be
a true refcount that defines the lifetime of the ldisc. But this is a
purely syntactic change for now to make the next steps easier.
This patch should make no semantic changes at all. But I wanted to make
the ldisc refcount be an atomic (I will be touching it without locks
soon enough), and I wanted to rename it so that there isn't quite as
much confusion between 'ldo->refcount' (ldisk operations refcount) and
'ld->refcount' (ldisc refcount itself) in the same file.
So it's now an atomic 'ld->users' count. It still starts at zero,
despite having a reference from 'tty->ldisc', but that will change once
we turn it into a _real_ refcount.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@mail.by>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix those compiler warnings, which indeed point to a bug:
drivers/char/agp/parisc-agp.c:228: warning: initialization from incompatible pointer type
drivers/char/agp/parisc-agp.c:201: warning: 'parisc_agp_page_mask_memory' defined but not used
Signed-off-by: Helge Deller <deller@gmx.de>
commit d6580a9f15 ("kexec: sysrq: simplify
sysrq-c handler") changed the behavior of sysrq-c to unconditional
dereference of NULL pointer. So in cases with CONFIG_KEXEC, where
crash_kexec() was directly called from sysrq-c before, now it can be said
that a step of "real oops" was inserted before starting kdump.
However, in contrast to oops via SysRq-c from keyboard which results in
panic due to in_interrupt(), oops via "echo c > /proc/sysrq-trigger" will
not become panic unless panic_on_oops=1. It means that even if dump is
properly configured to be taken on panic, the sysrq-c from proc interface
might not start crashdump while the sysrq-c from keyboard can start
crashdump. This confuses traditional users of kdump, i.e. people who
expect sysrq-c to do common behavior in both of the keyboard and proc
interface.
This patch brings the keyboard and proc interface behavior of sysrq-c in
line, by forcing panic_on_oops=1 before oops in sysrq-c handler.
And some updates in documentation are included, to clarify that there is
no longer dependency with CONFIG_KEXEC, and that now the system can just
crash by sysrq-c if no dump mechanism is configured.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Brayan Arraes <brayan@yack.com.br>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We really don't want to mark the pty as a low-latency device, because as
Alan points out, the ->write method can be called from an IRQ (ppp?),
and that means we can't use ->low_latency=1 as we take mutexes in the
low_latency case.
So rather than using low_latency to force the written data to be pushed
to the ldisc handling at 'write()' time, just make the reader side (or
the poll function) do the flush when it checks whether there is data to
be had.
This also fixes the problem with lost data in an emacs compile buffer
(bugzilla 13815), and we can thus revert the low_latency pty hack
(commit 3a54297478: "pty: quickfix for the
pty ENXIO timing problems").
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[ Modified to do the tty_flush_to_ldisc() inside input_available_p() so
that it triggers for both read and poll() - Linus]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This function does not have an error return and returning an error is
instead interpreted as having a lot of pending bytes.
Reported by Jeff Harris who provided a list of some of the remaining
offenders.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If spin_lock_irqsave is called twice in a row with the same second
argument, the interrupt state at the point of the second call overwrites
the value saved by the first call. Indeed, the second call does not
need to save the interrupt state, so it is changed to a simple
spin_lock.
The semantic match that finds this problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@@
expression lock1,lock2;
expression flags;
@@
*spin_lock_irqsave(lock1,flags)
... when != flags
*spin_lock_irqsave(lock2,flags)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The buffer for the consoles are unconditionally allocated at con_init()
time, which miss the creation of the vcs(a) devices.
Since 2.6.30 (commit 4995f8ef9d, 'vcs:
hook sysfs devices into object lifetime instead of "binding"' to be
exact) these devices are no longer created at open() and removed on
close(), but controlled by the lifetime of the buffers.
Reported-by: Gerardo Exequiel Pozzi <vmlinuz386@yahoo.com.ar>
Tested-by: Gerardo Exequiel Pozzi <vmlinuz386@yahoo.com.ar>
Cc: stable@kernel.org
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If a tty in N_TTY mode with echo enabled manages to get itself into a state
where
- echo characters are pending
- FASYNC is enabled
- tty_write_wakeup is called from either
- a device write path (pty)
- an IRQ (serial)
then it either deadlocks or explodes taking a mutex in the IRQ path.
On the serial side it is almost impossible to reproduce because you have to
go from a full serial port to a near empty one with echo characters
pending. The pty case happens to have become possible to trigger using
emacs and ptys, the pty changes having created a scenario which shows up
this bug.
The code path is
n_tty:process_echoes() (takes mutex)
tty_io:tty_put_char()
pty:pty_write (or serial paths)
tty_wakeup (from pty_write or serial IRQ)
n_tty_write_wakeup()
process_echoes()
*KABOOM*
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bootmem is not used for the vt screen buffer anymore as slab is now
available at the time the console is initialized.
Get rid of the now superfluous distinction between slab and bootmem,
it's always slab.
This also fixes a kmalloc leak which Catalin described thusly:
Commit a5f4f52e ("vt: use kzalloc() instead of the bootmem allocator")
replaced the alloc_bootmem() with kzalloc() but didn't set vc_kmalloced to
1 and the memory block is later leaked. The corresponding kmemleak trace:
unreferenced object 0xdf828000 (size 8192):
comm "swapper", pid 0, jiffies 4294937296
backtrace:
[<c006d473>] __save_stack_trace+0x17/0x1c
[<c000d869>] log_early+0x55/0x84
[<c01cfa4b>] kmemleak_alloc+0x33/0x3c
[<c006c013>] __kmalloc+0xd7/0xe4
[<c00108c7>] con_init+0xbf/0x1b8
[<c0010149>] console_init+0x11/0x20
[<c0008797>] start_kernel+0x137/0x1e4
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We can get a situation where a hangup occurs during or after a close. In
that case the ldisc gets disposed of by the close and the hangup then
explodes.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT
This will make hardirq.h inclusion cheaper for every PREEMPT=n config
(which includes allmodconfig/allyesconfig, BTW)
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 5fd29d6ccb ("printk: clean up
handling of log-levels and newlines") changed printk semantics. printk
lines with multiple KERN_<level> prefixes are no longer emitted as
before the patch.
<level> is now included in the output on each additional use.
Remove all uses of multiple KERN_<level>s in formats.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This fixes the ppp problems and various other issues with call locking
caused by one side of a pty called in one locking context trying to match
another with differing rules on the other side. We also get a big slack
space to work with that means we can bury the flow control deadlock case
for any conceivable real world situation.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>