On final port close (and thus final tty close), only output flow
control requests in the input data should be processed. Ignore all
other input data, including parity errors, overruns and breaks.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to fcntl(2), "a SIGIO signal is sent whenever input
or output becomes possible on that file descriptor", i.e.
after the output buffer was full and now has space for new data.
But in fact SIGIO is sent after every write.
n_tty_write() should set TTY_DO_WRITE_WAKEUP only when
not all data could be written to the buffer.
[pjh: Also fixes missed SIGIO if amt written just happens to be
[ amount still to write
Signed-off-by: Johannes Stezenbach <js@sig21.net>
[pjh: minor patch edits and re-submit]
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since n_tty_check_unthrottle() is only called from n_tty_read()
which only originates from a userspace read(), the tty count cannot
be 0; the read() guarantees the file descriptor has not yet been
released.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If signal-driven i/o is disabled while write wakeup is pending (ie.,
n_tty_write() has set TTY_DO_WRITE_WAKEUP but then signal-driven i/o
is disabled), the TTY_DO_WRITE_WAKEUP bit will never be cleared and
will cause tty_wakeup() to always call n_tty_write_wakeup.
Unconditionally clear the write wakeup, and since kill_fasync()
already checks if the fasync ptr is null, call kill_fasync()
unconditionally as well.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Only the N_TTY line discipline implements the signal-driven i/o
notification enabled/disabled by fcntl(F_SETFL, O_ASYNC). The ldisc
fasync() notification is sent to the ldisc when the enable state has
changed (the tty core is notified via the fasync() VFS file operation).
The N_TTY line discipline used the enable state to change the wakeup
condition (minimum_to_wake = 1) for notifying the signal handler i/o is
available. However, just the presence of data is sufficient and necessary
to signal i/o is available, so changing minimum_to_wake is unnecessary
(and creates a race condition with read() and poll() which may be
concurrently updating minimum_to_wake).
Furthermore, since the kill_fasync() VFS helper performs no action if
the fasync list is empty, calling unconditionally is preferred; if
signal driven i/o just has been disabled, no signal will be sent by
kill_fasync() anyway so notification of the change via the ldisc
fasync() method is superfluous.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A read() in non-canonical mode when VMIN > 0 and VTIME == 0 does not
complete until at least VMIN chars have been read (or the user buffer is
full). In this infrequent read mode, n_tty_read() attempts to reduce
wakeups by computing the amount of data still necessary to complete the
read (minimum_to_wake) and only waking the read()/poll() when that much
unread data has been processed. This is the only read mode for which
new data does not necessarily generate a wakeup.
However, this optimization is broken and commonly leads to hung reads
even though the necessary amount of data has been received. Since the
optimization is of marginal value anyway, just remove the whole
thing. This also remedies a race between a concurrent poll() and
read() in this mode, where the poll() can reset the minimum_to_wake
of the read() (and vice versa).
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In canonical read mode, each line read and logged is pushed separately
with tty_audit_push(). For all single-threaded processes and multi-threaded
processes reading from only one tty, this patch has no effect; the last line
read will still be the entry pushed to the audit log because the tty
association cannot have changed between tty_audit_add_data() and
tty_audit_push().
For multi-threaded processes reading from different ttys concurrently,
the audit log will have mixed log entries anyway. Consider two ttys
audited concurrently:
CPU0 CPU1
---------- ------------
tty_audit_add_data(ttyA)
tty_audit_add_data(ttyB)
tty_audit_push()
tty_audit_add_data(ttyB)
tty_audit_push()
This patch will now cause the ttyB output to be split into separate
audit log entries.
However, this possibility is equally likely without this patch:
CPU0 CPU1
---------- ------------
tty_audit_add_data(ttyB)
tty_audit_add_data(ttyA)
tty_audit_push()
tty_audit_add_data(ttyB)
tty_audit_push()
Mixed canonical and non-canonical reads have similar races.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The tty termios bits cannot change while n_tty_read() is in the
i/o loop; the termios_rwsem ensures mutual exclusion with termios
changes in n_tty_set_termios(). Check L_ICANON() directly and
eliminate icanon parameter.
NB: tty_audit_add_data() => tty_audit_buf_get() => tty_audit_buf_alloc()
is a single path; ie., tty_audit_buf_get() and tty_audit_buf_alloc()
have no other callers.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tty audit never logs pty master reads, but packet mode only works for
pty masters, so tty_audit_add_data() was never logging packet mode
anyway.
Don't audit packet mode data. As those are the lone call sites, remove
tty_put_user().
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reduce global tty symbols; move and rename tty_ldisc_begin() as
n_tty_init() and redefine the N_TTY ldisc ops as file scope.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The chars_in_buffer() line discipline method serves no functional
purpose, other than as a (dubious) debugging aid for mostly bit-rotting
drivers. Despite being documented as an optional method, every caller
is unconditionally executed (although conditionally compiled).
Furthermore, direct tty->ldisc access without an ldisc ref is unsafe.
Lastly, N_TTY's chars_in_buffer() has warned of removal since 3.12.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Although n_tty_check_unthrottle() has a valid ldisc reference (since
the tty core gets the ldisc ref in tty_read() before calling the line
discipline read() method), it does not have a valid ldisc reference to
the "other" pty of a pty pair. Since getting an ldisc reference for
tty->link essentially open-codes tty_wakeup(), just replace with the
equivalent tty_wakeup().
Cc: <stable@vger.kernel.org>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add a temporary for the computed source address and substitute
where appropriate. No functional change.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Merge the multiple tty_copy_to_user() calls into a single copy
sequence within tty_copy_to_user().
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since not all ttys are devices (eg., SysV ptys), dev_*() printk macros
cannot be used. Define tty_*() printk macros that output in similar
format to dev_*() macros (ie., <driver> <tty>: .....).
Transform the most-trivial printk( LEVEL ...) usage to tty_*() usage.
NB: The function name has been eliminated from messages with unique
context, or prefixed to the format when given.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 40d5e0905a ("n_tty: Fix EOF push handling") fixed EOF push
for reads. However, that approach still allows a condition mismatch
between poll() and read(), where poll() returns POLLIN but read()
blocks. This state can happen when a previous read() returned because
the user buffer was full and the next character was an EOF not at the
beginning of the line. While the next read() will properly identify
the condition and advance the read buffer tail without improperly
indicating an EOF file condition (ie., read() will not mistakenly
return 0), poll() will mistakenly indicate POLLIN.
Although a possible solution would be to peek at the input buffer
in n_tty_poll(), the better solution in this patch is to eat the
EOF during the previous read() (ie., fix the problem by eliminating
the condition).
The current canon line buffer copy limits the scan for next end-of-line
to the smaller of either,
a. the remaining user buffer size
b. completed lines in the input buffer
When the remaining user buffer size is exactly one less than the
end-of-line marked by EOF push, the EOF is not scanned nor skipped
but left for subsequent reads. In the example below, the scan
index 'eol' has stopped at the EOF because it is past the scan
limit of 5 (not because it has found the next set bit in read_flags)
user buffer [*nr = 5] _ _ _ _ _
read_flags 0 0 0 0 0 1
input buffer h e l l o [EOF]
^ ^
/ /
tail eol
result: found = 0, tail += 5, *nr += 5
Instead, allow the scan to peek ahead 1 byte (while still limiting the
scan to completed lines in the input buffer). For the example above,
result: found = 1, tail += 6, *nr += 5
Because the scan limit is now bumped +1 byte, when the scan is
completed, the tail advance and the user buffer copy limit is
re-clamped to *nr when EOF is _not_ found.
Fixes: 40d5e0905a ("n_tty: Fix EOF push handling")
Cc: <stable@vger.kernel.org> # 3.12+
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Introduce API functions to restart and cancel tty buffer work, rather
than manipulate buffer work directly.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The job_control() check in n_tty_read() has nearly identical purpose
and results as tty_check_change(). Both functions' purpose is to
determine if the current task's pgrp is the foreground pgrp for the tty,
and if not, to signal the current pgrp.
Introduce __tty_check_change() which takes the signal to send
and performs the shared operations for job control() and
tty_check_change().
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Waking the reader immediately upon receipt of TTY_BREAK or TTY_PARITY
chars has no effect on the outcome of read():
1. Only non-canonical/EXTPROC mode applies since canonical mode
will not return data until a line termination is received anyway
2. EXTPROC mode - the reader will always be woken by the input worker
3. Non-canonical modes
a. MIN == 0, TIME == 0
b. MIN == 0, TIME > 0
c. MIN > 0, TIME > 0
minimum_to_wake is always 1 in these modes so the reader will always
be woken by the input worker
d. MIN > 0, TIME == 0
although the reader will not be woken by the input worker unless the
minimum data is received, the reader would not otherwise have
returned the received data
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
My colleague ran into a program stall on a x86_64 server, where
n_tty_read() was waiting for data even if there was data in the buffer
in the pty. kernel stack for the stuck process looks like below.
#0 [ffff88303d107b58] __schedule at ffffffff815c4b20
#1 [ffff88303d107bd0] schedule at ffffffff815c513e
#2 [ffff88303d107bf0] schedule_timeout at ffffffff815c7818
#3 [ffff88303d107ca0] wait_woken at ffffffff81096bd2
#4 [ffff88303d107ce0] n_tty_read at ffffffff8136fa23
#5 [ffff88303d107dd0] tty_read at ffffffff81368013
#6 [ffff88303d107e20] __vfs_read at ffffffff811a3704
#7 [ffff88303d107ec0] vfs_read at ffffffff811a3a57
#8 [ffff88303d107f00] sys_read at ffffffff811a4306
#9 [ffff88303d107f50] entry_SYSCALL_64_fastpath at ffffffff815c86d7
There seems to be two problems causing this issue.
First, in drivers/tty/n_tty.c, __receive_buf() stores the data and
updates ldata->commit_head using smp_store_release() and then checks
the wait queue using waitqueue_active(). However, since there is no
memory barrier, __receive_buf() could return without calling
wake_up_interactive_poll(), and at the same time, n_tty_read() could
start to wait in wait_woken() as in the following chart.
__receive_buf() n_tty_read()
------------------------------------------------------------------------
if (waitqueue_active(&tty->read_wait))
/* Memory operations issued after the
RELEASE may be completed before the
RELEASE operation has completed */
add_wait_queue(&tty->read_wait, &wait);
...
if (!input_available_p(tty, 0)) {
smp_store_release(&ldata->commit_head,
ldata->read_head);
...
timeout = wait_woken(&wait,
TASK_INTERRUPTIBLE, timeout);
------------------------------------------------------------------------
The second problem is that n_tty_read() also lacks a memory barrier
call and could also cause __receive_buf() to return without calling
wake_up_interactive_poll(), and n_tty_read() to wait in wait_woken()
as in the chart below.
__receive_buf() n_tty_read()
------------------------------------------------------------------------
spin_lock_irqsave(&q->lock, flags);
/* from add_wait_queue() */
...
if (!input_available_p(tty, 0)) {
/* Memory operations issued after the
RELEASE may be completed before the
RELEASE operation has completed */
smp_store_release(&ldata->commit_head,
ldata->read_head);
if (waitqueue_active(&tty->read_wait))
__add_wait_queue(q, wait);
spin_unlock_irqrestore(&q->lock,flags);
/* from add_wait_queue() */
...
timeout = wait_woken(&wait,
TASK_INTERRUPTIBLE, timeout);
------------------------------------------------------------------------
There are also other places in drivers/tty/n_tty.c which have similar
calls to waitqueue_active(), so instead of adding many memory barrier
calls, this patch simply removes the call to waitqueue_active(),
leaving just wake_up*() behind.
This fixes both problems because, even though the memory access before
or after the spinlocks in both wake_up*() and add_wait_queue() can
sneak into the critical section, it cannot go past it and the critical
section assures that they will be serialized (please see "INTER-CPU
ACQUIRING BARRIER EFFECTS" in Documentation/memory-barriers.txt for a
better explanation). Moreover, the resulting code is much simpler.
Latency measurement using a ping-pong test over a pty doesn't show any
visible performance drop.
Signed-off-by: Kosuke Tatsukawa <tatsu@ab.jp.nec.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>