Commit Graph

197 Commits

Author SHA1 Message Date
Linus Torvalds
d75ae5bdf2 Merge tag 'printk-for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
Pull printk updates from Petr Mladek:

 - Help userspace log daemons to catch up with a flood of messages. They
   will get woken after each message even if the console is far behind
   and handled by another process.

 - Flush printk safe buffers safely even when panic() happens in the
   normal context.

 - Fix possible va_list reuse when race happened in printk_safe().

 - Remove %pCr printf format to prevent sleeping in the atomic context.

 - Misc vsprintf code cleanup.

* tag 'printk-for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
  printk: drop in_nmi check from printk_safe_flush_on_panic()
  lib/vsprintf: Remove atomic-unsafe support for %pCr
  serial: sh-sci: Stop using printk format %pCr
  thermal: bcm2835: Stop using printk format %pCr
  clk: renesas: cpg-mssr: Stop using printk format %pCr
  printk: fix possible reuse of va_list variable
  printk: wake up klogd in vprintk_emit
  vsprintf: Tweak pF/pf comment
  lib/vsprintf: Mark expected switch fall-through
  lib/vsprintf: Replace space with '_' before crng is ready
  lib/vsprintf: Deduplicate pointer_string()
  lib/vsprintf: Move pointer_string() upper
  lib/vsprintf: Make flag_spec global
  lib/vsprintf: Make strspec global
  lib/vsprintf: Make dec_spec global
  lib/test_printf: Mark big constant with UL
2018-06-06 16:04:55 -07:00
Sergey Senozhatsky
554755be08 printk: drop in_nmi check from printk_safe_flush_on_panic()
Drop the in_nmi() check from printk_safe_flush_on_panic()
and attempt to re-init (IOW unlock) locked logbuf spinlock
from panic CPU regardless of its context.

Otherwise, theoretically, we can deadlock on logbuf trying to flush
per-CPU buffers:

  a) Panic CPU is running in non-NMI context
  b) Panic CPU sends out shutdown IPI via reboot vector
  c) Panic CPU fails to stop all remote CPUs
  d) Panic CPU sends out shutdown IPI via NMI vector
     One of the CPUs that we bring down via NMI vector can hold
     logbuf spin lock (theoretically).

Link: http://lkml.kernel.org/r/20180530070350.10131-1-sergey.senozhatsky@gmail.com
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-06-05 13:38:15 +02:00
Tetsuo Handa
988a35f8da printk: fix possible reuse of va_list variable
I noticed that there is a possibility that printk_safe_log_store() causes
kernel oops because "args" parameter is passed to vsnprintf() again when
atomic_cmpxchg() detected that we raced. Fix this by using va_copy().

Link: http://lkml.kernel.org/r/201805112002.GIF21216.OFVHFOMLJtQFSO@I-love.SAKURA.ne.jp
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: dvyukov@google.com
Cc: syzkaller@googlegroups.com
Cc: fengguang.wu@intel.com
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 42a0bb3f71 ("printk/nmi: generic solution for safe printk in NMI")
Cc: 4.7+ <stable@vger.kernel.org> # v4.7+
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-05-16 14:00:46 +02:00
Sergey Senozhatsky
43a17111c2 printk: wake up klogd in vprintk_emit
We wake up klogd very late - only when current console_sem owner
is done pushing pending kernel messages to the serial/net consoles.
In some cases this results in lost syslog messages, because kernel
log buffer is a circular buffer and if we don't wakeup syslog long
enough there are chances that logbuf simply will wrap around.

The patch moves the klogd wake up call to vprintk_emit(), which is
the only legit way for a kernel message to appear in the logbuf,
right after the attempt to handle consoles. As a result, klogd
will get waken either after flushing the new message to consoles
or immediately when consoles are still busy with older messages.

Link: http://lkml.kernel.org/r/20180419014250.5692-1-sergey.senozhatsky@gmail.com
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-04-25 13:26:02 +02:00
Linus Torvalds
2a56bb596b Merge tag 'trace-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
 "New features:

   - Tom Zanussi's extended histogram work.

     This adds the synthetic events to have histograms from multiple
     event data Adds triggers "onmatch" and "onmax" to call the
     synthetic events Several updates to the histogram code from this

   - Allow way to nest ring buffer calls in the same context

   - Allow absolute time stamps in ring buffer

   - Rewrite of filter code parsing based on Al Viro's suggestions

   - Setting of trace_clock to global if TSC is unstable (on boot)

   - Better OOM handling when allocating large ring buffers

   - Added initcall tracepoints (consolidated initcall_debug code with
     them)

  And other various fixes and clean ups"

* tag 'trace-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (68 commits)
  init: Have initcall_debug still work without CONFIG_TRACEPOINTS
  init, tracing: Have printk come through the trace events for initcall_debug
  init, tracing: instrument security and console initcall trace events
  init, tracing: Add initcall trace events
  tracing: Add rcu dereference annotation for test func that touches filter->prog
  tracing: Add rcu dereference annotation for filter->prog
  tracing: Fixup logic inversion on setting trace_global_clock defaults
  tracing: Hide global trace clock from lockdep
  ring-buffer: Add set/clear_current_oom_origin() during allocations
  ring-buffer: Check if memory is available before allocation
  lockdep: Add print_irqtrace_events() to __warn
  vsprintf: Do not preprocess non-dereferenced pointers for bprintf (%px and %pK)
  tracing: Uninitialized variable in create_tracing_map_fields()
  tracing: Make sure variable string fields are NULL-terminated
  tracing: Add action comparisons when testing matching hist triggers
  tracing: Don't add flag strings when displaying variable references
  tracing: Fix display of hist trigger expressions containing timestamps
  ftrace: Drop a VLA in module_exists()
  tracing: Mention trace_clock=global when warning about unstable clocks
  tracing: Default to using trace_global_clock if sched_clock is unstable
  ...
2018-04-10 11:27:30 -07:00
Abderrahmane Benbachir
58eacfffc4 init, tracing: instrument security and console initcall trace events
Trace events have been added around the initcall functions defined in
init/main.c. But console and security have their own initcalls. This adds
the trace events associated for those initcall functions.

Link: http://lkml.kernel.org/r/1521765208.19745.2.camel@polymtl.ca

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Abderrahmane Benbachir <abderrahmane.benbachir@polymtl.ca>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-04-06 08:56:55 -04:00
Linus Torvalds
357aa6aefe Merge branch 'for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
Pull printk updates from Petr Mladek:

 - Add info about loaded kdump kernel into the dump stack header

 - Move dump-stack related code from printk.c to lib/dump_stack.c

 - Write message about suspending consoles in KERN_INFO log level

* 'for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
  printk: change message to pr_info
  printk: move dump stack related code to lib/dump_stack.c
  print kdump kernel loaded status in stack dump
2018-04-05 19:53:30 -07:00
Tomeu Vizoso
47319f7186 printk: change message to pr_info
To allow userspace to prevent this message from appearing in the
console by changing the log priority.

This matches other informative messages that the power subsystem emits
when the system changes power states.

Link: http://lkml.kernel.org/r/20180322135833.16602-1-tomeu.vizoso@collabora.com
To: linux-kernel@vger.kernel.org
Cc: kernel@collabora.com
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-03-23 15:41:59 +01:00
Dave Young
e36df28f53 printk: move dump stack related code to lib/dump_stack.c
dump_stack related stuff should belong to lib/dump_stack.c thus move them
there. Also conditionally compile lib/dump_stack.c since dump_stack code
does not make sense if printk is disabled.

Link: http://lkml.kernel.org/r/20180213072834.GA24784@dhcp-128-65.nay.redhat.com
To: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org
Cc: akpm@linux-foundation.org
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Dave Young <dyoung@redhat.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-03-15 13:25:36 +01:00
Linus Torvalds
7bec4a9646 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
Pull printk fix from Petr Mladek:
 "Make sure that we wake up userspace loggers. This fixes a race
  introduced by the console waiter logic during this merge window"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
  printk: Wake klogd when passing console_lock owner
2018-03-01 10:06:39 -08:00
Petr Mladek
c14376de3a printk: Wake klogd when passing console_lock owner
wake_klogd is a local variable in console_unlock(). The information
is lost when the console_lock owner using the busy wait added by
the commit dbdda842fe ("printk: Add console owner and waiter
logic to load balance console writes"). The following race is
possible:

CPU0				CPU1
console_unlock()

  for (;;)
     /* calling console for last message */

				printk()
				  log_store()
				    log_next_seq++;

     /* see new message */
     if (seen_seq != log_next_seq) {
	wake_klogd = true;
	seen_seq = log_next_seq;
     }

     console_lock_spinning_enable();

				  if (console_trylock_spinning())
				     /* spinning */

     if (console_lock_spinning_disable_and_check()) {
	printk_safe_exit_irqrestore(flags);
	return;

				  console_unlock()
				    if (seen_seq != log_next_seq) {
				    /* already seen */
				    /* nothing to do */

Result: Nobody would wakeup klogd.

One solution would be to make a global variable from wake_klogd.
But then we would need to manipulate it under a lock or so.

This patch wakes klogd also when console_lock is passed to the
spinning waiter. It looks like the right way to go. Also userspace
should have a chance to see and store any "flood" of messages.

Note that the very late klogd wake up was a historic solution.
It made sense on single CPU systems or when sys_syslog() operations
were synchronized using the big kernel lock like in v2.1.113.
But it is questionable these days.

Fixes: dbdda842fe ("printk: Add console owner and waiter logic to load balance console writes")
Link: http://lkml.kernel.org/r/20180226155734.dzwg3aovqnwtvkoy@pathway.suse.cz
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Suggested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-02-27 10:25:50 +01:00
Linus Torvalds
a9a08845e9 vfs: do bulk POLL* -> EPOLL* replacement
This is the mindless scripted replacement of kernel use of POLL*
variables as described by Al, done by this script:

    for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
        L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
        for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
    done

with de-mangling cleanups yet to come.

NOTE! On almost all architectures, the EPOLL* constants have the same
values as the POLL* constants do.  But they keyword here is "almost".
For various bad reasons they aren't the same, and epoll() doesn't
actually work quite correctly in some cases due to this on Sparc et al.

The next patch from Al will sort out the final differences, and we
should be all done.

Scripted-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-11 14:34:03 -08:00
Dave Young
097114aa6e print kdump kernel loaded status in stack dump
It is useful to print kdump kernel loaded status in dump_stack()
especially when panic happens so that we can differenciate
kdump kernel early hang and a normal panic in a bug report.

Link: http://lkml.kernel.org/r/20180127041129.GA29016@dhcp-128-65.nay.redhat.com
To: Steven Rostedt <rostedt@goodmis.org>
To: Andi Kleen <ak@linux.intel.com>
To: kexec@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: akpm@linux-foundation.org
Cc: kexec@lists.infradead.org
Signed-off-by: Dave Young <dyoung@redhat.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-02-08 14:47:25 +01:00
Linus Torvalds
ab486bc9a5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
Pull printk updates from Petr Mladek:

 - Add a console_msg_format command line option:

     The value "default" keeps the old "[time stamp] text\n" format. The
     value "syslog" allows to see the syslog-like "<log
     level>[timestamp] text" format.

     This feature was requested by people doing regression tests, for
     example, 0day robot. They want to have both filtered and full logs
     at hands.

 - Reduce the risk of softlockup:

     Pass the console owner in a busy loop.

     This is a new approach to the old problem. It was first proposed by
     Steven Rostedt on Kernel Summit 2017. It marks a context in which
     the console_lock owner calls console drivers and could not sleep.
     On the other side, printk() callers could detect this state and use
     a busy wait instead of a simple console_trylock(). Finally, the
     console_lock owner checks if there is a busy waiter at the end of
     the special context and eventually passes the console_lock to the
     waiter.

     The hand-off works surprisingly well and helps in many situations.
     Well, there is still a possibility of the softlockup, for example,
     when the flood of messages stops and the last owner still has too
     much to flush.

     There is increasing number of people having problems with
     printk-related softlockups. We might eventually need to get better
     solution. Anyway, this looks like a good start and promising
     direction.

 - Do not allow to schedule in console_unlock() called from printk():

     This reverts an older controversial commit. The reschedule helped
     to avoid softlockups. But it also slowed down the console output.
     This patch is obsoleted by the new console waiter logic described
     above. In fact, the reschedule made the hand-off less effective.

 - Deprecate "%pf" and "%pF" format specifier:

     It was needed on ia64, ppc64 and parisc64 to dereference function
     descriptors and show the real function address. It is done
     transparently by "%ps" and "pS" format specifier now.

     Sergey Senozhatsky found that all the function descriptors were in
     a special elf section and could be easily detected.

 - Remove printk_symbol() API:

     It has been obsoleted by "%pS" format specifier, and this change
     helped to remove few continuous lines and a less intuitive old API.

 - Remove redundant memsets:

     Sergey removed unnecessary memset when processing printk.devkmsg
     command line option.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: (27 commits)
  printk: drop redundant devkmsg_log_str memsets
  printk: Never set console_may_schedule in console_trylock()
  printk: Hide console waiter logic into helpers
  printk: Add console owner and waiter logic to load balance console writes
  kallsyms: remove print_symbol() function
  checkpatch: add pF/pf deprecation warning
  symbol lookup: introduce dereference_symbol_descriptor()
  parisc64: Add .opd based function descriptor dereference
  powerpc64: Add .opd based function descriptor dereference
  ia64: Add .opd based function descriptor dereference
  sections: split dereference_function_descriptor()
  openrisc: Fix conflicting types for _exext and _stext
  lib: do not use print_symbol()
  irq debug: do not use print_symbol()
  sysfs: do not use print_symbol()
  drivers: do not use print_symbol()
  x86: do not use print_symbol()
  unicore32: do not use print_symbol()
  sh: do not use print_symbol()
  mn10300: do not use print_symbol()
  ...
2018-02-01 13:36:15 -08:00
Linus Torvalds
168fe32a07 Merge branch 'misc.poll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull poll annotations from Al Viro:
 "This introduces a __bitwise type for POLL### bitmap, and propagates
  the annotations through the tree. Most of that stuff is as simple as
  'make ->poll() instances return __poll_t and do the same to local
  variables used to hold the future return value'.

  Some of the obvious brainos found in process are fixed (e.g. POLLIN
  misspelled as POLL_IN). At that point the amount of sparse warnings is
  low and most of them are for genuine bugs - e.g. ->poll() instance
  deciding to return -EINVAL instead of a bitmap. I hadn't touched those
  in this series - it's large enough as it is.

  Another problem it has caught was eventpoll() ABI mess; select.c and
  eventpoll.c assumed that corresponding POLL### and EPOLL### were
  equal. That's true for some, but not all of them - EPOLL### are
  arch-independent, but POLL### are not.

  The last commit in this series separates userland POLL### values from
  the (now arch-independent) kernel-side ones, converting between them
  in the few places where they are copied to/from userland. AFAICS, this
  is the least disruptive fix preserving poll(2) ABI and making epoll()
  work on all architectures.

  As it is, it's simply broken on sparc - try to give it EPOLLWRNORM and
  it will trigger only on what would've triggered EPOLLWRBAND on other
  architectures. EPOLLWRBAND and EPOLLRDHUP, OTOH, are never triggered
  at all on sparc. With this patch they should work consistently on all
  architectures"

* 'misc.poll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits)
  make kernel-side POLL... arch-independent
  eventpoll: no need to mask the result of epi_item_poll() again
  eventpoll: constify struct epoll_event pointers
  debugging printk in sg_poll() uses %x to print POLL... bitmap
  annotate poll(2) guts
  9p: untangle ->poll() mess
  ->si_band gets POLL... bitmap stored into a user-visible long field
  ring_buffer_poll_wait() return value used as return value of ->poll()
  the rest of drivers/*: annotate ->poll() instances
  media: annotate ->poll() instances
  fs: annotate ->poll() instances
  ipc, kernel, mm: annotate ->poll() instances
  net: annotate ->poll() instances
  apparmor: annotate ->poll() instances
  tomoyo: annotate ->poll() instances
  sound: annotate ->poll() instances
  acpi: annotate ->poll() instances
  crypto: annotate ->poll() instances
  block: annotate ->poll() instances
  x86: annotate ->poll() instances
  ...
2018-01-30 17:58:07 -08:00
Petr Mladek
bb4f552a59 Merge branch 'for-4.16-console-waiter-logic' into for-4.16 2018-01-22 10:40:59 +01:00
Sergey Senozhatsky
6fd78a1a99 printk: drop redundant devkmsg_log_str memsets
We copy in null terminated strings "on" and "off", no
need to zero out devkmsg_log_str in control_devkmsg().

Link: http://lkml.kernel.org/r/20180119043901.1728-1-sergey.senozhatsky@gmail.com
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-01-22 10:33:04 +01:00
Sergey Senozhatsky
fd5f7cde1b printk: Never set console_may_schedule in console_trylock()
This patch, basically, reverts commit 6b97a20d3a ("printk:
set may_schedule for some of console_trylock() callers").
That commit was a mistake, it introduced a big dependency
on the scheduler, by enabling preemption under console_sem
in printk()->console_unlock() path, which is rather too
critical. The patch did not significantly reduce the
possibilities of printk() lockups, but made it possible to
stall printk(), as has been reported by Tetsuo Handa [1].

Another issues is that preemption under console_sem also
messes up with Steven Rostedt's hand off scheme, by making
it possible to sleep with console_sem both in console_unlock()
and in vprintk_emit(), after acquiring the console_sem
ownership (anywhere between printk_safe_exit_irqrestore() in
console_trylock_spinning() and printk_safe_enter_irqsave()
in console_unlock()). This makes hand off less likely and,
at the same time, may result in a significant amount of
pending logbuf messages. Preempted console_sem owner makes
it impossible for other CPUs to emit logbuf messages, but
does not make it impossible for other CPUs to append new
messages to the logbuf.

Reinstate the old behavior and make printk() non-preemptible.
Should any printk() lockup reports arrive they must be handled
in a different way.

[1] http://lkml.kernel.org/r/201603022101.CAH73907.OVOOMFHFFtQJSL%20()%20I-love%20!%20SAKURA%20!%20ne%20!%20jp
Fixes: 6b97a20d3a ("printk: set may_schedule for some of console_trylock() callers")
Link: http://lkml.kernel.org/r/20180116044716.GE6607@jagdpanzerIV
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: akpm@linux-foundation.org
Cc: linux-mm@kvack.org
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-01-16 17:21:28 +01:00
Petr Mladek
c162d5b433 printk: Hide console waiter logic into helpers
The commit ("printk: Add console owner and waiter logic to load balance
console writes") made vprintk_emit() and console_unlock() even more
complicated.

This patch extracts the new code into 3 helper functions. They should
help to keep it rather self-contained. It will be easier to use and
maintain.

This patch just shuffles the existing code. It does not change
the functionality.

Link: http://lkml.kernel.org/r/20180112160837.GD24497@linux.suse
Cc: akpm@linux-foundation.org
Cc: linux-mm@kvack.org
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: rostedt@home.goodmis.org
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-01-16 17:21:16 +01:00
Steven Rostedt (VMware)
dbdda842fe printk: Add console owner and waiter logic to load balance console writes
This patch implements what I discussed in Kernel Summit. I added
lockdep annotation (hopefully correctly), and it hasn't had any splats
(since I fixed some bugs in the first iterations). It did catch
problems when I had the owner covering too much. But now that the owner
is only set when actively calling the consoles, lockdep has stayed
quiet.

Here's the design again:

I added a "console_owner" which is set to a task that is actively
writing to the consoles. It is *not* the same as the owner of the
console_lock. It is only set when doing the calls to the console
functions. It is protected by a console_owner_lock which is a raw spin
lock.

There is a console_waiter. This is set when there is an active console
owner that is not current, and waiter is not set. This too is protected
by console_owner_lock.

In printk() when it tries to write to the consoles, we have:

	if (console_trylock())
		console_unlock();

Now I added an else, which will check if there is an active owner, and
no current waiter. If that is the case, then console_waiter is set, and
the task goes into a spin until it is no longer set.

When the active console owner finishes writing the current message to
the consoles, it grabs the console_owner_lock and sees if there is a
waiter, and clears console_owner.

If there is a waiter, then it breaks out of the loop, clears the waiter
flag (because that will release the waiter from its spin), and exits.
Note, it does *not* release the console semaphore. Because it is a
semaphore, there is no owner. Another task may release it. This means
that the waiter is guaranteed to be the new console owner! Which it
becomes.

Then the waiter calls console_unlock() and continues to write to the
consoles.

If another task comes along and does a printk() it too can become the
new waiter, and we wash rinse and repeat!

By Petr Mladek about possible new deadlocks:

The thing is that we move console_sem only to printk() call
that normally calls console_unlock() as well. It means that
the transferred owner should not bring new type of dependencies.
As Steven said somewhere: "If there is a deadlock, it was
there even before."

We could look at it from this side. The possible deadlock would
look like:

CPU0                            CPU1

console_unlock()

  console_owner = current;

				spin_lockA()
				  printk()
				    spin = true;
				    while (...)

    call_console_drivers()
      spin_lockA()

This would be a deadlock. CPU0 would wait for the lock A.
While CPU1 would own the lockA and would wait for CPU0
to finish calling the console drivers and pass the console_sem
owner.

But if the above is true than the following scenario was
already possible before:

CPU0

spin_lockA()
  printk()
    console_unlock()
      call_console_drivers()
	spin_lockA()

By other words, this deadlock was there even before. Such
deadlocks are prevented by using printk_deferred() in
the sections guarded by the lock A.

By Steven Rostedt:

To demonstrate the issue, this module has been shown to lock up a
system with 4 CPUs and a slow console (like a serial console). It is
also able to lock up a 8 CPU system with only a fast (VGA) console, by
passing in "loops=100". The changes in this commit prevent this module
from locking up the system.

 #include <linux/module.h>
 #include <linux/delay.h>
 #include <linux/sched.h>
 #include <linux/mutex.h>
 #include <linux/workqueue.h>
 #include <linux/hrtimer.h>

 static bool stop_testing;
 static unsigned int loops = 1;

 static void preempt_printk_workfn(struct work_struct *work)
 {
 	int i;

 	while (!READ_ONCE(stop_testing)) {
 		for (i = 0; i < loops && !READ_ONCE(stop_testing); i++) {
 			preempt_disable();
 			pr_emerg("%5d%-75s\n", smp_processor_id(),
 				 " XXX NOPREEMPT");
 			preempt_enable();
 		}
 		msleep(1);
 	}
 }

 static struct work_struct __percpu *works;

 static void finish(void)
 {
 	int cpu;

 	WRITE_ONCE(stop_testing, true);
 	for_each_online_cpu(cpu)
 		flush_work(per_cpu_ptr(works, cpu));
 	free_percpu(works);
 }

 static int __init test_init(void)
 {
 	int cpu;

 	works = alloc_percpu(struct work_struct);
 	if (!works)
 		return -ENOMEM;

 	/*
 	 * This is just a test module. This will break if you
 	 * do any CPU hot plugging between loading and
 	 * unloading the module.
 	 */

 	for_each_online_cpu(cpu) {
 		struct work_struct *work = per_cpu_ptr(works, cpu);

 		INIT_WORK(work, &preempt_printk_workfn);
 		schedule_work_on(cpu, work);
 	}

 	return 0;
 }

 static void __exit test_exit(void)
 {
 	finish();
 }

 module_param(loops, uint, 0);
 module_init(test_init);
 module_exit(test_exit);
 MODULE_LICENSE("GPL");

Link: http://lkml.kernel.org/r/20180110132418.7080-2-pmladek@suse.com
Cc: akpm@linux-foundation.org
Cc: linux-mm@kvack.org
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
[pmladek@suse.com: Commit message about possible deadlocks]
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-01-16 17:21:09 +01:00
Sergey Senozhatsky
cca10d58d2 printk: add console_msg_format command line option
0day and kernelCI automatically parse kernel log - basically some sort
of grepping using the pre-defined text patterns - in order to detect
and report regressions/errors. There are several sources they get the
kernel logs from:

a) dmesg or /proc/ksmg

   This is the preferred way. Because `dmesg --raw' (see later Note)
   and /proc/kmsg output contains facility and log level, which greatly
   simplifies grepping for EMERG/ALERT/CRIT/ERR messages.

b) serial consoles

   This option is harder to maintain, because serial console messages
   don't contain facility and log level.

This patch introduces a `console_msg_format=' command line option,
to switch between different message formatting on serial consoles.
For the time being we have just two options - default and syslog.
The "default" option just keeps the existing format. While the
"syslog" option makes serial console messages to appear in syslog
format [syslog() syscall], matching the `dmesg -S --raw' and
`cat /proc/kmsg' output formats:

- facility and log level
- time stamp (depends on printk_time/PRINTK_TIME)
- message

	<%u>[time stamp] text\n

NOTE: while Kevin and Fengguang talk about "dmesg --raw", it's actually
"dmesg -S --raw" that always prints messages in syslog format [per
Petr Mladek]. Running "dmesg --raw" may produce output in non-syslog
format sometimes. console_msg_format=syslog enables syslog format,
thus in documentation we mention "dmesg -S --raw", not "dmesg --raw".

Per Kevin Hilman:

: Right now we can get this info from a "dmesg --raw" after bootup,
: but it would be really nice in certain automation frameworks to
: have a kernel command-line option to enable printing of loglevels
: in default boot log.
:
: This is especially useful when ingesting kernel logs into advanced
: search/analytics frameworks (I'm playing with and ELK stack: Elastic
: Search, Logstash, Kibana).
:
: The other important reason for having this on the command line is that
: for testing linux-next (and other bleeding edge developer branches),
: it's common that we never make it to userspace, so can't even run
: "dmesg --raw" (or equivalent.)  So we really want this on the primary
: boot (serial) console.

Per Fengguang Wu, 0day scripts should quickly benefit from that
feature, because they will be able to switch to a more reliable
parsing, based on messages' facility and log levels [1]:

`#{grep} -a -E -e '^<[0123]>' -e '^kern  :(err   |crit  |alert |emerg )'

instead of doing text pattern matching

`#{grep} -a -F -f /lkp/printk-error-messages #{kmsg_file} |
      grep -a -v -E -f #{LKP_SRC}/etc/oops-pattern |
      grep -a -v -F -f #{LKP_SRC}/etc/kmsg-blacklist`

[1] https://github.com/fengguang/lkp-tests/blob/master/lib/dmesg.rb

Link: http://lkml.kernel.org/r/20171221054149.4398-1-sergey.senozhatsky@gmail.com
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Fengguang Wu <fengguang.wu@intel.com>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-01-04 14:51:27 +01:00
Linus Torvalds
b7ad7ef742 remove task and stack pointer printout from oops dump
Geert Uytterhoeven reported a NFS oops, and pointed out that some of the
numbers were hashed and useless.

We could just turn them from '%p' into '%px', but those numbers are
really just legacy, and useless even when not hashed.

So just remove them entirely.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-05 08:23:20 -08:00
Al Viro
9dd957485d ipc, kernel, mm: annotate ->poll() instances
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-11-27 16:20:05 -05:00
Linus Torvalds
11ca75d2d6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
Pull printk updates from Petr Mladek:

 - print the warning about dropped messages on consoles on a separate
   line.   It makes it more legible.

 - one typo fix and small code clean up.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
  added new line symbol after warning about dropped messages
  printk: fix typo in printk_safe.c
  printk: simplify no_printk()
2017-11-21 05:28:13 -10:00
Linus Torvalds
2dcd9c71c1 Merge tag 'trace-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from

 - allow module init functions to be traced

 - clean up some unused or not used by config events (saves space)

 - clean up of trace histogram code

 - add support for preempt and interrupt enabled/disable events

 - other various clean ups

* tag 'trace-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (30 commits)
  tracing, thermal: Hide cpu cooling trace events when not in use
  tracing, thermal: Hide devfreq trace events when not in use
  ftrace: Kill FTRACE_OPS_FL_PER_CPU
  perf/ftrace: Small cleanup
  perf/ftrace: Fix function trace events
  perf/ftrace: Revert ("perf/ftrace: Fix double traces of perf on ftrace:function")
  tracing, dma-buf: Remove unused trace event dma_fence_annotate_wait_on
  tracing, memcg, vmscan: Hide trace events when not in use
  tracing/xen: Hide events that are not used when X86_PAE is not defined
  tracing: mark trace_test_buffer as __maybe_unused
  printk: Remove superfluous memory barriers from printk_safe
  ftrace: Clear hashes of stale ips of init memory
  tracing: Add support for preempt and irq enable/disable events
  tracing: Prepare to add preempt and irq trace events
  ftrace/kallsyms: Have /proc/kallsyms show saved mod init functions
  ftrace: Add freeing algorithm to free ftrace_mod_maps
  ftrace: Save module init functions kallsyms symbols for tracing
  ftrace: Allow module init functions to be traced
  ftrace: Add a ftrace_free_mem() function for modules to use
  tracing: Reimplement log2
  ...
2017-11-17 14:58:01 -08:00