Commit Graph

188 Commits

Author SHA1 Message Date
Kay Sievers
c313af145b printk() - isolate KERN_CONT users from ordinary complete lines
Arrange the continuation printk() buffering to be fully separated from the
ordinary full line users.

Limit the exposure to races and wrong printk() line merges to users of
continuation only. Ordinary full line users racing against continuation
users will no longer affect each other.

Multiple continuation users from different threads, racing against each
other will not wrongly be merged into a single line, but printed as
separate lines.

Test output of a kernel module which starts two separate threads which
race against each other, one of them printing a single full terminated
line:
  printk("(AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA)\n");

The other one printing the line, every character separate in a
continuation loop:
  printk("(C");
  for (i = 0; i < 58; i++)
          printk(KERN_CONT "C");
  printk(KERN_CONT "C)\n");

Behavior of single and non-thread-aware printk() buffer:
  # modprobe printk-race
  printk test init
  (CC(AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA)
  C(AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA)
  CC(AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA)
  C(AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA)
  CC(AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA)
  C(AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA)
  C(AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA)
  CC(AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA)
  C(AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA)
  C(AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA)
  CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC)
  (CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC)

New behavior with separate and thread-aware continuation buffer:
  # modprobe printk-race
  printk test init
  (AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA)
  (AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA)
  (AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA)
  (CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC)
  (AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA)
  (AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA)
  (AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA)
  (AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA)
  (CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC)
  (CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC)

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Joe Perches <joe@perches.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Ingo Molnar  <mingo@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-05-14 12:36:45 -07:00
Kay Sievers
3ce9a7c0ac printk() - restore prefix/timestamp printing for multi-newline strings
Calls like:
  printk("\n *** DEADLOCK ***\n\n");
will print 3 properly indented, separated, syslog + timestamp prefixed lines in
the log output.

Reported-By: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-05-14 08:42:22 -07:00
Randy Dunlap
1fce677971 printk: add stub for prepend_timestamp()
Add a stub for prepend_timestamp() when CONFIG_PRINTK is not
enabled.  Fixes this build error:

kernel/printk.c:1770:3: error: implicit declaration of function 'prepend_timestamp'

Cc: Kay Sievers <kay@vrfy.org>
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-05-11 16:44:54 -07:00
Stephen Warren
f8450fca6e printk: correctly align __log_buf
__log_buf must be aligned, because a 64-bit value is written directly
to it as part of struct log. Alignment of the log entries is typically
handled by log_store(), but this only triggers for subsequent entries,
not the very first (or wrapped) entries.

Cc: Kay Sievers <kay@vrfy.org>
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-05-10 15:36:59 -07:00
Kay Sievers
649e6ee33f printk() - restore timestamp printing at console output
The output of the timestamps got lost with the conversion of the
kmsg buffer to records; restore the old behavior.

Document, that CONFIG_PRINTK_TIME now only controls the output of
the timestamps in the syslog() system call and on the console, and
not the recording of the timestamps.

Cc: Joe Perches <joe@perches.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Sasha Levin <levinsasha928@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Reported-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-05-09 20:35:06 -07:00
Kay Sievers
5c5d5ca51a printk() - do not merge continuation lines of different threads
This prevents the merging of printk() continuation lines of different
threads, in the case they race against each other.

It should properly isolate "atomic" single-line printk() users from
continuation users, to make sure the single-line users will never be
merged with the racy continuation ones.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-05-09 20:29:59 -07:00
Kay Sievers
7f3a781d6f printk - fix compilation for CONFIG_PRINTK=n
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-05-09 15:51:09 -07:00
Kay Sievers
5fc3249068 kmsg: use do_div() to divide 64bit integer
On Tue, May 8, 2012 at 10:02 AM, Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> kernel/built-in.o: In function `devkmsg_read':
> printk.c:(.text+0x27e8): undefined reference to `__udivdi3'
> Most probably the "msg->ts_nsec / 1000" since
> ts_nsec is a u64 and this is a 32 bit build ...

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-05-08 08:55:26 -07:00
Kay Sievers
e11fea92e1 kmsg: export printk records to the /dev/kmsg interface
Support for multiple concurrent readers of /dev/kmsg, with read(),
seek(), poll() support. Output of message sequence numbers, to allow
userspace log consumers to reliably reconnect and reconstruct their
state at any given time. After open("/dev/kmsg"), read() always
returns *all* buffered records. If only future messages should be
read, SEEK_END can be used. In case records get overwritten while
/dev/kmsg is held open, or records get faster overwritten than they
are read, the next read() will return -EPIPE and the current reading
position gets updated to the next available record. The passed
sequence numbers allow the log consumer to calculate the amount of
lost messages.

  [root@mop ~]# cat /dev/kmsg
  5,0,0;Linux version 3.4.0-rc1+ (kay@mop) (gcc version 4.7.0 20120315 ...
  6,159,423091;ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
  7,160,424069;pci_root PNP0A03:00: host bridge window [io  0x0000-0x0cf7] (ignored)
   SUBSYSTEM=acpi
   DEVICE=+acpi:PNP0A03:00
  6,339,5140900;NET: Registered protocol family 10
  30,340,5690716;udevd[80]: starting version 181
  6,341,6081421;FDC 0 is a S82078B
  6,345,6154686;microcode: CPU0 sig=0x623, pf=0x0, revision=0x0
  7,346,6156968;sr 1:0:0:0: Attached scsi CD-ROM sr0
   SUBSYSTEM=scsi
   DEVICE=+scsi:1:0:0:0
  6,347,6289375;microcode: CPU1 sig=0x623, pf=0x0, revision=0x0

Cc: Karel Zak <kzak@redhat.com>
Tested-by: William Douglas <william.douglas@intel.com>
Signed-off-by: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-05-07 17:03:27 -07:00
Kay Sievers
7ff9554bb5 printk: convert byte-buffer to variable-length record buffer
- Record-based stream instead of the traditional byte stream
  buffer. All records carry a 64 bit timestamp, the syslog facility
  and priority in the record header.

- Records consume almost the same amount, sometimes less memory than
  the traditional byte stream buffer (if printk_time is enabled). The record
  header is 16 bytes long, plus some padding bytes at the end if needed.
  The byte-stream buffer needed 3 chars for the syslog prefix, 15 char for
  the timestamp and a newline.

- Buffer management is based on message sequence numbers. When records
  need to be discarded, the reading heads move on to the next full
  record. Unlike the byte-stream buffer, no old logged lines get
  truncated or partly overwritten by new ones. Sequence numbers also
  allow consumers of the log stream to get notified if any message in
  the stream they are about to read gets discarded during the time
  of reading.

- Better buffered IO support for KERN_CONT continuation lines, when printk()
  is called multiple times for a single line. The use of KERN_CONT is now
  mandatory to use continuation; a few places in the kernel need trivial fixes
  here. The buffering could possibly be extended to per-cpu variables to allow
  better thread-safety for multiple printk() invocations for a single line.

- Full-featured syslog facility value support. Different facilities
  can tag their messages. All userspace-injected messages enforce a
  facility value > 0 now, to be able to reliably distinguish them from
  the kernel-generated messages. Independent subsystems like a
  baseband processor running its own firmware, or a kernel-related
  userspace process can use their own unique facility values. Multiple
  independent log streams can co-exist that way in the same
  buffer. All share the same global sequence number counter to ensure
  proper ordering (and interleaving) and to allow the consumers of the
  log to reliably correlate the events from different facilities.

Tested-by: William Douglas <william.douglas@intel.com>
Signed-off-by: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-05-07 16:53:02 -07:00
Linus Torvalds
2ba68940c8 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler changes for v3.4 from Ingo Molnar

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
  printk: Make it compile with !CONFIG_PRINTK
  sched/x86: Fix overflow in cyc2ns_offset
  sched: Fix nohz load accounting -- again!
  sched: Update yield() docs
  printk/sched: Introduce special printk_sched() for those awkward moments
  sched/nohz: Correctly initialize 'next_balance' in 'nohz' idle balancer
  sched: Cleanup cpu_active madness
  sched: Fix load-balance wreckage
  sched: Clean up parameter passing of proc_sched_autogroup_set_nice()
  sched: Ditch per cgroup task lists for load-balancing
  sched: Rename load-balancing fields
  sched: Move load-balancing arguments into helper struct
  sched/rt: Do not submit new work when PI-blocked
  sched/rt: Prevent idle task boosting
  sched/wait: Add __wake_up_all_locked() API
  sched/rt: Document scheduler related skip-resched-check sites
  sched/rt: Use schedule_preempt_disabled()
  sched/rt: Add schedule_preempt_disabled()
  sched/rt: Do not throttle when PI boosting
  sched/rt: Keep period timer ticking when rt throttling is active
  ...
2012-03-20 10:31:44 -07:00
Peter Zijlstra
600e145882 printk: Make it compile with !CONFIG_PRINTK
Commit 3ccf3e8306 ("printk/sched: Introduce special
printk_sched() for those awkward moments") overlooked
an #ifdef, so move code around to respect these directives.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Link: http://lkml.kernel.org/r/1331811337.18960.179.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-15 13:38:54 +01:00
Ingo Molnar
47258cf3c4 Merge tag 'v3.3-rc7' into sched/core
Merge reason: merge back final fixes, prepare for the merge window.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-13 16:26:52 +01:00
Ingo Molnar
35239e23c6 Merge branch 'perf/urgent' into perf/core
Merge reason: We are going to queue up a dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-12 20:44:11 +01:00
Peter Zijlstra
3ccf3e8306 printk/sched: Introduce special printk_sched() for those awkward moments
There's a few awkward printk()s inside of scheduler guts that people
prefer to keep but really are rather deadlock prone. Fudge around it
by storing the text in a per-cpu buffer and poll it using the existing
printk_tick() handler.

This will drop output when its more frequent than once a tick, however
only the affinity thing could possible go that fast and for that just
one should suffice to notify the admin he's done something silly..

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-wua3lmkt3dg8nfts66o6brne@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-12 20:43:16 +01:00
Matthew Garrett
c22ab33290 kmsg_dump: don't run on non-error paths by default
Since commit 04c6862c05 ("kmsg_dump: add kmsg_dump() calls to the
reboot, halt, poweroff and emergency_restart paths"), kmsg_dump() gets
run on normal paths including poweroff and reboot.

This is less than ideal given pstore implementations that can only
represent single backtraces, since a reboot may overwrite a stored oops
before it's been picked up by userspace.  In addition, some pstore
backends may have low performance and provide a significant delay in
reboot as a result.

This patch adds a printk.always_kmsg_dump kernel parameter (which can also
be changed from userspace).  Without it, the code will only be run on
failure paths rather than on normal paths.  The option can be enabled in
environments where there's a desire to attempt to audit whether or not a
reboot was cleanly requested or not.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Acked-by: Seiji Aguchi <seiji.aguchi@hds.com>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Marco Stornelli <marco.stornelli@gmail.com>
Cc: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-05 15:49:42 -08:00
Johannes Berg
9510035849 printk/tracing: Add console output tracing
Add a printk.console trace point to record any printk
messages into the trace, regardless of the current
console loglevel. This can help correlate (existing)
printk debugging with other tracing.

Link: http://lkml.kernel.org/r/1322161388.5366.54.camel@jlt3.sipsolutions.net

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-02-13 13:46:05 -05:00
Rusty Russell
2329abfa34 module_param: make bool parameters really bool (core code)
module_param(bool) used to counter-intuitively take an int.  In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.

It's time to remove the int/unsigned int option.  For this version
it'll simply give a warning, but it'll break next kernel version.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-13 09:32:18 +10:30
Rusty Russell
29d4d6df10 printk: fix unnecessary module_param_name.
You don't need module_param_name if the name is the same!

Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-13 09:32:17 +10:30
Linus Torvalds
1483b38235 Merge branch 'core-printk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'core-printk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  printk, lockdep: Switch to tracked irq ops
  printk, lockdep: Remove superfluous preempt_disable()
  printk, lockdep: Disable lock debugging on zap_locks()
2012-01-06 08:01:59 -08:00
Ingo Molnar
45aa0663cc Merge branch 'memblock-kill-early_node_map' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into core/memblock 2011-12-20 12:14:26 +01:00
Peter Zijlstra
09dc3cf93f printk: avoid double lock acquire
Commit 4f2a8d3cf5 ("printk: Fix console_sem vs logbuf_lock unlock race")
introduced another silly bug where we would want to acquire an already
held lock.  Avoid this.

Reported-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-09 07:50:28 -08:00
Tejun Heo
d4bbf7e775 Merge branch 'master' into x86/memblock
Conflicts & resolutions:

* arch/x86/xen/setup.c

	dc91c728fd "xen: allow extra memory to be in multiple regions"
	24aa07882b "memblock, x86: Replace memblock_x86_reserve/free..."

	conflicted on xen_add_extra_mem() updates.  The resolution is
	trivial as the latter just want to replace
	memblock_x86_reserve_range() with memblock_reserve().

* drivers/pci/intel-iommu.c

	166e9278a3 "x86/ia64: intel-iommu: move to drivers/iommu/"
	5dfe8660a3 "bootmem: Replace work_with_active_regions() with..."

	conflicted as the former moved the file under drivers/iommu/.
	Resolved by applying the chnages from the latter on the moved
	file.

* mm/Kconfig

	6661672053 "memblock: add NO_BOOTMEM config symbol"
	c378ddd53f "memblock, x86: Make ARCH_DISCARD_MEMBLOCK a config option"

	conflicted trivially.  Both added config options.  Just
	letting both add their own options resolves the conflict.

* mm/memblock.c

	d1f0ece6cd "mm/memblock.c: small function definition fixes"
	ed7b56a799 "memblock: Remove memblock_memory_can_coalesce()"

	confliected.  The former updates function removed by the
	latter.  Resolution is trivial.

Signed-off-by: Tejun Heo <tj@kernel.org>
2011-11-28 09:46:22 -08:00
Peter Zijlstra
1a9a8aefa8 printk, lockdep: Switch to tracked irq ops
Switch to local_irq_ ops so that the irq state is properly tracked
(raw_local_irq_* isn't tracked by lockdep, causing confusion).

Possible now that commit dd4e5d3ac4 ("lockdep: Fix
trace_[soft,hard]irqs_[on,off]() recursion") cured the reason we
needed the raw_ ops.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-11-14 13:35:20 +01:00
Peter Zijlstra
47ff5c95db printk, lockdep: Remove superfluous preempt_disable()
The raw_lock_irq_{save,restore}() already implies a
non-preemptibility.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-11-14 13:35:18 +01:00