Commit Graph

20620 Commits

Author SHA1 Message Date
Lorenzo Stoakes
3e44c471a2 gcov: add support for GCC 5.1
Fix kernel gcov support for GCC 5.1.  Similar to commit a992bf836f
("gcov: add support for GCC 4.9"), this patch takes into account the
existence of a new gcov counter (see gcc's gcc/gcov-counter.def.)

Firstly, it increments GCOV_COUNTERS (to 10), which makes the data
structure struct gcov_info compatible with GCC 5.1.

Secondly, a corresponding counter function __gcov_merge_icall_topn (Top N
value tracking for indirect calls) is included in base.c with the other
gcov counters unused for kernel profiling.

Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Yuan Pengfei <coolypf@qq.com>
Tested-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-30 19:44:57 -07:00
HATAYAMA Daisuke
5375b708f2 kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
Commit f06e5153f4 ("kernel/panic.c: add "crash_kexec_post_notifiers"
option for kdump after panic_notifers") introduced
"crash_kexec_post_notifiers" kernel boot option, which toggles wheather
panic() calls crash_kexec() before panic_notifiers and dump kmsg or after.

The problem is that the commit overlooks panic_on_oops kernel boot option.
 If it is enabled, crash_kexec() is called directly without going through
panic() in oops path.

To fix this issue, this patch adds a check to "crash_kexec_post_notifiers"
in the condition of kexec_should_crash().

Also, put a comment in kexec_should_crash() to explain not obvious things
on this patch.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Acked-by: Baoquan He <bhe@redhat.com>
Tested-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-30 19:44:57 -07:00
HATAYAMA Daisuke
f45d85ff1f kernel/panic: call the 2nd crash_kexec() only if crash_kexec_post_notifiers is enabled
For compatibility with the behaviour before the commit f06e5153f4
("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after
panic_notifers"), the 2nd crash_kexec() should be called only if
crash_kexec_post_notifiers is enabled.

Note that crash_kexec() returns immediately if kdump crash kernel is not
loaded, so in this case, this patch makes no functionality change, but the
point is to make it explicit, from the caller panic() side, that the 2nd
crash_kexec() does nothing.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-30 19:44:57 -07:00
Linus Torvalds
4a10a91756 Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit
Pull audit updates from Paul Moore:
 "Four small audit patches for v4.2, all bug fixes.  Only 10 lines of
  change this time so very unremarkable, the patch subject lines pretty
  much tell the whole story"

* 'upstream' of git://git.infradead.org/users/pcmoore/audit:
  audit: Fix check of return value of strnlen_user()
  audit: obsolete audit_context check is removed in audit_filter_rules()
  audit: fix for typo in comment to function audit_log_link_denied()
  lsm: rename duplicate labels in LSM_AUDIT_DATA_TASK audit message type
2015-06-27 13:53:16 -07:00
Linus Torvalds
e22619a29f Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "The main change in this kernel is Casey's generalized LSM stacking
  work, which removes the hard-coding of Capabilities and Yama stacking,
  allowing multiple arbitrary "small" LSMs to be stacked with a default
  monolithic module (e.g.  SELinux, Smack, AppArmor).

  See
        https://lwn.net/Articles/636056/

  This will allow smaller, simpler LSMs to be incorporated into the
  mainline kernel and arbitrarily stacked by users.  Also, this is a
  useful cleanup of the LSM code in its own right"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (38 commits)
  tpm, tpm_crb: fix le64_to_cpu conversions in crb_acpi_add()
  vTPM: set virtual device before passing to ibmvtpm_reset_crq
  tpm_ibmvtpm: remove unneccessary message level.
  ima: update builtin policies
  ima: extend "mask" policy matching support
  ima: add support for new "euid" policy condition
  ima: fix ima_show_template_data_ascii()
  Smack: freeing an error pointer in smk_write_revoke_subj()
  selinux: fix setting of security labels on NFS
  selinux: Remove unused permission definitions
  selinux: enable genfscon labeling for sysfs and pstore files
  selinux: enable per-file labeling for debugfs files.
  selinux: update netlink socket classes
  signals: don't abuse __flush_signals() in selinux_bprm_committed_creds()
  selinux: Print 'sclass' as string when unrecognized netlink message occurs
  Smack: allow multiple labels in onlycap
  Smack: fix seq operations in smackfs
  ima: pass iint to ima_add_violation()
  ima: wrap event related data to the new ima_event_data structure
  integrity: add validity checks for 'path' parameter
  ...
2015-06-27 13:26:03 -07:00
Linus Torvalds
e0dd880a54 Merge branch 'for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue updates from Tejun Heo:
 "Most of the changes are around implementing and fixing fallouts from
  sysfs and internal interface to limit the CPUs available to all
  unbound workqueues to help isolating CPUs.  It needs more work as
  ordered workqueues can roam unrestricted but still is a significant
  improvement"

* 'for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: fix typos in comments
  workqueue: move flush_scheduled_work() to workqueue.h
  workqueue: remove the lock from wq_sysfs_prep_attrs()
  workqueue: remove the declaration of copy_workqueue_attrs()
  workqueue: ensure attrs changes are properly synchronized
  workqueue: separate out and refactor the locking of applying attrs
  workqueue: simplify wq_update_unbound_numa()
  workqueue: wq_pool_mutex protects the attrs-installation
  workqueue: fix a typo
  workqueue: function name in the comment differs from the real function name
  workqueue: fix trivial typo in Documentation/workqueue.txt
  workqueue: Allow modifying low level unbound workqueue cpumask
  workqueue: Create low-level unbound workqueues cpumask
  workqueue: split apply_workqueue_attrs() into 3 stages
2015-06-26 20:12:21 -07:00
Linus Torvalds
bbe179f88d Merge branch 'for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:

 - threadgroup_lock got reorganized so that its users can pick the
   actual locking mechanism to use.  Its only user - cgroups - is
   updated to use a percpu_rwsem instead of per-process rwsem.

   This makes things a bit lighter on hot paths and allows cgroups to
   perform and fail multi-task (a process) migrations atomically.
   Multi-task migrations are used in several places including the
   unified hierarchy.

 - Delegation rule and documentation added to unified hierarchy.  This
   will likely be the last interface update from the cgroup core side
   for unified hierarchy before lifting the devel mask.

 - Some groundwork for the pids controller which is scheduled to be
   merged in the coming devel cycle.

* 'for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: add delegation section to unified hierarchy documentation
  cgroup: require write perm on common ancestor when moving processes on the default hierarchy
  cgroup: separate out cgroup_procs_write_permission() from __cgroup_procs_write()
  kernfs: make kernfs_get_inode() public
  MAINTAINERS: add a cgroup core co-maintainer
  cgroup: fix uninitialised iterator in for_each_subsys_which
  cgroup: replace explicit ss_mask checking with for_each_subsys_which
  cgroup: use bitmask to filter for_each_subsys
  cgroup: add seq_file forward declaration for struct cftype
  cgroup: simplify threadgroup locking
  sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem
  sched, cgroup: reorganize threadgroup locking
  cgroup: switch to unsigned long for bitmasks
  cgroup: reorganize include/linux/cgroup.h
  cgroup: separate out include/linux/cgroup-defs.h
  cgroup: fix some comment typos
2015-06-26 19:50:04 -07:00
Linus Torvalds
8d7804a2f0 Merge tag 'driver-core-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
 "Here is the driver core / firmware changes for 4.2-rc1.

  A number of small changes all over the place in the driver core, and
  in the firmware subsystem.  Nothing really major, full details in the
  shortlog.  Some of it is a bit of churn, given that the platform
  driver probing changes was found to not work well, so they were
  reverted.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'driver-core-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (31 commits)
  Revert "base/platform: Only insert MEM and IO resources"
  Revert "base/platform: Continue on insert_resource() error"
  Revert "of/platform: Use platform_device interface"
  Revert "base/platform: Remove code duplication"
  firmware: add missing kfree for work on async call
  fs: sysfs: don't pass count == 0 to bin file readers
  base:dd - Fix for typo in comment to function driver_deferred_probe_trigger().
  base/platform: Remove code duplication
  of/platform: Use platform_device interface
  base/platform: Continue on insert_resource() error
  base/platform: Only insert MEM and IO resources
  firmware: use const for remaining firmware names
  firmware: fix possible use after free on name on asynchronous request
  firmware: check for file truncation on direct firmware loading
  firmware: fix __getname() missing failure check
  drivers: of/base: move of_init to driver_init
  drivers/base: cacheinfo: fix annoying typo when DT nodes are absent
  sysfs: disambiguate between "error code" and "failure" in comments
  driver-core: fix build for !CONFIG_MODULES
  driver-core: make __device_attach() static
  ...
2015-06-26 15:07:37 -07:00
Linus Torvalds
e382608254 Merge tag 'trace-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
 "This patch series contains several clean ups and even a new trace
  clock "monitonic raw".  Also some enhancements to make the ring buffer
  even faster.  But the biggest and most noticeable change is the
  renaming of the ftrace* files, structures and variables that have to
  deal with trace events.

  Over the years I've had several developers tell me about their
  confusion with what ftrace is compared to events.  Technically,
  "ftrace" is the infrastructure to do the function hooks, which include
  tracing and also helps with live kernel patching.  But the trace
  events are a separate entity altogether, and the files that affect the
  trace events should not be named "ftrace".  These include:

    include/trace/ftrace.h         ->    include/trace/trace_events.h
    include/linux/ftrace_event.h   ->    include/linux/trace_events.h

  Also, functions that are specific for trace events have also been renamed:

    ftrace_print_*()               ->    trace_print_*()
    (un)register_ftrace_event()    ->    (un)register_trace_event()
    ftrace_event_name()            ->    trace_event_name()
    ftrace_trigger_soft_disabled() ->    trace_trigger_soft_disabled()
    ftrace_define_fields_##call()  ->    trace_define_fields_##call()
    ftrace_get_offsets_##call()    ->    trace_get_offsets_##call()

  Structures have been renamed:

    ftrace_event_file              ->    trace_event_file
    ftrace_event_{call,class}      ->    trace_event_{call,class}
    ftrace_event_buffer            ->    trace_event_buffer
    ftrace_subsystem_dir           ->    trace_subsystem_dir
    ftrace_event_raw_##call        ->    trace_event_raw_##call
    ftrace_event_data_offset_##call->    trace_event_data_offset_##call
    ftrace_event_type_funcs_##call ->    trace_event_type_funcs_##call

  And a few various variables and flags have also been updated.

  This has been sitting in linux-next for some time, and I have not
  heard a single complaint about this rename breaking anything.  Mostly
  because these functions, variables and structures are mostly internal
  to the tracing system and are seldom (if ever) used by anything
  external to that"

* tag 'trace-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits)
  ring_buffer: Allow to exit the ring buffer benchmark immediately
  ring-buffer-benchmark: Fix the wrong type
  ring-buffer-benchmark: Fix the wrong param in module_param
  ring-buffer: Add enum names for the context levels
  ring-buffer: Remove useless unused tracing_off_permanent()
  ring-buffer: Give NMIs a chance to lock the reader_lock
  ring-buffer: Add trace_recursive checks to ring_buffer_write()
  ring-buffer: Allways do the trace_recursive checks
  ring-buffer: Move recursive check to per_cpu descriptor
  ring-buffer: Add unlikelys to make fast path the default
  tracing: Rename ftrace_get_offsets_##call() to trace_event_get_offsets_##call()
  tracing: Rename ftrace_define_fields_##call() to trace_event_define_fields_##call()
  tracing: Rename ftrace_event_type_funcs_##call to trace_event_type_funcs_##call
  tracing: Rename ftrace_data_offset_##call to trace_event_data_offset_##call
  tracing: Rename ftrace_raw_##call event structures to trace_event_raw_##call
  tracing: Rename ftrace_trigger_soft_disabled() to trace_trigger_soft_disabled()
  tracing: Rename FTRACE_EVENT_FL_* flags to EVENT_FILE_FL_*
  tracing: Rename struct ftrace_subsystem_dir to trace_subsystem_dir
  tracing: Rename ftrace_event_name() to trace_event_name()
  tracing: Rename FTRACE_MAX_EVENT to TRACE_EVENT_TYPE_MAX
  ...
2015-06-26 14:02:43 -07:00
Linus Torvalds
fcbc1777ce Merge tag 'trace-fixes-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
 "This isn't my 4.2 pull request (yet).  I found a few more bugs that I
  would have sent to fix 4.1, but since 4.1 is already out, I'm sending
  this before sending my 4.2 request (which is ready to go).

  After fixing the previous filter issue reported by Vince Weaver, I
  could not come up with a situation where the operand counter (cnt)
  could go below zero, so I added a WARN_ON_ONCE(cnt < 0).  Vince was
  able to trigger that warn on with his fuzzer test, but didn't have a
  filter input that caused it.

  Later, Sasha Levin was able to trigger that same warning, and was able
  to give me the filter string that triggered it.  It was simply a
  single operation ">".

  I wrapped the filtering code in a userspace program such that I could
  single step through the logic.  With a single operator the operand
  counter can legitimately go below zero, and should be reported to the
  user as an error, but should not produce a kernel warning.  The
  WARN_ON_ONCE(cnt < 0) should be just a "if (cnt < 0) break;" and the
  code following it will produce the error message for the user.

  While debugging this, I found that there was another bug that let the
  pointer to the filter string go beyond the filter string.  This too
  was fixed.

  Finally, there was a typo in a stub function that only gets compiled
  if trace events is disabled but tracing is enabled (I'm not even sure
  that's possible)"

* tag 'trace-fixes-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix typo from "static inlin" to "static inline"
  tracing/filter: Do not allow infix to exceed end of string
  tracing/filter: Do not WARN on operand count going below zero
2015-06-26 13:56:55 -07:00
Linus Torvalds
e8a0b37d28 Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM updates from Russell King:
 "Bigger items included in this update are:

   - A series of updates from Arnd for ARM randconfig build failures
   - Updates from Dmitry for StrongARM SA-1100 to move IRQ handling to
     drivers/irqchip/
   - Move ARMs SP804 timer to drivers/clocksource/
   - Perf updates from Mark Rutland in preparation to move the ARM perf
     code into drivers/ so it can be shared with ARM64.
   - MCPM updates from Nicolas
   - Add support for taking platform serial number from DT
   - Re-implement Keystone2 physical address space switch to conform to
     architecture requirements
   - Clean up ARMv7 LPAE code, which goes in hand with the Keystone2
     changes.
   - L2C cleanups to avoid unlocking caches if we're prevented by the
     secure support to unlock.
   - Avoid cleaning a potentially dirty cache containing stale data on
     CPU initialisation
   - Add ARM-only entry point for secondary startup (for machines that
     can only call into a Thumb kernel in ARM mode).  Same thing is also
     done for the resume entry point.
   - Provide arch_irqs_disabled via asm-generic
   - Enlarge ARMv7M vector table
   - Always use BFD linker for VDSO, as gold doesn't accept some of the
     options we need.
   - Fix an incorrect BSYM (for Thumb symbols) usage, and convert all
     BSYM compiler macros to a "badr" (for branch address).
   - Shut up compiler warnings provoked by our cmpxchg() implementation.
   - Ensure bad xchg sizes fail to link"

* 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (75 commits)
  ARM: Fix build if CLKDEV_LOOKUP is not configured
  ARM: fix new BSYM() usage introduced via for-arm-soc branch
  ARM: 8383/1: nommu: avoid deprecated source register on mov
  ARM: 8391/1: l2c: add options to overwrite prefetching behavior
  ARM: 8390/1: irqflags: Get arch_irqs_disabled from asm-generic
  ARM: 8387/1: arm/mm/dma-mapping.c: Add arm_coherent_dma_mmap
  ARM: 8388/1: tcm: Don't crash when TCM banks are protected by TrustZone
  ARM: 8384/1: VDSO: force use of BFD linker
  ARM: 8385/1: VDSO: group link options
  ARM: cmpxchg: avoid warnings from macro-ized cmpxchg() implementations
  ARM: remove __bad_xchg definition
  ARM: 8369/1: ARMv7M: define size of vector table for Vybrid
  ARM: 8382/1: clocksource: make ARM_TIMER_SP804 depend on GENERIC_SCHED_CLOCK
  ARM: 8366/1: move Dual-Timer SP804 driver to drivers/clocksource
  ARM: 8365/1: introduce sp804_timer_disable and remove arm_timer.h inclusion
  ARM: 8364/1: fix BE32 module loading
  ARM: 8360/1: add secondary_startup_arm prototype in header file
  ARM: 8359/1: correct secondary_startup_arm mode
  ARM: proc-v7: sanitise and document registers around errata
  ARM: proc-v7: clean up MIDR access
  ...
2015-06-26 12:20:00 -07:00
Linus Torvalds
47a469421d Merge branch 'akpm' (patches from Andrew)
Merge second patchbomb from Andrew Morton:

 - most of the rest of MM

 - lots of misc things

 - procfs updates

 - printk feature work

 - updates to get_maintainer, MAINTAINERS, checkpatch

 - lib/ updates

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (96 commits)
  exit,stats: /* obey this comment */
  coredump: add __printf attribute to cn_*printf functions
  coredump: use from_kuid/kgid when formatting corename
  fs/reiserfs: remove unneeded cast
  NILFS2: support NFSv2 export
  fs/befs/btree.c: remove unneeded initializations
  fs/minix: remove unneeded cast
  init/do_mounts.c: add create_dev() failure log
  kasan: remove duplicate definition of the macro KASAN_FREE_PAGE
  fs/efs: femove unneeded cast
  checkpatch: emit "NOTE: <types>" message only once after multiple files
  checkpatch: emit an error when there's a diff in a changelog
  checkpatch: validate MODULE_LICENSE content
  checkpatch: add multi-line handling for PREFER_ETHER_ADDR_COPY
  checkpatch: suggest using eth_zero_addr() and eth_broadcast_addr()
  checkpatch: fix processing of MEMSET issues
  checkpatch: suggest using ether_addr_equal*()
  checkpatch: avoid NOT_UNIFIED_DIFF errors on cover-letter.patch files
  checkpatch: remove local from codespell path
  checkpatch: add --showfile to allow input via pipe to show filenames
  ...
2015-06-26 09:52:05 -07:00
Rik van Riel
51229b4953 exit,stats: /* obey this comment */
There is a helpful comment in do_exit() that states we sync the mm's RSS
info before statistics gathering.

The function that does the statistics gathering is called right above that
comment.

Change the code to obey the comment.

Signed-off-by: Rik van Riel <riel@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:43 -07:00
Rasmus Villemoes
ff14417c0a kernel/trace/blktrace.c: use strreplace() in do_blk_trace_setup()
Part of the disassembly of do_blk_trace_setup:

    231b:       e8 00 00 00 00          callq  2320 <do_blk_trace_setup+0x50>
                        231c: R_X86_64_PC32     strlen+0xfffffffffffffffc
    2320:       eb 0a                   jmp    232c <do_blk_trace_setup+0x5c>
    2322:       66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
    2328:       48 83 c3 01             add    $0x1,%rbx
    232c:       48 39 d8                cmp    %rbx,%rax
    232f:       76 47                   jbe    2378 <do_blk_trace_setup+0xa8>
    2331:       41 80 3c 1c 2f          cmpb   $0x2f,(%r12,%rbx,1)
    2336:       75 f0                   jne    2328 <do_blk_trace_setup+0x58>
    2338:       41 c6 04 1c 5f          movb   $0x5f,(%r12,%rbx,1)
    233d:       4c 89 e7                mov    %r12,%rdi
    2340:       e8 00 00 00 00          callq  2345 <do_blk_trace_setup+0x75>
                        2341: R_X86_64_PC32     strlen+0xfffffffffffffffc
    2345:       eb e1                   jmp    2328 <do_blk_trace_setup+0x58>

Yep, that's right: gcc isn't smart enough to realize that replacing '/' by
'_' cannot change the strlen(), so we call it again and again (at least
when a '/' is found).  Even if gcc were that smart, this construction
would still loop over the string twice, once for the initial strlen() call
and then the open-coded loop.

Let's simply use strreplace() instead.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Liked-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:40 -07:00
Rasmus Villemoes
1bb564718f kernel/trace/trace_events_filter.c: use strreplace()
There's no point in starting over every time we see a ','...

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:40 -07:00
Vasily Averin
3ea4331c60 check_syslog_permissions() cleanup
Patch fixes drawbacks in heck_syslog_permissions() noticed by AKPM:
"from_file handling makes me cry.

That's not a boolean - it's an enumerated value with two values
currently defined.

But the code in check_syslog_permissions() treats it as a boolean and
also hardwires the knowledge that SYSLOG_FROM_PROC == 1 (or == `true`).

And the name is wrong: it should be called from_proc to match
SYSLOG_FROM_PROC."

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:39 -07:00
Vasily Averin
d194e5d666 security_syslog() should be called once only
The final version of commit 637241a900 ("kmsg: honor dmesg_restrict
sysctl on /dev/kmsg") lost few hooks, as result security_syslog() are
processed incorrectly:

- open of /dev/kmsg checks syslog access permissions by using
  check_syslog_permissions() where security_syslog() is not called if
  dmesg_restrict is set.

- syslog syscall and /proc/kmsg calls do_syslog() where security_syslog
  can be executed twice (inside check_syslog_permissions() and then
  directly in do_syslog())

With this patch security_syslog() is called once only in all
syslog-related operations regardless of dmesg_restrict value.

Fixes: 637241a900 ("kmsg: honor dmesg_restrict sysctl on /dev/kmsg")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:39 -07:00
Tejun Heo
6fe29354be printk: implement support for extended console drivers
printk log_buf keeps various metadata for each message including its
sequence number and timestamp.  The metadata is currently available only
through /dev/kmsg and stripped out before passed onto console drivers.  We
want this metadata to be available to console drivers too so that console
consumers can get full information including the metadata and dictionary,
which among other things can be used to detect whether messages got lost
in transit.

This patch implements support for extended console drivers.  Consoles can
indicate that they want extended messages by setting the new CON_EXTENDED
flag and they'll be fed messages formatted the same way as /dev/kmsg.

 "<level>,<sequnum>,<timestamp>,<contflag>;<message text>\n"

If extended consoles exist, in-kernel fragment assembly is disabled.  This
ensures that all messages emitted to consoles have full metadata including
sequence number.  The contflag carries enough information to reassemble
the fragments from the reader side trivially.  Note that this only affects
/dev/kmsg.  Regular console and /proc/kmsg outputs are not affected by
this change.

* Extended message formatting for console drivers is enabled iff there
  are registered extended consoles.

* Comment describing /dev/kmsg message format updated to add missing
  contflag field and help distinguishing variable from verbatim terms.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: David Miller <davem@davemloft.net>
Cc: Kay Sievers <kay@vrfy.org>
Reviewed-by: Petr Mladek <pmladek@suse.cz>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:38 -07:00
Tejun Heo
0a295e67ec printk: factor out message formatting from devkmsg_read()
The extended message formatting used for /dev/kmsg will be used implement
extended consoles.  Factor out msg_print_ext_header() and
msg_print_ext_body() from devkmsg_read().

This is pure restructuring.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: David Miller <davem@davemloft.net>
Cc: Kay Sievers <kay@vrfy.org>
Reviewed-by: Petr Mladek <pmladek@suse.cz>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:38 -07:00
Tejun Heo
d43ff430f4 printk: guard the amount written per line by devkmsg_read()
This patchset updates netconsole so that it can emit messages with the
same header as used in /dev/kmsg which gives neconsole receiver full log
information which enables things like structured logging and detection
of lost messages.

This patch (of 7):

devkmsg_read() uses 8k buffer and assumes that the formatted output
message won't overrun which seems safe given LOG_LINE_MAX, the current use
of dict and the escaping method being used; however, we're planning to use
devkmsg formatting wider and accounting for the buffer size properly isn't
that complicated.

This patch defines CONSOLE_EXT_LOG_MAX as 8192 and updates devkmsg_read()
so that it limits output accordingly.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: David Miller <davem@davemloft.net>
Cc: Kay Sievers <kay@vrfy.org>
Reviewed-by: Petr Mladek <pmladek@suse.cz>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:38 -07:00
Josh Triplett
3033f14ab7 clone: support passing tls argument via C rather than pt_regs magic
clone has some of the quirkiest syscall handling in the kernel, with a
pile of special cases, historical curiosities, and architecture-specific
calling conventions.  In particular, clone with CLONE_SETTLS accepts a
parameter "tls" that the C entry point completely ignores and some
assembly entry points overwrite; instead, the low-level arch-specific
code pulls the tls parameter out of the arch-specific register captured
as part of pt_regs on entry to the kernel.  That's a massive hack, and
it makes the arch-specific code only work when called via the specific
existing syscall entry points; because of this hack, any new clone-like
system call would have to accept an identical tls argument in exactly
the same arch-specific position, rather than providing a unified system
call entry point across architectures.

The first patch allows architectures to handle the tls argument via
normal C parameter passing, if they opt in by selecting
HAVE_COPY_THREAD_TLS.  The second patch makes 32-bit and 64-bit x86 opt
into this.

These two patches came out of the clone4 series, which isn't ready for
this merge window, but these first two cleanup patches were entirely
uncontroversial and have acks.  I'd like to go ahead and submit these
two so that other architectures can begin building on top of this and
opting into HAVE_COPY_THREAD_TLS.  However, I'm also happy to wait and
send these through the next merge window (along with v3 of clone4) if
anyone would prefer that.

This patch (of 2):

clone with CLONE_SETTLS accepts an argument to set the thread-local
storage area for the new thread.  sys_clone declares an int argument
tls_val in the appropriate point in the argument list (based on the
various CLONE_BACKWARDS variants), but doesn't actually use or pass along
that argument.  Instead, sys_clone calls do_fork, which calls
copy_process, which calls the arch-specific copy_thread, and copy_thread
pulls the corresponding syscall argument out of the pt_regs captured at
kernel entry (knowing what argument of clone that architecture passes tls
in).

Apart from being awful and inscrutable, that also only works because only
one code path into copy_thread can pass the CLONE_SETTLS flag, and that
code path comes from sys_clone with its architecture-specific
argument-passing order.  This prevents introducing a new version of the
clone system call without propagating the same architecture-specific
position of the tls argument.

However, there's no reason to pull the argument out of pt_regs when
sys_clone could just pass it down via C function call arguments.

Introduce a new CONFIG_HAVE_COPY_THREAD_TLS for architectures to opt into,
and a new copy_thread_tls that accepts the tls parameter as an additional
unsigned long (syscall-argument-sized) argument.  Change sys_clone's tls
argument to an unsigned long (which does not change the ABI), and pass
that down to copy_thread_tls.

Architectures that don't opt into copy_thread_tls will continue to ignore
the C argument to sys_clone in favor of the pt_regs captured at kernel
entry, and thus will be unable to introduce new versions of the clone
syscall.

Patch co-authored by Josh Triplett and Thiago Macieira.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thiago Macieira <thiago.macieira@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:38 -07:00
Alexey Dobriyan
4a00e9df29 prctl: more prctl(PR_SET_MM_*) checks
Individual prctl(PR_SET_MM_*) calls do some checking to maintain a
consistent view of mm->arg_start et al fields, but not enough.  In
particular PR_SET_MM_ARG_START/PR_SET_MM_ARG_END/ R_SET_MM_ENV_START/
PR_SET_MM_ENV_END only check that the address lies in an existing VMA,
but don't check that the start address is lower than the end address _at
all_.

Consolidate all consistency checks, so there will be no difference in
the future between PR_SET_MM_MAP and individual PR_SET_MM_* calls.

The program below makes both ARGV and ENVP areas be reversed.  It makes
/proc/$PID/cmdline show garbage (it doesn't oops by luck).

#include <sys/mman.h>
#include <sys/prctl.h>
#include <unistd.h>

enum {PAGE_SIZE=4096};

int main(void)
{
	void *p;

	p = mmap(NULL, PAGE_SIZE, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);

#define PR_SET_MM               35
#define PR_SET_MM_ARG_START     8
#define PR_SET_MM_ARG_END       9
#define PR_SET_MM_ENV_START     10
#define PR_SET_MM_ENV_END       11
	prctl(PR_SET_MM, PR_SET_MM_ARG_START, (unsigned long)p + PAGE_SIZE - 1, 0, 0);
	prctl(PR_SET_MM, PR_SET_MM_ARG_END,   (unsigned long)p, 0, 0);
	prctl(PR_SET_MM, PR_SET_MM_ENV_START, (unsigned long)p + PAGE_SIZE - 1, 0, 0);
	prctl(PR_SET_MM, PR_SET_MM_ENV_END,   (unsigned long)p, 0, 0);

	pause();
	return 0;
}

[akpm@linux-foundation.org: tidy code, tweak comment]
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jarod Wilson <jarod@redhat.com>
Cc: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:37 -07:00
Steven Rostedt (Red Hat)
cc9e4bde03 tracing: Fix typo from "static inlin" to "static inline"
The trace.h header when called without CONFIG_EVENT_TRACING enabled
(seldom done), will not compile because of a typo in the protocol
of trace_event_enum_update().

Cc: stable@vger.kernel.org # 4.1+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-06-25 18:21:34 -04:00
Steven Rostedt (Red Hat)
6b88f44e16 tracing/filter: Do not allow infix to exceed end of string
While debugging a WARN_ON() for filtering, I found that it is possible
for the filter string to be referenced after its end. With the filter:

 # echo '>' > /sys/kernel/debug/events/ext4/ext4_truncate_exit/filter

The filter_parse() function can call infix_get_op() which calls
infix_advance() that updates the infix filter pointers for the cnt
and tail without checking if the filter is already at the end, which
will put the cnt to zero and the tail beyond the end. The loop then calls
infix_next() that has

	ps->infix.cnt--;
	return ps->infix.string[ps->infix.tail++];

The cnt will now be below zero, and the tail that is returned is
already passed the end of the filter string. So far the allocation
of the filter string usually has some buffer that is zeroed out, but
if the filter string is of the exact size of the allocated buffer
there's no guarantee that the charater after the nul terminating
character will be zero.

Luckily, only root can write to the filter.

Cc: stable@vger.kernel.org # 2.6.33+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-06-25 18:18:17 -04:00
Steven Rostedt (Red Hat)
b4875bbe7e tracing/filter: Do not WARN on operand count going below zero
When testing the fix for the trace filter, I could not come up with
a scenario where the operand count goes below zero, so I added a
WARN_ON_ONCE(cnt < 0) to the logic. But there is legitimate case
that it can happen (although the filter would be wrong).

 # echo '>' > /sys/kernel/debug/events/ext4/ext4_truncate_exit/filter

That is, a single operation without any operands will hit the path
where the WARN_ON_ONCE() can trigger. Although this is harmless,
and the filter is reported as a error. But instead of spitting out
a warning to the kernel dmesg, just fail nicely and report it via
the proper channels.

Link: http://lkml.kernel.org/r/558C6082.90608@oracle.com

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: stable@vger.kernel.org # 2.6.33+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-06-25 18:02:29 -04:00