Commit Graph

26853 Commits

Author SHA1 Message Date
Linus Torvalds
9ca2c16f3b Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Thomas Gleixner:
 "Perf tool updates and kprobe fixes:

   - perf_mmap overwrite mode fixes/overhaul, prep work to get 'perf
     top' using it, making it bearable to use it in large core count
     systems such as Knights Landing/Mill Intel systems (Kan Liang)

   - s/390 now uses syscall.tbl, just like x86-64 to generate the
     syscall table id -> string tables used by 'perf trace' (Hendrik
     Brueckner)

   - Use strtoull() instead of home grown function (Andy Shevchenko)

   - Synchronize kernel ABI headers, v4.16-rc1 (Ingo Molnar)

   - Document missing 'perf data --force' option (Sangwon Hong)

   - Add perf vendor JSON metrics for ARM Cortex-A53 Processor (William
     Cohen)

   - Improve error handling and error propagation of ftrace based
     kprobes so failures when installing kprobes are not silently
     ignored and create disfunctional tracepoints"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
  kprobes: Propagate error from disarm_kprobe_ftrace()
  kprobes: Propagate error from arm_kprobe_ftrace()
  Revert "tools include s390: Grab a copy of arch/s390/include/uapi/asm/unistd.h"
  perf s390: Rework system call table creation by using syscall.tbl
  perf s390: Grab a copy of arch/s390/kernel/syscall/syscall.tbl
  tools/headers: Synchronize kernel ABI headers, v4.16-rc1
  perf test: Fix test trace+probe_libc_inet_pton.sh for s390x
  perf data: Document missing --force option
  perf tools: Substitute yet another strtoull()
  perf top: Check the latency of perf_top__mmap_read()
  perf top: Switch default mode to overwrite mode
  perf top: Remove lost events checking
  perf hists browser: Add parameter to disable lost event warning
  perf top: Add overwrite fall back
  perf evsel: Expose the perf_missing_features struct
  perf top: Check per-event overwrite term
  perf mmap: Discard legacy interface for mmap read
  perf test: Update mmap read functions for backward-ring-buffer test
  perf mmap: Introduce perf_mmap__read_event()
  perf mmap: Introduce perf_mmap__read_done()
  ...
2018-02-18 12:38:40 -08:00
Linus Torvalds
2d6c4e40ab Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "A small set of updates mostly for irq chip drivers:

   - MIPS GIC fix for spurious, masked interrupts

   - fix for a subtle IPI bug in GICv3

   - do not probe GICv3 ITSs that are marked as disabled

   - multi-MSI support for GICv2m

   - various small cleanups"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqdomain: Re-use DEFINE_SHOW_ATTRIBUTE() macro
  irqchip/bcm: Remove hashed address printing
  irqchip/gic-v2m: Add PCI Multi-MSI support
  irqchip/gic-v3: Ignore disabled ITS nodes
  irqchip/gic-v3: Use wmb() instead of smb_wmb() in gic_raise_softirq()
  irqchip/gic-v3: Change pr_debug message to pr_devel
  irqchip/mips-gic: Avoid spuriously handling masked interrupts
2018-02-18 12:22:04 -08:00
Andy Shevchenko
0b24a0bbe2 irqdomain: Re-use DEFINE_SHOW_ATTRIBUTE() macro
...instead of open coding file operations followed by custom ->open()
callbacks per each attribute.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-02-16 14:22:34 +00:00
Jessica Yu
297f9233b5 kprobes: Propagate error from disarm_kprobe_ftrace()
Improve error handling when disarming ftrace-based kprobes. Like with
arm_kprobe_ftrace(), propagate any errors from disarm_kprobe_ftrace() so
that we do not disable/unregister kprobes that are still armed. In other
words, unregister_kprobe() and disable_kprobe() should not report success
if the kprobe could not be disarmed.

disarm_all_kprobes() keeps its current behavior and attempts to
disarm all kprobes. It returns the last encountered error and gives a
warning if not all probes could be disarmed.

This patch is based on Petr Mladek's original patchset (patches 2 and 3)
back in 2015, which improved kprobes error handling, found here:

   https://lkml.org/lkml/2015/2/26/452

However, further work on this had been paused since then and the patches
were not upstreamed.

Based-on-patches-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S . Miller <davem@davemloft.net>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/20180109235124.30886-3-jeyu@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-16 09:12:58 +01:00
Jessica Yu
12310e3437 kprobes: Propagate error from arm_kprobe_ftrace()
Improve error handling when arming ftrace-based kprobes. Specifically, if
we fail to arm a ftrace-based kprobe, register_kprobe()/enable_kprobe()
should report an error instead of success. Previously, this has lead to
confusing situations where register_kprobe() would return 0 indicating
success, but the kprobe would not be functional if ftrace registration
during the kprobe arming process had failed. We should therefore take any
errors returned by ftrace into account and propagate this error so that we
do not register/enable kprobes that cannot be armed. This can happen if,
for example, register_ftrace_function() finds an IPMODIFY conflict (since
kprobe_ftrace_ops has this flag set) and returns an error. Such a conflict
is possible since livepatches also set the IPMODIFY flag for their ftrace_ops.

arm_all_kprobes() keeps its current behavior and attempts to arm all
kprobes. It returns the last encountered error and gives a warning if
not all probes could be armed.

This patch is based on Petr Mladek's original patchset (patches 2 and 3)
back in 2015, which improved kprobes error handling, found here:

   https://lkml.org/lkml/2015/2/26/452

However, further work on this had been paused since then and the patches
were not upstreamed.

Based-on-patches-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S . Miller <davem@davemloft.net>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/20180109235124.30886-2-jeyu@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-16 09:12:52 +01:00
Linus Torvalds
1388c80438 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
 "Misc fixes:

   - fix rq->lock lockdep annotation bug

   - fix/improve update_curr_rt() and update_curr_dl() accounting

   - update documentation

   - remove unused macro"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/cpufreq: Remove unused SUGOV_KTHREAD_PRIORITY macro
  sched/core: Fix DEBUG_SPINLOCK annotation for rq->lock
  sched/rt: Make update_curr_rt() more accurate
  sched/deadline: Make update_curr_dl() more accurate
  membarrier-sync-core: Document architecture support
2018-02-15 09:28:47 -08:00
Linus Torvalds
e9e3b3002f Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
 "This contains two qspinlock fixes and three documentation and comment
  fixes"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/semaphore: Update the file path in documentation
  locking/atomic/bitops: Document and clarify ordering semantics for failed test_and_{}_bit()
  locking/qspinlock: Ensure node->count is updated before initialising node
  locking/qspinlock: Ensure node is initialised before updating prev->next
  Documentation/locking/mutex-design: Update to reflect latest changes
2018-02-15 09:05:26 -08:00
Will Deacon
11dc13224c locking/qspinlock: Ensure node->count is updated before initialising node
When queuing on the qspinlock, the count field for the current CPU's head
node is incremented. This needn't be atomic because locking in e.g. IRQ
context is balanced and so an IRQ will return with node->count as it
found it.

However, the compiler could in theory reorder the initialisation of
node[idx] before the increment of the head node->count, causing an
IRQ to overwrite the initialised node and potentially corrupt the lock
state.

Avoid the potential for this harmful compiler reordering by placing a
barrier() between the increment of the head node->count and the subsequent
node initialisation.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1518528177-19169-3-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13 14:50:14 +01:00
Will Deacon
95bcade33a locking/qspinlock: Ensure node is initialised before updating prev->next
When a locker ends up queuing on the qspinlock locking slowpath, we
initialise the relevant mcs node and publish it indirectly by updating
the tail portion of the lock word using xchg_tail. If we find that there
was a pre-existing locker in the queue, we subsequently update their
->next field to point at our node so that we are notified when it's our
turn to take the lock.

This can be roughly illustrated as follows:

  /* Initialise the fields in node and encode a pointer to node in tail */
  tail = initialise_node(node);

  /*
   * Exchange tail into the lockword using an atomic read-modify-write
   * operation with release semantics
   */
  old = xchg_tail(lock, tail);

  /* If there was a pre-existing waiter ... */
  if (old & _Q_TAIL_MASK) {
	prev = decode_tail(old);
	smp_read_barrier_depends();

	/* ... then update their ->next field to point to node.
	WRITE_ONCE(prev->next, node);
  }

The conditional update of prev->next therefore relies on the address
dependency from the result of xchg_tail ensuring order against the
prior initialisation of node. However, since the release semantics of
the xchg_tail operation apply only to the write portion of the RmW,
then this ordering is not guaranteed and it is possible for the CPU
to return old before the writes to node have been published, consequently
allowing us to point prev->next to an uninitialised node.

This patch fixes the problem by making the update of prev->next a RELEASE
operation, which also removes the reliance on dependency ordering.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1518528177-19169-2-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13 14:50:14 +01:00
Leo Yan
43d1b29b27 sched/cpufreq: Remove unused SUGOV_KTHREAD_PRIORITY macro
Since schedutil kernel thread directly set priority to 0, the macro
SUGOV_KTHREAD_PRIORITY is not used.  So remove it.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vikram Mulukutla <markivx@codeaurora.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Link: http://lkml.kernel.org/r/1518097702-9665-1-git-send-email-leo.yan@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13 13:04:03 +01:00
Peter Zijlstra
269d599271 sched/core: Fix DEBUG_SPINLOCK annotation for rq->lock
Mark noticed that he had sporadic "spinlock recursion" warnings from
the DEBUG_SPINLOCK code. Now rq->lock is special in that the owner
changes in the middle of a context switch.

It so happens that we fix up the lock.owner too late, @prev can run
(remotely) the moment prev->on_cpu is cleared, this then allows @prev
to again try and acquire this rq->lock and trigger this warning.

So we have to switch lock.owner before clearing prev->on_cpu.

Do this by moving the DEBUG_SPINLOCK annotation from after switch_to()
to before switch_to() and collect all lockdep annotations there into
prepare_lock_switch() to mirror the existing finish_lock_switch().

Debugged-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13 11:44:41 +01:00
Wen Yang
a7711602c7 sched/rt: Make update_curr_rt() more accurate
rq->clock_task may be updated between the two calls of
rq_clock_task() in update_curr_rt(). Calling rq_clock_task() only
once makes it more accurate and efficient, taking update_curr() as
reference.

Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: zhong.weidong@zte.com.cn
Link: http://lkml.kernel.org/r/1517882008-44552-1-git-send-email-wen.yang99@zte.com.cn
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13 11:44:41 +01:00
Wen Yang
6fe0ce1eb0 sched/deadline: Make update_curr_dl() more accurate
rq->clock_task may be updated between the two calls of
rq_clock_task() in update_curr_dl(). Calling rq_clock_task() only
once makes it more accurate and efficient, taking update_curr() as
reference.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: zhong.weidong@zte.com.cn
Link: http://lkml.kernel.org/r/1517882148-44599-1-git-send-email-wen.yang99@zte.com.cn
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13 11:44:40 +01:00
Linus Torvalds
a9a08845e9 vfs: do bulk POLL* -> EPOLL* replacement
This is the mindless scripted replacement of kernel use of POLL*
variables as described by Al, done by this script:

    for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
        L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
        for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
    done

with de-mangling cleanups yet to come.

NOTE! On almost all architectures, the EPOLL* constants have the same
values as the POLL* constants do.  But they keyword here is "almost".
For various bad reasons they aren't the same, and epoll() doesn't
actually work quite correctly in some cases due to this on Sparc et al.

The next patch from Al will sort out the final differences, and we
should be all done.

Scripted-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-11 14:34:03 -08:00
Linus Torvalds
15303ba5d1 Merge tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Radim Krčmář:
 "ARM:

   - icache invalidation optimizations, improving VM startup time

   - support for forwarded level-triggered interrupts, improving
     performance for timers and passthrough platform devices

   - a small fix for power-management notifiers, and some cosmetic
     changes

  PPC:

   - add MMIO emulation for vector loads and stores

   - allow HPT guests to run on a radix host on POWER9 v2.2 CPUs without
     requiring the complex thread synchronization of older CPU versions

   - improve the handling of escalation interrupts with the XIVE
     interrupt controller

   - support decrement register migration

   - various cleanups and bugfixes.

  s390:

   - Cornelia Huck passed maintainership to Janosch Frank

   - exitless interrupts for emulated devices

   - cleanup of cpuflag handling

   - kvm_stat counter improvements

   - VSIE improvements

   - mm cleanup

  x86:

   - hypervisor part of SEV

   - UMIP, RDPID, and MSR_SMI_COUNT emulation

   - paravirtualized TLB shootdown using the new KVM_VCPU_PREEMPTED bit

   - allow guests to see TOPOEXT, GFNI, VAES, VPCLMULQDQ, and more
     AVX512 features

   - show vcpu id in its anonymous inode name

   - many fixes and cleanups

   - per-VCPU MSR bitmaps (already merged through x86/pti branch)

   - stable KVM clock when nesting on Hyper-V (merged through
     x86/hyperv)"

* tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (197 commits)
  KVM: PPC: Book3S: Add MMIO emulation for VMX instructions
  KVM: PPC: Book3S HV: Branch inside feature section
  KVM: PPC: Book3S HV: Make HPT resizing work on POWER9
  KVM: PPC: Book3S HV: Fix handling of secondary HPTEG in HPT resizing code
  KVM: PPC: Book3S PR: Fix broken select due to misspelling
  KVM: x86: don't forget vcpu_put() in kvm_arch_vcpu_ioctl_set_sregs()
  KVM: PPC: Book3S PR: Fix svcpu copying with preemption enabled
  KVM: PPC: Book3S HV: Drop locks before reading guest memory
  kvm: x86: remove efer_reload entry in kvm_vcpu_stat
  KVM: x86: AMD Processor Topology Information
  x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested
  kvm: embed vcpu id to dentry of vcpu anon inode
  kvm: Map PFN-type memory regions as writable (if possible)
  x86/kvm: Make it compile on 32bit and with HYPYERVISOR_GUEST=n
  KVM: arm/arm64: Fixup userspace irqchip static key optimization
  KVM: arm/arm64: Fix userspace_irqchip_in_use counting
  KVM: arm/arm64: Fix incorrect timer_is_pending logic
  MAINTAINERS: update KVM/s390 maintainers
  MAINTAINERS: add Halil as additional vfio-ccw maintainer
  MAINTAINERS: add David as a reviewer for KVM/s390
  ...
2018-02-10 13:16:35 -08:00
Linus Torvalds
c839682c71 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Make allocations less aggressive in x_tables, from Minchal Hocko.

 2) Fix netfilter flowtable Kconfig deps, from Pablo Neira Ayuso.

 3) Fix connection loss problems in rtlwifi, from Larry Finger.

 4) Correct DRAM dump length for some chips in ath10k driver, from Yu
    Wang.

 5) Fix ABORT handling in rxrpc, from David Howells.

 6) Add SPDX tags to Sun networking drivers, from Shannon Nelson.

 7) Some ipv6 onlink handling fixes, from David Ahern.

 8) Netem packet scheduler interval calcualtion fix from Md. Islam.

 9) Don't put crypto buffers on-stack in rxrpc, from David Howells.

10) Fix handling of error non-delivery status in netlink multicast
    delivery over multiple namespaces, from Nicolas Dichtel.

11) Missing xdp flush in tuntap driver, from Jason Wang.

12) Synchonize RDS protocol netns/module teardown with rds object
    management, from Sowini Varadhan.

13) Add nospec annotations to mpls, from Dan Williams.

14) Fix SKB truesize handling in TIPC, from Hoang Le.

15) Interrupt masking fixes in stammc from Niklas Cassel.

16) Don't allow ptr_ring objects to be sized outside of kmalloc's
    limits, from Jason Wang.

17) Don't allow SCTP chunks to be built which will have a length
    exceeding the chunk header's 16-bit length field, from Alexey
    Kodanev.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (82 commits)
  ibmvnic: Remove skb->protocol checks in ibmvnic_xmit
  bpf: fix rlimit in reuseport net selftest
  sctp: verify size of a new chunk in _sctp_make_chunk()
  s390/qeth: fix SETIP command handling
  s390/qeth: fix underestimated count of buffer elements
  ptr_ring: try vmalloc() when kmalloc() fails
  ptr_ring: fail early if queue occupies more than KMALLOC_MAX_SIZE
  net: stmmac: remove redundant enable of PMT irq
  net: stmmac: rename GMAC_INT_DEFAULT_MASK for dwmac4
  net: stmmac: discard disabled flags in interrupt status register
  ibmvnic: Reset long term map ID counter
  tools/libbpf: handle issues with bpf ELF objects containing .eh_frames
  selftests/bpf: add selftest that use test_libbpf_open
  selftests/bpf: add test program for loading BPF ELF files
  tools/libbpf: improve the pr_debug statements to contain section numbers
  bpf: Sync kernel ABI header with tooling header for bpf_common.h
  net: phy: fix phy_start to consider PHY_IGNORE_INTERRUPT
  net: thunder: change q_len's type to handle max ring size
  tipc: fix skb truesize/datasize ratio control
  net/sched: cls_u32: fix cls_u32 on filter replace
  ...
2018-02-09 15:34:18 -08:00
Linus Torvalds
8158c2ffa4 Merge tag 'trace-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
 "Al Viro discovered some breakage with the parsing of the
  set_ftrace_filter as well as the removing of function probes.

  This fixes the code with Al's suggestions. I also added a few
  selftests to test the broken cases such that they wont happen
  again"

* tag 'trace-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  selftests/ftrace: Add more tests for removing of function probes
  selftests/ftrace: Add some missing glob checks
  selftests/ftrace: Have reset_ftrace_filter handle multiple instances
  selftests/ftrace: Have reset_ftrace_filter handle modules
  tracing: Fix parsing of globs with a wildcard at the beginning
  ftrace: Remove incorrect setting of glob search field
2018-02-09 14:47:09 -08:00
David S. Miller
437a4db66d Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2018-02-09

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Two fixes for BPF sockmap in order to break up circular map references
   from programs attached to sockmap, and detaching related sockets in
   case of socket close() event. For the latter we get rid of the
   smap_state_change() and plug into ULP infrastructure, which will later
   also be used for additional features anyway such as TX hooks. For the
   second issue, dependency chain is broken up via map release callback
   to free parse/verdict programs, all from John.

2) Fix a libbpf relocation issue that was found while implementing XDP
   support for Suricata project. Issue was that when clang was invoked
   with default target instead of bpf target, then various other e.g.
   debugging relevant sections are added to the ELF file that contained
   relocation entries pointing to non-BPF related sections which libbpf
   trips over instead of skipping them. Test cases for libbpf are added
   as well, from Jesper.

3) Various misc fixes for bpftool and one for libbpf: a small addition
   to libbpf to make sure it recognizes all standard section prefixes.
   Then, the Makefile in bpftool/Documentation is improved to explicitly
   check for rst2man being installed on the system as we otherwise risk
   installing empty man pages; the man page for bpftool-map is corrected
   and a set of missing bash completions added in order to avoid shipping
   bpftool where the completions are only partially working, from Quentin.

4) Fix applying the relocation to immediate load instructions in the
   nfp JIT which were missing a shift, from Jakub.

5) Two fixes for the BPF kernel selftests: handle CONFIG_BPF_JIT_ALWAYS_ON=y
   gracefully in test_bpf.ko module and mark them as FLAG_EXPECTED_FAIL
   in this case; and explicitly delete the veth devices in the two tests
   test_xdp_{meta,redirect}.sh before dismantling the netnses as when
   selftests are run in batch mode, then workqueue to handle destruction
   might not have finished yet and thus veth creation in next test under
   same dev name would fail, from Yonghong.

6) Fix test_kmod.sh to check the test_bpf.ko module path before performing
   an insmod, and fallback to modprobe. Especially the latter is useful
   when having a device under test that has the modules installed instead,
   from Naresh.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-09 14:05:10 -05:00
Steven Rostedt (VMware)
0723402141 tracing: Fix parsing of globs with a wildcard at the beginning
Al Viro reported:

    For substring - sure, but what about something like "*a*b" and "a*b"?
    AFAICS, filter_parse_regex() ends up with identical results in both
    cases - MATCH_GLOB and *search = "a*b".  And no way for the caller
    to tell one from another.

Testing this with the following:

 # cd /sys/kernel/tracing
 # echo '*raw*lock' > set_ftrace_filter
 bash: echo: write error: Invalid argument

With this patch:

 # echo '*raw*lock' > set_ftrace_filter
 # cat set_ftrace_filter
_raw_read_trylock
_raw_write_trylock
_raw_read_unlock
_raw_spin_unlock
_raw_write_unlock
_raw_spin_trylock
_raw_spin_lock
_raw_write_lock
_raw_read_lock

Al recommended not setting the search buffer to skip the first '*' unless we
know we are not using MATCH_GLOB. This implements his suggested logic.

Link: http://lkml.kernel.org/r/20180127170748.GF13338@ZenIV.linux.org.uk

Cc: stable@vger.kernel.org
Fixes: 60f1d5e3ba ("ftrace: Support full glob matching")
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Suggsted-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-02-08 10:11:47 -05:00
Steven Rostedt (VMware)
7b65865627 ftrace: Remove incorrect setting of glob search field
__unregister_ftrace_function_probe() will incorrectly parse the glob filter
because it resets the search variable that was setup by filter_parse_regex().

Al Viro reported this:

    After that call of filter_parse_regex() we could have func_g.search not
    equal to glob only if glob started with '!' or '*'.  In the former case
    we would've buggered off with -EINVAL (not = 1).  In the latter we
    would've set func_g.search equal to glob + 1, calculated the length of
    that thing in func_g.len and proceeded to reset func_g.search back to
    glob.

    Suppose the glob is e.g. *foo*.  We end up with
	    func_g.type = MATCH_MIDDLE_ONLY;
	    func_g.len = 3;
	    func_g.search = "*foo";
    Feeding that to ftrace_match_record() will not do anything sane - we
    will be looking for names containing "*foo" (->len is ignored for that
    one).

Link: http://lkml.kernel.org/r/20180127031706.GE13338@ZenIV.linux.org.uk

Cc: stable@vger.kernel.org
Fixes: 3ba0092971 ("ftrace: Introduce ftrace_glob structure")
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-02-08 10:11:11 -05:00
Linus Torvalds
581e400ff9 Merge tag 'modules-for-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux
Pull modules updates from Jessica Yu:
 "Minor code cleanups and MAINTAINERS update"

* tag 'modules-for-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
  modpost: Remove trailing semicolon
  ftrace/module: Move ftrace_release_mod() to ddebug_cleanup label
  MAINTAINERS: Remove from module & paravirt maintenance
2018-02-07 14:29:34 -08:00
Linus Torvalds
a2e5790d84 Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:

 - kasan updates

 - procfs

 - lib/bitmap updates

 - other lib/ updates

 - checkpatch tweaks

 - rapidio

 - ubsan

 - pipe fixes and cleanups

 - lots of other misc bits

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (114 commits)
  Documentation/sysctl/user.txt: fix typo
  MAINTAINERS: update ARM/QUALCOMM SUPPORT patterns
  MAINTAINERS: update various PALM patterns
  MAINTAINERS: update "ARM/OXNAS platform support" patterns
  MAINTAINERS: update Cortina/Gemini patterns
  MAINTAINERS: remove ARM/CLKDEV SUPPORT file pattern
  MAINTAINERS: remove ANDROID ION pattern
  mm: docs: add blank lines to silence sphinx "Unexpected indentation" errors
  mm: docs: fix parameter names mismatch
  mm: docs: fixup punctuation
  pipe: read buffer limits atomically
  pipe: simplify round_pipe_size()
  pipe: reject F_SETPIPE_SZ with size over UINT_MAX
  pipe: fix off-by-one error when checking buffer limits
  pipe: actually allow root to exceed the pipe buffer limits
  pipe, sysctl: remove pipe_proc_fn()
  pipe, sysctl: drop 'min' parameter from pipe-max-size converter
  kasan: rework Kconfig settings
  crash_dump: is_kdump_kernel can be boolean
  kernel/mutex: mutex_is_locked can be boolean
  ...
2018-02-06 22:15:42 -08:00
Linus Torvalds
ab2d92ad88 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:

 - membarrier updates (Mathieu Desnoyers)

 - SMP balancing optimizations (Mel Gorman)

 - stats update optimizations (Peter Zijlstra)

 - RT scheduler race fixes (Steven Rostedt)

 - misc fixes and updates

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Use a recently used CPU as an idle candidate and the basis for SIS
  sched/fair: Do not migrate if the prev_cpu is idle
  sched/fair: Restructure wake_affine*() to return a CPU id
  sched/fair: Remove unnecessary parameters from wake_affine_idle()
  sched/rt: Make update_curr_rt() more accurate
  sched/rt: Up the root domain ref count when passing it around via IPIs
  sched/rt: Use container_of() to get root domain in rto_push_irq_work_func()
  sched/core: Optimize update_stats_*()
  sched/core: Optimize ttwu_stat()
  membarrier/selftest: Test private expedited sync core command
  membarrier/arm64: Provide core serializing command
  membarrier/x86: Provide core serializing command
  membarrier: Provide core serializing command, *_SYNC_CORE
  lockin/x86: Implement sync_core_before_usermode()
  locking: Introduce sync_core_before_usermode()
  membarrier/selftest: Test global expedited command
  membarrier: Provide GLOBAL_EXPEDITED command
  membarrier: Document scheduler barrier requirements
  powerpc, membarrier: Skip memory barrier in switch_mm()
  membarrier/selftest: Test private expedited command
2018-02-06 19:57:31 -08:00
Linus Torvalds
0dc400f41f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix error path in netdevsim, from Jakub Kicinski.

 2) Default values listed in tcp_wmem and tcp_rmem documentation were
    inaccurate, from Tonghao Zhang.

 3) Fix route leaks in SCTP, both for ipv4 and ipv6. From Alexey Kodanev
    and Tommi Rantala.

 4) Fix "MASK < Y" meant to be "MASK << Y" in xgbe driver, from Wolfram
    Sang.

 5) Use after free in u32_destroy_key(), from Paolo Abeni.

 6) Fix two TX issues in be2net driver, from Suredh Reddy.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (25 commits)
  be2net: Handle transmit completion errors in Lancer
  be2net: Fix HW stall issue in Lancer
  RDS: IB: Fix null pointer issue
  nfp: fix kdoc warnings on nested structures
  sample/bpf: fix erspan metadata
  net: erspan: fix erspan config overwrite
  net: erspan: fix metadata extraction
  cls_u32: fix use after free in u32_destroy_key()
  net: amd-xgbe: fix comparison to bitshift when dealing with a mask
  net: phy: Handle not having GPIO enabled in the kernel
  ibmvnic: fix empty firmware version and errors cleanup
  sctp: fix dst refcnt leak in sctp_v4_get_dst
  sctp: fix dst refcnt leak in sctp_v6_get_dst()
  dwc-xlgmac: remove Jie Deng as co-maintainer
  doc: Change the min default value of tcp_wmem/tcp_rmem.
  samples/bpf: use bpf_set_link_xdp_fd
  libbpf: add missing SPDX-License-Identifier
  libbpf: add error reporting in XDP
  libbpf: add function to setup XDP
  tools: add netlink.h and if_link.h in tools uapi
  ...
2018-02-06 19:00:42 -08:00
Eric Biggers
96e99be40e pipe: reject F_SETPIPE_SZ with size over UINT_MAX
A pipe's size is represented as an 'unsigned int'.  As expected, writing a
value greater than UINT_MAX to /proc/sys/fs/pipe-max-size fails with
EINVAL.  However, the F_SETPIPE_SZ fcntl silently truncates such values to
32 bits, rather than failing with EINVAL as expected.  (It *does* fail
with EINVAL for values above (1 << 31) but <= UINT_MAX.)

Fix this by moving the check against UINT_MAX into round_pipe_size() which
is called in both cases.

Link: http://lkml.kernel.org/r/20180111052902.14409-6-ebiggers3@gmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Luis R . Rodriguez" <mcgrof@kernel.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:47 -08:00