Commit Graph

25568 Commits

Author SHA1 Message Date
Roman Gushchin 65f3975f35 cgroup: revert fa06235b8e ("cgroup: reset css on destruction")
Commit fa06235b8e ("cgroup: reset css on destruction") caused
css_reset callback to be called from the offlining path.  Although it
solves the problem mentioned in the commit description ("For instance,
memory cgroup needs to reset memory.low, otherwise pages charged to a
dead cgroup might never get reclaimed."), generally speaking, it's not
correct.

An offline cgroup can still be a resource domain, and we shouldn't grant
it more resources than it had before deletion.

For instance, if an offline memory cgroup has dirty pages, we should
still imply i/o limits during writeback.

The css_reset callback is designed to return the cgroup state into the
original state, that means reset all limits and counters.  It's
spomething different from the offlining, and we shouldn't use it from
the offlining path.  Instead, we should adjust necessary settings from
the per-controller css_offline callbacks (e.g.  reset memory.low).

Link: http://lkml.kernel.org/r/20170727130428.28856-2-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-06 17:27:27 -07:00
Linus Torvalds e7d0c41ecc Merge tag 'devprop-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull device properties framework updates from Rafael Wysocki:
 "These introduce fwnode operations for all of the separate types of
  'firmware nodes' that can be handled by the device properties
  framework, make the framework use const fwnode arguments all over, add
  a helper for the consolidated handling of node references and switch
  over the framework to the new UUID API.

  Specifics:

   - Introduce fwnode operations for all of the separate types of
     'firmware nodes' that can be handled by the device properties
     framework and drop the type field from struct fwnode_handle (Sakari
     Ailus, Arnd Bergmann).

   - Make the device properties framework use const fwnode arguments
     where possible (Sakari Ailus).

   - Add a helper for the consolidated handling of node references to
     the device properties framework (Sakari Ailus).

   - Switch over the ACPI part of the device properties framework to the
     new UUID API (Andy Shevchenko)"

* tag 'devprop-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: device property: Switch to use new generic UUID API
  device property: export irqchip_fwnode_ops
  device property: Introduce fwnode_property_get_reference_args
  device property: Constify fwnode property API
  device property: Constify argument to pset fwnode backend
  ACPI: Constify internal fwnode arguments
  ACPI: Constify acpi_bus helper functions, switch to macros
  ACPI: Prepare for constifying acpi_get_next_subnode() fwnode argument
  device property: Get rid of struct fwnode_handle type field
  ACPI: Use IS_ERR_OR_NULL() instead of non-NULL check in is_acpi_data_node()
2017-09-05 12:50:00 -07:00
Linus Torvalds 439644096c Merge tag 'pm-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
 "This time (again) cpufreq gets the majority of changes which mostly
  are driver updates (including a major consolidation of intel_pstate),
  some schedutil governor modifications and core cleanups.

  There also are some changes in the system suspend area, mostly related
  to diagnostics and debug messages plus some renames of things related
  to suspend-to-idle. One major change here is that suspend-to-idle is
  now going to be preferred over S3 on systems where the ACPI tables
  indicate to do so and provide requsite support (the Low Power Idle S0
  _DSM in particular). The system sleep documentation and the tools
  related to it are updated too.

  The rest is a few cpuidle changes (nothing major), devfreq updates,
  generic power domains (genpd) framework updates and a few assorted
  modifications elsewhere.

  Specifics:

   - Drop the P-state selection algorithm based on a PID controller from
     intel_pstate and make it use the same P-state selection method
     (based on the CPU load) for all types of systems in the active mode
     (Rafael Wysocki, Srinivas Pandruvada).

   - Rework the cpufreq core and governors to make it possible to take
     cross-CPU utilization updates into account and modify the schedutil
     governor to actually do so (Viresh Kumar).

   - Clean up the handling of transition latency information in the
     cpufreq core and untangle it from the information on which drivers
     cannot do dynamic frequency switching (Viresh Kumar).

   - Add support for new SoCs (MT2701/MT7623 and MT7622) to the mediatek
     cpufreq driver and update its DT bindings (Sean Wang).

   - Modify the cpufreq dt-platdev driver to autimatically create
     cpufreq devices for the new (v2) Operating Performance Points (OPP)
     DT bindings and update its whitelist of supported systems (Viresh
     Kumar, Shubhrajyoti Datta, Marc Gonzalez, Khiem Nguyen, Finley
     Xiao).

   - Add support for Ux500 to the cpufreq-dt driver and drop the
     obsolete dbx500 cpufreq driver (Linus Walleij, Arnd Bergmann).

   - Add new SoC (R8A7795) support to the cpufreq rcar driver (Khiem
     Nguyen).

   - Fix and clean up assorted issues in the cpufreq drivers and core
     (Arvind Yadav, Christophe Jaillet, Colin Ian King, Gustavo Silva,
     Julia Lawall, Leonard Crestez, Rob Herring, Sudeep Holla).

   - Update the IO-wait boost handling in the schedutil governor to make
     it less aggressive (Joel Fernandes).

   - Rework system suspend diagnostics to make it print fewer messages
     to the kernel log by default, add a sysfs knob to allow more
     suspend-related messages to be printed and add Low Power S0 Idle
     constraints checks to the ACPI suspend-to-idle code (Rafael
     Wysocki, Srinivas Pandruvada).

   - Prefer suspend-to-idle over S3 on ACPI-based systems with the
     ACPI_FADT_LOW_POWER_S0 flag set and the Low Power Idle S0 _DSM
     interface present in the ACPI tables (Rafael Wysocki).

   - Update documentation related to system sleep and rename a number of
     items in the code to make it cleare that they are related to
     suspend-to-idle (Rafael Wysocki).

   - Export a variable allowing device drivers to check the target
     system sleep state from the core system suspend code (Florian
     Fainelli).

   - Clean up the cpuidle subsystem to handle the polling state on x86
     in a more straightforward way and to use %pOF instead of full_name
     (Rafael Wysocki, Rob Herring).

   - Update the devfreq framework to fix and clean up a few minor issues
     (Chanwoo Choi, Rob Herring).

   - Extend diagnostics in the generic power domains (genpd) framework
     and clean it up slightly (Thara Gopinath, Rob Herring).

   - Fix and clean up a couple of issues in the operating performance
     points (OPP) framework (Viresh Kumar, Waldemar Rymarkiewicz).

   - Add support for RV1108 to the rockchip-io Adaptive Voltage Scaling
     (AVS) driver (David Wu).

   - Fix the usage of notifiers in CPU power management on some
     platforms (Alex Shi).

   - Update the pm-graph system suspend/hibernation and boot profiling
     utility (Todd Brandt).

   - Make it possible to run the cpupower utility without CPU0 (Prarit
     Bhargava)"

* tag 'pm-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (87 commits)
  cpuidle: Make drivers initialize polling state
  cpuidle: Move polling state initialization code to separate file
  cpuidle: Eliminate the CPUIDLE_DRIVER_STATE_START symbol
  cpufreq: imx6q: Fix imx6sx low frequency support
  cpufreq: speedstep-lib: make several arrays static, makes code smaller
  PM: docs: Delete the obsolete states.txt document
  PM: docs: Describe high-level PM strategies and sleep states
  PM / devfreq: Fix memory leak when fail to register device
  PM / devfreq: Add dependency on PM_OPP
  PM / devfreq: Move private devfreq_update_stats() into devfreq
  PM / devfreq: Convert to using %pOF instead of full_name
  PM / AVS: rockchip-io: add io selectors and supplies for RV1108
  cpufreq: ti: Fix 'of_node_put' being called twice in error handling path
  cpufreq: dt-platdev: Drop few entries from whitelist
  cpufreq: dt-platdev: Automatically create cpufreq device with OPP v2
  ARM: ux500: don't select CPUFREQ_DT
  cpuidle: Convert to using %pOF instead of full_name
  cpufreq: Convert to using %pOF instead of full_name
  PM / Domains: Convert to using %pOF instead of full_name
  cpufreq: Cap the default transition delay value to 10 ms
  ...
2017-09-05 12:19:08 -07:00
Linus Torvalds bafb0762cb Merge tag 'char-misc-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
 "Here is the big char/misc driver update for 4.14-rc1.

  Lots of different stuff in here, it's been an active development cycle
  for some reason. Highlights are:

   - updated binder driver, this brings binder up to date with what
     shipped in the Android O release, plus some more changes that
     happened since then that are in the Android development trees.

   - coresight updates and fixes

   - mux driver file renames to be a bit "nicer"

   - intel_th driver updates

   - normal set of hyper-v updates and changes

   - small fpga subsystem and driver updates

   - lots of const code changes all over the driver trees

   - extcon driver updates

   - fmc driver subsystem upadates

   - w1 subsystem minor reworks and new features and drivers added

   - spmi driver updates

  Plus a smattering of other minor driver updates and fixes.

  All of these have been in linux-next with no reported issues for a
  while"

* tag 'char-misc-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (244 commits)
  ANDROID: binder: don't queue async transactions to thread.
  ANDROID: binder: don't enqueue death notifications to thread todo.
  ANDROID: binder: Don't BUG_ON(!spin_is_locked()).
  ANDROID: binder: Add BINDER_GET_NODE_DEBUG_INFO ioctl
  ANDROID: binder: push new transactions to waiting threads.
  ANDROID: binder: remove proc waitqueue
  android: binder: Add page usage in binder stats
  android: binder: fixup crash introduced by moving buffer hdr
  drivers: w1: add hwmon temp support for w1_therm
  drivers: w1: refactor w1_slave_show to make the temp reading functionality separate
  drivers: w1: add hwmon support structures
  eeprom: idt_89hpesx: Support both ACPI and OF probing
  mcb: Fix an error handling path in 'chameleon_parse_cells()'
  MCB: add support for SC31 to mcb-lpc
  mux: make device_type const
  char: virtio: constify attribute_group structures.
  Documentation/ABI: document the nvmem sysfs files
  lkdtm: fix spelling mistake: "incremeted" -> "incremented"
  perf: cs-etm: Fix ETMv4 CONFIGR entry in perf.data file
  nvmem: include linux/err.h from header
  ...
2017-09-05 11:08:17 -07:00
Linus Torvalds 04759194dc Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:

 - VMAP_STACK support, allowing the kernel stacks to be allocated in the
   vmalloc space with a guard page for trapping stack overflows. One of
   the patches introduces THREAD_ALIGN and changes the generic
   alloc_thread_stack_node() to use this instead of THREAD_SIZE (no
   functional change for other architectures)

 - Contiguous PTE hugetlb support re-enabled (after being reverted a
   couple of times). We now have the semantics agreed in the generic mm
   layer together with API improvements so that the architecture code
   can detect between contiguous and non-contiguous huge PTEs

 - Initial support for persistent memory on ARM: DC CVAP instruction
   exposed to user space (HWCAP) and the in-kernel pmem API implemented

 - raid6 improvements for arm64: faster algorithm for the delta syndrome
   and implementation of the recovery routines using Neon

 - FP/SIMD refactoring and removal of support for Neon in interrupt
   context. This is in preparation for full SVE support

 - PTE accessors converted from inline asm to cmpxchg so that we can use
   LSE atomics if available (ARMv8.1)

 - Perf support for Cortex-A35 and A73

 - Non-urgent fixes and cleanups

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (75 commits)
  arm64: cleanup {COMPAT_,}SET_PERSONALITY() macro
  arm64: introduce separated bits for mm_context_t flags
  arm64: hugetlb: Cleanup setup_hugepagesz
  arm64: Re-enable support for contiguous hugepages
  arm64: hugetlb: Override set_huge_swap_pte_at() to support contiguous hugepages
  arm64: hugetlb: Override huge_pte_clear() to support contiguous hugepages
  arm64: hugetlb: Handle swap entries in huge_pte_offset() for contiguous hugepages
  arm64: hugetlb: Add break-before-make logic for contiguous entries
  arm64: hugetlb: Spring clean huge pte accessors
  arm64: hugetlb: Introduce pte_pgprot helper
  arm64: hugetlb: set_huge_pte_at Add WARN_ON on !pte_present
  arm64: kexec: have own crash_smp_send_stop() for crash dump for nonpanic cores
  arm64: dma-mapping: Mark atomic_pool as __ro_after_init
  arm64: dma-mapping: Do not pass data to gen_pool_set_algo()
  arm64: Remove the !CONFIG_ARM64_HW_AFDBM alternative code paths
  arm64: Ignore hardware dirty bit updates in ptep_set_wrprotect()
  arm64: Move PTE_RDONLY bit handling out of set_pte_at()
  kvm: arm64: Convert kvm_set_s2pte_readonly() from inline asm to cmpxchg()
  arm64: Convert pte handling from inline asm to using (cmp)xchg
  arm64: neon/efi: Make EFI fpsimd save/restore variables static
  ...
2017-09-05 09:53:37 -07:00
Linus Torvalds f57091767a Merge branch 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cache quality monitoring update from Thomas Gleixner:
 "This update provides a complete rewrite of the Cache Quality
  Monitoring (CQM) facility.

  The existing CQM support was duct taped into perf with a lot of issues
  and the attempts to fix those turned out to be incomplete and
  horrible.

  After lengthy discussions it was decided to integrate the CQM support
  into the Resource Director Technology (RDT) facility, which is the
  obvious choise as in hardware CQM is part of RDT. This allowed to add
  Memory Bandwidth Monitoring support on top.

  As a result the mechanisms for allocating cache/memory bandwidth and
  the corresponding monitoring mechanisms are integrated into a single
  management facility with a consistent user interface"

* 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
  x86/intel_rdt: Turn off most RDT features on Skylake
  x86/intel_rdt: Add command line options for resource director technology
  x86/intel_rdt: Move special case code for Haswell to a quirk function
  x86/intel_rdt: Remove redundant ternary operator on return
  x86/intel_rdt/cqm: Improve limbo list processing
  x86/intel_rdt/mbm: Fix MBM overflow handler during CPU hotplug
  x86/intel_rdt: Modify the intel_pqr_state for better performance
  x86/intel_rdt/cqm: Clear the default RMID during hotcpu
  x86/intel_rdt: Show bitmask of shareable resource with other executing units
  x86/intel_rdt/mbm: Handle counter overflow
  x86/intel_rdt/mbm: Add mbm counter initialization
  x86/intel_rdt/mbm: Basic counting of MBM events (total and local)
  x86/intel_rdt/cqm: Add CPU hotplug support
  x86/intel_rdt/cqm: Add sched_in support
  x86/intel_rdt: Introduce rdt_enable_key for scheduling
  x86/intel_rdt/cqm: Add mount,umount support
  x86/intel_rdt/cqm: Add rmdir support
  x86/intel_rdt: Separate the ctrl bits from rmdir
  x86/intel_rdt/cqm: Add mon_data
  x86/intel_rdt: Prepare for RDT monitor data support
  ...
2017-09-04 13:56:37 -07:00
Linus Torvalds d725c7ac8b Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU hotplug fix from Thomas Gleixner:
 "A single fix to handle the removal of the first dynamic CPU hotplug
  state correctly"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  smp/hotplug: Handle removal correctly in cpuhp_store_callbacks()
2017-09-04 13:53:53 -07:00
Linus Torvalds 93cc1228b4 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "The interrupt subsystem delivers this time:

   - Refactoring of the GIC-V3 driver to prepare for the GIC-V4 support

   - Initial GIC-V4 support

   - Consolidation of the FSL MSI support

   - Utilize the effective affinity interface in various ARM irqchip
     drivers

   - Yet another interrupt chip driver (UniPhier AIDET)

   - Bulk conversion of the irq chip driver to use %pOF

   - The usual small fixes and improvements all over the place"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits)
  irqchip/ls-scfg-msi: Add MSI affinity support
  irqchip/ls-scfg-msi: Add LS1043a v1.1 MSI support
  irqchip/ls-scfg-msi: Add LS1046a MSI support
  arm64: dts: ls1046a: Add MSI dts node
  arm64: dts: ls1043a: Share all MSIs
  arm: dts: ls1021a: Share all MSIs
  arm64: dts: ls1043a: Fix typo of MSI compatible string
  arm: dts: ls1021a: Fix typo of MSI compatible string
  irqchip/ls-scfg-msi: Fix typo of MSI compatible strings
  irqchip/irq-bcm7120-l2: Use correct I/O accessors for irq_fwd_mask
  irqchip/mmp: Make mmp_intc_conf const
  irqchip/gic: Make irq_chip const
  irqchip/gic-v3: Advertise GICv4 support to KVM
  irqchip/gic-v4: Enable low-level GICv4 operations
  irqchip/gic-v4: Add some basic documentation
  irqchip/gic-v4: Add VLPI configuration interface
  irqchip/gic-v4: Add VPE command interface
  irqchip/gic-v4: Add per-VM VPE domain creation
  irqchip/gic-v3-its: Set implementation defined bit to enable VLPIs
  irqchip/gic-v3-its: Allow doorbell interrupts to be injected/cleared
  ...
2017-09-04 13:08:27 -07:00
Linus Torvalds dd90cccffc Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
 "A rather small update for the time(r) subsystem:

   - A new clocksource driver IMX-TPM

   - Minor fixes to the alarmtimer facility

   - Device tree cleanups for Renesas drivers

   - A new kselftest and fixes for the timer related tests

   - Conversion of the clocksource drivers to use %pOF

   - Use the proper helpers to access rlimits in the posix-cpu-timer
     code"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  alarmtimer: Ensure RTC module is not unloaded
  clocksource: Convert to using %pOF instead of full_name
  clocksource/drivers/bcm2835: Remove message for a memory allocation failure
  devicetree: bindings: Remove deprecated properties
  devicetree: bindings: Remove unused 32-bit CMT bindings
  devicetree: bindings: Deprecate property, update example
  devicetree: bindings: r8a73a4 and R-Car Gen2 CMT bindings
  devicetree: bindings: R-Car Gen2 CMT0 and CMT1 bindings
  devicetree: bindings: Remove sh7372 CMT binding
  clocksource/drivers/imx-tpm: Add imx tpm timer support
  dt-bindings: timer: Add nxp tpm timer binding doc
  posix-cpu-timers: Use dedicated helper to access rlimit values
  alarmtimer: Fix unavailable wake-up source in sysfs
  timekeeping: Use proper timekeeper for debug code
  kselftests: timers: set-timer-lat: Add one-shot timer test cases
  kselftests: timers: set-timer-lat: Tweak reporting when timer fires early
  kselftests: timers: freq-step: Fix build warning
  kselftests: timers: freq-step: Define ADJ_SETOFFSET if device has older kernel headers
2017-09-04 13:06:34 -07:00
Linus Torvalds b1b6f83ac9 Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm changes from Ingo Molnar:
 "PCID support, 5-level paging support, Secure Memory Encryption support

  The main changes in this cycle are support for three new, complex
  hardware features of x86 CPUs:

   - Add 5-level paging support, which is a new hardware feature on
     upcoming Intel CPUs allowing up to 128 PB of virtual address space
     and 4 PB of physical RAM space - a 512-fold increase over the old
     limits. (Supercomputers of the future forecasting hurricanes on an
     ever warming planet can certainly make good use of more RAM.)

     Many of the necessary changes went upstream in previous cycles,
     v4.14 is the first kernel that can enable 5-level paging.

     This feature is activated via CONFIG_X86_5LEVEL=y - disabled by
     default.

     (By Kirill A. Shutemov)

   - Add 'encrypted memory' support, which is a new hardware feature on
     upcoming AMD CPUs ('Secure Memory Encryption', SME) allowing system
     RAM to be encrypted and decrypted (mostly) transparently by the
     CPU, with a little help from the kernel to transition to/from
     encrypted RAM. Such RAM should be more secure against various
     attacks like RAM access via the memory bus and should make the
     radio signature of memory bus traffic harder to intercept (and
     decrypt) as well.

     This feature is activated via CONFIG_AMD_MEM_ENCRYPT=y - disabled
     by default.

     (By Tom Lendacky)

   - Enable PCID optimized TLB flushing on newer Intel CPUs: PCID is a
     hardware feature that attaches an address space tag to TLB entries
     and thus allows to skip TLB flushing in many cases, even if we
     switch mm's.

     (By Andy Lutomirski)

  All three of these features were in the works for a long time, and
  it's coincidence of the three independent development paths that they
  are all enabled in v4.14 at once"

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (65 commits)
  x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)
  x86/mm: Use pr_cont() in dump_pagetable()
  x86/mm: Fix SME encryption stack ptr handling
  kvm/x86: Avoid clearing the C-bit in rsvd_bits()
  x86/CPU: Align CR3 defines
  x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages
  acpi, x86/mm: Remove encryption mask from ACPI page protection type
  x86/mm, kexec: Fix memory corruption with SME on successive kexecs
  x86/mm/pkeys: Fix typo in Documentation/x86/protection-keys.txt
  x86/mm/dump_pagetables: Speed up page tables dump for CONFIG_KASAN=y
  x86/mm: Implement PCID based optimization: try to preserve old TLB entries using PCID
  x86: Enable 5-level paging support via CONFIG_X86_5LEVEL=y
  x86/mm: Allow userspace have mappings above 47-bit
  x86/mm: Prepare to expose larger address space to userspace
  x86/mpx: Do not allow MPX if we have mappings above 47-bit
  x86/mm: Rename tasksize_32bit/64bit to task_size_32bit/64bit()
  x86/xen: Redefine XEN_ELFNOTE_INIT_P2M using PUD_SIZE * PTRS_PER_PUD
  x86/mm/dump_pagetables: Fix printout of p4d level
  x86/mm/dump_pagetables: Generalize address normalization
  x86/boot: Fix memremap() related build failure
  ...
2017-09-04 12:21:28 -07:00
Linus Torvalds 5f82e71a00 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:

 - Add 'cross-release' support to lockdep, which allows APIs like
   completions, where it's not the 'owner' who releases the lock, to be
   tracked. It's all activated automatically under
   CONFIG_PROVE_LOCKING=y.

 - Clean up (restructure) the x86 atomics op implementation to be more
   readable, in preparation of KASAN annotations. (Dmitry Vyukov)

 - Fix static keys (Paolo Bonzini)

 - Add killable versions of down_read() et al (Kirill Tkhai)

 - Rework and fix jump_label locking (Marc Zyngier, Paolo Bonzini)

 - Rework (and fix) tlb_flush_pending() barriers (Peter Zijlstra)

 - Remove smp_mb__before_spinlock() and convert its usages, introduce
   smp_mb__after_spinlock() (Peter Zijlstra)

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits)
  locking/lockdep/selftests: Fix mixed read-write ABBA tests
  sched/completion: Avoid unnecessary stack allocation for COMPLETION_INITIALIZER_ONSTACK()
  acpi/nfit: Fix COMPLETION_INITIALIZER_ONSTACK() abuse
  locking/pvqspinlock: Relax cmpxchg's to improve performance on some architectures
  smp: Avoid using two cache lines for struct call_single_data
  locking/lockdep: Untangle xhlock history save/restore from task independence
  locking/refcounts, x86/asm: Disable CONFIG_ARCH_HAS_REFCOUNT for the time being
  futex: Remove duplicated code and fix undefined behaviour
  Documentation/locking/atomic: Finish the document...
  locking/lockdep: Fix workqueue crossrelease annotation
  workqueue/lockdep: 'Fix' flush_work() annotation
  locking/lockdep/selftests: Add mixed read-write ABBA tests
  mm, locking/barriers: Clarify tlb_flush_pending() barriers
  locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS truly non-interactive
  locking/lockdep: Explicitly initialize wq_barrier::done::map
  locking/lockdep: Rename CONFIG_LOCKDEP_COMPLETE to CONFIG_LOCKDEP_COMPLETIONS
  locking/lockdep: Reword title of LOCKDEP_CROSSRELEASE config
  locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE part of CONFIG_PROVE_LOCKING
  locking/refcounts, x86/asm: Implement fast refcount overflow protection
  locking/lockdep: Fix the rollback and overwrite detection logic in crossrelease
  ...
2017-09-04 11:52:29 -07:00
Linus Torvalds f213a6c84c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main changes in this cycle were:

   - fix affine wakeups (Peter Zijlstra)

   - improve CPU onlining (and general bootup) scalability on systems
     with ridiculous number (thousands) of CPUs (Peter Zijlstra)

   - sched/numa updates (Rik van Riel)

   - sched/deadline updates (Byungchul Park)

   - sched/cpufreq enhancements and related cleanups (Viresh Kumar)

   - sched/debug enhancements (Xie XiuQi)

   - various fixes"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
  sched/debug: Optimize sched_domain sysctl generation
  sched/topology: Avoid pointless rebuild
  sched/topology, cpuset: Avoid spurious/wrong domain rebuilds
  sched/topology: Improve comments
  sched/topology: Fix memory leak in __sdt_alloc()
  sched/completion: Document that reinit_completion() must be called after complete_all()
  sched/autogroup: Fix error reporting printk text in autogroup_create()
  sched/fair: Fix wake_affine() for !NUMA_BALANCING
  sched/debug: Intruduce task_state_to_char() helper function
  sched/debug: Show task state in /proc/sched_debug
  sched/debug: Use task_pid_nr_ns in /proc/$pid/sched
  sched/core: Remove unnecessary initialization init_idle_bootup_task()
  sched/deadline: Change return value of cpudl_find()
  sched/deadline: Make find_later_rq() choose a closer CPU in topology
  sched/numa: Scale scan period with tasks in group and shared/private
  sched/numa: Slow down scan rate if shared faults dominate
  sched/pelt: Fix false running accounting
  sched: Mark pick_next_task_dl() and build_sched_domain() as static
  sched/cpupri: Don't re-initialize 'struct cpupri'
  sched/deadline: Don't re-initialize 'struct cpudl'
  ...
2017-09-04 09:10:24 -07:00
Linus Torvalds 9657752cb5 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Kernel side changes:

   - Add branch type profiling/tracing support. (Jin Yao)

   - Add the PERF_SAMPLE_PHYS_ADDR ABI to allow the tracing/profiling of
     physical memory addresses, where the PMU supports it. (Kan Liang)

   - Export some PMU capability details in the new
     /sys/bus/event_source/devices/cpu/caps/ sysfs directory. (Andi
     Kleen)

   - Aux data fixes and updates (Will Deacon)

   - kprobes fixes and updates (Masami Hiramatsu)

   - AMD uncore PMU driver fixes and updates (Janakarajan Natarajan)

  On the tooling side, here's a (limited!) list of highlights - there
  were many other changes that I could not list, see the shortlog and
  git history for details:

  UI improvements:

   - Implement a visual marker for fused x86 instructions in the
     annotate TUI browser, available now in 'perf report', more work
     needed to have it available as well in 'perf top' (Jin Yao)

     Further explanation from one of Jin's patches:

             │   ┌──cmpl   $0x0,argp_program_version_hook
       81.93 │   ├──je     20
             │   │  lock   cmpxchg %esi,0x38a9a4(%rip)
             │   │↓ jne    29
             │   │↓ jmp    43
       11.47 │20:└─→cmpxch %esi,0x38a999(%rip)

     That means the cmpl+je is a fused instruction pair and they should
     be considered together.

   - Record the branch type and then show statistics and info about in
     callchain entries (Jin Yao)

     Example from one of Jin's patches:

        # perf record -g -j any,save_type
        # perf report --branch-history --stdio --no-children

        38.50%  div.c:45                [.] main                    div
                |
                ---main div.c:42 (RET CROSS_2M cycles:2)
                   compute_flag div.c:28 (cycles:2)
                   compute_flag div.c:27 (RET CROSS_2M cycles:1)
                   rand rand.c:28 (cycles:1)
                   rand rand.c:28 (RET CROSS_2M cycles:1)
                   __random random.c:298 (cycles:1)
                   __random random.c:297 (COND_BWD CROSS_2M cycles:1)
                   __random random.c:295 (cycles:1)
                   __random random.c:295 (COND_BWD CROSS_2M cycles:1)
                   __random random.c:295 (cycles:1)
                   __random random.c:295 (RET CROSS_2M cycles:9)

  namespaces support:

   - Add initial support for namespaces, using setns to access files in
     namespaces, grabbing their build-ids, etc. (Krister Johansen)

  perf trace enhancements:

   - Beautify pkey_{alloc,free,mprotect} arguments in 'perf trace'
     (Arnaldo Carvalho de Melo)

   - Add initial 'clone' syscall args beautifier in 'perf trace'
     (Arnaldo Carvalho de Melo)

   - Ignore 'fd' and 'offset' args for MAP_ANONYMOUS in 'perf trace'
     (Arnaldo Carvalho de Melo)

   - Beautifiers for the 'cmd' arg of several ioctl types, including:
     sound, DRM, KVM, vhost virtio and perf_events. (Arnaldo Carvalho de
     Melo)

   - Add PERF_SAMPLE_CALLCHAIN and PERF_RECORD_MMAP[2] to 'perf data'
     CTF conversion, allowing CTF trace visualization tools to show
     callchains and to resolve symbols (Geneviève Bastien)

   - Beautify the fcntl syscall, which is an interesting one in the
     sense that infrastructure had to be put in place to change the
     formatters of some arguments according to the value in a previous
     one, i.e. cmd dictates how arg and the syscall return will be
     formatted. (Arnaldo Carvalho de Melo

  perf stat enhancements:

   - Use group read for event groups in 'perf stat', reducing overhead
     when groups are defined in the event specification, i.e. when using
     {} to enclose a list of events, asking them to be read at the same
     time, e.g.: "perf stat -e '{cycles,instructions}'" (Jiri Olsa)

  pipe mode improvements:

   - Process tracing data in 'perf annotate' pipe mode (David
     Carrillo-Cisneros)

   - Add header record types to pipe-mode, now this command:

        $ perf record -o - -e cycles sleep 1 | perf report --stdio --header

     Will show the same as in non-pipe mode, i.e. involving a perf.data
     file (David Carrillo-Cisneros)

  Vendor specific hardware event support updates/enhancements:

   - Update POWER9 vendor events tables (Sukadev Bhattiprolu)

   - Add POWER9 PMU events Sukadev (Bhattiprolu)

   - Support additional POWER8+ PVR in PMU mapfile (Shriya)

   - Add Skylake server uncore JSON vendor events (Andi Kleen)

   - Support exporting Intel PT data to sqlite3 with python perf
     scripts, this is in addition to the postgresql support that was
     already there (Adrian Hunter)"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (253 commits)
  perf symbols: Fix plt entry calculation for ARM and AARCH64
  perf probe: Fix kprobe blacklist checking condition
  perf/x86: Fix caps/ for !Intel
  perf/core, x86: Add PERF_SAMPLE_PHYS_ADDR
  perf/core, pt, bts: Get rid of itrace_started
  perf trace beauty: Beautify pkey_{alloc,free,mprotect} arguments
  tools headers: Sync cpu features kernel ABI headers with tooling headers
  perf tools: Pass full path of FEATURES_DUMP
  perf tools: Robustify detection of clang binary
  tools lib: Allow external definition of CC, AR and LD
  perf tools: Allow external definition of flex and bison binary names
  tools build tests: Don't hardcode gcc name
  perf report: Group stat values on global event id
  perf values: Zero value buffers
  perf values: Fix allocation check
  perf values: Fix thread index bug
  perf report: Add dump_read function
  perf record: Set read_format for inherit_stat
  perf c2c: Fix remote HITM detection for Skylake
  perf tools: Fix static build with newer toolchains
  ...
2017-09-04 08:39:02 -07:00
Linus Torvalds 0081a0ce80 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnad:
 "The main RCU related changes in this cycle were:

   - Removal of spin_unlock_wait()
   - SRCU updates
   - RCU torture-test updates
   - RCU Documentation updates
   - Extend the sys_membarrier() ABI with the MEMBARRIER_CMD_PRIVATE_EXPEDITED variant
   - Miscellaneous RCU fixes
   - CPU-hotplug fixes"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
  arch: Remove spin_unlock_wait() arch-specific definitions
  locking: Remove spin_unlock_wait() generic definitions
  drivers/ata: Replace spin_unlock_wait() with lock/unlock pair
  ipc: Replace spin_unlock_wait() with lock/unlock pair
  exit: Replace spin_unlock_wait() with lock/unlock pair
  completion: Replace spin_unlock_wait() with lock/unlock pair
  doc: Set down RCU's scheduling-clock-interrupt needs
  doc: No longer allowed to use rcu_dereference on non-pointers
  doc: Add RCU files to docbook-generation files
  doc: Update memory-barriers.txt for read-to-write dependencies
  doc: Update RCU documentation
  membarrier: Provide expedited private command
  rcu: Remove exports from rcu_idle_exit() and rcu_idle_enter()
  rcu: Add warning to rcu_idle_enter() for irqs enabled
  rcu: Make rcu_idle_enter() rely on callers disabling irqs
  rcu: Add assertions verifying blocked-tasks list
  rcu/tracing: Set disable_rcu_irq_enter on rcu_eqs_exit()
  rcu: Add TPS() protection for _rcu_barrier_trace strings
  rcu: Use idle versions of swait to make idle-hack clear
  swait: Add idle variants which don't contribute to load average
  ...
2017-09-04 08:13:52 -07:00
Ingo Molnar edc2988c54 Merge branch 'linus' into locking/core, to fix up conflicts
Conflicts:
	mm/page_alloc.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-04 11:01:18 +02:00
Rafael J. Wysocki 7b01463e51 Merge branch 'pm-sleep'
* pm-sleep:
  ACPI / PM: Check low power idle constraints for debug only
  PM / s2idle: Rename platform operations structure
  PM / s2idle: Rename ->enter_freeze to ->enter_s2idle
  PM / s2idle: Rename freeze_state enum and related items
  PM / s2idle: Rename PM_SUSPEND_FREEZE to PM_SUSPEND_TO_IDLE
  ACPI / PM: Prefer suspend-to-idle over S3 on some systems
  platform/x86: intel-hid: Wake up Dell Latitude 7275 from suspend-to-idle
  PM / suspend: Define pr_fmt() in suspend.c
  PM / suspend: Use mem_sleep_labels[] strings in messages
  PM / sleep: Put pm_test under CONFIG_PM_SLEEP_DEBUG
  PM / sleep: Check pm_wakeup_pending() in __device_suspend_noirq()
  PM / core: Add error argument to dpm_show_time()
  PM / core: Split dpm_suspend_noirq() and dpm_resume_noirq()
  PM / s2idle: Rearrange the main suspend-to-idle loop
  PM / timekeeping: Print debug messages when requested
  PM / sleep: Mark suspend/hibernation start and finish
  PM / sleep: Do not print debug messages by default
  PM / suspend: Export pm_suspend_target_state
2017-09-04 00:06:02 +02:00
Rafael J. Wysocki 08a10002be Merge branch 'pm-cpufreq-sched'
* pm-cpufreq-sched:
  cpufreq: schedutil: Always process remote callback with slow switching
  cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily
  cpufreq: Return 0 from ->fast_switch() on errors
  cpufreq: Simplify cpufreq_can_do_remote_dvfs()
  cpufreq: Process remote callbacks from any CPU if the platform permits
  sched: cpufreq: Allow remote cpufreq callbacks
  cpufreq: schedutil: Use unsigned int for iowait boost
  cpufreq: schedutil: Make iowait boost more energy efficient
2017-09-04 00:05:22 +02:00
Rafael J. Wysocki bd87c8fb9d Merge branch 'pm-cpufreq'
* pm-cpufreq: (33 commits)
  cpufreq: imx6q: Fix imx6sx low frequency support
  cpufreq: speedstep-lib: make several arrays static, makes code smaller
  cpufreq: ti: Fix 'of_node_put' being called twice in error handling path
  cpufreq: dt-platdev: Drop few entries from whitelist
  cpufreq: dt-platdev: Automatically create cpufreq device with OPP v2
  ARM: ux500: don't select CPUFREQ_DT
  cpufreq: Convert to using %pOF instead of full_name
  cpufreq: Cap the default transition delay value to 10 ms
  cpufreq: dbx500: Delete obsolete driver
  mfd: db8500-prcmu: Get rid of cpufreq dependency
  cpufreq: enable the DT cpufreq driver on the Ux500
  cpufreq: Loongson2: constify platform_device_id
  cpufreq: dt: Add r8a7796 support to to use generic cpufreq driver
  cpufreq: remove setting of policy->cpu in policy->cpus during init
  cpufreq: mediatek: add support of cpufreq to MT7622 SoC
  cpufreq: mediatek: add cleanups with the more generic naming
  cpufreq: rcar: Add support for R8A7795 SoC
  cpufreq: dt: Add rk3328 compatible to use generic cpufreq driver
  cpufreq: s5pv210: add missing of_node_put()
  cpufreq: Allow dynamic switching with CPUFREQ_ETERNAL latency
  ...
2017-09-04 00:05:13 +02:00
Rafael J. Wysocki 45a7953c83 Merge branches 'pm-core', 'pm-opp', 'pm-domains', 'pm-cpu' and 'pm-avs'
* pm-core:
  PM / wakeup: Set power.can_wakeup if wakeup_sysfs_add() fails

* pm-opp:
  PM / OPP: Fix get sharing CPUs when hotplug is used
  PM / OPP: OF: Use pr_debug() instead of pr_err() while adding OPP table

* pm-domains:
  PM / Domains: Convert to using %pOF instead of full_name
  PM / Domains: Extend generic power domain debugfs
  PM / Domains: Add time accounting to various genpd states

* pm-cpu:
  PM / CPU: replace raw_notifier with atomic_notifier

* pm-avs:
  PM / AVS: rockchip-io: add io selectors and supplies for RV1108
2017-09-04 00:04:49 +02:00
Linus Torvalds 3b62dc6c38 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
 "A single fix for a thinko in the raw timekeeper update which causes
  clock MONOTONIC_RAW to run with erratically increased frequency"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  time: Fix ktime_get_raw() incorrect base accumulation
2017-09-03 09:30:40 -07:00
Linus Torvalds e92d51aff5 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:

 - Prevent a potential inconistency in the perf user space access which
   might lead to evading sanity checks.

 - Prevent perf recording function trace entries twice

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/ftrace: Fix double traces of perf on ftrace:function
  perf/core: Fix potential double-fetch bug
2017-09-03 09:23:23 -07:00
Linus Torvalds 8cf9f2a29f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix handling of pinned BPF map nodes in hash of maps, from Daniel
    Borkmann.

 2) IPSEC ESP error paths leak memory, from Steffen Klassert.

 3) We need an RCU grace period before freeing fib6_node objects, from
    Wei Wang.

 4) Must check skb_put_padto() return value in HSR driver, from FLorian
    Fainelli.

 5) Fix oops on PHY probe failure in ftgmac100 driver, from Andrew
    Jeffery.

 6) Fix infinite loop in UDP queue when using SO_PEEK_OFF, from Eric
    Dumazet.

 7) Use after free when tcf_chain_destroy() called multiple times, from
    Jiri Pirko.

 8) Fix KSZ DSA tag layer multiple free of SKBS, from Florian Fainelli.

 9) Fix leak of uninitialized memory in sctp_get_sctp_info(),
    inet_diag_msg_sctpladdrs_fill() and inet_diag_msg_sctpaddrs_fill().
    From Stefano Brivio.

10) L2TP tunnel refcount fixes from Guillaume Nault.

11) Don't leak UDP secpath in udp_set_dev_scratch(), from Yossi
    Kauperman.

12) Revert a PHY layer change wrt. handling of PHY_HALTED state in
    phy_stop_machine(), it causes regressions for multiple people. From
    Florian Fainelli.

13) When packets are sent out of br0 we have to clear the
    offload_fwdq_mark value.

14) Several NULL pointer deref fixes in packet schedulers when their
    ->init() routine fails. From Nikolay Aleksandrov.

15) Aquantium devices cannot checksum offload correctly when the packet
    is <= 60 bytes. From Pavel Belous.

16) Fix vnet header access past end of buffer in AF_PACKET, from
    Benjamin Poirier.

17) Double free in probe error paths of nfp driver, from Dan Carpenter.

18) QOS capability not checked properly in DCB init paths of mlx5
    driver, from Huy Nguyen.

19) Fix conflicts between firmware load failure and health_care timer in
    mlx5, also from Huy Nguyen.

20) Fix dangling page pointer when DMA mapping errors occur in mlx5,
    from Eran Ben ELisha.

21) ->ndo_setup_tc() in bnxt_en driver doesn't count rings properly,
    from Michael Chan.

22) Missing MSIX vector free in bnxt_en, also from Michael Chan.

23) Refcount leak in xfrm layer when using sk_policy, from Lorenzo
    Colitti.

24) Fix copy of uninitialized data in qlge driver, from Arnd Bergmann.

25) bpf_setsockopts() erroneously always returns -EINVAL even on
    success. Fix from Yuchung Cheng.

26) tipc_rcv() needs to linearize the SKB before parsing the inner
    headers, from Parthasarathy Bhuvaragan.

27) Fix deadlock between link status updates and link removal in netvsc
    driver, from Stephen Hemminger.

28) Missed locking of page fragment handling in ESP output, from Steffen
    Klassert.

29) Fix refcnt leak in ebpf congestion control code, from Sabrina
    Dubroca.

30) sxgbe_probe_config_dt() doesn't check devm_kzalloc()'s return value,
    from Christophe Jaillet.

31) Fix missing ipv6 rx_dst_cookie update when rx_dst is updated during
    early demux, from Paolo Abeni.

32) Several info leaks in xfrm_user layer, from Mathias Krause.

33) Fix out of bounds read in cxgb4 driver, from Stefano Brivio.

34) Properly propagate obsolete state of route upwards in ipv6 so that
    upper holders like xfrm can see it. From Xin Long.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (118 commits)
  udp: fix secpath leak
  bridge: switchdev: Clear forward mark when transmitting packet
  mlxsw: spectrum: Forbid linking to devices that have uppers
  wl1251: add a missing spin_lock_init()
  Revert "net: phy: Correctly process PHY_HALTED in phy_stop_machine()"
  net: dsa: bcm_sf2: Fix number of CFP entries for BCM7278
  kcm: do not attach PF_KCM sockets to avoid deadlock
  sch_tbf: fix two null pointer dereferences on init failure
  sch_sfq: fix null pointer dereference on init failure
  sch_netem: avoid null pointer deref on init failure
  sch_fq_codel: avoid double free on init failure
  sch_cbq: fix null pointer dereferences on init failure
  sch_hfsc: fix null pointer deref and double free on init failure
  sch_hhf: fix null pointer dereference on init failure
  sch_multiq: fix double free on init failure
  sch_htb: fix crash on init failure
  net/mlx5e: Fix CQ moderation mode not set properly
  net/mlx5e: Fix inline header size for small packets
  net/mlx5: E-Switch, Unload the representors in the correct order
  net/mlx5e: Properly resolve TC offloaded ipv6 vxlan tunnel source address
  ...
2017-09-01 12:49:03 -07:00
Eric Biggers 355627f518 mm, uprobes: fix multiple free of ->uprobes_state.xol_area
Commit 7c05126793 ("mm, fork: make dup_mmap wait for mmap_sem for
write killable") made it possible to kill a forking task while it is
waiting to acquire its ->mmap_sem for write, in dup_mmap().

However, it was overlooked that this introduced an new error path before
the new mm_struct's ->uprobes_state.xol_area has been set to NULL after
being copied from the old mm_struct by the memcpy in dup_mm().  For a
task that has previously hit a uprobe tracepoint, this resulted in the
'struct xol_area' being freed multiple times if the task was killed at
just the right time while forking.

Fix it by setting ->uprobes_state.xol_area to NULL in mm_init() rather
than in uprobe_dup_mmap().

With CONFIG_UPROBE_EVENTS=y, the bug can be reproduced by the same C
program given by commit 2b7e8665b4 ("fork: fix incorrect fput of
->exe_file causing use-after-free"), provided that a uprobe tracepoint
has been set on the fork_thread() function.  For example:

    $ gcc reproducer.c -o reproducer -lpthread
    $ nm reproducer | grep fork_thread
    0000000000400719 t fork_thread
    $ echo "p $PWD/reproducer:0x719" > /sys/kernel/debug/tracing/uprobe_events
    $ echo 1 > /sys/kernel/debug/tracing/events/uprobes/enable
    $ ./reproducer

Here is the use-after-free reported by KASAN:

    BUG: KASAN: use-after-free in uprobe_clear_state+0x1c4/0x200
    Read of size 8 at addr ffff8800320a8b88 by task reproducer/198

    CPU: 1 PID: 198 Comm: reproducer Not tainted 4.13.0-rc7-00015-g36fde05f3fb5 #255
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-20170228_101828-anatol 04/01/2014
    Call Trace:
     dump_stack+0xdb/0x185
     print_address_description+0x7e/0x290
     kasan_report+0x23b/0x350
     __asan_report_load8_noabort+0x19/0x20
     uprobe_clear_state+0x1c4/0x200
     mmput+0xd6/0x360
     do_exit+0x740/0x1670
     do_group_exit+0x13f/0x380
     get_signal+0x597/0x17d0
     do_signal+0x99/0x1df0
     exit_to_usermode_loop+0x166/0x1e0
     syscall_return_slowpath+0x258/0x2c0
     entry_SYSCALL_64_fastpath+0xbc/0xbe

    ...

    Allocated by task 199:
     save_stack_trace+0x1b/0x20
     kasan_kmalloc+0xfc/0x180
     kmem_cache_alloc_trace+0xf3/0x330
     __create_xol_area+0x10f/0x780
     uprobe_notify_resume+0x1674/0x2210
     exit_to_usermode_loop+0x150/0x1e0
     prepare_exit_to_usermode+0x14b/0x180
     retint_user+0x8/0x20

    Freed by task 199:
     save_stack_trace+0x1b/0x20
     kasan_slab_free+0xa8/0x1a0
     kfree+0xba/0x210
     uprobe_clear_state+0x151/0x200
     mmput+0xd6/0x360
     copy_process.part.8+0x605f/0x65d0
     _do_fork+0x1a5/0xbd0
     SyS_clone+0x19/0x20
     do_syscall_64+0x22f/0x660
     return_from_SYSCALL_64+0x0/0x7a

Note: without KASAN, you may instead see a "Bad page state" message, or
simply a general protection fault.

Link: http://lkml.kernel.org/r/20170830033303.17927-1-ebiggers3@gmail.com
Fixes: 7c05126793 ("mm, fork: make dup_mmap wait for mmap_sem for write killable")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>    [4.7+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31 16:33:15 -07:00
Shaohua Li 22cf8bc6cb kernel/kthread.c: kthread_worker: don't hog the cpu
If the worker thread continues getting work, it will hog the cpu and rcu
stall complains.  Make it a good citizen.  This is triggered in a loop
block device test.

Link: http://lkml.kernel.org/r/5de0a179b3184e1a2183fc503448b0269f24d75b.1503697127.git.shli@fb.com
Signed-off-by: Shaohua Li <shli@fb.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31 16:33:15 -07:00
Alexandre Belloni 51218298a2 alarmtimer: Ensure RTC module is not unloaded
When registering the rtc device to be used to handle alarm timers,
get_device is used to ensure the device doesn't go away but the module can
still be unloaded.

Call try_module_get to ensure the rtc driver will not go away.

Reported-and-tested-by: Michal Simek <monstr@monstr.eu>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: John Stultz <john.stultz@linaro.org>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Link: http://lkml.kernel.org/r/20170820220146.30969-1-alexandre.belloni@free-electrons.com
2017-08-31 21:36:45 +02:00