mirror of
https://github.com/linux-apfs/linux-apfs.git
synced 2026-05-01 15:00:59 -07:00
2a58d42c1e018ad514d4e23fd33fb2ded95d3ee6
21691 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
67990608c8 |
Merge tag 'pm+acpi-4.5-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull oower management and ACPI updates from Rafael Wysocki:
"As far as the number of commits goes, ACPICA takes the lead this time,
followed by cpufreq and the device properties framework changes.
The most significant new feature is the debugfs-based interface to the
ACPICA's AML debugger added in the previous cycle and a new user space
tool for accessing it.
On the cpufreq front, the core is updated to handle governors more
efficiently, particularly on systems where a single cpufreq policy
object is shared between multiple CPUs, and there are quite a few
changes in drivers (intel_pstate, cpufreq-dt etc).
The device properties framework is updated to handle built-in (ie
included in the kernel itself) device properties better, among other
things by adding a fallback mechanism that will allow drivers to
provide default properties to be used in case the plaform firmware
doesn't provide the properties expected by them.
The Operating Performance Points (OPP) framework gets new DT bindings
and debugfs support.
A new cpufreq driver for ST platforms is added and the ACPI driver for
AMD SoCs will now support the APM X-Gene ACPI I2C device.
The rest is mostly fixes and cleanups all over.
Specifics:
- Add a debugfs-based interface for interacting with the ACPICA's AML
debugger introduced in the previous cycle and a new user space tool
for that, fix some bugs related to the AML debugger and clean up
the code in question (Lv Zheng, Dan Carpenter, Colin Ian King,
Markus Elfring).
- Update ACPICA to upstream revision 20151218 including a number of
fixes and cleanups in the ACPICA core (Bob Moore, Lv Zheng, Labbe
Corentin, Prarit Bhargava, Colin Ian King, David E Box, Rafael
Wysocki).
In particular, the previously added erroneous support for the _SUB
object is dropped, the concatenate operator will support all ACPI
objects now, the Debug Object handling is improved, the SuperName
handling of parameters being control methods is fixed, the
ObjectType operator handling is updated to follow ACPI 5.0A and the
handling of CondRefOf and RefOf is updated accordingly, module-
level code will be executed after loading each ACPI table now
(instead of being run once after all tables containing AML have
been loaded), the Operation Region handlers management is updated
to fix some reported problems and a the ACPICA code in the kernel
is more in line with the upstream now.
- Update the ACPI backlight driver to provide information on whether
or not it will generate key-presses for brightness change hotkeys
and update some platform drivers (dell-wmi, thinkpad_acpi) to use
that information to avoid sending double key-events to users pace
for these, add new ACPI backlight quirks (Hans de Goede, Aaron Lu,
Adrien Schildknecht).
- Improve the ACPI handling of interrupt GPIOs (Christophe Ricard).
- Fix the handling of the list of device IDs of device objects found
in the ACPI namespace and add a helper for checking if there is a
device object for a given device ID (Lukas Wunner).
- Change the logic in the ACPI namespace scanning code to create
struct acpi_device objects for all ACPI device objects found in the
namespace even if _STA fails for them which helps to avoid device
enumeration problems on Microsoft Surface 3 (Aaron Lu).
- Add support for the APM X-Gene ACPI I2C device to the ACPI driver
for AMD SoCs (Loc Ho).
- Fix the long-standing issue with the DMA controller on Intel SoCs
where ACPI tables have no power management support for the DMA
controller itself, but it can be powered off automatically when the
last (other) device on the SoC is powered off via ACPI and clean up
the ACPI driver for Intel SoCs (acpi-lpss) after previous attempts
to fix that problem (Andy Shevchenko).
- Assorted ACPI fixes and cleanups (Andy Lutomirski, Colin Ian King,
Javier Martinez Canillas, Ken Xue, Mathias Krause, Rafael Wysocki,
Sinan Kaya).
- Update the device properties framework for better handling of
built-in properties, add support for built-in properties to the
platform bus type, update the MFD subsystem's handling of device
properties and add support for passing default configuration data
as device properties to the intel-lpss MFD drivers, convert the
designware I2C driver to use the unified device properties API and
add a fallback mechanism for using default built-in properties if
the platform firmware fails to provide the properties as expected
by drivers (Andy Shevchenko, Mika Westerberg, Heikki Krogerus,
Andrew Morton).
- Add new Device Tree bindings to the Operating Performance Points
(OPP) framework and update the exynos4412 DT binding accordingly,
introduce debugfs support for the OPP framework (Viresh Kumar,
Bartlomiej Zolnierkiewicz).
- Migrate the mt8173 cpufreq driver to the new OPP bindings (Pi-Cheng
Chen).
- Update the cpufreq core to make the handling of governors more
efficient, especially on systems where policy objects are shared
between multiple CPUs (Viresh Kumar, Rafael Wysocki).
- Fix cpufreq governor handling on configurations with
CONFIG_HZ_PERIODIC set (Chen Yu).
- Clean up the cpufreq core code related to the boost sysfs knob
support and update the ACPI cpufreq driver accordingly (Rafael
Wysocki).
- Add a new cpufreq driver for ST platforms and corresponding Device
Tree bindings (Lee Jones).
- Update the intel_pstate driver to allow the P-state selection
algorithm used by it to depend on the CPU ID of the processor it is
running on, make it use a special P-state selection algorithm (with
an IO wait time compensation tweak) on Atom CPUs based on the
Airmont and Silvermont cores so as to reduce their energy
consumption and improve intel_pstate documentation (Philippe
Longepe, Srinivas Pandruvada).
- Update the cpufreq-dt driver to support registering cooling devices
that use the (P * V^2 * f) dynamic power draw formula where V is
the voltage, f is the frequency and P is a constant coefficient
provided by Device Tree and update the arm_big_little cpufreq
driver to use that support (Punit Agrawal).
- Assorted cpufreq driver (cpufreq-dt, qoriq, pcc-cpufreq,
blackfin-cpufreq) updates (Andrzej Hajda, Hongtao Jia, Jacob
Tanenbaum, Markus Elfring).
- cpuidle core tweaks related to polling and measured_us calculation
(Rik van Riel).
- Removal of modularity from a few cpuidle drivers (clps711x, ux500,
exynos) that cannot be built as modules in practice (Paul
Gortmaker).
- PM core update to prevent devices from being probed during system
suspend/resume which is generally problematic and may lead to
inconsistent behavior (Grygorii Strashko).
- Assorted updates of the PM core and related code (Julia Lawall,
Manuel Pégourié-Gonnard, Maruthi Bayyavarapu, Rafael Wysocki, Ulf
Hansson).
- PNP bus type updates (Christophe Le Roy, Heiner Kallweit).
- PCI PM code cleanups (Jarkko Nikula, Julia Lawall).
- cpupower tool updates (Jacob Tanenbaum, Thomas Renninger)"
* tag 'pm+acpi-4.5-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (177 commits)
PM / clk: don't leave clocks enabled when driver not bound
i2c: dw: Add APM X-Gene ACPI I2C device support
ACPI / APD: Add APM X-Gene ACPI I2C device support
ACPI / LPSS: change 'does not have' to 'has' in comment
Revert "dmaengine: dw: platform: provide platform data for Intel"
dmaengine: dw: return immediately from IRQ when DMA isn't in use
dmaengine: dw: platform: power on device on shutdown
ACPI / LPSS: override power state for LPSS DMA device
PM / OPP: Use snprintf() instead of sprintf()
Documentation: cpufreq: intel_pstate: enhance documentation
ACPI, PCI, irq: remove redundant check for null string pointer
ACPI / video: driver must be registered before checking for keypresses
cpufreq-dt: fix handling regulator_get_voltage() result
cpufreq: governor: Fix negative idle_time when configured with CONFIG_HZ_PERIODIC
PM / sleep: Add support for read-only sysfs attributes
ACPI: Fix white space in a structure definition
ACPI / SBS: fix inconsistent indenting inside if statement
PNP: respect PNP_DRIVER_RES_DO_NOT_CHANGE when detaching
ACPI / PNP: constify device IDs
ACPI / PCI: Simplify acpi_penalize_isa_irq()
...
|
||
|
|
c17488d066 |
Merge tag 'trace-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"Not much new with tracing for this release. Mostly just clean ups and
minor fixes.
Here's what else is new:
- A new TRACE_EVENT_FN_COND macro, combining both _FN and _COND for
those that want both.
- New selftest to test the instance create and delete
- Better debug output when ftrace fails"
* tag 'trace-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (24 commits)
ftrace: Fix the race between ftrace and insmod
ftrace: Add infrastructure for delayed enabling of module functions
x86: ftrace: Fix the comments for ftrace_modify_code_direct()
tracing: Fix comment to use tracing_on over tracing_enable
metag: ftrace: Fix the comments for ftrace_modify_code
sh: ftrace: Fix the comments for ftrace_modify_code()
ia64: ftrace: Fix the comments for ftrace_modify_code()
ftrace: Clean up ftrace_module_init() code
ftrace: Join functions ftrace_module_init() and ftrace_init_module()
tracing: Introduce TRACE_EVENT_FN_COND macro
tracing: Use seq_buf_used() in seq_buf_to_user() instead of len
bpf: Constify bpf_verifier_ops structure
ftrace: Have ftrace_ops_get_func() handle RCU and PER_CPU flags too
ftrace: Remove use of control list and ops
ftrace: Fix output of enabled_functions for showing tramp
ftrace: Fix a typo in comment
ftrace: Show all tramps registered to a record on ftrace_bug()
ftrace: Add variable ftrace_expected for archs to show expected code
ftrace: Add new type to distinguish what kind of ftrace_bug()
tracing: Update cond flag when enabling or disabling a trigger
...
|
||
|
|
34a9304a96 |
Merge branch 'for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo: - cgroup v2 interface is now official. It's no longer hidden behind a devel flag and can be mounted using the new cgroup2 fs type. Unfortunately, cpu v2 interface hasn't made it yet due to the discussion around in-process hierarchical resource distribution and only memory and io controllers can be used on the v2 interface at the moment. - The existing documentation which has always been a bit of mess is relocated under Documentation/cgroup-v1/. Documentation/cgroup-v2.txt is added as the authoritative documentation for the v2 interface. - Some features are added through for-4.5-ancestor-test branch to enable netfilter xt_cgroup match to use cgroup v2 paths. The actual netfilter changes will be merged through the net tree which pulled in the said branch. - Various cleanups * 'for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: rename cgroup documentations cgroup: fix a typo. cgroup: Remove resource_counter.txt in Documentation/cgroup-legacy/00-INDEX. cgroup: demote subsystem init messages to KERN_DEBUG cgroup: Fix uninitialized variable warning cgroup: put controller Kconfig options in meaningful order cgroup: clean up the kernel configuration menu nomenclature cgroup_pids: fix a typo. Subject: cgroup: Fix incomplete dd command in blkio documentation cgroup: kill cgrp_ss_priv[CGROUP_CANFORK_COUNT] and friends cpuset: Replace all instances of time_t with time64_t cgroup: replace unified-hierarchy.txt with a proper cgroup v2 documentation cgroup: rename Documentation/cgroups/ to Documentation/cgroup-legacy/ cgroup: replace __DEVEL__sane_behavior with cgroup2 fs type |
||
|
|
aee3bfa330 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from Davic Miller:
1) Support busy polling generically, for all NAPI drivers. From Eric
Dumazet.
2) Add byte/packet counter support to nft_ct, from Floriani Westphal.
3) Add RSS/XPS support to mvneta driver, from Gregory Clement.
4) Implement IPV6_HDRINCL socket option for raw sockets, from Hannes
Frederic Sowa.
5) Add support for T6 adapter to cxgb4 driver, from Hariprasad Shenai.
6) Add support for VLAN device bridging to mlxsw switch driver, from
Ido Schimmel.
7) Add driver for Netronome NFP4000/NFP6000, from Jakub Kicinski.
8) Provide hwmon interface to mlxsw switch driver, from Jiri Pirko.
9) Reorganize wireless drivers into per-vendor directories just like we
do for ethernet drivers. From Kalle Valo.
10) Provide a way for administrators "destroy" connected sockets via the
SOCK_DESTROY socket netlink diag operation. From Lorenzo Colitti.
11) Add support to add/remove multicast routes via netlink, from Nikolay
Aleksandrov.
12) Make TCP keepalive settings per-namespace, from Nikolay Borisov.
13) Add forwarding and packet duplication facilities to nf_tables, from
Pablo Neira Ayuso.
14) Dead route support in MPLS, from Roopa Prabhu.
15) TSO support for thunderx chips, from Sunil Goutham.
16) Add driver for IBM's System i/p VNIC protocol, from Thomas Falcon.
17) Rationalize, consolidate, and more completely document the checksum
offloading facilities in the networking stack. From Tom Herbert.
18) Support aborting an ongoing scan in mac80211/cfg80211, from
Vidyullatha Kanchanapally.
19) Use per-bucket spinlock for bpf hash facility, from Tom Leiming.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1375 commits)
net: bnxt: always return values from _bnxt_get_max_rings
net: bpf: reject invalid shifts
phonet: properly unshare skbs in phonet_rcv()
dwc_eth_qos: Fix dma address for multi-fragment skbs
phy: remove an unneeded condition
mdio: remove an unneed condition
mdio_bus: NULL dereference on allocation error
net: Fix typo in netdev_intersect_features
net: freescale: mac-fec: Fix build error from phy_device API change
net: freescale: ucc_geth: Fix build error from phy_device API change
bonding: Prevent IPv6 link local address on enslaved devices
IB/mlx5: Add flow steering support
net/mlx5_core: Export flow steering API
net/mlx5_core: Make ipv4/ipv6 location more clear
net/mlx5_core: Enable flow steering support for the IB driver
net/mlx5_core: Initialize namespaces only when supported by device
net/mlx5_core: Set priority attributes
net/mlx5_core: Connect flow tables
net/mlx5_core: Introduce modify flow table command
net/mlx5_core: Managing root flow table
...
|
||
|
|
33caf82acf |
Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
"All kinds of stuff. That probably should've been 5 or 6 separate
branches, but by the time I'd realized how large and mixed that bag
had become it had been too close to -final to play with rebasing.
Some fs/namei.c cleanups there, memdup_user_nul() introduction and
switching open-coded instances, burying long-dead code, whack-a-mole
of various kinds, several new helpers for ->llseek(), assorted
cleanups and fixes from various people, etc.
One piece probably deserves special mention - Neil's
lookup_one_len_unlocked(). Similar to lookup_one_len(), but gets
called without ->i_mutex and tries to avoid ever taking it. That, of
course, means that it's not useful for any directory modifications,
but things like getting inode attributes in nfds readdirplus are fine
with that. I really should've asked for moratorium on lookup-related
changes this cycle, but since I hadn't done that early enough... I
*am* asking for that for the coming cycle, though - I'm going to try
and get conversion of i_mutex to rwsem with ->lookup() done under lock
taken shared.
There will be a patch closer to the end of the window, along the lines
of the one Linus had posted last May - mechanical conversion of
->i_mutex accesses to inode_lock()/inode_unlock()/inode_trylock()/
inode_is_locked()/inode_lock_nested(). To quote Linus back then:
-----
| This is an automated patch using
|
| sed 's/mutex_lock(&\(.*\)->i_mutex)/inode_lock(\1)/'
| sed 's/mutex_unlock(&\(.*\)->i_mutex)/inode_unlock(\1)/'
| sed 's/mutex_lock_nested(&\(.*\)->i_mutex,[ ]*I_MUTEX_\([A-Z0-9_]*\))/inode_lock_nested(\1, I_MUTEX_\2)/'
| sed 's/mutex_is_locked(&\(.*\)->i_mutex)/inode_is_locked(\1)/'
| sed 's/mutex_trylock(&\(.*\)->i_mutex)/inode_trylock(\1)/'
|
| with a very few manual fixups
-----
I'm going to send that once the ->i_mutex-affecting stuff in -next
gets mostly merged (or when Linus says he's about to stop taking
merges)"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
nfsd: don't hold i_mutex over userspace upcalls
fs:affs:Replace time_t with time64_t
fs/9p: use fscache mutex rather than spinlock
proc: add a reschedule point in proc_readfd_common()
logfs: constify logfs_block_ops structures
fcntl: allow to set O_DIRECT flag on pipe
fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE
fs: xattr: Use kvfree()
[s390] page_to_phys() always returns a multiple of PAGE_SIZE
nbd: use ->compat_ioctl()
fs: use block_device name vsprintf helper
lib/vsprintf: add %*pg format specifier
fs: use gendisk->disk_name where possible
poll: plug an unused argument to do_poll
amdkfd: don't open-code memdup_user()
cdrom: don't open-code memdup_user()
rsxx: don't open-code memdup_user()
mtip32xx: don't open-code memdup_user()
[um] mconsole: don't open-code memdup_user_nul()
[um] hostaudio: don't open-code memdup_user()
...
|
||
|
|
fce205e9da |
Merge branch 'work.copy_file_range' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs copy_file_range updates from Al Viro: "Several series around copy_file_range/CLONE" * 'work.copy_file_range' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: btrfs: use new dedupe data function pointer vfs: hoist the btrfs deduplication ioctl to the vfs vfs: wire up compat ioctl for CLONE/CLONE_RANGE cifs: avoid unused variable and label nfsd: implement the NFSv4.2 CLONE operation nfsd: Pass filehandle to nfs4_preprocess_stateid_op() vfs: pull btrfs clone API to vfs layer locks: new locks_mandatory_area calling convention vfs: Add vfs_copy_file_range() support for pagecache copies btrfs: add .copy_file_range file operation x86: add sys_copy_file_range to syscall tables vfs: add copy_file_range syscall and vfs helper |
||
|
|
229394e8e6 |
net: bpf: reject invalid shifts
On ARM64, a BUG() is triggered in the eBPF JIT if a filter with a constant shift that can't be encoded in the immediate field of the UBFM/SBFM instructions is passed to the JIT. Since these shifts amounts, which are negative or >= regsize, are invalid, reject them in the eBPF verifier and the classic BPF filter checker, for all architectures. Signed-off-by: Rabin Vincent <rabin@rab.in> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
c9bed1cf51 |
Merge tag 'for-linus-4.5-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from David Vrabel: "Xen features and fixes for 4.5-rc0: - Stolen ticks and PV wallclock support for arm/arm64 - Add grant copy ioctl to gntdev device" * tag 'for-linus-4.5-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/gntdev: add ioctl for grant copy x86/xen: don't reset vcpu_info on a cancelled suspend xen/gntdev: constify mmu_notifier_ops structures xen/grant-table: constify gnttab_ops structure xen/time: use READ_ONCE xen/x86: convert remaining timespec to timespec64 in xen_pvclock_gtod_notify xen/x86: support XENPF_settime64 xen/arm: set the system time in Xen via the XENPF_settime64 hypercall xen/arm: introduce xen_read_wallclock arm: extend pvclock_wall_clock with sec_hi xen: introduce XENPF_settime64 xen/arm: introduce HYPERVISOR_platform_op on arm and arm64 xen: rename dom0_op to platform_op xen/arm: account for stolen ticks arm64: introduce CONFIG_PARAVIRT, PARAVIRT_TIME_ACCOUNTING and pv_time_ops arm: introduce CONFIG_PARAVIRT, PARAVIRT_TIME_ACCOUNTING and pv_time_ops missing include asm/paravirt.h in cputime.c xen: move xen_setup_runstate_info and get_runstate_snapshot to drivers/xen/time.c |
||
|
|
0f8c790103 |
Merge branch 'for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue update from Tejun Heo: "Workqueue changes for v4.5. One cleanup patch and three to improve the debuggability. Workqueue now has a stall detector which dumps workqueue state if any worker pool hasn't made forward progress over a certain amount of time (30s by default) and also triggers a warning if a workqueue which can be used in memory reclaim path tries to wait on something which can't be. These should make workqueue hangs a lot easier to debug." * 'for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: simplify the apply_workqueue_attrs_locked() workqueue: implement lockup detector watchdog: introduce touch_softlockup_watchdog_sched() workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue |
||
|
|
3d116a66ed |
Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"The irq department provides:
- Support for MSI to wire bridges and a first user of it
- More ACPI support for ARM/GIC
- A new TS-4800 interrupt controller driver
- RCU based free of interrupt descriptors to support the upcoming
Intel VMD technology without introducing a locking nightmare
- The usual pile of fixes and updates to drivers and core code"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
irqchip/omap-intc: Add support for spurious irq handling
irqchip/zevio: Use irq_data_get_chip_type() helper
irqchip/omap-intc: Remove duplicate setup for IRQ chip type handler
irqchip/ts4800: Add TS-4800 interrupt controller
irqchip/ts4800: Add documentation for TS-4800 interrupt controller
irq/platform-MSI: Increase the maximum MSIs the MSI framework can support
irqchip/gicv2m: Miscellaneous fixes for v2m resources and SPI ranges
irqchip/bcm2836: Make code more readable
irqchip/bcm2836: Tolerate IRQs while no flag is set in ISR
irqchip/bcm2836: Add SMP support for the 2836
irqchip/bcm2836: Fix initialization of the LOCAL_IRQ_CNT timers
irqchip/gic-v2m: acpi: Introducing GICv2m ACPI support
irqchip/gic-v2m: Refactor to prepare for ACPI support
irqdomain: Introduce is_fwnode_irqchip helper
acpi: pci: Setup MSI domain for ACPI based pci devices
genirq/msi: Export functions to allow MSI domains in modules
irqchip/mbigen: Implement the mbigen irq chip operation functions
irqchip/mbigen: Create irq domain for each mbigen device
irqchip/mgigen: Add platform device driver for mbigen device
dt-bindings: Documents the mbigen bindings
...
|
||
|
|
b4cee21ee0 |
Merge branches 'timers-core-for-linus' and 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates - and a leftover fix - from Thomas Gleixner:
"A rather large (commit wise) update from the timer side:
- A bulk update to make compile tests work in the clocksource drivers
- An overhaul of the h8300 timers
- Some more Y2038 work
- A few overflow prevention checks in the timekeeping/ntp code
- The usual pile of fixes and improvements to the various
clocksource/clockevent drivers and core code"
Also:
"A single fix for the posix-clock poll code which did not make it into
4.4"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (84 commits)
clocksource/drivers/acpi_pm: Convert to pr_* macros
clocksource: Make clocksource validation work for all clocksources
timekeeping: Cap adjustments so they don't exceed the maxadj value
ntp: Fix second_overflow's input parameter type to be 64bits
ntp: Change time_reftime to time64_t and utilize 64bit __ktime_get_real_seconds
timekeeping: Provide internal function __ktime_get_real_seconds
clocksource/drivers/h8300: Use ioread / iowrite
clocksource/drivers/h8300: Initializer cleanup.
clocksource/drivers/h8300: Simplify delta handling
clocksource/drivers/h8300: Fix timer not overflow case
clocksource/drivers/h8300: Change to overflow interrupt
clocksource/drivers/lpc32: Correct pr_err() output format
clocksource/drivers/arm_global_timer: Fix suspend resume
clocksource/drivers/pistachio: Fix wrong calculated clocksource read value
clockevents/drivers/arm_global_timer: Use writel_relaxed in gt_compare_set
clocksource/drivers/dw_apb_timer: Inline apbt_readl and apbt_writel
clocksource/drivers/dw_apb_timer: Use {readl|writel}_relaxed in critical path
clocksource/drivers/dw_apb_timer: Fix apbt_readl return types
clocksource/drivers/tango-xtal: Replace code by clocksource_mmio_init
clocksource/drivers/h8300: Increase the compilation test coverage
...
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
posix-clock: Fix return code on the poll method's error path
|
||
|
|
8f053a56df |
Merge branches 'pm-sleep' and 'pm-tools'
* pm-sleep: PM / sleep: Add support for read-only sysfs attributes * pm-tools: cpupower: fix how "cpupower frequency-info" interprets latency cpupower: rework the "cpupower frequency-info" command cpupower: Do not analyse offlined cpus cpupower: Provide STATIC variable in Makefile for debug builds cpupower: Fix precedence issue |
||
|
|
88cbfd0711 |
Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 asm updates from Ingo Molnar: "The main changes in this cycle were: - vDSO and asm entry improvements (Andy Lutomirski) - Xen paravirt entry enhancements (Boris Ostrovsky) - asm entry labels enhancement (Borislav Petkov) - and other misc changes (Thomas Gleixner, me)" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vsdo: Fix build on PARAVIRT_CLOCK=y, KVM_GUEST=n Revert "x86/kvm: On KVM re-enable (e.g. after suspend), update clocks" x86/entry/64_compat: Make labels local x86/platform/uv: Include clocksource.h for clocksource_touch_watchdog() x86/vdso: Enable vdso pvclock access on all vdso variants x86/vdso: Remove pvclock fixmap machinery x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader x86/kvm: On KVM re-enable (e.g. after suspend), update clocks x86/entry/64: Bypass enter_from_user_mode on non-context-tracking boots x86/asm: Add asm macros for static keys/jump labels x86/asm: Error out if asm/jump_label.h is included inappropriately context_tracking: Switch to new static_branch API x86/entry, x86/paravirt: Remove the unused usergs_sysret32 PV op x86/paravirt: Remove the unused irq_enable_sysexit pv op x86/xen: Avoid fast syscall path for Xen PV guests |
||
|
|
4f19b8803b |
Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 apic updates from Ingo Molnar:
"The main changes in this cycle were:
- introduce optimized single IPI sending methods on modern APICs
(Linus Torvalds, Thomas Gleixner)
- kexec/crash APIC handling fixes and enhancements (Hidehiro Kawai)
- extend lapic vector saving/restoring to the CMCI (MCE) vector as
well (Juergen Gross)
- various fixes and enhancements (Jake Oshins, Len Brown)"
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
x86/irq: Export functions to allow MSI domains in modules
Documentation: Document kernel.panic_on_io_nmi sysctl
x86/nmi: Save regs in crash dump on external NMI
x86/apic: Introduce apic_extnmi command line parameter
kexec: Fix race between panic() and crash_kexec()
panic, x86: Allow CPUs to save registers even if looping in NMI context
panic, x86: Fix re-entrance problem due to panic on NMI
x86/apic: Fix the saving and restoring of lapic vectors during suspend/resume
x86/smpboot: Re-enable init_udelay=0 by default on modern CPUs
x86/smp: Remove single IPI wrapper
x86/apic: Use default send single IPI wrapper
x86/apic: Provide default send single IPI wrapper
x86/apic: Implement single IPI for apic_noop
x86/apic: Wire up single IPI for apic_numachip
x86/apic: Wire up single IPI for x2apic_uv
x86/apic: Implement single IPI for x2apic_phys
x86/apic: Wire up single IPI for bigsmp_apic
x86/apic: Remove pointless indirections from bigsmp_apic
x86/apic: Wire up single IPI for apic_physflat
x86/apic: Remove pointless indirections from apic_physflat
...
|
||
|
|
af345201ea |
Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"The main changes in this cycle were:
- tickless load average calculation enhancements (Byungchul Park)
- vtime handling enhancements (Frederic Weisbecker)
- scalability improvement via properly aligning a key structure field
(Jiri Olsa)
- various stop_machine() fixes (Oleg Nesterov)
- sched/numa enhancement (Rik van Riel)
- various fixes and improvements (Andi Kleen, Dietmar Eggemann,
Geliang Tang, Hiroshi Shimamoto, Joonwoo Park, Peter Zijlstra,
Waiman Long, Wanpeng Li, Yuyang Du)"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
sched/fair: Fix new task's load avg removed from source CPU in wake_up_new_task()
sched/core: Move sched_entity::avg into separate cache line
x86/fpu: Properly align size in CHECK_MEMBER_AT_END_OF() macro
sched/deadline: Fix the earliest_dl.next logic
sched/fair: Disable the task group load_avg update for the root_task_group
sched/fair: Move the cache-hot 'load_avg' variable into its own cacheline
sched/fair: Avoid redundant idle_cpu() call in update_sg_lb_stats()
sched/core: Move the sched_to_prio[] arrays out of line
sched/cputime: Convert vtime_seqlock to seqcount
sched/cputime: Introduce vtime accounting check for readers
sched/cputime: Rename vtime_accounting_enabled() to vtime_accounting_cpu_enabled()
sched/cputime: Correctly handle task guest time on housekeepers
sched/cputime: Clarify vtime symbols and document them
sched/cputime: Remove extra cost in task_cputime()
sched/fair: Make it possible to account fair load avg consistently
sched/fair: Modify the comment about lock assumptions in migrate_task_rq_fair()
stop_machine: Clean up the usage of the preemption counter in cpu_stopper_thread()
stop_machine: Shift the 'done != NULL' check from cpu_stop_signal_done() to callers
stop_machine: Kill cpu_stop_done->executed
stop_machine: Change __stop_cpus() to rely on cpu_stop_queue_work()
...
|
||
|
|
5cb52b5e16 |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"Kernel side changes:
- Intel Knights Landing support. (Harish Chegondi)
- Intel Broadwell-EP uncore PMU support. (Kan Liang)
- Core code improvements. (Peter Zijlstra.)
- Event filter, LBR and PEBS fixes. (Stephane Eranian)
- Enable cycles:pp on Intel Atom. (Stephane Eranian)
- Add cycles:ppp support for Skylake. (Andi Kleen)
- Various x86 NMI overhead optimizations. (Andi Kleen)
- Intel PT enhancements. (Takao Indoh)
- AMD cache events fix. (Vince Weaver)
Tons of tooling changes:
- Show random perf tool tips in the 'perf report' bottom line
(Namhyung Kim)
- perf report now defaults to --group if the perf.data file has
grouped events, try it with:
# perf record -e '{cycles,instructions}' -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.093 MB perf.data (1247 samples) ]
# perf report
# Samples: 1K of event 'anon group { cycles, instructions }'
# Event count (approx.): 1955219195
#
# Overhead Command Shared Object Symbol
2.86% 0.22% swapper [kernel.kallsyms] [k] intel_idle
1.05% 0.33% firefox libxul.so [.] js::SetObjectElement
1.05% 0.00% kworker/0:3 [kernel.kallsyms] [k] gen6_ring_get_seqno
0.88% 0.17% chrome chrome [.] 0x0000000000ee27ab
0.65% 0.86% firefox libxul.so [.] js::ValueToId<(js::AllowGC)1>
0.64% 0.23% JS Helper libxul.so [.] js::SplayTree<js::jit::LiveRange*, js::jit::LiveRange>::splay
0.62% 1.27% firefox libxul.so [.] js::GetIterator
0.61% 1.74% firefox libxul.so [.] js::NativeSetProperty
0.61% 0.31% firefox libxul.so [.] js::SetPropertyByDefining
- Introduce the 'perf stat record/report' workflow:
Generate perf.data files from 'perf stat', to tap into the
scripting capabilities perf has instead of defining a 'perf stat'
specific scripting support to calculate event ratios, etc.
Simple example:
$ perf stat record -e cycles usleep 1
Performance counter stats for 'usleep 1':
1,134,996 cycles
0.000670644 seconds time elapsed
$ perf stat report
Performance counter stats for '/home/acme/bin/perf stat record -e cycles usleep 1':
1,134,996 cycles
0.000670644 seconds time elapsed
$
It generates PERF_RECORD_ userspace records to store the details:
$ perf report -D | grep PERF_RECORD
0xf0 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 27637
0x118 [0x12]: PERF_RECORD_CPU_MAP nr: 1 cpu: 65535
0x12a [0x40]: PERF_RECORD_STAT_CONFIG
0x16a [0x30]: PERF_RECORD_STAT
-1 -1 0x19a [0x40]: PERF_RECORD_MMAP -1/0: [0xffffffff81000000(0x1f000000) @ 0xffffffff81000000]: x [kernel.kallsyms]_text
0x1da [0x18]: PERF_RECORD_STAT_ROUND
[acme@ssdandy linux]$
An effort was made to make perf.data files generated like this to
not generate cryptic messages when processed by older tools.
The 'perf script' bits need rebasing, will go up later.
- Make command line options always available, even when they depend
on some feature being enabled, warning the user about use of such
options (Wang Nan)
- Support hw breakpoint events (mem:0xAddress) in the default output
mode in 'perf script' (Wang Nan)
- Fixes and improvements for supporting annotating ARM binaries,
support ARM call and jump instructions, more work needed to have
arch specific stuff separated into tools/perf/arch/*/annotate/
(Russell King)
- Add initial 'perf config' command, for now just with a --list
command to the contents of the configuration file in use and a
basic man page describing its format, commands for doing edits and
detailed documentation are being reviewed and proof-read. (Taeung
Song)
- Allows BPF scriptlets specify arguments to be fetched using DWARF
info, using a prologue generated at compile/build time (He Kuang,
Wang Nan)
- Allow attaching BPF scriptlets to module symbols (Wang Nan)
- Allow attaching BPF scriptlets to userspace code using uprobe (Wang
Nan)
- BPF programs now can specify 'perf probe' tunables via its section
name, separating key=val values using semicolons (Wang Nan)
Testing some of these new BPF features:
Use case: get callchains when receiving SSL packets, filter then in the
kernel, at arbitrary place.
# cat ssl.bpf.c
#define SEC(NAME) __attribute__((section(NAME), used))
struct pt_regs;
SEC("func=__inet_lookup_established hnum")
int func(struct pt_regs *ctx, int err, unsigned short port)
{
return err == 0 && port == 443;
}
char _license[] SEC("license") = "GPL";
int _version SEC("version") = LINUX_VERSION_CODE;
#
# perf record -a -g -e ssl.bpf.c
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.787 MB perf.data (3 samples) ]
# perf script | head -30
swapper 0 [000] 58783.268118: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb
8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux)
896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux)
8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux)
855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
8572a8 process_backlog (/lib/modules/4.3.0+/build/vmlinux)
856b11 net_rx_action (/lib/modules/4.3.0+/build/vmlinux)
2a284b __do_softirq (/lib/modules/4.3.0+/build/vmlinux)
2a2ba3 irq_exit (/lib/modules/4.3.0+/build/vmlinux)
96b7a4 do_IRQ (/lib/modules/4.3.0+/build/vmlinux)
969807 ret_from_intr (/lib/modules/4.3.0+/build/vmlinux)
2dede5 cpu_startup_entry (/lib/modules/4.3.0+/build/vmlinux)
95d5bc rest_init (/lib/modules/4.3.0+/build/vmlinux)
1163ffa start_kernel ([kernel.vmlinux].init.text)
11634d7 x86_64_start_reservations ([kernel.vmlinux].init.text)
1163623 x86_64_start_kernel ([kernel.vmlinux].init.text)
qemu-system-x86 9178 [003] 58785.792417: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb
8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux)
896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux)
8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux)
855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
856660 netif_receive_skb_internal (/lib/modules/4.3.0+/build/vmlinux)
8566ec netif_receive_skb_sk (/lib/modules/4.3.0+/build/vmlinux)
430a br_handle_frame_finish ([bridge])
48bc br_handle_frame ([bridge])
855f44 __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
#
- Use 'perf probe' various options to list functions, see what
variables can be collected at any given point, experiment first
collecting without a filter, then filter, use it together with
'perf trace', 'perf top', with or without callchains, if it
explodes, please tell us!
- Introduce a new callchain mode: "folded", that will list per line
representations of all callchains for a give histogram entry,
facilitating 'perf report' output processing by other tools, such
as Brendan Gregg's flamegraph tools (Namhyung Kim)
E.g:
# perf report | grep -v ^# | head
18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry
|
---cpu_startup_entry
|
|--12.07%--start_secondary
|
--6.30%--rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel
#
Becomes, in "folded" mode:
# perf report -g folded | grep -v ^# | head -5
18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry
12.07% cpu_startup_entry;start_secondary
6.30% cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
16.90% 0.00% swapper [kernel.kallsyms] [k] call_cpuidle
11.23% call_cpuidle;cpu_startup_entry;start_secondary
5.67% call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
16.90% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter
11.23% cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
5.67% cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
15.12% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter_state
#
The user can also select one of "count", "period" or "percent" as
the first column.
... and lots of infrastructure enhancements, plus fixes and other
changes, features I failed to list - see the shortlog and the git log
for details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (271 commits)
perf evlist: Add --trace-fields option to show trace fields
perf record: Store data mmaps for dwarf unwind
perf libdw: Check for mmaps also in MAP__VARIABLE tree
perf unwind: Check for mmaps also in MAP__VARIABLE tree
perf unwind: Use find_map function in access_dso_mem
perf evlist: Remove perf_evlist__(enable|disable)_event functions
perf evlist: Make perf_evlist__open() open evsels with their cpus and threads (like perf record does)
perf report: Show random usage tip on the help line
perf hists: Export a couple of hist functions
perf diff: Use perf_hpp__register_sort_field interface
perf tools: Add overhead/overhead_children keys defaults via string
perf tools: Remove list entry from struct sort_entry
perf tools: Include all tools/lib directory for tags/cscope/TAGS targets
perf script: Align event name properly
perf tools: Add missing headers in perf's MANIFEST
perf tools: Do not show trace command if it's not compiled in
perf report: Change default to use event group view
perf top: Decay periods in callchains
tools lib: Move bitmap.[ch] from tools/perf/ to tools/{lib,include}/
tools lib: Sync tools/lib/find_bit.c with the kernel
...
|
||
|
|
24af98c4cf |
Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
"So we have a laundry list of locking subsystem changes:
- continuing barrier API and code improvements
- futex enhancements
- atomics API improvements
- pvqspinlock enhancements: in particular lock stealing and adaptive
spinning
- qspinlock micro-enhancements"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op
futex: Cleanup the goto confusion in requeue_pi()
futex: Remove pointless put_pi_state calls in requeue()
futex: Document pi_state refcounting in requeue code
futex: Rename free_pi_state() to put_pi_state()
futex: Drop refcount if requeue_pi() acquired the rtmutex
locking/barriers, arch: Remove ambiguous statement in the smp_store_mb() documentation
lcoking/barriers, arch: Use smp barriers in smp_store_release()
locking/cmpxchg, arch: Remove tas() definitions
locking/pvqspinlock: Queue node adaptive spinning
locking/pvqspinlock: Allow limited lock stealing
locking/pvqspinlock: Collect slowpath lock statistics
sched/core, locking: Document Program-Order guarantees
locking, sched: Introduce smp_cond_acquire() and use it
locking/pvqspinlock, x86: Optimize the PV unlock code path
locking/qspinlock: Avoid redundant read of next pointer
locking/qspinlock: Prefetch the next node cacheline
locking/qspinlock: Use _acquire/_release() versions of cmpxchg() & xchg()
atomics: Add test for atomic operations with _relaxed variants
|
||
|
|
9061cbe62a |
Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
"The changes in this cycle were:
- Adding transitivity uniformly to rcu_node structure ->lock
acquisitions. (This is implemented by the first two commits on top
of v4.4-rc2 due to the pervasive nature of this change.)
- Documentation updates, including RCU requirements.
- Expedited grace-period changes.
- Miscellaneous fixes.
- Linked-list fixes, courtesy of KTSAN.
- Torture-test updates.
- Late-breaking fix to sysrq-generated crash.
One thing I should note is that these pieces of documentation are
fairly large files:
.../RCU/Design/Requirements/Requirements.html | 2897 ++++++++++++++++++++
.../RCU/Design/Requirements/Requirements.htmlx | 2741 ++++++++++++++++++
and are written in HTML, not the usual .txt style. I hope they are
fine"
Paul McKenney explains the html docs:
"For whatever it is worth, the reason for this unconventional choice
was that attempts to do the diagrams in ASCII art failed miserably.
And attempts to do ASCII art for the upcoming documentation of the
data structures failed even more miserably"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (49 commits)
sysrq: Fix warning in sysrq generated crash.
list: Add lockless list traversal primitives
rcu: Make rcu_gp_init() be bool rather than int
rcu: Move wakeup out from under rnp->lock
rcu: Fix comment for rcu_dereference_raw_notrace
rcu: Don't redundantly disable irqs in rcu_irq_{enter,exit}()
rcu: Make cpu_needs_another_gp() be bool
rcu: Eliminate unused rcu_init_one() argument
rcu: Remove TINY_RCU bloat from pointless boot parameters
torture: Place console.log files correctly from the get-go
torture: Abbreviate console error dump
rcutorture: Print symbolic name for ->gp_state
rcutorture: Print symbolic name for rcu_torture_writer_state
rcutorture: Remove CONFIG_RCU_USER_QS from rcutorture selftest doc
rcutorture: Default grace period to three minutes, allow override
rcutorture: Dump stack when GP kthread stalls
rcutorture: Flag nonexistent RCU GP kthread
rcutorture: Add batch number to script printout
Documentation/memory-barriers.txt: Fix ACCESS_ONCE thinko
documentation: Update RCU requirements based on expedited changes
...
|
||
|
|
2cfa2b191e |
cgroup: fix a typo.
This patch fixes a typo in a comment in cgroup.c. Signed-off-by: Rami Rosen <rami.rosen@intel.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
|
|
de03017958 |
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar: "Misc scheduler fixes" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/core: Reset task's lockless wake-queues on fork() sched/core: Fix unserialized r-m-w scribbling stuff sched/core: Check tgid in is_global_init() sched/fair: Fix multiplication overflow on 32-bit systems |
||
|
|
3ab6d1ebd5 |
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar: "Two core subsystem fixes, plus a handful of tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Fix race in swevent hash perf: Fix race in perf_event_exec() perf list: Robustify event printing routine perf list: Add support for PERF_COUNT_SW_BPF_OUT perf hists browser: Fix segfault if use symbol filter in cmdline perf hists browser: Reset selection when refresh perf hists browser: Add NULL pointer check to prevent crash perf buildid-list: Fix return value of perf buildid-list -k perf buildid-list: Show running kernel build id fix |
||
|
|
ea83ae2fb3 |
Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Ingo Molnar: "Fixes a core IRQ subsystem deadlock" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Prevent chip buslock deadlock |
||
|
|
5156dca34a |
ftrace: Fix the race between ftrace and insmod
We hit ftrace_bug report when booting Android on a 64bit ATOM SOC chip.
Basically, there is a race between insmod and ftrace_run_update_code.
After load_module=>ftrace_module_init, another thread jumps in to call
ftrace_run_update_code=>ftrace_arch_code_modify_prepare
=>set_all_modules_text_rw, to change all modules
as RW. Since the new module is at MODULE_STATE_UNFORMED, the text attribute
is not changed. Then, the 2nd thread goes ahead to change codes.
However, load_module continues to call complete_formation=>set_section_ro_nx,
then 2nd thread would fail when probing the module's TEXT.
The patch fixes it by using notifier to delay the enabling of ftrace
records to the time when module is at state MODULE_STATE_COMING.
Link: http://lkml.kernel.org/r/567CE628.3000609@intel.com
Signed-off-by: Qiu Peiyang <peiyangx.qiu@intel.com>
Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
||
|
|
2626820d83 |
Merge tag 'trace-v4.4-rc4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull ftrace fix from Steven Rostedt: "PeiyangX Qiu reported that if a module fails to load between calling ftrace_module_init() and do_init_module() that the allocations made in ftrace_module_init() will not be freed, resulting in a memory leak. The solution is to call ftrace_release_mod() on the failing module in the fail path befor do_init_module() is called. This will remove any allocations made for that module, and nothing if ftrace_module_init() wasn't called yet for that module. Note, once do_init_module() is called, the MODULE_GOING notifiers are called for the failed module, which calls into the ftrace code to do the proper clean up (basically calling ftrace_release_mod())" * tag 'trace-v4.4-rc4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace/module: Call clean up function when module init fails early |
||
|
|
b7ffffbb46 |
ftrace: Add infrastructure for delayed enabling of module functions
Qiu Peiyang pointed out that there's a race when enabling function tracing and loading a module. In order to make the modifications of converting nops in the prologue of functions into callbacks, the text needs to be converted from read-only to read-write. When enabling function tracing, the text permission is updated, the functions are modified, and then they are put back. When loading a module, the updates to convert function calls to mcount is done before the module text is set to read-only. But after it is done, the module text is visible by the function tracer. Thus we have the following race: CPU 0 CPU 1 ----- ----- start function tracing set text to read-write load_module add functions to ftrace set module text read-only update all functions to callbacks modify module functions too < Can't it's read-only > When this happens, ftrace detects the issue and disables itself till the next reboot. To fix this, a new DISABLED flag is added for ftrace records, which all module functions get when they are added. Then later, after the module code is all set, the records will have the DISABLED flag cleared, and they will be enabled if any callback wants all functions to be traced. Note, this doesn't add the delay to later. It simply changes the ftrace_module_init() to do both the setting of DISABLED records, and then immediately calls the enable code. This helps with testing this new code as it has the same behavior as previously. Another change will come after this to have the ftrace_module_enable() called after the text is set to read-only. Cc: Qiu Peiyang <peiyangx.qiu@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |