It's unnecessary to excessively spam the kernel log anytime the BTS buffer
cannot be allocated, so make this allocation __GFP_NOWARN.
The user probably will want to at least find some artifact that the
allocation has failed in the past, probably due to fragmentation because
of its large size, when it's not allocated at bootstrap. Thus, add a
WARN_ONCE() so something is left behind for them to understand why perf
commnads that require PEBS is not working properly.
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1406301600460.26302@chino.kir.corp.google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With -cpu host, KVM reports LBR and extra_regs support, if the host has
support.
When the guest perf driver tries to access LBR or extra_regs MSR,
it #GPs all MSR accesses,since KVM doesn't handle LBR and extra_regs support.
So check the related MSRs access right once at initialization time to avoid
the error access at runtime.
For reproducing the issue, please build the kernel with CONFIG_KVM_INTEL = y
(for host kernel).
And CONFIG_PARAVIRT = n and CONFIG_KVM_GUEST = n (for guest kernel).
Start the guest with -cpu host.
Run perf record with --branch-any or --branch-filter in guest to trigger LBR
Run perf stat offcore events (E.g. LLC-loads/LLC-load-misses ...) in guest to
trigger offcore_rsp #GP
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
Cc: Mark Davies <junk@eslaf.co.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yan, Zheng <zheng.z.yan@intel.com>
Link: http://lkml.kernel.org/r/1405365957-20202-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently, any NMI is falsely handled by a NMI handler of NMI watchdog
if CondChgd bit in MSR_CORE_PERF_GLOBAL_STATUS MSR is set.
For example, we use external NMI to make system panic to get crash
dump, but in this case, the external NMI is falsely handled do to the
issue.
This commit deals with the issue simply by ignoring CondChgd bit.
Here is explanation in detail.
On x86 NMI watchdog uses performance monitoring feature to
periodically signal NMI each time performance counter gets overflowed.
intel_pmu_handle_irq() is called as a NMI_LOCAL handler from a NMI
handler of NMI watchdog, perf_event_nmi_handler(). It identifies an
owner of a given NMI by looking at overflow status bits in
MSR_CORE_PERF_GLOBAL_STATUS MSR. If some of the bits are set, then it
handles the given NMI as its own NMI.
The problem is that the intel_pmu_handle_irq() doesn't distinguish
CondChgd bit from other bits. Unlike the other status bits, CondChgd
bit doesn't represent overflow status for performance counters. Thus,
CondChgd bit cannot be thought of as a mark indicating a given NMI is
NMI watchdog's.
As a result, if CondChgd bit is set, any NMI is falsely handled by the
NMI handler of NMI watchdog. Also, if type of the falsely handled NMI
is either NMI_UNKNOWN, NMI_SERR or NMI_IO_CHECK, the corresponding
action is never performed until CondChgd bit is cleared.
I noticed this behavior on systems with Ivy Bridge processors: Intel
Xeon CPU E5-2630 v2 and Intel Xeon CPU E7-8890 v2. On both systems,
CondChgd bit in MSR_CORE_PERF_GLOBAL_STATUS MSR has already been set
in the beginning at boot. Then the CondChgd bit is immediately cleared
by next wrmsr to MSR_CORE_PERF_GLOBAL_CTRL MSR and appears to remain
0.
On the other hand, on older processors such as Nehalem, Xeon E7540,
CondChgd bit is not set in the beginning at boot.
I'm not sure about exact behavior of CondChgd bit, in particular when
this bit is set. Although I read Intel System Programmer's Manual to
figure out that, the descriptions I found are:
In 18.9.1:
"The MSR_PERF_GLOBAL_STATUS MSR also provides a ¡sticky bit¢ to
indicate changes to the state of performancmonitoring hardware"
In Table 35-2 IA-32 Architectural MSRs
63 CondChg: status bits of this register has changed.
These are different from the bahviour I see on the actual system as I
explained above.
At least, I think ignoring CondChgd bit should be enough for NMI
watchdog perspective.
Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20140625.103503.409316067.d.hatayama@jp.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull i2c new drivers from Wolfram Sang:
"Here is a pull request from i2c hoping for the "new driver" rule.
Originally, I wanted to send this request during the merge window, but
code checkers with very recent additions complained, so a few fixups
were needed. So, some more time went by and I merged rc1 to get a
stable base"
So the "new driver" rule is really about drivers that people absolutely
need for the kernel to work on new hardware, which is not so much the
case for i2c. So I considered not pulling this, but eventually
relented.
Just for FYI: the whole (and only) point of "new drivers" is not that
new drivers cannot regress things (they can, and they have - by
triggering badly tested code on machines that never triggered that code
before), but because they can bring to life machines that otherwise
wouldn't be useful at all without the drivers.
So the new driver rule is for essential things that actual consumers
would care about, ie devices like networking or disk drivers that matter
to normal people (not server people - they run old kernels anyway, so
mainlining new drivers is irrelevant for them).
* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: sun6-p2wi: fix call to snprintf
i2c: rk3x: add NULL entry to the end of_device_id array
i2c: sun6i-p2wi: use proper return value in probe
i2c: sunxi: add P2WI (Push/Pull 2 Wire Interface) controller support
i2c: sunxi: add P2WI DT bindings documentation
i2c: rk3x: add driver for Rockchip RK3xxx SoC I2C adapter
Pull file locking fixes from Jeff Layton:
"File locking related bugfixes
Nothing too earth-shattering here. A fix for a potential regression
due to a patch in pile #1, and the addition of a memory barrier to
prevent a race condition between break_deleg and generic_add_lease"
* tag 'locks-v3.16-2' of git://git.samba.org/jlayton/linux:
locks: set fl_owner for leases back to current->files
locks: add missing memory barrier in break_deleg
Pull kbuild fixes from Michal Marek:
"There are three fixes for regressions caused by the relative paths
series: deb-pkg, tar-pkg and *docs did not work with O=.
Plus, there is a fix for the linux-headers deb package and a fixed
typo. These are not regression fixes but are safe enough"
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
kbuild: fix a typo in a kbuild document
builddeb: fix missing headers in linux-headers package
Documentation: Fix DocBook build with relative $(srctree)
kbuild: Fix tar-pkg with relative $(objtree)
deb-pkg: Fix for relative paths
Pull btrfs fixes from Chris Mason:
"This fixes some lockups in btrfs reported with rc1. It probably has
some performance impact because it is backing off our spinning locks
more often and switching to a blocking lock. I'll be able to nail
that down next week, but for now I want to get the lockups taken care
of.
Otherwise some more stack reduction and assorted fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix wrong error handle when the device is missing or is not writeable
Btrfs: fix deadlock when mounting a degraded fs
Btrfs: use bio_endio_nodec instead of open code
Btrfs: fix NULL pointer crash when running balance and scrub concurrently
btrfs: Skip scrubbing removed chunks to avoid -ENOENT.
Btrfs: fix broken free space cache after the system crashed
Btrfs: make free space cache write out functions more readable
Btrfs: remove unused wait queue in struct extent_buffer
Btrfs: fix deadlocks with trylock on tree nodes
Pull nfsd bugfixes from Bruce Fields:
"Fixes for a new regression from the xdr encoding rewrite, and a
delegation problem we've had for a while (made somewhat more annoying
by the vfs delegation support added in 3.13)"
* 'for-3.16' of git://linux-nfs.org/~bfields/linux:
NFSD: fix bug for readdir of pseudofs
NFSD: Don't hand out delegations for 30 seconds after recalling them.
Pull perf fixes from Ingo Molnar:
"This is larger than usual: the main reason are the ARM symbol lookup
speedups that came in late and were hard to resist.
There's also a kprobes fix and various tooling fixes, plus the minimal
re-enablement of the mmap2 support interface"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
x86/kprobes: Fix build errors and blacklist context_track_user
perf tests: Add test for closing dso objects on EMFILE error
perf tests: Add test for caching dso file descriptors
perf tests: Allow reuse of test_file function
perf tests: Spawn child for each test
perf tools: Add dso__data_* interface descriptons
perf tools: Allow to close dso fd in case of open failure
perf tools: Add file size check and factor dso__data_read_offset
perf tools: Cache dso data file descriptor
perf tools: Add global count of opened dso objects
perf tools: Add global list of opened dso objects
perf tools: Add data_fd into dso object
perf tools: Separate dso data related variables
perf tools: Cache register accesses for unwind processing
perf record: Fix to honor user freq/interval properly
perf timechart: Reflow documentation
perf probe: Improve error messages in --line option
perf probe: Improve an error message of perf probe --vars mode
perf probe: Show error code and description in verbose mode
perf probe: Improve error message for unknown member of data structure
...
Pull rtmutex fixes from Thomas Gleixner:
"Another three patches to make the rtmutex code more robust. That's
the last urgent fallout from the big futex/rtmutex investigation"
* 'locking-urgent-for-linus.patch' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rtmutex: Plug slow unlock race
rtmutex: Detect changes in the pi lock chain
rtmutex: Handle deadlock detection smarter
Pull s390 patches from Martin Schwidefsky:
"A couple of bug fixes, a debug change for qdio, an update for the
default config, and one small extension.
The watchdog module based on diagnose 0x288 is converted to the
watchdog API and it now works under LPAR as well"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/ccwgroup: use ccwgroup_ungroup wrapper
s390/ccwgroup: fix an uninitialized return code
s390/ccwgroup: obtain extra reference for asynchronous processing
qdio: Keep device-specific dbf entries
s390/compat: correct ucontext layout for high gprs
s390/cio: set device name as early as possible
s390: update default configuration
s390: avoid format strings leaking into names
s390/airq: silence lockdep warning
s390/watchdog: add support for LPAR operation (diag288)
s390/watchdog: use watchdog API
s390/sclp_vt220: Enable ASCII console per default
s390/qdio: replace shift loop by ilog2
s390/cio: silence lockdep warning
s390/uaccess: always load the kernel ASCE after task switch
s390/ap_bus: Make modules parameters visible in sysfs
Pull UniCore32 bug fixes from Guan Xuetao:
"This includes bugfixes to make unicore32 successfully build under
defconfig, and some changes for allmodconfig (though not finished)"
* tag 'for-linus' of git://github.com/gxt/linux:
unicore32: Remove ARCH_HAS_CPUFREQ config option
UniCore32: Change git tree location information in MAINTAINERS
arch: unicore32: ksyms: export '__cpuc_coherent_kern_range' to avoid compiling failure
arch: unicore32: ksyms: export 'pm_power_off' to avoid compiling failure.
arch: unicore32: ksyms: export additional find_first_*() to avoid compiling failure
arch:unicore32:mm: add devmem_is_allowed() to support STRICT_DEVMEM
unicore32: include: asm: add missing ')' for PAGE_* macros in pgtable.h
arch/unicore32/kernel/setup.c: add generic 'screen_info' to avoid compiling failure
drivers: scsi: mvsas: fix compiling issue by adding 'MVS_' for "enum pci_interrupt_cause"
arch: unicore32: kernel: ksyms: remove 'bswapsi2' and 'muldi3' to avoid compiling failure
arch/unicore32/kernel/ksyms.c: remove 2 export symbols to avoid compiling failure
drivers/rtc/rtc-puv3.c: remove "&dev->" for typo issue MIME-Version: 1.0
drivers/rtc/rtc-puv3.c: use dev_dbg() instead of dev_debug() for typo issue
arch/unicore32/include/asm/io.h: add readl_relaxed() generic definition
arch/unicore32/include/asm/ptrace.h: add generic definition for profile_pc()
arch/unicore32/mm/alignment.c: include "asm/pgtable.h" to avoid compiling error
arch/unicore32/kernel/clock.c: add readl() and writel() for 'PM_' macros
arch/unicore32/kernel/module.c: use __vmalloc_node_range() instead of __vmalloc_area()
arch/unicore32/kernel/ksyms.c: remove several undefined exported symbols
Pull char / misc driver fixes from Greg KH:
"Here are 3 patches, one a revert of the UIO patch you objected to in
3.16-rc1 and that no one wanted to defend, a w1 driver bugfix, and a
MAINTAINERS update for the vmware balloon driver.
All of these, except for the MAINTAINERS update which just got added,
have been in linux-next just fine"
* tag 'char-misc-3.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
MAINTAINERS: add entry for VMware Balloon driver
w1: mxc_w1: Fix incorrect "presence" status
Revert "uio: fix vma io range check in mmap"
Pull staging driver fixes from Greg KH:
"Here are a few fixes for staging and iio drivers that resolve issues
reported in 3.16-rc1.
All have been in linux-next just fine"
* tag 'staging-3.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
imx-drm: parallel-display: Fix DPMS default state.
staging: android: timed_output: fix use after free of dev
staging: comedi: addi_apci_1564: add addi_watchdog dependency
staging: rtl8723au: Reference correct firmwarefiles with MODULE_FIRMWARE()
staging: rtl8723au: Request correct firmware file for A-cut parts
iio: adc: checking for NULL instead of IS_ERR() in probe
iio: adc: at91: signedness bug in at91_adc_get_trigger_value_by_name()
iio: mxs-lradc: fix divider
iio: Fix endianness issue in ak8975_read_axis()
staging/iio: IIO_SIMPLE_DUMMY_BUFFER neds IIO_BUFFER
twl4030-madc: Request processed values in twl4030_get_madc_conversion
staging: iio: tsl2x7x_core: fix proximity treshold
iio: Fix two mpl3115 issues in measurement conversion
iio: hid-sensors: Get feature report from sensor hub after changing power state
Pull tty/serial bugfixes from Greg KH:
"Here are some tty / serial driver bugfixes for 3.16-rc2 that resolve
some reported issues. The samsung driver build error itself has been
reported by a bunch of people, sorry about that one. The others are
all tiny and everyone seems to like them in linux-next so far"
* tag 'tty-3.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty/serial: fix 8250 early console option passing to regular console
tty: Correct INPCK handling
serial: Fix IGNBRK handling
serial: samsung: Fix build error
Pull USB fixes from Greg KH:
"Here are some USB fixes for 3.16-rc2 that resolve some reported
issues. All of these have been in linux-next for a while with no
problems"
* tag 'usb-3.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: usbtest: add a timeout for scatter-gather tests
USB: EHCI: avoid BIOS handover on the HASEE E200
usb: fix hub-port pm_runtime_enable() vs runtime pm transitions
usb: quiet peer failure warning, disable poweroff
usb: improve "not suspended yet" message in hub_suspend()
xhci: Fix sleeping with IRQs disabled in xhci_stop_device()
usb: fix ->update_hub_device() vs hdev->maxchild