Commit Graph

16565 Commits

Author SHA1 Message Date
Linus Torvalds ed06ef318a Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

 . revert 20b279 - require exclude_guest to use PEBS - kernel side, now
   older binaries will continue working for things like cycles:pp
   without needing to pass extra modifiers, from David Ahern.

 . Fix building from 'make perf-*-src-pkg' tarballs, broken by UAPI,
   from Sebastian Andrzej Siewior

[ Pulling directly, Ingo would normally pull but has been unresponsive ]

* tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf tools: Fix building from 'make perf-*-src-pkg' tarballs
  perf x86: revert 20b279 - require exclude_guest to use PEBS - kernel side
2013-01-22 14:32:07 -08:00
Oleg Nesterov 9899d11f65 ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL
putreg() assumes that the tracee is not running and pt_regs_access() can
safely play with its stack.  However a killed tracee can return from
ptrace_stop() to the low-level asm code and do RESTORE_REST, this means
that debugger can actually read/modify the kernel stack until the tracee
does SAVE_REST again.

set_task_blockstep() can race with SIGKILL too and in some sense this
race is even worse, the very fact the tracee can be woken up breaks the
logic.

As Linus suggested we can clear TASK_WAKEKILL around the arch_ptrace()
call, this ensures that nobody can ever wakeup the tracee while the
debugger looks at it.  Not only this fixes the mentioned problems, we
can do some cleanups/simplifications in arch_ptrace() paths.

Probably ptrace_unfreeze_traced() needs more callers, for example it
makes sense to make the tracee killable for oom-killer before
access_process_vm().

While at it, add the comment into may_ptrace_stop() to explain why
ptrace_stop() still can't rely on SIGKILL and signal_pending_state().

Reported-by: Salman Qazi <sqazi@google.com>
Reported-by: Suleiman Souhlal <suleiman@google.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-22 10:08:00 -08:00
Linus Torvalds 5c69bed266 Merge tag 'stable/for-linus-3.8-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull Xen fixes from Konrad Rzeszutek Wilk:
 - CVE-2013-0190/XSA-40 (or stack corruption for 32-bit PV kernels)
 - Fix racy vma access spotted by Al Viro
 - Fix mmap batch ioctl potentially resulting in large O(n) page allcations.
 - Fix vcpu online/offline BUG:scheduling while atomic..
 - Fix unbound buffer scanning for more than 32 vCPUs.
 - Fix grant table being incorrectly initialized
 - Fix incorrect check in pciback
 - Allow privcmd in backend domains.

Fix up whitespace conflict due to ugly merge resolution in Xen tree in
arch/arm/xen/enlighten.c

* tag 'stable/for-linus-3.8-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen: Fix stack corruption in xen_failsafe_callback for 32bit PVOPS guests.
  Revert "xen/smp: Fix CPU online/offline bug triggering a BUG: scheduling while atomic."
  xen/gntdev: remove erronous use of copy_to_user
  xen/gntdev: correctly unmap unlinked maps in mmu notifier
  xen/gntdev: fix unsafe vma access
  xen/privcmd: Fix mmap batch ioctl.
  Xen: properly bound buffer access when parsing cpu/*/availability
  xen/grant-table: correctly initialize grant table version 1
  x86/xen : Fix the wrong check in pciback
  xen/privcmd: Relax access control in privcmd_ioctl_mmap
2013-01-18 12:02:52 -08:00
Andrew Cooper 9174adbee4 xen: Fix stack corruption in xen_failsafe_callback for 32bit PVOPS guests.
This fixes CVE-2013-0190 / XSA-40

There has been an error on the xen_failsafe_callback path for failed
iret, which causes the stack pointer to be wrong when entering the
iret_exc error path.  This can result in the kernel crashing.

In the classic kernel case, the relevant code looked a little like:

        popl %eax      # Error code from hypervisor
        jz 5f
        addl $16,%esp
        jmp iret_exc   # Hypervisor said iret fault
5:      addl $16,%esp
                       # Hypervisor said segment selector fault

Here, there are two identical addls on either option of a branch which
appears to have been optimised by hoisting it above the jz, and
converting it to an lea, which leaves the flags register unaffected.

In the PVOPS case, the code looks like:

        popl_cfi %eax         # Error from the hypervisor
        lea 16(%esp),%esp     # Add $16 before choosing fault path
        CFI_ADJUST_CFA_OFFSET -16
        jz 5f
        addl $16,%esp         # Incorrectly adjust %esp again
        jmp iret_exc

It is possible unprivileged userspace applications to cause this
behaviour, for example by loading an LDT code selector, then changing
the code selector to be not-present.  At this point, there is a race
condition where it is possible for the hypervisor to return back to
userspace from an interrupt, fault on its own iret, and inject a
failsafe_callback into the kernel.

This bug has been present since the introduction of Xen PVOPS support
in commit 5ead97c84 (xen: Core Xen implementation), in 2.6.23.

Signed-off-by: Frediano Ziglio <frediano.ziglio@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: stable@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-01-16 16:17:42 -05:00
Linus Torvalds 2409c873be Merge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Peter Anvin:
 "This is mainly a workaround for a bug in Sandy Bridge graphics which
  causes corruption of certain memory pages."

* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/Sandy Bridge: Sandy Bridge workaround depends on CONFIG_PCI
  x86/Sandy Bridge: mark arrays in __init functions as __initconst
  x86/Sandy Bridge: reserve pages when integrated graphics is present
  x86, efi: correct precedence of operators in setup_efi_pci
2013-01-16 09:11:50 -08:00
Konrad Rzeszutek Wilk d55bf532d7 Revert "xen/smp: Fix CPU online/offline bug triggering a BUG: scheduling while atomic."
This reverts commit 41bd956de3.

The fix is incorrect and not appropiate for the latest kernels.
In fact it _causes_ the BUG: scheduling while atomic while
doing vCPU hotplug.

Suggested-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-01-15 22:41:27 -05:00
Konrad Rzeszutek Wilk 7bcc1ec077 Merge tag 'v3.7' into stable/for-linus-3.8
Linux 3.7

* tag 'v3.7': (833 commits)
  Linux 3.7
  Input: matrix-keymap - provide proper module license
  Revert "revert "Revert "mm: remove __GFP_NO_KSWAPD""" and associated damage
  ipv4: ip_check_defrag must not modify skb before unsharing
  Revert "mm: avoid waking kswapd for THP allocations when compaction is deferred or contended"
  inet_diag: validate port comparison byte code to prevent unsafe reads
  inet_diag: avoid unsafe and nonsensical prefix matches in inet_diag_bc_run()
  inet_diag: validate byte code to prevent oops in inet_diag_bc_run()
  inet_diag: fix oops for IPv4 AF_INET6 TCP SYN-RECV state
  mm: vmscan: fix inappropriate zone congestion clearing
  vfs: fix O_DIRECT read past end of block device
  net: gro: fix possible panic in skb_gro_receive()
  tcp: bug fix Fast Open client retransmission
  tmpfs: fix shared mempolicy leak
  mm: vmscan: do not keep kswapd looping forever due to individual uncompactable zones
  mm: compaction: validate pfn range passed to isolate_freepages_block
  mmc: sh-mmcif: avoid oops on spurious interrupts (second try)
  Revert misapplied "mmc: sh-mmcif: avoid oops on spurious interrupts"
  mmc: sdhci-s3c: fix missing clock for gpio card-detect
  lib/Makefile: Fix oid_registry build dependency
  ...

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Conflicts:
	arch/arm/xen/enlighten.c
	drivers/xen/Makefile

[We need to have the v3.7 base as the 'for-3.8' was based off v3.7-rc3
and there are some patches in v3.7-rc6 that we to have in our branch]
2013-01-15 15:58:25 -05:00
H. Peter Anvin e43b3cec71 x86/Sandy Bridge: Sandy Bridge workaround depends on CONFIG_PCI
early_pci_allowed() and read_pci_config_16() are only available if
CONFIG_PCI is defined.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
2013-01-13 20:58:57 -08:00
H. Peter Anvin ab3cd8670e x86/Sandy Bridge: mark arrays in __init functions as __initconst
Mark static arrays as __initconst so they get removed when the init
sections are flushed.

Reported-by: Mathias Krause <minipli@googlemail.com>
Link: http://lkml.kernel.org/r/75F4BEE6-CB0E-4426-B40B-697451677738@googlemail.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-13 20:36:39 -08:00
Jesse Barnes a9acc5365d x86/Sandy Bridge: reserve pages when integrated graphics is present
SNB graphics devices have a bug that prevent them from accessing certain
memory ranges, namely anything below 1M and in the pages listed in the
table.  So reserve those at boot if set detect a SNB gfx device on the
CPU to avoid GPU hangs.

Stephane Marchesin had a similar patch to the page allocator awhile
back, but rather than reserving pages up front, it leaked them at
allocation time.

[ hpa: made a number of stylistic changes, marked arrays as static
  const, and made less verbose; use "memblock=debug" for full
  verbosity. ]

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-11 14:26:38 -08:00
Linus Torvalds ccae663cd4 Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM bugfixes from Marcelo Tosatti.

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: use dynamic percpu allocations for shared msrs area
  KVM: PPC: Book3S HV: Fix compilation without CONFIG_PPC_POWERNV
  powerpc: Corrected include header path in kvm_para.h
  Add rcu user eqs exception hooks for async page fault
2013-01-10 09:05:18 -08:00
David Ahern a706d965dc perf x86: revert 20b279 - require exclude_guest to use PEBS - kernel side
This patch is brought to you by the letter 'H'.

Commit 20b279 breaks compatiblity with older perf binaries when run with
precise modifier (:p or :pp) by requiring the exclude_guest attribute to be
set. Older binaries default exclude_guest to 0 (ie., wanting guest-based
samples) unless host only profiling is requested (:H modifier). The workaround
for older binaries is to add H to the modifier list (e.g., -e cycles:ppH -
toggles exclude_guest to 1). This was deemed unacceptable by Linus:

https://lkml.org/lkml/2012/12/12/570

Between family in town and the fresh snow in Breckenridge there is no time left
to be working on the proper fix for this over the holidays. In the New Year I
have more pressing problems to resolve -- like some memory leaks in perf which
are proving to be elusive -- although the aforementioned snow is probably why
they are proving to be elusive. Either way I do not have any spare time to work
on this and from the time I have managed to spend on it the solution is more
difficult than just moving to a new exclude_guest flag (does not work) or
flipping the logic to include_guest (which is not as trivial as one would
think).

So, two options: silently force exclude_guest on as suggested by Gleb which
means no impact to older perf binaries or revert the original patch which
caused the breakage.

This patch does the latter -- reverts the original patch that introduced the
regression. The problem can be revisited in the future as time allows.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Link: http://lkml.kernel.org/r/1356749767-17322-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-01-10 09:21:19 -03:00
Marcelo Tosatti 013f6a5d3d KVM: x86: use dynamic percpu allocations for shared msrs area
Use dynamic percpu allocations for the shared msrs structure,
to avoid using the limited reserved percpu space.

Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-01-08 12:51:56 -02:00
Greg Kroah-Hartman a18e3690a5 X86: drivers: remove __dev* attributes.
CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
markings need to be removed.

This change removes the use of __devinit, __devexit_p, __devinitconst,
and __devexit from these drivers.

Based on patches originally written by Bill Pemberton, but redone by me
in order to handle some of the coding style issues better, by hand.

Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Drake <dsd@laptop.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-03 15:57:04 -08:00
Myron Stowe 1278998f8f PCI: Work around Stratus ftServer broken PCIe hierarchy (fix DMI check)
Commit 284f5f9 was intended to disable the "only_one_child()" optimization
on Stratus ftServer systems, but its DMI check is wrong.  It looks for
DMI_SYS_VENDOR that contains "ftServer", when it should look for
DMI_SYS_VENDOR containing "Stratus" and DMI_PRODUCT_NAME containing
"ftServer".

Tested on Stratus ftServer 6400.

Reported-by: Fadeeva Marina <astarta@rat.ru>
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=51331
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org	# v3.5+
2012-12-26 10:39:23 -07:00
Linus Torvalds 54d46ea993 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull signal handling cleanups from Al Viro:
 "sigaltstack infrastructure + conversion for x86, alpha and um,
  COMPAT_SYSCALL_DEFINE infrastructure.

  Note that there are several conflicts between "unify
  SS_ONSTACK/SS_DISABLE definitions" and UAPI patches in mainline;
  resolution is trivial - just remove definitions of SS_ONSTACK and
  SS_DISABLED from arch/*/uapi/asm/signal.h; they are all identical and
  include/uapi/linux/signal.h contains the unified variant."

Fixed up conflicts as per Al.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  alpha: switch to generic sigaltstack
  new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to those
  generic compat_sys_sigaltstack()
  introduce generic sys_sigaltstack(), switch x86 and um to it
  new helper: compat_user_stack_pointer()
  new helper: restore_altstack()
  unify SS_ONSTACK/SS_DISABLE definitions
  new helper: current_user_stack_pointer()
  missing user_stack_pointer() instances
  Bury the conditionals from kernel_thread/kernel_execve series
  COMPAT_SYSCALL_DEFINE: infrastructure
2012-12-20 18:05:28 -08:00
Sasha Levin 886d751a2e x86, efi: correct precedence of operators in setup_efi_pci
With the current code, the condition in the if() doesn't make much sense due to
precedence of operators.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Link: http://lkml.kernel.org/r/1356030701-16284-25-git-send-email-sasha.levin@oracle.com
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-12-20 11:47:14 -08:00
Linus Torvalds 787314c35f Merge tag 'iommu-updates-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
 "A few new features this merge-window.  The most important one is
  probably, that dma-debug now warns if a dma-handle is not checked with
  dma_mapping_error by the device driver.  This requires minor changes
  to some architectures which make use of dma-debug.  Most of these
  changes have the respective Acks by the Arch-Maintainers.

  Besides that there are updates to the AMD IOMMU driver for refactor
  the IOMMU-Groups support and to make sure it does not trigger a
  hardware erratum.

  The OMAP changes (for which I pulled in a branch from Tony Lindgren's
  tree) have a conflict in linux-next with the arm-soc tree.  The
  conflict is in the file arch/arm/mach-omap2/clock44xx_data.c which is
  deleted in the arm-soc tree.  It is safe to delete the file too so
  solve the conflict.  Similar changes are done in the arm-soc tree in
  the common clock framework migration.  A missing hunk from the patch
  in the IOMMU tree will be submitted as a seperate patch when the
  merge-window is closed."

* tag 'iommu-updates-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (29 commits)
  ARM: dma-mapping: support debug_dma_mapping_error
  ARM: OMAP4: hwmod data: ipu and dsp to use parent clocks instead of leaf clocks
  iommu/omap: Adapt to runtime pm
  iommu/omap: Migrate to hwmod framework
  iommu/omap: Keep mmu enabled when requested
  iommu/omap: Remove redundant clock handling on ISR
  iommu/amd: Remove obsolete comment
  iommu/amd: Don't use 512GB pages
  iommu/tegra: smmu: Move bus_set_iommu after probe for multi arch
  iommu/tegra: gart: Move bus_set_iommu after probe for multi arch
  iommu/tegra: smmu: Remove unnecessary PTC/TLB flush all
  tile: dma_debug: add debug_dma_mapping_error support
  sh: dma_debug: add debug_dma_mapping_error support
  powerpc: dma_debug: add debug_dma_mapping_error support
  mips: dma_debug: add debug_dma_mapping_error support
  microblaze: dma-mapping: support debug_dma_mapping_error
  ia64: dma_debug: add debug_dma_mapping_error support
  c6x: dma_debug: add debug_dma_mapping_error support
  ARM64: dma_debug: add debug_dma_mapping_error support
  intel-iommu: Prevent devices with RMRRs from being placed into SI Domain
  ...
2012-12-20 10:07:25 -08:00
Al Viro c40702c49f new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to those
note that they are relying on access_ok() already checked by caller.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:41 -05:00
Al Viro 9026843952 generic compat_sys_sigaltstack()
Again, conditional on CONFIG_GENERIC_SIGALTSTACK

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:41 -05:00
Al Viro 6bf9adfc90 introduce generic sys_sigaltstack(), switch x86 and um to it
Conditional on CONFIG_GENERIC_SIGALTSTACK; architectures that do not
select it are completely unaffected

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:40 -05:00
Al Viro 9b064fc3f9 new helper: compat_user_stack_pointer()
Compat counterpart of current_user_stack_pointer(); for most of the biarch
architectures those two are identical, but e.g. arm64 and arm use different
registers for stack pointer...

Note that amd64 variants of current_user_stack_pointer/compat_user_stack_pointer
do *not* rely on pt_regs having been through FIXUP_TOP_OF_STACK.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:40 -05:00
Al Viro 031b656698 unify SS_ONSTACK/SS_DISABLE definitions
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:39 -05:00
Al Viro 5208ba24e7 missing user_stack_pointer() instances
for the architectures that have usp in pt_regs and do not have
user_stack_pointer() already defined.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:39 -05:00
Al Viro ae903caae2 Bury the conditionals from kernel_thread/kernel_execve series
All architectures have
	CONFIG_GENERIC_KERNEL_THREAD
	CONFIG_GENERIC_KERNEL_EXECVE
	__ARCH_WANT_SYS_EXECVE
None of them have __ARCH_WANT_KERNEL_EXECVE and there are only two callers
of kernel_execve() (which is a trivial wrapper for do_execve() now) left.
Kill the conditionals and make both callers use do_execve().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19 18:07:38 -05:00