Commit Graph

495796 Commits

Author SHA1 Message Date
Alex Williamson
cac80d6e38 vfio-pci: Generalize setup of simple eventfds
We want another single vector IRQ index to support signaling of
the device request to userspace.  Generalize the error reporting
IRQ index to avoid code duplication.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-10 12:37:57 -07:00
Alex Williamson
13060b64b8 vfio: Add and use device request op for vfio bus drivers
When a request is made to unbind a device from a vfio bus driver,
we need to wait for the device to become unused, ie. for userspace
to release the device.  However, we have a long standing TODO in
the code to do something proactive to make that happen.  To enable
this, we add a request callback on the vfio bus driver struct,
which is intended to signal the user through the vfio device
interface to release the device.  Instead of passively waiting for
the device to become unused, we can now pester the user to give
it up.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-10 12:37:47 -07:00
Alex Williamson
4a68810dbb vfio: Tie IOMMU group reference to vfio group
Move the iommu_group reference from the device to the vfio_group.
This ensures that the iommu_group persists as long as the vfio_group
remains.  This can be important if all of the device from an
iommu_group are removed, but we still have an outstanding vfio_group
reference; we can still walk the empty list of devices.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-06 15:05:06 -07:00
Alex Williamson
60720a0fc6 vfio: Add device tracking during unbind
There's a small window between the vfio bus driver calling
vfio_del_group_dev() and the device being completely unbound where
the vfio group appears to be non-viable.  This creates a race for
users like QEMU/KVM where the kvm-vfio module tries to get an
external reference to the group in order to match and release an
existing reference, while the device is potentially being removed
from the vfio bus driver.  If the group is momentarily non-viable,
kvm-vfio may not be able to release the group reference until VM
shutdown, making the group unusable until that point.

Bridge the gap between device removal from the group and completion
of the driver unbind by tracking it in a list.  The device is added
to the list before the bus driver reference is released and removed
using the existing unbind notifier.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-06 15:05:06 -07:00
Alex Williamson
c5e6688752 vfio/type1: Add conditional rescheduling
IOMMU operations can be expensive and it's not very difficult for a
user to give us a lot of work to do for a map or unmap operation.
Killing a large VM will vfio assigned devices can result in soft
lockups and IOMMU tracing shows that we can easily spend 80% of our
time with need-resched set.  A sprinkling of conf_resched() calls
after map and unmap calls has a very tiny affect on performance
while resulting in traces with <1% of calls overflowing into needs-
resched.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-06 14:19:12 -07:00
Alex Williamson
babbf17609 vfio/type1: Chunk contiguous reserved/invalid page mappings
We currently map invalid and reserved pages, such as often occur from
mapping MMIO regions of a VM through the IOMMU, using single pages.
There's really no reason we can't instead follow the methodology we
use for normal pages and find the largest possible physically
contiguous chunk for mapping.  The only difference is that we don't
do locked memory accounting for these since they're not back by RAM.

In most applications this will be a very minor improvement, but when
graphics and GPGPU devices are in play, MMIO BARs become non-trivial.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-06 10:59:16 -07:00
Alex Williamson
6fe1010d6d vfio/type1: DMA unmap chunking
When unmapping DMA entries we try to rely on the IOMMU API behavior
that allows the IOMMU to unmap a larger area than requested, up to
the size of the original mapping.  This works great when the IOMMU
supports superpages *and* they're in use.  Otherwise, each PAGE_SIZE
increment is unmapped separately, resulting in poor performance.

Instead we can use the IOVA-to-physical-address translation provided
by the IOMMU API and unmap using the largest contiguous physical
memory chunk available, which is also how vfio/type1 would have
mapped the region.  For a synthetic 1TB guest VM mapping and shutdown
test on Intel VT-d (2M IOMMU pagesize support), this achieves about
a 30% overall improvement mapping standard 4K pages, regardless of
IOMMU superpage enabling, and about a 40% improvement mapping 2M
hugetlbfs pages when IOMMU superpages are not available.  Hugetlbfs
with IOMMU superpages enabled is effectively unchanged.

Unfortunately the same algorithm does not work well on IOMMUs with
fine-grained superpages, like AMD-Vi, costing about 25% extra since
the IOMMU will automatically unmap any power-of-two contiguous
mapping we've provided it.  We add a routine and a domain flag to
detect this feature, leaving AMD-Vi unaffected by this unmap
optimization.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-06 10:58:56 -07:00
Linus Torvalds
e36f014edf Linux 3.19-rc7 2015-02-01 20:07:21 -08:00
Linus Torvalds
fba7e99458 Merge tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
 "One more week's worth of fixes.  Worth pointing out here are:

   - A patch fixing detaching of iommu registrations when a device is
     removed -- earlier the ops pointer wasn't managed properly
   - Another set of Renesas boards get the same GIC setup fixup as
     others have in previous -rcs
   - Serial port aliases fixups for sunxi.  We did the same to tegra but
     we caught that in time before the merge window due to more machines
     being affected.  Here it took longer for anyone to notice.
   - A couple more DT tweaks on sunxi
   - A follow-up patch for the mvebu coherency disabling in last -rc
     batch"

* tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  arm: dma-mapping: Set DMA IOMMU ops in arm_iommu_attach_device()
  ARM: shmobile: r8a7790: Instantiate GIC from C board code in legacy builds
  ARM: shmobile: r8a73a4: Instantiate GIC from C board code in legacy builds
  ARM: mvebu: don't set the PL310 in I/O coherency mode when I/O coherency is disabled
  ARM: sunxi: dt: Fix aliases
  ARM: dts: sun4i: Add simplefb node with de_fe0-de_be0-lcd0-hdmi pipeline
  ARM: dts: sun6i: ippo-q8h-v5: Fix serial0 alias
  ARM: dts: sunxi: Fix usb-phy support for sun4i/sun5i
2015-02-01 13:20:47 -08:00
Linus Torvalds
3441456bfa Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input layer updates from Dmitry Torokhov:
 "Just a few quirks for PS/2 this time"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: elantech - add more Fujtisu notebooks to force crc_enabled
  Input: i8042 - add noloop quirk for Medion Akoya E7225 (MD98857)
  Input: synaptics - adjust min/max for Lenovo ThinkPad X1 Carbon 2nd
2015-02-01 13:16:40 -08:00
Linus Torvalds
00845eb968 sched: don't cause task state changes in nested sleep debugging
Commit 8eb23b9f35 ("sched: Debug nested sleeps") added code to report
on nested sleep conditions, which we generally want to avoid because the
inner sleeping operation can re-set the thread state to TASK_RUNNING,
but that will then cause the outer sleep loop not actually sleep when it
calls schedule.

However, that's actually valid traditional behavior, with the inner
sleep being some fairly rare case (like taking a sleeping lock that
normally doesn't actually need to sleep).

And the debug code would actually change the state of the task to
TASK_RUNNING internally, which makes that kind of traditional and
working code not work at all, because now the nested sleep doesn't just
sometimes cause the outer one to not block, but will cause it to happen
every time.

In particular, it will cause the cardbus kernel daemon (pccardd) to
basically busy-loop doing scheduling, converting a laptop into a heater,
as reported by Bruno Prémont.  But there may be other legacy uses of
that nested sleep model in other drivers that are also likely to never
get converted to the new model.

This fixes both cases:

 - don't set TASK_RUNNING when the nested condition happens (note: even
   if WARN_ONCE() only _warns_ once, the return value isn't whether the
   warning happened, but whether the condition for the warning was true.
   So despite the warning only happening once, the "if (WARN_ON(..))"
   would trigger for every nested sleep.

 - in the cases where we knowingly disable the warning by using
   "sched_annotate_sleep()", don't change the task state (that is used
   for all core scheduling decisions), instead use '->task_state_change'
   that is used for the debugging decision itself.

(Credit for the second part of the fix goes to Oleg Nesterov: "Can't we
avoid this subtle change in behaviour DEBUG_ATOMIC_SLEEP adds?" with the
suggested change to use 'task_state_change' as part of the test)

Reported-and-bisected-by: Bruno Prémont <bonbons@linux-vserver.org>
Tested-by: Rafael J Wysocki <rjw@rjwysocki.net>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Cc: Ilya Dryomov <ilya.dryomov@inktank.com>,
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Hurley <peter@hurleysoftware.com>,
Cc: Davidlohr Bueso <dave@stgolabs.net>,
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-01 12:23:32 -08:00
Rainer Koenig
47c1ffb2b6 Input: elantech - add more Fujtisu notebooks to force crc_enabled
Add two more Fujitsu LIFEBOOK models that also ship with the Elantech
touchpad and don't work with crc_disabled to the quirk list.

Signed-off-by: Rainer Koenig <Rainer.Koenig@ts.fujitsu.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2015-02-01 11:51:26 -08:00
Olof Johansson
28111dda37 Merge tag 'renesas-soc-fixes3-for-v3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes
Merge "Third Round of Renesas ARM Based SoC Fixes for v3.19" from Simon Horman:

* Instantiate GIC from C board code in legacy builds on r8a7790 and r8a73a4

* tag 'renesas-soc-fixes3-for-v3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
  ARM: shmobile: r8a7790: Instantiate GIC from C board code in legacy builds
  ARM: shmobile: r8a73a4: Instantiate GIC from C board code in legacy builds

Signed-off-by: Olof Johansson <olof@lixom.net>
2015-02-01 08:51:12 -08:00
Linus Torvalds
788807d7ca Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "i2c driver bugfixes (s3c2410, slave-eeprom, sh_mobile), size
  regression "bugfix" (i2c slave), documentation bugfix (st).

  Also, one documentation update (da9063), so some devicetrees can now
  be verified"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: sh_mobile: terminate DMA reads properly
  i2c: Only include slave support if selected
  i2c: s3c2410: fix ABBA deadlock by keeping clock prepared
  i2c: slave-eeprom: fix boundary check when using sysfs
  i2c: st: Rename clock reference to something that exists
  DT: i2c: Add devices handled by the da9063 MFD driver
2015-01-31 10:34:25 -08:00
Linus Torvalds
2141fd0181 Merge tag 'char-misc-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
 "Here are two tiny patches, one fixing up the drivers/Kconfig file, and
  one adding a MAINTAINERS entry for the UIO git tree"

* tag 'char-misc-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  drivers/Kconfig: remove duplicate entry for soc
  MAINTAINERS: add git url entry for UIO
2015-01-30 19:49:44 -08:00
Linus Torvalds
5921dfe8dc Merge tag 'staging-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging tree fixes from Greg KH:
 "Here are two tiny staging tree fixes.  One for the nvec driver to
  resolve a reported problem, and one to add a MAINTAINERS entry for the
  Android drivers"

* tag 'staging-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  MAINTAINERS: add Android driver entries
  staging: nvec: specify a platform-device base id
2015-01-30 19:44:56 -08:00
Linus Torvalds
73dc61cb38 Merge tag 'usb-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
 "Here are some small USB fixes and quirk additions for 3.19-rc7.

  All have been in linux-next for a while with no reported problems"

* tag 'usb-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: Add OTG PET device to TPL
  usb-storage/SCSI: blacklist FUA on JMicron 152d:2566 USB-SATA controller
  uas: Add no-report-opcodes quirk for Simpletech devices with id 4971:8017
  storage: Revise/fix quirk for 04E6:000F SCM USB-SCSI converter
  usb: phy: never defer probe in non-OF case
  usb: dwc2: call dwc2_is_controller_alive() under spinlock
2015-01-30 19:35:35 -08:00
Linus Torvalds
6155bc1431 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Mostly tooling fixes, but also an event groups fix, two PMU driver
  fixes and a CPU model variant addition"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Tighten (and fix) the grouping condition
  perf/x86/intel: Add model number for Airmont
  perf/rapl: Fix crash in rapl_scale()
  perf/x86/intel/uncore: Move uncore_box_init() out of driver initialization
  perf probe: Fix probing kretprobes
  perf symbols: Introduce 'for' method to iterate over the symbols with a given name
  perf probe: Do not rely on map__load() filter to find symbols
  perf symbols: Introduce method to iterate symbols ordered by name
  perf symbols: Return the first entry with a given name in find_by_name method
  perf annotate: Fix memory leaks in LOCK handling
  perf annotate: Handle ins parsing failures
  perf scripting perl: Force to use stdbool
  perf evlist: Remove extraneous 'was' on error message
2015-01-30 14:34:55 -08:00
Linus Torvalds
bc208e0ee0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fix from Chris Mason:
 "We have one more fix for btrfs in my for-linus branch - this was a bug
  in the new raid5/6 scrubbing support"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  btrfs: fix raid56 scrub failed in xfstests btrfs/072
2015-01-30 14:25:52 -08:00
Linus Torvalds
92ef9ce301 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota and UDF fix from Jan Kara:
 "A fix for UDF to properly free preallocated blocks and a fix for quota
  so that Q_GETQUOTA quotactl reports correct numbers for XFS filesystem
  (and similarly Q_XGETQUOTA quotactl works properly for other
  filesystems)"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  quota: Switch ->get_dqblk() and ->set_dqblk() to use bytes as space units
  udf: Release preallocation on last writeable close
2015-01-30 13:46:04 -08:00
Linus Torvalds
1f59fe7667 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
 "The ARM changes are largish, but not too scary.  And a simple fix for
  x86 (bug introduced in 3.19)"

(Paolo sayus these are the "Final" fixes. We'll see).

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: check LAPIC presence when building apic_map
  arm/arm64: KVM: Use kernel mapping to perform invalidation on page fault
  arm/arm64: KVM: Invalidate data cache on unmap
  arm/arm64: KVM: Use set/way op trapping to track the state of the caches
2015-01-30 10:45:24 -08:00
Linus Torvalds
f3a3404162 Merge tag 'iommu-fixes-v3.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
 "Two small fixes for the Tegra GART IOMMU driver:

   - provide a .map_sg function for iommu_ops
   - do not register Tegra GART driver as a workaround because of issues
     with it when used from DRM code"

* tag 'iommu-fixes-v3.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/tegra: gart: Provide default ->map_sg() callback
  iommu/tegra: gart: Do not register with bus
2015-01-30 10:41:26 -08:00
Linus Torvalds
c6591c8131 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull intel and dp mst drm fixes from Dave Airlie:
 "Intel had a few more fixes lined up and no point me sitting on them,
  along with a DP MST fix from Rob for a race at undock + vt switch"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm: fix fb-helper vs MST dangling connector ptrs (v2)
  drm/i915: BDW Fix Halo PCI IDs marked as ULT.
  drm/i915: Fix and clean BDW PCH identification
  drm/i915: Only fence tiled region of object.
  drm/i915: fix inconsistent brightness after resume
  drm/i915: Init PPGTT before context enable
2015-01-30 10:34:24 -08:00
Guenter Roeck
e262eb9381 arc: mm: Fix build failure
Fix misspelled define.

Fixes: 33692f2759 ("vm: add VM_FAULT_SIGSEGV handling support")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-01-30 10:31:14 -08:00
Wolfram Sang
32e224090f i2c: sh_mobile: terminate DMA reads properly
DMA read requests could miss proper termination, so two more bytes would
have been read via PIO overwriting the end of the buffer with wrong
data. Make DMA stop handling more readable while we are here.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2015-01-30 17:58:43 +01:00