Current wq_pool_mutex doesn't proctect the attrs-installation, it results
that ->unbound_attrs, ->numa_pwq_tbl[] and ->dfl_pwq can only be accessed
under wq->mutex and causes some inconveniences. Example, wq_update_unbound_numa()
has to acquire wq->mutex before fetching the wq->unbound_attrs->no_numa
and the old_pwq.
attrs-installation is a short operation, so this change will no cause any
latency for other operations which also acquire the wq_pool_mutex.
The only unprotected attrs-installation code is in apply_workqueue_attrs(),
so this patch touches code less than comments.
It is also a preparation patch for next several patches which read
wq->unbound_attrs, wq->numa_pwq_tbl[] and wq->dfl_pwq with
only wq_pool_mutex held.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Allow to modify the low-level unbound workqueues cpumask through
sysfs. This is performed by traversing the entire workqueue list
and calling apply_wqattrs_prepare() on the unbound workqueues
with the new low level mask. Only after all the preparation are done,
we commit them all together.
Ordered workqueues are ignored from the low level unbound workqueue
cpumask, it will be handled in near future.
All the (default & per-node) pwqs are mandatorily controlled by
the low level cpumask. If the user configured cpumask doesn't overlap
with the low level cpumask, the low level cpumask will be used for the
wq instead.
The comment of wq_calc_node_cpumask() is updated and explicitly
requires that its first argument should be the attrs of the default
pwq.
The default wq_unbound_cpumask is cpu_possible_mask. The workqueue
subsystem doesn't know its best default value, let the system manager
or the other subsystem set it when needed.
Changed from V8:
merge the calculating code for the attrs of the default pwq together.
minor change the code&comments for saving the user configured attrs.
remove unnecessary list_del().
minor update the comment of wq_calc_node_cpumask().
update the comment of workqueue_set_unbound_cpumask();
Cc: Christoph Lameter <cl@linux.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Original-patch-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Create a cpumask that limits the affinity of all unbound workqueues.
This cpumask is controlled through a file at the root of the workqueue
sysfs directory.
It works on a lower-level than the per WQ_SYSFS workqueues cpumask files
such that the effective cpumask applied for a given unbound workqueue is
the intersection of /sys/devices/virtual/workqueue/$WORKQUEUE/cpumask and
the new /sys/devices/virtual/workqueue/cpumask file.
This patch implements the basic infrastructure and the read interface.
wq_unbound_cpumask is initially set to cpu_possible_mask.
Cc: Christoph Lameter <cl@linux.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Current apply_workqueue_attrs() includes pwqs-allocation and pwqs-installation,
so when we batch multiple apply_workqueue_attrs()s as a transaction, we can't
ensure the transaction must succeed or fail as a complete unit.
To solve this, we split apply_workqueue_attrs() into three stages.
The first stage does the preparation: allocation memory, pwqs.
The second stage does the attrs-installaion and pwqs-installation.
The third stage frees the allocated memory and (old or unused) pwqs.
As the result, batching multiple apply_workqueue_attrs()s can
succeed or fail as a complete unit:
1) batch do all the first stage for all the workqueues
2) only commit all when all the above succeed.
This patch is a preparation for the next patch ("Allow modifying low level
unbound workqueue cpumask") which will do a multiple apply_workqueue_attrs().
The patch doesn't have functionality changed except two minor adjustment:
1) free_unbound_pwq() for the error path is removed, we use the
heavier version put_pwq_unlocked() instead since the error path
is rare. this adjustment simplifies the code.
2) the memory-allocation is also moved into wq_pool_mutex.
this is needed to avoid to do the further splitting.
tj: minor updates to comments.
Suggested-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET with
SS == 0 results in an invalid usermode state in which SS is apparently
equal to __USER_DS but causes #SS if used.
Work around the issue by setting SS to __KERNEL_DS __switch_to, thus
ensuring that SYSRET never happens with SS set to NULL.
This was exposed by a recent vDSO cleanup.
Fixes: e7d6eefaaa x86/vdso32/syscall.S: Do not load __USER32_DS to %ss
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull intel drm fixes from Dave Airlie.
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/i915: vlv: fix save/restore of GFX_MAX_REQ_COUNT reg
drm/i915: Workaround to avoid lite restore with HEAD==TAIL
drm/i915: cope with large i2c transfers
Pull intel iommu updates from David Woodhouse:
"This lays a little of the groundwork for upcoming Shared Virtual
Memory support — fixing some bogus #defines for capability bits and
adding the new ones, and starting to use the new wider page tables
where we can, in anticipation of actually filling in the new fields
therein.
It also allows graphics devices to be assigned to VM guests again.
This got broken in 3.17 by disallowing assignment of RMRR-afflicted
devices. Like USB, we do understand why there's an RMRR for graphics
devices — and unlike USB, it's actually sane. So we can make an
exception for graphics devices, just as we do USB controllers.
Finally, tone down the warning about the X2APIC_OPT_OUT bit, due to
persistent requests. X2APIC_OPT_OUT was added to the spec as a nasty
hack to allow broken BIOSes to forbid us from using X2APIC when they
do stupid and invasive things and would break if we did.
Someone noticed that since Windows doesn't have full IOMMU support for
DMA protection, setting the X2APIC_OPT_OUT bit made Windows avoid
initialising the IOMMU on the graphics unit altogether.
This means that it would be available for use in "driver mode", where
the IOMMU registers are made available through a BAR of the graphics
device and the graphics driver can do SVM all for itself.
So they started setting the X2APIC_OPT_OUT bit on *all* platforms with
SVM capabilities. And even the platforms which *might*, if the
planets had been aligned correctly, possibly have had SVM capability
but which in practice actually don't"
* git://git.infradead.org/intel-iommu:
iommu/vt-d: support extended root and context entries
iommu/vt-d: Add new extended capabilities from v2.3 VT-d specification
iommu/vt-d: Allow RMRR on graphics devices too
iommu/vt-d: Print x2apic opt out info instead of printing a warning
iommu/vt-d: kill bogus ecap_niotlb_iunits()
Pull i2c fixes from Wolfram Sang:
"This has a mixture of merge window cleanups and bugfixes"
* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: st: add include for pinctrl
i2c: mux: use proper dev when removing "channel-X" symlinks
i2c: digicolor: remove duplicate include
i2c: Mark adapter devices with pm_runtime_no_callbacks
i2c: pca-platform: fix broken email address
i2c: mxs: fix broken email address
i2c: rk3x: report number of messages transmitted
Pull btrfs fixes from Chris Mason:
"Filipe hit two problems in my block group cache patches. We finalized
the fixes last week and ran through more tests"
* 'for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: prevent list corruption during free space cache processing
Btrfs: fix inode cache writeout
three fixes for i915.
* tag 'drm-intel-next-fixes-2015-04-25' of git://anongit.freedesktop.org/drm-intel:
drm/i915: vlv: fix save/restore of GFX_MAX_REQ_COUNT reg
drm/i915: Workaround to avoid lite restore with HEAD==TAIL
drm/i915: cope with large i2c transfers
Pull NFS client updates from Trond Myklebust:
"Another set of mainly bugfixes and a couple of cleanups. No new
functionality in this round.
Highlights include:
Stable patches:
- Fix a regression in /proc/self/mountstats
- Fix the pNFS flexfiles O_DIRECT support
- Fix high load average due to callback thread sleeping
Bugfixes:
- Various patches to fix the pNFS layoutcommit support
- Do not cache pNFS deviceids unless server notifications are enabled
- Fix a SUNRPC transport reconnection regression
- make debugfs file creation failure non-fatal in SUNRPC
- Another fix for circular directory warnings on NFSv4 "junctioned"
mountpoints
- Fix locking around NFSv4.2 fallocate() support
- Truncating NFSv4 file opens should also sync O_DIRECT writes
- Prevent infinite loop in rpcrdma_ep_create()
Features:
- Various improvements to the RDMA transport code's handling of
memory registration
- Various code cleanups"
* tag 'nfs-for-4.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (55 commits)
fs/nfs: fix new compiler warning about boolean in switch
nfs: Remove unneeded casts in nfs
NFS: Don't attempt to decode missing directory entries
Revert "nfs: replace nfs_add_stats with nfs_inc_stats when add one"
NFS: Rename idmap.c to nfs4idmap.c
NFS: Move nfs_idmap.h into fs/nfs/
NFS: Remove CONFIG_NFS_V4 checks from nfs_idmap.h
NFS: Add a stub for GETDEVICELIST
nfs: remove WARN_ON_ONCE from nfs_direct_good_bytes
nfs: fix DIO good bytes calculation
nfs: Fetch MOUNTED_ON_FILEID when updating an inode
sunrpc: make debugfs file creation failure non-fatal
nfs: fix high load average due to callback thread sleeping
NFS: Reduce time spent holding the i_mutex during fallocate()
NFS: Don't zap caches on fallocate()
xprtrdma: Make rpcrdma_{un}map_one() into inline functions
xprtrdma: Handle non-SEND completions via a callout
xprtrdma: Add "open" memreg op
xprtrdma: Add "destroy MRs" memreg op
xprtrdma: Add "reset MRs" memreg op
...
Pull fourth vfs update from Al Viro:
"d_inode() annotations from David Howells (sat in for-next since before
the beginning of merge window) + four assorted fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
RCU pathwalk breakage when running into a symlink overmounting something
fix I_DIO_WAKEUP definition
direct-io: only inc/dec inode->i_dio_count for file systems
fs/9p: fix readdir()
VFS: assorted d_backing_inode() annotations
VFS: fs/inode.c helpers: d_inode() annotations
VFS: fs/cachefiles: d_backing_inode() annotations
VFS: fs library helpers: d_inode() annotations
VFS: assorted weird filesystems: d_inode() annotations
VFS: normal filesystems (and lustre): d_inode() annotations
VFS: security/: d_inode() annotations
VFS: security/: d_backing_inode() annotations
VFS: net/: d_inode() annotations
VFS: net/unix: d_backing_inode() annotations
VFS: kernel/: d_inode() annotations
VFS: audit: d_backing_inode() annotations
VFS: Fix up some ->d_inode accesses in the chelsio driver
VFS: Cachefiles should perform fs modifications on the top layer only
VFS: AF_UNIX sockets should call mknod on the top layer only
Pull more power management and ACPI updates from Rafael Wysocki:
"These are fixes mostly (intel_pstate, ACPI core, ACPI EC driver,
cpupower tool), a new CPU ID for the Intel RAPL driver and one
intel_pstate driver improvement that didn't make it to my previous
pull requests due to timing.
Specifics:
- Fix a build warning in the intel_pstate driver showing up in
non-SMP builds (Borislav Petkov)
- Change one of the intel_pstate's P-state selection parameters for
Baytrail and Cherrytrail CPUs to significantly improve performance
at the cost of a small increase in energy consumption (Kristen
Carlson Accardi)
- Fix a NULL pointer dereference in the ACPI EC driver due to an
unsafe list walk in the query handler removal routine (Chris
Bainbridge)
- Get rid of a false-positive lockdep warning in the ACPI container
hot-remove code (Rafael J Wysocki)
- Prevent the ACPI device enumeration code from creating device
objects of a wrong type in some cases (Rafael J Wysocki)
- Add Skylake processors support to the Intel RAPL power capping
driver (Brian Bian)
- Drop the stale MAINTAINERS entry for the ACPI dock driver that is
regarded as part of the ACPI core and maintained along with it now
(Chao Yu)
- Fix cpupower tool breakage caused by a library API change in libpci
3.3.0 (Lucas Stach)"
* tag 'pm+acpi-4.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / scan: Add a scan handler for PRP0001
ACPI / scan: Annotate physical_node_lock in acpi_scan_is_offline()
ACPI / EC: fix NULL pointer dereference in acpi_ec_remove_query_handler()
MAINTAINERS: remove maintainship entry of docking station driver
powercap / RAPL: Add support for Intel Skylake processors
cpufreq: intel_pstate: Fix an annoying !CONFIG_SMP warning
intel_pstate: Change the setpoint for Atom params
cpupower: fix breakage from libpci API change
Pull crypto fixes from Herbert Xu:
"This push fixes a build problem with img-hash under non-standard
configurations and a serious regression with sha512_ssse3 which can
lead to boot failures"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: img-hash - CRYPTO_DEV_IMGTEC_HASH should depend on HAS_DMA
crypto: x86/sha512_ssse3 - fixup for asm function prototype change
Pull x86 platform driver updates from Darren Hart:
"This series includes significant updates to the toshiba_acpi driver
and the reintroduction of the dell-laptop keyboard backlight additions
I had to revert previously. Also included are various fixes for
typos, warnings, correctness, and minor bugs.
Specifics:
dell-laptop:
- add support for keyboard backlight.
toshiba_acpi:
- adaptive keyboard, hotkey, USB sleep and charge, and backlight
updates. Update sysfs documentation.
toshiba_bluetooth:
- fix enabling/disabling loop on recent devices
apple-gmux:
- lock iGP IO to protect from vgaarb changes
other:
- Fix typos, clear gcc warnings, clarify pr_* messages, correct
return types, update MAINTAINERS"
* tag 'platform-drivers-x86-v4.1-1' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86: (25 commits)
toshiba_acpi: Do not register vendor backlight when acpi_video bl is available
MAINTAINERS: Add me on list of Dell laptop drivers
platform: x86: dell-laptop: Add support for keyboard backlight
Documentation/ABI: Update sysfs-driver-toshiba_acpi entry
toshiba_acpi: Fix pr_* messages from USB Sleep Functions
toshiba_acpi: Update and fix USB Sleep and Charge modes
wmi: Use bool function return values of true/false not 1/0
toshiba_bluetooth: Fix enabling/disabling loop on recent devices
toshiba_bluetooth: Clean up *_add function and disable BT device at removal
toshiba_bluetooth: Add three new functions to the driver
toshiba_acpi: Fix the enabling of the Special Functions
toshiba_acpi: Use the Hotkey Event Type function for keymap choosing
toshiba_acpi: Add Hotkey Event Type function and definitions
x86/wmi: delete unused wmi_data_lock mutex causing gcc warning
apple-gmux: lock iGP IO to protect from vgaarb changes
MAINTAINERS: Add missing Toshiba devices and add myself as maintainer
toshiba_acpi: Update events in toshiba_acpi_notify
intel-oaktrail: Fix trivial typo in comment
thinkpad_acpi: off by one in adaptive_keyboard_hotkey_notify_hotkey()
thinkpad_acpi: signedness bugs getting current_mode
...
Pull chrome platform updates from Olof Johansson:
"Here's a set of updates to the Chrome OS platform drivers for this
merge window.
Main new things this cycle is:
- Driver changes to expose the lightbar to users. With this, you can
make your own blinkenlights on Chromebook Pixels.
- Changes in the way that the atmel_mxt trackpads are probed. The
laptop driver is trying to be smart and not instantiate the devices
that don't answer to probe. For the trackpad that can come up in
two modes (bootloader or regular), this gets complicated since the
driver already knows how to handle the two modes including the
actual addresses used. So now the laptop driver needs to know more
too, instantiating the regular address even if the bootloader one
is the probe that passed.
- mfd driver improvements by Javier Martines Canillas, and a few
bugfixes from him, kbuild and myself"
* tag 'chrome-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/olof/chrome-platform:
platform/chrome: chromeos_laptop - instantiate Atmel at primary address
platform/chrome: cros_ec_lpc - Depend on X86 || COMPILE_TEST
platform/chrome: cros_ec_lpc - Include linux/io.h header file
platform/chrome: fix platform_no_drv_owner.cocci warnings
platform/chrome: cros_ec_lightbar - fix duplicate const warning
platform/chrome: cros_ec_dev - fix Unknown escape '%' warning
platform/chrome: Expose Chrome OS Lightbar to users
platform/chrome: Create sysfs attributes for the ChromeOS EC
mfd: cros_ec: Instantiate ChromeOS EC character device
platform/chrome: Add Chrome OS EC userspace device interface
platform/chrome: Add cros_ec_lpc driver for x86 devices
mfd: cros_ec: Add char dev and virtual dev pointers
mfd: cros_ec: Use fixed size arrays to transfer data with the EC
Pull arch/cris updates from Jesper Nilsson:
"Some much needed love for the CRIS-port.
There's a bunch of changes this time, giving the CRISv32 port a bit of
modern makeover with device-tree, irq domain and gpiolib support, and
more switchover to generic frameworks.
Some small fixes and removal of the theoretical SMP support brings up
the rear"
* tag 'cris-for-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jesper/cris:
cris: fix integer overflow in ELF_ET_DYN_BASE
CRISv32: use GENERIC_SCHED_CLOCK
CRISv32: use MMIO clocksource
CRISv32: use generic clockevents
CRIS: use generic headers via Kbuild
CRIS: use generic cmpxchg.h
CRIS: use generic atomic.h
CRIS: use generic atomic bitops
CRISv10: remove redundant macros from system.h
CRIS: remove SMP code
CRISv32: don't enable irqs in INIT_THREAD
CRISv32: handle multiple signals
CRISv32: prevent bogus restarts on sigreturn
CRISv32: don't attempt syscall restart on irq exit
Add binding documentation for CRIS
CRIS: add Axis 88 board device tree
CRISv32: add device tree support
CRISv32: add irq domains support
CRIS: enable GPIOLIB
Pull powerpc fixes from Michael Ellerman:
- fix for mm_dec_nr_pmds() from Scott.
- fixes for oopses seen with KVM + THP from Aneesh.
- build fixes from Aneesh & Shreyas.
* tag 'powerpc-4.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
powerpc/mm: Fix build error with CONFIG_PPC_TRANSACTIONAL_MEM disabled
powerpc/kvm: Fix ppc64_defconfig + PPC_POWERNV=n build error
powerpc/mm/thp: Return pte address if we find trans_splitting.
powerpc/mm/thp: Make page table walk safe against thp split/collapse
KVM: PPC: Remove page table walk helpers
KVM: PPC: Use READ_ONCE when dereferencing pte_t pointer
powerpc/hugetlb: Call mm_dec_nr_pmds() in hugetlb_free_pmd_range()
Pull second batch of KVM changes from Paolo Bonzini:
"This mostly includes the PPC changes for 4.1, which this time cover
Book3S HV only (debugging aids, minor performance improvements and
some cleanups). But there are also bug fixes and small cleanups for
ARM, x86 and s390.
The task_migration_notifier revert and real fix is still pending
review, but I'll send it as soon as possible after -rc1"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (29 commits)
KVM: arm/arm64: check IRQ number on userland injection
KVM: arm: irqfd: fix value returned by kvm_irq_map_gsi
KVM: VMX: Preserve host CR4.MCE value while in guest mode.
KVM: PPC: Book3S HV: Use msgsnd for signalling threads on POWER8
KVM: PPC: Book3S HV: Translate kvmhv_commence_exit to C
KVM: PPC: Book3S HV: Streamline guest entry and exit
KVM: PPC: Book3S HV: Use bitmap of active threads rather than count
KVM: PPC: Book3S HV: Use decrementer to wake napping threads
KVM: PPC: Book3S HV: Don't wake thread with no vcpu on guest IPI
KVM: PPC: Book3S HV: Get rid of vcore nap_count and n_woken
KVM: PPC: Book3S HV: Move vcore preemption point up into kvmppc_run_vcpu
KVM: PPC: Book3S HV: Minor cleanups
KVM: PPC: Book3S HV: Simplify handling of VCPUs that need a VPA update
KVM: PPC: Book3S HV: Accumulate timing information for real-mode code
KVM: PPC: Book3S HV: Create debugfs file for each guest's HPT
KVM: PPC: Book3S HV: Add ICP real mode counters
KVM: PPC: Book3S HV: Move virtual mode ICP functions to real-mode
KVM: PPC: Book3S HV: Convert ICS mutex lock to spin lock
KVM: PPC: Book3S HV: Add guest->host real mode completion counters
KVM: PPC: Book3S HV: Add helpers for lock/unlock hpte
...
The new Atmel MXT driver expects i2c client's address contain the
primary (main address) of the chip, and calculates the expected
bootloader address form the primary address. Unfortunately chrome_laptop
does probe the devices and if touchpad (or touchscreen, or both) comes
up in bootloader mode the i2c device gets instantiated with the
bootloader address which confuses the driver.
To work around this issue let's probe the primary address first. If the
device is not detected at the primary address we'll probe alternative
addresses as "dummy" devices. If any of them are found, destroy the
dummy client and instantiate client with proper name at primary address
still.
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Olof Johansson <olof@lixom.net>