This commit implements an SRCU state machine in support of call_srcu().
The state machine is preemptible, light-weight, and single-threaded,
minimizing synchronization overhead. In particular, there is no longer
any need for synchronize_srcu() to be guarded by a mutex.
Expedited processing is handled, at least in the absence of concurrent
grace-period operations on that same srcu_struct structure, by having
the synchronize_srcu_expedited() thread take on the role of the
workqueue thread for one iteration.
There is a reasonable probability that a given SRCU callback will
be invoked on the same CPU that registered it, however, there is no
guarantee. Concurrent SRCU grace-period primitives can cause callbacks
to be executed elsewhere, even in absence of CPU-hotplug operations.
Callbacks execute in process context, but under the influence of
local_bh_disable(), so it is illegal to sleep in an SRCU callback
function.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The earlier algorithm used an "expedited" flag combined with a "trycount"
counter to differentiate between normal and expedited SRCU grace periods.
However, the difference can be encoded into a single counter with a cutoff
value and different initial values for expedited and normal SRCU grace
periods. This commit makes that change.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Conflicts:
kernel/srcu.c
Expand the calls to srcu_readers_active_idx() from srcu_readers_active()
inline. This change improves cache locality by interating over the CPUs
once rather than twice.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The old srcu_barrier() macro is now unused. This commit removes it so
that it may be used for the SRCU flavor of rcu_barrier(), which will in
turn be needed to allow the upcoming call_srcu() to be used from within
modules.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit implements a variant of Peter's algorithm, which may be found
at https://lkml.org/lkml/2012/2/1/119.
o Make the checking lock-free to enable parallel checking.
Parallel checking is required when (1) the original checking
task is preempted for a long time, (2) sychronize_srcu_expedited()
starts during an ongoing SRCU grace period, or (3) we wish to
avoid acquiring a lock.
o Since the checking is lock-free, we avoid a mutex in state machine
for call_srcu().
o Remove the SRCU_REF_MASK and remove the coupling with the flipping.
This might allow us to remove the preempt_disable() in future
versions, though such removal will need great care because it
rescinds the one-old-reader-per-CPU guarantee.
o Remove a smp_mb(), simplify the comments and make the smp_mb() pairs
more intuitive.
Inspired-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The safety of SRCU is provided byy wait_idx() rather than flipping.
The flipping actually prevents starvation.
This commit therefore updates the comments to more accurately and
precisely describe what is going on.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This is an optimization of the SRCU grace period. To guard against
preempted readers with old values of the counter, it suffices to scan the
old counters once more, then flip ->completed only one time. The reason
this works is that the old readers must have incremented the old set of
counters (if they have not yet incremented, then their critical section
starts after this grace period, so they may be safely ignored).
This commit therefore optimizes the second flip out in favor of a simple
rescan.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The purpose of the upper bit of SRCU's per-CPU counters is to guarantee
that no reasonable series of srcu_read_lock() and srcu_read_unlock()
operations can return the value of the counter to its original value.
This guarantee is require only after the index has been switched to
the other set of counters, so at most one srcu_read_lock() can affect
a given CPU's counter. The number of srcu_read_unlock() operations
on a given counter is limited to the number of tasks in the system,
which given the Linux kernel's current structure is limited to far less
than 2^30 on 32-bit systems and far less than 2^62 on 64-bit systems.
(Something about a limited number of bytes in the kernel's address space.)
Therefore, if srcu_read_lock() increments the upper bits, then
srcu_read_unlock() need not do so. In this case, an srcu_read_lock() and
an srcu_read_unlock() will flip the lower bit of the upper field of the
counter. An unreasonably large additional number of srcu_read_unlock()
operations would be required to return the counter to its initial value,
thus preserving the guarantee.
This commit takes this approach, which further allows it to shrink
the size of the upper field to one bit, making the number of
srcu_read_unlock() operations required to return the counter to its
initial value even more unreasonable than before.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The fastpath in __synchronize_srcu() is designed to handle cases where
there are a large number of concurrent calls for the same srcu_struct
structure. However, the Linux kernel currently does not use SRCU in
this manner, so remove the fastpath checks for simplicity.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The current implementation of synchronize_srcu_expedited() can cause
severe OS jitter due to its use of synchronize_sched(), which in turn
invokes try_stop_cpus(), which causes each CPU to be sent an IPI.
This can result in severe performance degradation for real-time workloads
and especially for short-interation-length HPC workloads. Furthermore,
because only one instance of try_stop_cpus() can be making forward progress
at a given time, only one instance of synchronize_srcu_expedited() can
make forward progress at a time, even if they are all operating on
distinct srcu_struct structures.
This commit, inspired by an earlier implementation by Peter Zijlstra
(https://lkml.org/lkml/2012/1/31/211) and by further offline discussions,
takes a strictly algorithmic bits-in-memory approach. This has the
disadvantage of requiring one explicit memory-barrier instruction in
each of srcu_read_lock() and srcu_read_unlock(), but on the other hand
completely dispenses with OS jitter and furthermore allows SRCU to be
used freely by CPUs that RCU believes to be idle or offline.
The update-side implementation handles the single read-side memory
barrier by rechecking the per-CPU counters after summing them and
by running through the update-side state machine twice.
This implementation has passed moderate rcutorture testing on both
x86 and Power. Also updated to use this_cpu_ptr() instead of per_cpu_ptr(),
as suggested by Peter Zijlstra.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Although rcutorture does invoke rcu_barrier() and friends, it cannot
really be called a torture test given that it invokes them only once
at the end of the test. This commit therefore introduces heavy-duty
rcutorture testing for rcu_barrier(), which may be carried out
concurrently with normal rcutorture testing.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The rcutorture initialization code ignored the error returns from
rcu_torture_onoff_init() and rcu_torture_stall_init(). The rcutorture
cleanup code failed to NULL out a number of pointers. These bugs will
normally have no effect, but this commit fixes them nevertheless.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Pull MMC fixes from Chris Ball:
- Build fix for omap_hsmmc with OF against 3.4-rc1.
- Fix CONFIG_MMC_UNSAFE_RESUME semantics regression against 3.3, which
broke hotplug card detection when UNSAFE_RESUME is set.
- Fix a race condition in omap_hsmmc with runtime PM.
- Fix two libertas SDIO-powered-resume regressions.
- Small fixes for discard/sanitize, dw_mmc, cd-gpio and esdhc-imx.
* tag 'mmc-fixes-for-3.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
mmc: core: Do not pre-claim host in suspend
mmc: dw_mmc: prevent NULL dereference for dma_ops
mmc: unbreak sdhci-esdhc-imx on i.MX25
mmc: cd-gpio: Include header to pickup exported symbol prototypes
mmc: sdhci: refine non-removable card checking for card detection
mmc: dw_mmc: Fix switch from DMA to PIO
mmc: remove MMC bus legacy suspend/resume method
mmc: omap_hsmmc: Get rid of of_have_populated_dt() usage
mmc: omap_hsmmc: build fix for CONFIG_OF=y and CONFIG_MMC_OMAP_HS=m
mmc: fixes for eMMC v4.5 sanitize operation
mmc: fixes for eMMC v4.5 discard operation
Pull media fixes from Mauro Carvalho Chehab:
- Fixes a regression at DVB core when switching from DVB-S2 to DVB-S on
Kaffeine (Fedora 16 Bugzilla #812895);
- Fixes a mutex unlock at an error condition at drx-k;
- Fix winbond-cir set mode;
- mt9m032: Fix a compilation breakage with some random Kconfig;
- mt9m032: fix two dead locks;
- xc5000: don't require an special firmware (that won't be provided by
the vendor) just because the xtal frequency is different;
- V4L DocBook: fix some typos at multi-plane formats description.
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] xc5000: support 32MHz & 31.875MHz xtal using the 41.024.5 firmware
[media] V4L: mt9m032: fix compilation breakage
[media] V4L: DocBook: Fix typos in the multi-plane formats description
[media] V4L: mt9m032: fix two dead-locks
[media] rc-core: set mode for winbond-cir
[media] drxk: Does not unlock mutex if sanity check failed in scu_command()
[media] dvb_frontend: Fix a regression when switching back to DVB-S
Pull MFD fixes from Samuel Ortiz:
"We have 3 build fixes, a OMAP USB host PHY reset fix and the twl6040
conversion to an i2c driver. The latter may not sound like a fix but
the twl6040 MFD driver won't probe without it, triggering an OMAP4
audio regression."
* tag 'mfd-for-linus-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
mfd: Fix modular builds of rc5t583 regulator support
mfd: Fix asic3_gpio_to_irq
ARM: OMAP3: USB: Fix the EHCI ULPI PHY reset issue
mfd: Convert twl6040 to i2c driver, and separate it from twl core
mfd : Fix dbx500 compilation error
pfm_vm_munmap() is simply vm_munmap() and pfm_remove_smpl_mapping()
always get current as the first argument.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
... since exit_mmap() is coming and it will munmap() everything anyway.
In all other cases aio_free_ring() has ctx->mm == current->mm; moreover,
all other callers of vm_munmap() have mm == current->mm, so this will
allow us to get rid of mm argument of vm_munmap().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Since SDIO drivers may want to do some SDIO operations in their suspend
callback functions, we must not keep the host claimed when calling them.
Daniel Drake reported that libertas_sdio encountered a deadlock in its
suspend function.
Signed-off-by: Ulf Hansson <ulf.hansson@stericsson.com>
Tested-by: Daniel Drake <dsd@laptop.org>
[stable@: please apply to 3.2-stable and 3.3-stable]
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Now, dma_ops is assumed that use the IDMAC. But if dma_ops is assigned
the pdata->dma_ops, we didn't ensure that callback function is defined.
If the callback isn't defined, then we should run in PIO mode.
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Will Newton <will.newton@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
This was broken by me in 37865fe915
("mmc: sdhci-esdhc-imx: fix timeout on i.MX's sdhci") where more
extensive tests would have shown that read or write of data to the
card were failing (even if the partition table was correctly read).
Signed-off-by: Eric Bénard <eric@eukrea.com>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Include the linux/mmc/cd-gpio.h header to pickup the prototypes
for the two exported symbols.
This quiets the sparse warnings:
warning: symbol 'mmc_cd_gpio_request' was not declared. Should it be static?
warning: symbol 'mmc_cd_gpio_free' was not declared. Should it be static?
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Venkatraman S <svenkatr@ti.com>
Signed-off-by: Chris Ball <cjb@laptop.org>