On PREEMPT_RT, seqlock_t is transformed to a sleeping lock that do not
disable preemption. A seqlock_t reader can thus preempt its write side
section and spin for the enter scheduler tick. If that reader belongs to
a real-time scheduling class, it can spin forever and the kernel will
livelock.
To break this livelock possibility on PREEMPT_RT, implement seqlock_t in
terms of "seqcount_spinlock_t" instead of plain "seqcount_t".
Beside its pure annotational value, this will leverage the existing
seqcount_LOCKNAME_T PREEMPT_RT anti-livelock mechanisms, without adding
any extra code.
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200904153231.11994-6-a.darwish@linutronix.de
Preemption must be disabled before entering a sequence counter write
side critical section. Otherwise the read side section can preempt the
write side section and spin for the entire scheduler tick. If that
reader belongs to a real-time scheduling class, it can spin forever and
the kernel will livelock.
Disabling preemption cannot be done for PREEMPT_RT though: it can lead
to higher latencies, and the write side sections will not be able to
acquire locks which become sleeping locks (e.g. spinlock_t).
To remain preemptible, while avoiding a possible livelock caused by the
reader preempting the writer, use a different technique: let the reader
detect if a seqcount_LOCKNAME_t writer is in progress. If that's the
case, acquire then release the associated LOCKNAME writer serialization
lock. This will allow any possibly-preempted writer to make progress
until the end of its writer serialization lock critical section.
Implement this lock-unlock technique for all seqcount_LOCKNAME_t with
an associated (PREEMPT_RT) sleeping lock.
References: 55f3560df9 ("seqlock: Extend seqcount API with associated locks")
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200519214547.352050-1-a.darwish@linutronix.de
The sequence counters read APIs are implemented as CPP macros, so they
can take either seqcount_t or any of the seqcount_LOCKNAME_t variants.
Such macros then get *directly* transformed to internal C functions that
only take plain seqcount_t.
Further commits need access to seqcount_LOCKNAME_t inside of the actual
read APIs code. Thus transform all of the seqcount read APIs to pure GCC
statement expressions instead.
This will not break type-safety: all of the transformed APIs resolve to
a _Generic() selection that does not have a "default" case.
This will also not affect the transformed APIs readability: previously
added kernel-doc above all of seqlock.h functions makes the expectations
quite clear for call-site developers.
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200904153231.11994-4-a.darwish@linutronix.de
At seqlock.h, the following set of functions:
- __seqcount_ptr()
- __seqcount_preemptible()
- __seqcount_assert()
act as plain seqcount_t "property" accessors. Meanwhile, the following
group:
- __seqcount_ptr()
- __seqcount_lock_preemptible()
- __seqcount_assert_lock_held()
act as the equivalent set, but in the generic form, taking either
seqcount_t or any of the seqcount_LOCKNAME_t variants.
This is quite confusing, especially the first member where it is called
exactly the same in both groups.
Differentiate the first group by using "__seqprop" as prefix, and also
use that same prefix for all of seqcount_LOCKNAME_t property accessors.
While at it, constify the property accessors first parameter when
appropriate.
References: 55f3560df9 ("seqlock: Extend seqcount API with associated locks")
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200904153231.11994-3-a.darwish@linutronix.de
At seqlock.h, sequence counters with associated locks are either called
seqcount_LOCKNAME_t, seqcount_LOCKTYPE_t, or seqcount_locktype_t.
Standardize on seqcount_LOCKNAME_t for all instances in comments,
kernel-doc, and SEQCOUNT_LOCKNAME() generative macro paramters.
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200904153231.11994-2-a.darwish@linutronix.de
Latch sequence counters are a multiversion concurrency control mechanism
where the seqcount_t counter even/odd value is used to switch between
two copies of protected data. This allows the seqcount_t read path to
safely interrupt its write side critical section (e.g. from NMIs).
Initially, latch sequence counters were implemented as a single write
function above plain seqcount_t: raw_write_seqcount_latch(). The read
side was expected to use plain seqcount_t raw_read_seqcount().
A specialized latch read function, raw_read_seqcount_latch(), was later
added. It became the standardized way for latch read paths. Due to the
dependent load, it has one read memory barrier less than the plain
seqcount_t raw_read_seqcount() API.
Only raw_write_seqcount_latch() and raw_read_seqcount_latch() should be
used with latch sequence counters. Having *unique* read and write path
APIs means that latch sequence counters are actually a data type of
their own -- just inappropriately overloading plain seqcount_t.
Introduce seqcount_latch_t. This adds type-safety and ensures that only
the correct latch-safe APIs are to be used.
Not to break bisection, let the latch APIs also accept plain seqcount_t
or seqcount_raw_spinlock_t. After converting all call sites to
seqcount_latch_t, only that new data type will be allowed.
References: 9b0fd802e8 ("seqcount: Add raw_write_seqcount_latch()")
References: 7fc26327b7 ("seqlock: Introduce raw_read_seqcount_latch()")
References: aadd6e5caa ("time/sched_clock: Use raw_read_seqcount_latch()")
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200827114044.11173-4-a.darwish@linutronix.de
Fix kernel-doc warnings in <linux/seqlock.h>.
../include/linux/seqlock.h:152: warning: Incorrect use of kernel-doc format: * seqcount_LOCKNAME_init() - runtime initializer for seqcount_LOCKNAME_t
../include/linux/seqlock.h:164: warning: Incorrect use of kernel-doc format: * SEQCOUNT_LOCKTYPE() - Instantiate seqcount_LOCKNAME_t and helpers
../include/linux/seqlock.h:229: warning: Function parameter or member 'seq_name' not described in 'SEQCOUNT_LOCKTYPE_ZERO'
../include/linux/seqlock.h:229: warning: Function parameter or member 'assoc_lock' not described in 'SEQCOUNT_LOCKTYPE_ZERO'
../include/linux/seqlock.h:229: warning: Excess function parameter 'name' description in 'SEQCOUNT_LOCKTYPE_ZERO'
../include/linux/seqlock.h:229: warning: Excess function parameter 'lock' description in 'SEQCOUNT_LOCKTYPE_ZERO'
../include/linux/seqlock.h:695: warning: duplicate section name 'NOTE'
Demote kernel-doc notation for the macros "seqcount_LOCKNAME_init()" and
"SEQCOUNT_LOCKTYPE()"; scripts/kernel-doc does not handle them correctly.
Rename function parameters in SEQCNT_LOCKNAME_ZERO() documentation
to match the macro's argument names. Change the macro name in the
documentation to SEQCOUNT_LOCKTYPE_ZERO() to match the macro's name.
For raw_write_seqcount_latch(), rename the second NOTE: to NOTE2:
to prevent a kernel-doc warning. However, the generated output is not
quite as nice as it could be for this.
Fix a typo: s/LOCKTYPR/LOCKTYPE/
Fixes: 0efc94c5d1 ("seqcount: Compress SEQCNT_LOCKNAME_ZERO()")
Fixes: e4e9ab3f9f ("seqlock: Fold seqcount_LOCKNAME_init() definition")
Fixes: a8772dccb2 ("seqlock: Fold seqcount_LOCKNAME_t definition")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200817000200.20993-1-rdunlap@infradead.org
__SEQ_LOCKDEP() is an expression gate for the
seqcount_LOCKNAME_t::lock member. Rename it to be about the member,
not the gate condition.
Later (PREEMPT_RT) patches will make the member available for !LOCKDEP
configs.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
A sequence counter write side critical section must be protected by some
form of locking to serialize writers. If the serialization primitive is
not disabling preemption implicitly, preemption has to be explicitly
disabled before entering the write side critical section.
There is no built-in debugging mechanism to verify that the lock used
for writer serialization is held and preemption is disabled. Some usage
sites like dma-buf have explicit lockdep checks for the writer-side
lock, but this covers only a small portion of the sequence counter usage
in the kernel.
Add new sequence counter types which allows to associate a lock to the
sequence counter at initialization time. The seqcount API functions are
extended to provide appropriate lockdep assertions depending on the
seqcount/lock type.
For sequence counters with associated locks that do not implicitly
disable preemption, preemption protection is enforced in the sequence
counter write side functions. This removes the need to explicitly add
preempt_disable/enable() around the write side critical sections: the
write_begin/end() functions for these new sequence counter types
automatically do this.
Introduce the following seqcount types with associated locks:
seqcount_spinlock_t
seqcount_raw_spinlock_t
seqcount_rwlock_t
seqcount_mutex_t
seqcount_ww_mutex_t
Extend the seqcount read and write functions to branch out to the
specific seqcount_LOCKTYPE_t implementation at compile-time. This avoids
kernel API explosion per each new seqcount_LOCKTYPE_t added. Add such
compile-time type detection logic into a new, internal, seqlock header.
Document the proper seqcount_LOCKTYPE_t usage, and rationale, at
Documentation/locking/seqlock.rst.
If lockdep is disabled, this lock association is compiled out and has
neither storage size nor runtime overhead.
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200720155530.1173732-10-a.darwish@linutronix.de
Preemption must be disabled before entering a sequence count write side
critical section. Failing to do so, the seqcount read side can preempt
the write side section and spin for the entire scheduler tick. If that
reader belongs to a real-time scheduling class, it can spin forever and
the kernel will livelock.
Assert through lockdep that preemption is disabled for seqcount writers.
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200720155530.1173732-9-a.darwish@linutronix.de
raw_seqcount_begin() has the same code as raw_read_seqcount(), with the
exception of masking the sequence counter's LSB before returning it to
the caller.
Note, raw_seqcount_begin() masks the counter's LSB before returning it
to the caller so that read_seqcount_retry() can fail if the counter is
odd -- without the overhead of an extra branching instruction.
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200720155530.1173732-7-a.darwish@linutronix.de
The seqlock.h seqcount_t and seqlock_t API definitions are presented in
the chronological order of their development rather than the order that
makes most sense to readers. This makes it hard to follow and understand
the header file code.
Group and reorder all of the exported seqlock.h functions according to
their function.
First, group together the seqcount_t standard read path functions:
- __read_seqcount_begin()
- raw_read_seqcount_begin()
- read_seqcount_begin()
since each function is implemented exactly in terms of the one above
it. Then, group the special-case seqcount_t readers on their own as:
- raw_read_seqcount()
- raw_seqcount_begin()
since the only difference between the two functions is that the second
one masks the sequence counter LSB while the first one does not. Note
that raw_seqcount_begin() can actually be implemented in terms of
raw_read_seqcount(), which will be done in a follow-up commit.
Then, group the seqcount_t write path functions, instead of injecting
unrelated seqcount_t latch functions between them, and order them as:
- raw_write_seqcount_begin()
- raw_write_seqcount_end()
- write_seqcount_begin_nested()
- write_seqcount_begin()
- write_seqcount_end()
- raw_write_seqcount_barrier()
- write_seqcount_invalidate()
which is the expected natural order. This also isolates the seqcount_t
latch functions into their own area, at the end of the sequence counters
section, and before jumping to the next one: sequential locks
(seqlock_t).
Do a similar grouping and reordering for seqlock_t "locking" readers vs.
the "conditionally locking or lockless" ones.
No implementation code was changed in any of the reordering above.
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200720155530.1173732-5-a.darwish@linutronix.de
The seqcount_t latch reader example at the raw_write_seqcount_latch()
kernel-doc comment ends the latch read section with a manual smp memory
barrier and sequence counter comparison.
This is technically correct, but it is suboptimal: read_seqcount_retry()
already contains the same logic of an smp memory barrier and sequence
counter comparison.
End the latch read critical section example with read_seqcount_retry().
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200720155530.1173732-4-a.darwish@linutronix.de