Commit Graph

428 Commits

Author SHA1 Message Date
Peter Zijlstra ae813308f4 locking/lockdep: Avoid creating redundant links
Two boots + a make defconfig, the first didn't have the redundant bit
in, the second did:

 lock-classes:                         1168       1169 [max: 8191]
 direct dependencies:                  7688       5812 [max: 32768]
 indirect dependencies:               25492      25937
 all direct dependencies:            220113     217512
 dependency chains:                    9005       9008 [max: 65536]
 dependency chain hlocks:             34450      34366 [max: 327680]
 in-hardirq chains:                      55         51
 in-softirq chains:                     371        378
 in-process chains:                    8579       8579
 stack-trace entries:                108073      88474 [max: 524288]
 combined max dependencies:       178738560  169094640

 max locking depth:                      15         15
 max bfs queue depth:                   320        329

 cyclic checks:                        9123       9190

 redundant checks:                                5046
 redundant links:                                 1828

 find-mask forwards checks:            2564       2599
 find-mask backwards checks:          39521      39789

So it saves nearly 2k links and a fair chunk of stack-trace entries, but
as expected, makes no real difference on the indirect dependencies.

At the same time, you see the max BFS depth increase, which is also
expected, although it could easily be boot variance -- these numbers are
not entirely stable between boots.

The down side is that the cycles in the graph become larger and thus
the reports harder to read.

XXX: do we want this as a CONFIG variable, implied by LOCKDEP_SMALL?

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Nikolay Borisov <nborisov@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akpm@linux-foundation.org
Cc: boqun.feng@gmail.com
Cc: iamjoonsoo.kim@lge.com
Cc: kernel-team@lge.com
Cc: kirill@shutemov.name
Cc: npiggin@gmail.com
Cc: walken@google.com
Link: http://lkml.kernel.org/r/20170303091338.GH6536@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 12:29:04 +02:00
Peter Zijlstra d92a8cfcb3 locking/lockdep: Rework FS_RECLAIM annotation
A while ago someone, and I cannot find the email just now, asked if we
could not implement the RECLAIM_FS inversion stuff with a 'fake' lock
like we use for other things like workqueues etc. I think this should
be possible which allows reducing the 'irq' states and will reduce the
amount of __bfs() lookups we do.

Removing the 1 IRQ state results in 4 less __bfs() walks per
dependency, improving lockdep performance. And by moving this
annotation out of the lockdep code it becomes easier for the mm people
to extend.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Nikolay Borisov <nborisov@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akpm@linux-foundation.org
Cc: boqun.feng@gmail.com
Cc: iamjoonsoo.kim@lge.com
Cc: kernel-team@lge.com
Cc: kirill@shutemov.name
Cc: npiggin@gmail.com
Cc: walken@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 12:29:03 +02:00
Kirill Tkhai 83ced169d9 locking/rwsem-xadd: Add killable versions of rwsem_down_read_failed()
Rename rwsem_down_read_failed() in __rwsem_down_read_failed_common()
and teach it to abort waiting in case of pending signals and killable
state argument passed.

Note, that we shouldn't wake anybody up in EINTR path, as:

We check for (waiter.task) under spinlock before we go to out_nolock
path. Current task wasn't able to be woken up, so there are
a writer, owning the sem, or a writer, which is the first waiter.
In the both cases we shouldn't wake anybody. If there is a writer,
owning the sem, and we were the only waiter, remove RWSEM_WAITING_BIAS,
as there are no waiters anymore.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: arnd@arndb.de
Cc: avagin@virtuozzo.com
Cc: davem@davemloft.net
Cc: fenghua.yu@intel.com
Cc: gorcunov@virtuozzo.com
Cc: heiko.carstens@de.ibm.com
Cc: hpa@zytor.com
Cc: ink@jurassic.park.msu.ru
Cc: mattst88@gmail.com
Cc: rth@twiddle.net
Cc: schwidefsky@de.ibm.com
Cc: tony.luck@intel.com
Link: http://lkml.kernel.org/r/149789534632.9059.2901382369609922565.stgit@localhost.localdomain
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 12:28:55 +02:00
Kirill Tkhai 0aa1125fa8 locking/rwsem-spinlock: Add killable versions of __down_read()
Rename __down_read() in __down_read_common() and teach it
to abort waiting in case of pending signals and killable
state argument passed.

Note, that we shouldn't wake anybody up in EINTR path, as:

We check for signal_pending_state() after (!waiter.task)
test and under spinlock. So, current task wasn't able to
be woken up. It may be in two cases: a writer is owner
of the sem, or a writer is a first waiter of the sem.

If a writer is owner of the sem, no one else may work
with it in parallel. It will wake somebody, when it
call up_write() or downgrade_write().

If a writer is the first waiter, it will be woken up,
when the last active reader releases the sem, and
sem->count became 0.

Also note, that set_current_state() may be moved down
to schedule() (after !waiter.task check), as all
assignments in this type of semaphore (including wake_up),
occur under spinlock, so we can't miss anything.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: arnd@arndb.de
Cc: avagin@virtuozzo.com
Cc: davem@davemloft.net
Cc: fenghua.yu@intel.com
Cc: gorcunov@virtuozzo.com
Cc: heiko.carstens@de.ibm.com
Cc: hpa@zytor.com
Cc: ink@jurassic.park.msu.ru
Cc: mattst88@gmail.com
Cc: rth@twiddle.net
Cc: schwidefsky@de.ibm.com
Cc: tony.luck@intel.com
Link: http://lkml.kernel.org/r/149789533283.9059.9829416940494747182.stgit@localhost.localdomain
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 12:28:55 +02:00
Prateek Sood 50972fe78f locking/osq_lock: Fix osq_lock queue corruption
Fix ordering of link creation between node->prev and prev->next in
osq_lock(). A case in which the status of optimistic spin queue is
CPU6->CPU2 in which CPU6 has acquired the lock.

        tail
          v
  ,-. <- ,-.
  |6|    |2|
  `-' -> `-'

At this point if CPU0 comes in to acquire osq_lock, it will update the
tail count.

  CPU2			CPU0
  ----------------------------------

				       tail
				         v
			  ,-. <- ,-.    ,-.
			  |6|    |2|    |0|
			  `-' -> `-'    `-'

After tail count update if CPU2 starts to unqueue itself from
optimistic spin queue, it will find an updated tail count with CPU0 and
update CPU2 node->next to NULL in osq_wait_next().

  unqueue-A

	       tail
	         v
  ,-. <- ,-.    ,-.
  |6|    |2|    |0|
  `-'    `-'    `-'

  unqueue-B

  ->tail != curr && !node->next

If reordering of following stores happen then prev->next where prev
being CPU2 would be updated to point to CPU0 node:

				       tail
				         v
			  ,-. <- ,-.    ,-.
			  |6|    |2|    |0|
			  `-'    `-' -> `-'

  osq_wait_next()
    node->next <- 0
    xchg(node->next, NULL)

	       tail
	         v
  ,-. <- ,-.    ,-.
  |6|    |2|    |0|
  `-'    `-'    `-'

  unqueue-C

At this point if next instruction
	WRITE_ONCE(next->prev, prev);
in CPU2 path is committed before the update of CPU0 node->prev = prev then
CPU0 node->prev will point to CPU6 node.

	       tail
    v----------. v
  ,-. <- ,-.    ,-.
  |6|    |2|    |0|
  `-'    `-'    `-'
     `----------^

At this point if CPU0 path's node->prev = prev is committed resulting
in change of CPU0 prev back to CPU2 node. CPU2 node->next is NULL
currently,

				       tail
			                 v
			  ,-. <- ,-. <- ,-.
			  |6|    |2|    |0|
			  `-'    `-'    `-'
			     `----------^

so if CPU0 gets into unqueue path of osq_lock it will keep spinning
in infinite loop as condition prev->next == node will never be true.

Signed-off-by: Prateek Sood <prsood@codeaurora.org>
[ Added pictures, rewrote comments. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: sramana@codeaurora.org
Link: http://lkml.kernel.org/r/1500040076-27626-1-git-send-email-prsood@codeaurora.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 12:28:54 +02:00
Nicolas Pitre bc2eecd7ec futex: Allow for compiling out PI support
This makes it possible to preserve basic futex support and compile out the
PI support when RT mutexes are not available.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Darren Hart <dvhart@infradead.org>
Link: http://lkml.kernel.org/r/alpine.LFD.2.20.1708010024190.5981@knanqh.ubzr
2017-08-01 14:36:35 +02:00
Linus Torvalds 8b810a3a35 Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixlet from Ingo Molnar:
 "Remove an unnecessary priority adjustment in the rtmutex code"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rtmutex: Remove unnecessary priority adjustment
2017-07-21 11:11:23 -07:00
Alex Shi 69f0d429c4 locking/rtmutex: Remove unnecessary priority adjustment
We don't need to adjust priority before adding a new pi_waiter, the
priority only needs to be updated after pi_waiter change or task
priority change.

Steven Rostedt pointed out:

  "Interesting, I did some git mining and this was added with the original
   entry of the rtmutex.c (23f78d4a03). Looking at even that version, I
   don't see the purpose of adjusting the task prio here. It is done
   before anything changes in the task."

Signed-off-by: Alex Shi <alex.shi@linaro.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1499926704-28841-1-git-send-email-alex.shi@linaro.org
[ Enhance the changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-13 11:44:06 +02:00
Linus Torvalds c8b2ba83fb Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:

 - Fix the EINTR logic in rwsem-spinlock to avoid double locking by a
   writer and a reader

 - Add a missing include to qspinlocks

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/qspinlock: Explicitly include asm/prefetch.h
  locking/rwsem-spinlock: Fix EINTR branch in __down_write_common()
2017-07-09 10:47:50 -07:00
Linus Torvalds fe1b518075 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next
Pull sparc updates from David Miller:

 1) Queued spinlocks and rwlocks for sparc64, from Babu Moger.

 2) Some const'ification from Arvind Yadav.

 3) LDC/VIO driver infrastructure changes to facilitate future upcoming
    drivers, from Jag Raman.

 4) Initialize sched_clock() et al. early so that the initial printk
    timestamps are all done while the implementation is available and
    functioning. From Pavel Tatashin.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next: (38 commits)
  sparc: kernel: pmc: make of_device_ids const.
  sparc64: fix typo in property
  sparc64: add port_id to VIO device metadata
  sparc64: Enhance search for VIO device in MDESC
  sparc64: enhance VIO device probing
  sparc64: check if a client is allowed to register for MDESC notifications
  sparc64: remove restriction on VIO device name size
  sparc64: refactor code to obtain cfg_handle property from MDESC
  sparc64: add MDESC node name property to VIO device metadata
  sparc64: mdesc: use __GFP_REPEAT action modifier for VM allocation
  sparc64: expand MDESC interface
  sparc64: skip handshake for LDC channels in RAW mode
  sparc64: specify the device class in VIO version info. packet
  sparc64: ensure VIO operations are defined while being used
  sparc: kernel: apc: make of_device_ids const
  sparc/time: make of_device_ids const
  sparc64: broken %tick frequency on spitfire cpus
  sparc64: use prom interface to get %stick frequency
  sparc64: optimize functions that access tick
  sparc64: add hot-patched and inlined get_tick()
  ...
2017-07-08 12:14:14 -07:00
Stafford Horne 5671360f29 locking/qspinlock: Explicitly include asm/prefetch.h
In architectures that use qspinlock, like x86, prefetch is loaded
indirectly via the asm/qspinlock.h include.  On other architectures, like
OpenRISC, which may want to use asm-generic/qspinlock.h the built will
fail without the asm/prefetch.h include.

Fix this by including directly.

Signed-off-by: Stafford Horne <shorne@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170707195658.23840-1-shorne@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-08 11:01:11 +02:00
Pavel Tatashin 3d375d7859 mm: update callers to use HASH_ZERO flag
Update dcache, inode, pid, mountpoint, and mount hash tables to use
HASH_ZERO, and remove initialization after allocations.  In case of
places where HASH_EARLY was used such as in __pv_init_lock_hash the
zeroed hash table was already assumed, because memblock zeroes the
memory.

CPU: SPARC M6, Memory: 7T
Before fix:
  Dentry cache hash table entries: 1073741824
  Inode-cache hash table entries: 536870912
  Mount-cache hash table entries: 16777216
  Mountpoint-cache hash table entries: 16777216
  ftrace: allocating 20414 entries in 40 pages
  Total time: 11.798s

After fix:
  Dentry cache hash table entries: 1073741824
  Inode-cache hash table entries: 536870912
  Mount-cache hash table entries: 16777216
  Mountpoint-cache hash table entries: 16777216
  ftrace: allocating 20414 entries in 40 pages
  Total time: 3.198s

CPU: Intel Xeon E5-2630, Memory: 2.2T:
Before fix:
  Dentry cache hash table entries: 536870912
  Inode-cache hash table entries: 268435456
  Mount-cache hash table entries: 8388608
  Mountpoint-cache hash table entries: 8388608
  CPU: Physical Processor ID: 0
  Total time: 3.245s

After fix:
  Dentry cache hash table entries: 536870912
  Inode-cache hash table entries: 268435456
  Mount-cache hash table entries: 8388608
  Mountpoint-cache hash table entries: 8388608
  CPU: Physical Processor ID: 0
  Total time: 3.244s

Link: http://lkml.kernel.org/r/1488432825-92126-4-git-send-email-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Babu Moger <babu.moger@oracle.com>
Cc: David Miller <davem@davemloft.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-06 16:24:33 -07:00
Kirill Tkhai a0c4acd2c2 locking/rwsem-spinlock: Fix EINTR branch in __down_write_common()
If a writer could been woken up, the above branch

	if (sem->count == 0)
		break;

would have moved us to taking the sem. So, it's
not the time to wake a writer now, and only readers
are allowed now. Thus, 0 must be passed to __rwsem_do_wake().

Next, __rwsem_do_wake() wakes readers unconditionally.
But we mustn't do that if the sem is owned by writer
in the moment. Otherwise, writer and reader own the sem
the same time, which leads to memory corruption in
callers.

rwsem-xadd.c does not need that, as:

  1) the similar check is made lockless there,
  2) in __rwsem_mark_wake::try_reader_grant we test,

that sem is not owned by writer.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Niklas Cassel <niklas.cassel@axis.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 17fcbd590d "locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y"
Link: http://lkml.kernel.org/r/149762063282.19811.9129615532201147826.stgit@localhost.localdomain
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05 12:26:29 +02:00
Linus Torvalds 650fc870a2 Merge tag 'docs-4.13' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
 "There has been a fair amount of activity in the docs tree this time
  around. Highlights include:

   - Conversion of a bunch of security documentation into RST

   - The conversion of the remaining DocBook templates by The Amazing
     Mauro Machine. We can now drop the entire DocBook build chain.

   - The usual collection of fixes and minor updates"

* tag 'docs-4.13' of git://git.lwn.net/linux: (90 commits)
  scripts/kernel-doc: handle DECLARE_HASHTABLE
  Documentation: atomic_ops.txt is core-api/atomic_ops.rst
  Docs: clean up some DocBook loose ends
  Make the main documentation title less Geocities
  Docs: Use kernel-figure in vidioc-g-selection.rst
  Docs: fix table problems in ras.rst
  Docs: Fix breakage with Sphinx 1.5 and upper
  Docs: Include the Latex "ifthen" package
  doc/kokr/howto: Only send regression fixes after -rc1
  docs-rst: fix broken links to dynamic-debug-howto in kernel-parameters
  doc: Document suitability of IBM Verse for kernel development
  Doc: fix a markup error in coding-style.rst
  docs: driver-api: i2c: remove some outdated information
  Documentation: DMA API: fix a typo in a function name
  Docs: Insert missing space to separate link from text
  doc/ko_KR/memory-barriers: Update control-dependencies example
  Documentation, kbuild: fix typo "minimun" -> "minimum"
  docs: Fix some formatting issues in request-key.rst
  doc: ReSTify keys-trusted-encrypted.txt
  doc: ReSTify keys-request-key.txt
  ...
2017-07-03 21:13:25 -07:00
Linus Torvalds 892ad5acca Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Add CONFIG_REFCOUNT_FULL=y to allow the disabling of the 'full'
     (robustness checked) refcount_t implementation with slightly lower
     runtime overhead. (Kees Cook)

     The lighter weight variant is the default. The two variants use the
     same API. Having this variant was a precondition by some
     maintainers to merge refcount_t cleanups.

   - Add lockdep support for rtmutexes (Peter Zijlstra)

   - liblockdep fixes and improvements (Sasha Levin, Ben Hutchings)

   - ... misc fixes and improvements"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
  locking/refcount: Remove the half-implemented refcount_sub() API
  locking/refcount: Create unchecked atomic_t implementation
  locking/rtmutex: Don't initialize lockdep when not required
  locking/selftest: Add RT-mutex support
  locking/selftest: Remove the bad unlock ordering test
  rt_mutex: Add lockdep annotations
  MAINTAINERS: Claim atomic*_t maintainership
  locking/x86: Remove the unused atomic_inc_short() methd
  tools/lib/lockdep: Remove private kernel headers
  tools/lib/lockdep: Hide liblockdep output from test results
  tools/lib/lockdep: Add dummy current_gfp_context()
  tools/include: Add IS_ERR_OR_NULL to err.h
  tools/lib/lockdep: Add empty __is_[module,kernel]_percpu_address
  tools/lib/lockdep: Include err.h
  tools/include: Add (mostly) empty include/linux/sched/mm.h
  tools/lib/lockdep: Use LDFLAGS
  tools/lib/lockdep: Remove double-quotes from soname
  tools/lib/lockdep: Fix object file paths used in an out-of-tree build
  tools/lib/lockdep: Fix compilation for 4.11
  tools/lib/lockdep: Don't mix fd-based and stream IO
  ...
2017-07-03 12:14:18 -07:00
Levin, Alexander (Sasha Levin) cde50a6739 locking/rtmutex: Don't initialize lockdep when not required
pi_mutex isn't supposed to be tracked by lockdep, but just
passing NULLs for name and key will cause lockdep to spew a
warning and die, which is not what we want it to do.

Skip lockdep initialization if the caller passed NULLs for
name and key, suggesting such initialization isn't desired.

Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: f5694788ad ("rt_mutex: Add lockdep annotations")
Link: http://lkml.kernel.org/r/20170618140548.4763-1-alexander.levin@verizon.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20 11:53:09 +02:00
David S. Miller 95c4629d92 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc 2017-06-10 14:06:46 -07:00
Paul E. McKenney c4a09ff752 rcu: Remove the now-obsolete PROVE_RCU_REPEATEDLY Kconfig option
The PROVE_RCU_REPEATEDLY Kconfig option was initially added due to
the volume of messages from PROVE_RCU: Doing just one per boot would
have required excessive numbers of boots to locate them all.  However,
PROVE_RCU messages are now relatively rare, so there is no longer any
reason to need more than one such message per boot.  This commit therefore
removes the PROVE_RCU_REPEATEDLY Kconfig option.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
2017-06-08 18:52:41 -07:00
Paul E. McKenney 681fbec881 lockdep: Use consistent printing primitives
Commit a5dd63efda ("lockdep: Use "WARNING" tag on lockdep splats")
substituted pr_warn() for printk() in places called out by Dmitry Vyukov.
However, this resulted in an ugly mix of pr_warn() and printk().  This
commit therefore changes printk() to pr_warn() or pr_cont(), depending
on the absence or presence of KERN_CONT.  This is done in all functions
that had printk() changed to pr_warn() by the aforementioned commit.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08 18:52:36 -07:00
Peter Zijlstra f5694788ad rt_mutex: Add lockdep annotations
Now that (PI) futexes have their own private RT-mutex interface and
implementation we can easily add lockdep annotations to the existing
RT-mutex interface.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-08 10:35:49 +02:00
Babu Moger 9ab6055f95 kernel/locking: Fix compile error with qrwlock.c
Saw these compile errors on SPARC when queued rwlock feature is enabled.

 CC      kernel/locking/qrwlock.o
kernel/locking/qrwlock.c: In function ‘queued_read_lock_slowpath’:
kernel/locking/qrwlock.c:89: error: implicit declaration of function ‘arch_spin_lock’
kernel/locking/qrwlock.c:102: error: implicit declaration of function ‘arch_spin_unlock’
make[4]: *** [kernel/locking/qrwlock.o] Error 1

Include spinlock.h in qrwlock.c to fix it.

Signed-off-by: Babu Moger <babu.moger@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Jane Chu <jane.chu@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Reviewed-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-25 12:06:50 -07:00
Peter Zijlstra 04dc1b2fff futex,rt_mutex: Fix rt_mutex_cleanup_proxy_lock()
Markus reported that the glibc/nptl/tst-robustpi8 test was failing after
commit:

  cfafcd117d ("futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()")

The following trace shows the problem:

 ld-linux-x86-64-2161  [019] ....   410.760971: SyS_futex: 00007ffbeb76b028: 80000875  op=FUTEX_LOCK_PI
 ld-linux-x86-64-2161  [019] ...1   410.760972: lock_pi_update_atomic: 00007ffbeb76b028: curval=80000875 uval=80000875 newval=80000875 ret=0
 ld-linux-x86-64-2165  [011] ....   410.760978: SyS_futex: 00007ffbeb76b028: 80000875  op=FUTEX_UNLOCK_PI
 ld-linux-x86-64-2165  [011] d..1   410.760979: do_futex: 00007ffbeb76b028: curval=80000875 uval=80000875 newval=80000871 ret=0
 ld-linux-x86-64-2165  [011] ....   410.760980: SyS_futex: 00007ffbeb76b028: 80000871 ret=0000
 ld-linux-x86-64-2161  [019] ....   410.760980: SyS_futex: 00007ffbeb76b028: 80000871 ret=ETIMEDOUT

Task 2165 does an UNLOCK_PI, assigning the lock to the waiter task 2161
which then returns with -ETIMEDOUT. That wrecks the lock state, because now
the owner isn't aware it acquired the lock and removes the pending robust
list entry.

If 2161 is killed, the robust list will not clear out this futex and the
subsequent acquire on this futex will then (correctly) result in -ESRCH
which is unexpected by glibc, triggers an internal assertion and dies.

Task 2161			Task 2165

rt_mutex_wait_proxy_lock()
   timeout();
   /* T2161 is still queued in  the waiter list */
   return -ETIMEDOUT;

				futex_unlock_pi()
				spin_lock(hb->lock);
				rtmutex_unlock()
				  remove_rtmutex_waiter(T2161);
				   mark_lock_available();
				/* Make the next waiter owner of the user space side */
				futex_uval = 2161;
				spin_unlock(hb->lock);
spin_lock(hb->lock);
rt_mutex_cleanup_proxy_lock()
  if (rtmutex_owner() !== current)
     ...
     return FAIL;
....
return -ETIMEOUT;

This means that rt_mutex_cleanup_proxy_lock() needs to call
try_to_take_rt_mutex() so it can take over the rtmutex correctly which was
assigned by the waker. If the rtmutex is owned by some other task then this
call is harmless and just confirmes that the waiter is not able to acquire
it.

While there, fix what looks like a merge error which resulted in
rt_mutex_cleanup_proxy_lock() having two calls to
fixup_rt_mutex_waiters() and rt_mutex_wait_proxy_lock() not having any.
Both should have one, since both potentially touch the waiter list.

Fixes: 38d589f2fd ("futex,rt_mutex: Restructure rt_mutex_finish_proxy_lock()")
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Bug-Spotted-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Link: http://lkml.kernel.org/r/20170519154850.mlomgdsd26drq5j6@hirez.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-05-22 21:57:18 +02:00
Jonathan Corbet 6312811be2 Merge remote-tracking branch 'mauro-exp/docbook3' into death-to-docbook
Mauro says:

This patch series convert the remaining DocBooks to ReST.

The first version was originally
send as 3 patch series:

   [PATCH 00/36] Convert DocBook documents to ReST
   [PATCH 0/5] Convert more books to ReST
   [PATCH 00/13] Get rid of DocBook

The lsm book was added as if it were a text file under
Documentation. The plan is to merge it with another file
under Documentation/security, after both this series and
a security Documentation patch series gets merged.

It also adjusts some Sphinx-pedantic errors/warnings on
some kernel-doc markups.

I also added some patches here to add PDF output for all
existing ReST books.
2017-05-18 11:03:08 -06:00
Mauro Carvalho Chehab 7b4ff1adb5 mutex, futex: adjust kernel-doc markups to generate ReST
There are a few issues on some kernel-doc markups that was
causing troubles with kernel-doc output on ReST format:

./kernel/futex.c:492: WARNING: Inline emphasis start-string without end-string.
./kernel/futex.c:1264: WARNING: Block quote ends without a blank line; unexpected unindent.
./kernel/futex.c:1721: WARNING: Block quote ends without a blank line; unexpected unindent.
./kernel/futex.c:2338: WARNING: Block quote ends without a blank line; unexpected unindent.
./kernel/futex.c:2426: WARNING: Block quote ends without a blank line; unexpected unindent.
./kernel/futex.c:2899: WARNING: Block quote ends without a blank line; unexpected unindent.
./kernel/futex.c:2972: WARNING: Block quote ends without a blank line; unexpected unindent.

Fix them.

No functional changes.

Acked-by: Darren Hart (VMware) <dvhart@infradead.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-05-16 08:43:25 -03:00
Linus Torvalds de4d195308 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The main changes are:

   - Debloat RCU headers

   - Parallelize SRCU callback handling (plus overlapping patches)

   - Improve the performance of Tree SRCU on a CPU-hotplug stress test

   - Documentation updates

   - Miscellaneous fixes"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (74 commits)
  rcu: Open-code the rcu_cblist_n_lazy_cbs() function
  rcu: Open-code the rcu_cblist_n_cbs() function
  rcu: Open-code the rcu_cblist_empty() function
  rcu: Separately compile large rcu_segcblist functions
  srcu: Debloat the <linux/rcu_segcblist.h> header
  srcu: Adjust default auto-expediting holdoff
  srcu: Specify auto-expedite holdoff time
  srcu: Expedite first synchronize_srcu() when idle
  srcu: Expedited grace periods with reduced memory contention
  srcu: Make rcutorture writer stalls print SRCU GP state
  srcu: Exact tracking of srcu_data structures containing callbacks
  srcu: Make SRCU be built by default
  srcu: Fix Kconfig botch when SRCU not selected
  rcu: Make non-preemptive schedule be Tasks RCU quiescent state
  srcu: Expedite srcu_schedule_cbs_snp() callback invocation
  srcu: Parallelize callback handling
  kvm: Move srcu_struct fields to end of struct kvm
  rcu: Fix typo in PER_RCU_NODE_PERIOD header comment
  rcu: Use true/false in assignment to bool
  rcu: Use bool value directly
  ...
2017-05-10 10:30:46 -07:00