If the lockdep code is really running out of the stack_trace entries,
it is likely that buffer overrun can happen and the data immediately
after stack_trace[] will be corrupted.
If there is less than LOCK_TRACE_SIZE_IN_LONGS entries left before
the call to save_trace(), the max_entries computation will leave it
with a very large positive number because of its unsigned nature. The
subsequent call to stack_trace_save() will then corrupt the data after
stack_trace[]. Fix that by changing max_entries to a signed integer
and check for negative value before calling stack_trace_save().
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 12593b7467 ("locking/lockdep: Reduce space occupied by stack traces")
Link: https://lkml.kernel.org/r/20191220135128.14876-1-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull locking updates from Ingo Molnar:
- improve rwsem scalability
- add uninitialized rwsem debugging check
- reduce lockdep's stacktrace memory usage and add diagnostics
- misc cleanups, code consolidation and constification
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
mutex: Fix up mutex_waiter usage
locking/mutex: Use mutex flags macro instead of hard code
locking/mutex: Make __mutex_owner static to mutex.c
locking/qspinlock,x86: Clarify virt_spin_lock_key
locking/rwsem: Check for operations on an uninitialized rwsem
locking/rwsem: Make handoff writer optimistically spin on owner
locking/lockdep: Report more stack trace statistics
locking/lockdep: Reduce space occupied by stack traces
stacktrace: Constify 'entries' arguments
locking/lockdep: Make it clear that what lock_class::key points at is not modified
Security is a wonderful thing, but so is the ability to debug based on
lockdep warnings. This commit therefore makes lockdep lock addresses
visible in the clear.
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Although commit 669de8bda8 ("kernel/workqueue: Use dynamic lockdep keys
for workqueues") unregisters dynamic lockdep keys when a workqueue is
destroyed, a side effect of that commit is that all stack traces
associated with the lockdep key are leaked when a workqueue is destroyed.
Fix this by storing each unique stack trace once. Other changes in this
patch are:
- Use NULL instead of { .nr_entries = 0 } to represent 'no trace'.
- Store a pointer to a stack trace in struct lock_class and struct
lock_list instead of storing 'nr_entries' and 'offset'.
This patch avoids that the following program triggers the "BUG:
MAX_STACK_TRACE_ENTRIES too low!" complaint:
#include <fcntl.h>
#include <unistd.h>
int main()
{
for (;;) {
int fd = open("/dev/infiniband/rdma_cm", O_RDWR);
close(fd);
}
}
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Reported-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yuyang Du <duyuyang@gmail.com>
Link: https://lkml.kernel.org/r/20190722182443.216015-4-bvanassche@acm.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The sequence
static DEFINE_WW_CLASS(test_ww_class);
struct ww_acquire_ctx ww_ctx;
struct ww_mutex ww_lock_a;
struct ww_mutex ww_lock_b;
struct ww_mutex ww_lock_c;
struct mutex lock_c;
ww_acquire_init(&ww_ctx, &test_ww_class);
ww_mutex_init(&ww_lock_a, &test_ww_class);
ww_mutex_init(&ww_lock_b, &test_ww_class);
ww_mutex_init(&ww_lock_c, &test_ww_class);
mutex_init(&lock_c);
ww_mutex_lock(&ww_lock_a, &ww_ctx);
mutex_lock(&lock_c);
ww_mutex_lock(&ww_lock_b, &ww_ctx);
ww_mutex_lock(&ww_lock_c, &ww_ctx);
mutex_unlock(&lock_c); (*)
ww_mutex_unlock(&ww_lock_c);
ww_mutex_unlock(&ww_lock_b);
ww_mutex_unlock(&ww_lock_a);
ww_acquire_fini(&ww_ctx); (**)
will trigger the following error in __lock_release() when calling
mutex_release() at **:
DEBUG_LOCKS_WARN_ON(depth <= 0)
The problem is that the hlock merging happening at * updates the
references for test_ww_class incorrectly to 3 whereas it should've
updated it to 4 (representing all the instances for ww_ctx and
ww_lock_[abc]).
Fix this by updating the references during merging correctly taking into
account that we can have non-zero references (both for the hlock that we
merge into another hlock or for the hlock we are merging into).
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?= <ville.syrjala@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Link: https://lkml.kernel.org/r/20190524201509.9199-2-imre.deak@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The sequence
static DEFINE_WW_CLASS(test_ww_class);
struct ww_acquire_ctx ww_ctx;
struct ww_mutex ww_lock_a;
struct ww_mutex ww_lock_b;
struct mutex lock_c;
struct mutex lock_d;
ww_acquire_init(&ww_ctx, &test_ww_class);
ww_mutex_init(&ww_lock_a, &test_ww_class);
ww_mutex_init(&ww_lock_b, &test_ww_class);
mutex_init(&lock_c);
ww_mutex_lock(&ww_lock_a, &ww_ctx);
mutex_lock(&lock_c);
ww_mutex_lock(&ww_lock_b, &ww_ctx);
mutex_unlock(&lock_c); (*)
ww_mutex_unlock(&ww_lock_b);
ww_mutex_unlock(&ww_lock_a);
ww_acquire_fini(&ww_ctx);
triggers the following WARN in __lock_release() when doing the unlock at *:
DEBUG_LOCKS_WARN_ON(curr->lockdep_depth != depth - 1);
The problem is that the WARN check doesn't take into account the merging
of ww_lock_a and ww_lock_b which results in decreasing curr->lockdep_depth
by 2 not only 1.
Note that the following sequence doesn't trigger the WARN, since there
won't be any hlock merging.
ww_acquire_init(&ww_ctx, &test_ww_class);
ww_mutex_init(&ww_lock_a, &test_ww_class);
ww_mutex_init(&ww_lock_b, &test_ww_class);
mutex_init(&lock_c);
mutex_init(&lock_d);
ww_mutex_lock(&ww_lock_a, &ww_ctx);
mutex_lock(&lock_c);
mutex_lock(&lock_d);
ww_mutex_lock(&ww_lock_b, &ww_ctx);
mutex_unlock(&lock_d);
ww_mutex_unlock(&ww_lock_b);
ww_mutex_unlock(&ww_lock_a);
mutex_unlock(&lock_c);
ww_acquire_fini(&ww_ctx);
In general both of the above two sequences are valid and shouldn't
trigger any lockdep warning.
Fix this by taking the decrement due to the hlock merging into account
during lock release and hlock class re-setting. Merging can't happen
during lock downgrading since there won't be a new possibility to merge
hlocks in that case, so add a WARN if merging still happens then.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: ville.syrjala@linux.intel.com
Link: https://lkml.kernel.org/r/20190524201509.9199-1-imre.deak@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In mark_lock_irq(), the following checks are performed:
----------------------------------
| -> | unsafe | read unsafe |
|----------------------------------|
| safe | F B | F* B* |
|----------------------------------|
| read safe | F? B* | - |
----------------------------------
Where:
F: check_usage_forwards
B: check_usage_backwards
*: check enabled by STRICT_READ_CHECKS
?: check enabled by the !dir condition
From checking point of view, the special F? case does not make sense,
whereas it perhaps is made for peroformance concern. As later patch will
address this issue, remove this exception, which makes the checks
consistent later.
With STRICT_READ_CHECKS = 1 which is default, there is no functional
change.
Signed-off-by: Yuyang Du <duyuyang@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bvanassche@acm.org
Cc: frederic@kernel.org
Cc: ming.lei@redhat.com
Cc: will.deacon@arm.com
Link: https://lkml.kernel.org/r/20190506081939.74287-24-duyuyang@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
As Peter has put it all sound and complete for the cause, I simply quote:
"It (check_redundant) was added for cross-release (which has since been
reverted) which would generate a lot of redundant links (IIRC) but
having it makes the reports more convoluted -- basically, if we had an
A-B-C relation, then A-C will not be added to the graph because it is
already covered. This then means any report will include B, even though
a shorter cycle might have been possible."
This would increase the number of direct dependencies. For a simple workload
(make clean; reboot; make vmlinux -j8), the data looks like this:
CONFIG_LOCKDEP_SMALL: direct dependencies: 6926
!CONFIG_LOCKDEP_SMALL: direct dependencies: 9052 (+30.7%)
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Yuyang Du <duyuyang@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bvanassche@acm.org
Cc: frederic@kernel.org
Cc: ming.lei@redhat.com
Cc: will.deacon@arm.com
Link: https://lkml.kernel.org/r/20190506081939.74287-21-duyuyang@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>