When a NOQUEUE request fails, the rsb res_master field is unnecessarily
reset to -1, instead of leaving the valid master setting in place. We
want to save the looked-up master values while the rsb is on the "toss
list" so that another lookup can be avoided if the rsb is soon reused.
The fix is to simply leave res_master value alone.
Signed-off-by: David Teigland <teigland@redhat.com>
We *can* get there from receive_request() and dlm_recover_master_copy()
with namelen too large if incoming request is invalid; BUG() from
DLM_ASSERT() in allocate_rsb() is a bit excessive reaction to that
and in case of dlm_recover_master_copy() we would actually oops before
that while calculating hash of up to 64Kb worth of data - with data
actually being 64 _bytes_ in kmalloc()'ed struct.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Teigland <teigland@redhat.com>
* check that length is large enough to cover the non-variable part of message or
rcom resp. (after checking that it's large enough to cover the header, of
course).
* kill more pointless casts
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Teigland <teigland@redhat.com>
a) don't cast the pointer to dlm_header *, we use it as dlm_message *
anyway.
b) we copy the message into a queue element, then pass the pointer to
copy to dlm_receive_message_saved(); declare it properly to make sure
that we have the right alignment.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Teigland <teigland@redhat.com>
To prevent the master of an rsb from changing rapidly, an unused rsb is kept
on the "toss list" for a period of time to be reused. The toss list was
being cleared completely for each recovery, which is unnecessary. Much of
the benefit of the toss list can be maintained if nodes keep rsb's in their
toss list that they are the master of. These rsb's need to be included
when the resource directory is rebuilt during recovery.
Signed-off-by: David Teigland <teigland@redhat.com>
The invalid lockspace messages are normal and can appear relatively
often. They should be suppressed without debugging enabled.
Signed-off-by: David Teigland <teigland@redhat.com>
In a rare case we may need to repeat a local resource directory lookup
due to a race with removing the rsb and removing the resdir record.
We'll never need to do more than a single additional lookup, though,
so the infinite loop around the lookup can be removed. In addition
to being unnecessary, the infinite loop is dangerous since some other
unknown condition may appear causing the loop to never break.
Signed-off-by: David Teigland <teigland@redhat.com>
Non-forced unlocks should be rejected if the lock is waiting on the
rsb_lookup list for another lock to establish the master node.
Signed-off-by: David Teigland <teigland@redhat.com>
There was some hit and miss validation of messages that has now been
cleaned up and unified. Before processing a message, the new
validate_message() function checks that the lkb is the appropriate type,
process-copy or master-copy, and that the message is from the correct
nodeid for the the given lkb. Other checks and assertions on the
lkb type and nodeid have been removed. The assertions were particularly
bad since they would panic the machine instead of just ignoring the bad
message.
Although other recent patches have made processing old message unlikely,
it still may be possible for an old message to be processed and caught
by these checks.
Signed-off-by: David Teigland <teigland@redhat.com>
Messages from nodes that are no longer members of the lockspace should be
ignored. When nodes are removed from the lockspace, recovery can
sometimes complete quickly enough that messages arrive from a removed node
after recovery has completed. When processed, these messages would often
cause an error message, and could in some cases change some state, causing
problems.
Signed-off-by: David Teigland <teigland@redhat.com>
When a failed request (EBADR or ENOTBLK) is unlocked/canceled instead of
retried, there may be other lkb's waiting on the rsb_lookup list for it
to complete. A call to confirm_master() is needed to move on to the next
waiting lkb since the current one won't be retried.
Signed-off-by: David Teigland <teigland@redhat.com>
When recovery looks at locks waiting for replies, it fails to consider
locks that have already received a reply for their first remote operation,
but not received a reply for secondary, overlapping unlock/cancel. The
appropriate stub reply needs to be called for these waiters.
Appears when we start doing recovery in the presence of a many overlapping
unlock/cancel ops.
Signed-off-by: David Teigland <teigland@redhat.com>
The lkb_ast_type field indicates whether the lkb is on the astqueue list.
When clearing locks for a process, lkb's were being removed from the astqueue
list without clearing the field. If release_lockspace then happened
immediately afterward, it could try to remove the lkb from the list a second
time.
Appears when process calls libdlm dlm_release_lockspace() which first
closes the ls dev triggering clear_proc_locks, and then removes the ls
(a write to control dev) causing release_lockspace().
Signed-off-by: David Teigland <teigland@redhat.com>
The dlm functions in memory.c should use the dlm_ prefix. Also, use
kzalloc/kfree directly for dlm_direntry's, removing the wrapper functions.
Signed-off-by: David Teigland <teigland@redhat.com>
Change log_error() to log_debug() for conditions that can occur in
large number in normal operation.
Signed-off-by: David Teigland <teigland@redhat.com>
This patch adds a proper prototype for some functions in
fs/dlm/dlm_internal.h
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David Teigland <teigland@redhat.com>
Introduce a per-lockspace rwsem that's held in read mode by dlm_recv
threads while working in the dlm. This allows dlm_recv activity to be
suspended when the lockspace transitions to, from and between recovery
cycles.
The specific bug prompting this change is one where an in-progress
recovery cycle is aborted by a new recovery cycle. While dlm_recv was
processing a recovery message, the recovery cycle was aborted and
dlm_recoverd began cleaning up. dlm_recv decremented recover_locks_count
on an rsb after dlm_recoverd had reset it to zero. This is fixed by
suspending dlm_recv (taking write lock on the rwsem) before aborting the
current recovery.
The transitions to/from normal and recovery modes are simplified by using
this new ability to block dlm_recv. The switch from normal to recovery
mode means dlm_recv goes from processing locking messages, to saving them
for later, and vice versa. Races are avoided by blocking dlm_recv when
setting the flag that switches between modes.
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
If the castaddr passed to the userland API is NULL then don't overwrite the
existing castparam. This allows a different thread to cancel a lock request and
the CANCEL AST gets delivered to the original thread.
bz#306391 (for RHEL4) refers.
Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>